#### Introduction

In scientific researches, examined events are defined by mathematical models. The mathematical models that are formed enable to interpret in which state the examined event will be in time. Since the statistical events cannot be interpreted absolutely, the case of transition of events from one to another occurs. In the study of this kind of problems fuzzy logic approach may be used (1).

Fuzzy logic structure is developed by an article entitled “Fuzzy Sets” written by Zadeh (2). While classical logic is dichotomous as {0,1} and there is not any uncertainty; fuzzy logic enables the membership of an element to a fuzzy set to be any value in [0,1] interval. Human thought structure utilizes events with approximate terms such as “a few”, “many”, “more” instead of the crisp terms such as “present”, “absent” (3,4). When viewed from this aspect, fuzzy logic represents the real world and human thought structure in a good way.

While non-hierarchical fuzzy models (NHFMs) are built by adding all independent variables to the model at the same time; HFMs are created by combining fuzzy sub-models having lower dimensions. In NHFM approach; as the number of independent variables increases, the number of rules that are used to make decision about dependent variable increases exponentially in knowledge base, which causes “curse of dimensionality” due to the fact that the number of adaptive parameters increases so much especially when there are too many independent variables (5).^{ }In order to overcome this problem, HFMs are suggested since the number of rules are linearly increases (5-8).^{ }The aim of this study is to compare the classification performances of HFMs and NHFMs using different membership functions.

#### Materials and Methods

#### Adaptive Neuro-fuzzy Inference System (ANFIS)

ANFIS is a non-hierarchical hybrid network structure which represents Sugeno fuzzy inference system (9-16). The rules of ANFIS structure are as follows (8,11,17-21):

Rule 1: If X1=A1 and X2=B1 then Ŷ1=f1 (X1,X2 )=p1 X1+q1 X2+r1

Rule 2: If X1=A1 and X2=B2 then Ŷ2=f2 (X1,X2 )=p2 X1+q2 X2+r2

Rule 3: If X1=A2 and X2=B1 then Ŷ3=f3 (X1,X2 )=p3 X1+q3 X2+r3

Rule 4: If X1=A2 and X2=B2 then Ŷ4=f4 (X1,X2 )=p4 X1+q4 X2+r4

ANFIS structure consists of 5 layers (Figure 1) (22-24):

**1**^{st} layer, fuzzification layer: Each node in this layer is adaptive and outputs of the nodes consist of a membership degree depending on the membership function used and values of independent variables. The output O1,i of this node is calculated as follows:

O1,i = μAi (X1), i = 1,2

O1,i = μBi-2 (X2), i = 3,4

To predict the parameters of this layer with the least error, backpropagation algorithm is used (9,25,26).

**2**^{nd} layer, rule layer: None of the nodes in this layer is adaptive and they are expressed as Π. Each node corresponds to the rules written according to Sugeno fuzzy inference system and the number of them. Outputs of each rule nodes O2,i show rule weights calculated by (27,28):

O2,i=μAi (X1) * μBj (X2), i =1, j = 1,2

O2,i=μAi (X1) * μBj (X2), i =2, j = 1,2

**3**^{rd} layer, normalization layer: All of the nodes in this layer are fixed. Each node gives normalized value of each rule (29,30):

**4**^{th} layer, weighting layer: Each of the nodes O4,i in this layer, are adaptive and weighted output values of each rule are calculated. To predict the output parameters set [pi,qi,ri] of i^{th} rule with minimum error, least squares estimation method is used (16,31):

1,2,3,4

**5**^{th} layer, aggregation layer: There is only one node in this layer and the node is fixed. Outputs of weighting layer are gathered in this layer and the real value of ANFIS system is obtained (11,24):

#### Hierarchical Fuzzy Model Structure

Use of NHFMs in complex and high dimensional systems causes curse of dimensionality problem. HFMs are suggested to overcome this (5,7).

The number of rules exponentially increases as the number of independent variables increases in NHFMs while it increases linearly in HFMs. Supposing that there are m independent variables and each of these variables has v membership functions, then the number of rules equals to v^{m} in NHFMs while there are [(m – 1) * v2] rules in HFMs (6,7,32,33). Examining the HFM that has v fuzzy sets and m independent variables (Figure 2), it is seen that intermediate outputs (U1,U2,...Um-2) and dependent variable Ŷ= Um–1 are calculated by adding independent variables (X1,X2,...,Xm) to the model hierarchically.

#### Simulation

In simulation, normally distributed data sets were generated and the number of units was set to n=1000. The data sets were randomly divided into 70% (700 units) training and 30% (300 units) test sets.

#### Simulation with Three Independent Variables

Independent variables were derived from normal distributions as being X1~N(200,45), X2~N(130,30), X3~N(60,14) and correlated to one another (r12=0.704, r13 =0.553, r23=0.372).

In training set, the most correlated independent variables X1 and X2 were added to a layer by creating a Sugeno fuzzy model (SFM) in both training and test sets, then intermediate output U1 was obtained. After this, HFM was built by using U1 and X3. In this way, class prediction of dependent variable in test set was done.

When building NHFM, all the independent variables were used in ANFIS structure. Then the class prediction of dependent variable in test set was done running the created model.

#### Simulation with Six Independent Variables

Independent variables were generated as being X1~N(150,35), X2~N(110,25), X3~N(130,30), X4~N(100,15), X5~N(85,20), X6~N(50,10) and correlated to one another (rmin =0,400 – rmax = 0.900).

First, the most correlated independent variable pairs X1 – X2, X3 – X4 and X5 – X6 were layered and the intermediate outputs U1, U2 and U3 of each layer were obtained both in training and test sets. By compounding intermediate outputs, the HFM was constructed and class prediction of dependent variable in test set was done.

All independent variables were used in ANFIS structure to construct NHFM and classes of each unit were predicted.

#### Hypertension Data Set

In order to construct fuzzy models, the variables fasting blood glucose (FBG) (mg/dL), body mass index (BMI) (kg/m^{2}) and triglyceride (TG) (g/dL) which showed significant difference between hypertension and control groups, were correlated to each other and had fuzziness in their distributions were chosen as independent variables (34).

Each of the independent variables of hypertension data set was fuzzified by being divided into three sub-groups. Mean ± standard deviation and minimum-maximum values (minimum-maximum) of each of the groups were calculated to predict fuzzy models (Table 1).

In the first step of hypertension data set application, data set was randomly separated into 70% (223 units) training and 30% (96 units) test sets. The correlation coefficients between independent variables were rBMI–TG = 0,842, rTG–FBG = 0.210 and rBMI–FBG = 0.113. In training set, SFM was constructed by using the most correlated variable pairs BMI-TG in the first layer of HFM. Then by using intermediate output U1 of SFM and FBG as the independent variables of ANFIS structure, HFM was constructed.

In order to build NHFM structure, all the independent variables of training set and descriptive statistics (Table 1) of these variables were used in ANFIS structure.

Then classes of each unit of both training and test sets were predicted by adapting initial membership values of fuzzified variables to obtain minimum classification error.

#### Results

#### Simulation

The results showed that there was a significant difference (p<0.001) between classification performances of NHFMs and HFMs based on sensitivity (%), specificity (%), accuracy (%) and root mean square error (RMSE) (%).

Comparison results of simulation with three independent (Table 2) and six independent (Table 3) variables showed that the sensitivity, specificity and accuracy rates of NHFMs were higher while RMSE was lower than HFMs in test set.

#### Hypertension Data Set Application

It was found in hypertension data set application that different membership functions resulted in different classification results. In test set, sensitivity (%), specificity (%) and accuracy (%) rates were higher and RMSE (%) was lower NHFMs than HFMs constructed by Gaussian membership function (Table 4).

Rule basis of hierarchical and NHFMs are as follows:

#### Rule Base in Non-hierarchical Fuzzy Models

**Rule 1:** If BMI_{normal} and TG_{normal} and FBG_{hypoglycaemia} then GROUP_{control}

**Rule 2:** If BMI_{normal} and TG_{normal} and FBG_{normal} then GROUP_{control}

**Rule 3:** If BMI_{normal} and TG_{normal} and FBG_{hyperglycaemia} then GROUP_{control}

**Rule 4:** If BMI_{normal} and TG_{high at limit} and FBG_{hypoglycaemia} then GROUP_{control}

**Rule 5:** If BMI_{normal} and TG_{high at limit} and FBG_{normal} then GROUP_{control}

**Rule 6:** If BMI_{normal} and TG_{high at limit} and FBG_{hyperglycaemia} then GROUP_{control}

**Rule 7:** If BMI_{normal} and TG_{high} and FBG_{hypoglycaemia} then GROUP_{control}

**Rule 8:** If BMI_{normal} and TG_{high} and FBG_{normal} then GROUP_{control}

**Rule 9:** If BMI_{normal} and TG_{high} and FBG_{hyperglycaemia} then GROUP_{control}

**Rule 10:** If BMI_{overweight} and TG_{normal} and FBG_{hypoglycaemia} then GROUP_{control}

**Rule 11:** If BMI_{overweight} and TG_{normal} and FBG_{normal} then GROUP_{control}

**Rule 12:** If BMI_{overweight} and TG_{normal} and FBG_{hyperglycaemia} then GROUP_{control}

**Rule 13:** If BMI_{overweight} and TG_{high at limit} and FBG_{hypoglycaemia} then GROUP_{control}

**Rule 14:** If BMI_{overweight} and TG_{high at limit} and FBG_{normal} then GROUP_{control}

**Rule 15:** If BMI_{overweight} and TG_{high at limit} and FBG_{hyperglycaemia} then GROUP_{hypertension}

**Rule 16:** If BMI_{overweight} and TG_{high} and FBG_{hypoglycaemia} then GROUP_{hypertension}

**Rule 17:** If BMI_{overweight} and TG_{high} and FBG_{normal} then GROUP_{hypertension}

**Rule 18:** If BMI_{overweight} and TG_{high} and FBG_{hyperglycaemia} then GROUP_{hypertension}

**Rule 19:** If BMI_{obese} and TG_{normal} and FBG_{hypoglycaemia} then GROUP_{hypertension}

**Rule 20:** If BMI_{obese} and TG_{normal} and FBG_{normal} then GROUP_{hypertension}

**Rule 21:** If BMI_{obese} and TG_{normal} and FBG_{hyperglycaemia} then GROUP_{hypertension}

**Rule 22:** If BMI_{obese} and TG_{high at limit} and FBG_{hypoglycaemia} then GROUP_{hypertension}

**Rule 23:** If BMI_{obese} and TG_{high at limit} and FBG_{normal} then GROUP_{hypertension}

**Rule 24:** If BMI_{obese} and TG_{high at limit} and FBG_{hyperglycaemia} then GROUP_{hypertension}

**Rule 25:** If BMI_{obese} and TG_{high} and FBG_{hypoglycaemia} then GROUP_{hypertension}

**Rule 26:** If BMI_{obese} and TG_{high} and FBG_{normal} then GROUP_{hypertension}

**Rule 27:** If BMI_{obese} and TG_{high} and FBG_{hyperglycaemia} then GROUP_{hypertension}

#### Rule Base in Hierarchical Fuzzy Models

U_{1i} (i=1,2,3) is to be the i^{th }sub-category of the intermediate output U_{1} then rules of HFMs are as follows:

**Rule 1:** If BMI_{normal} and TG_{normal} then U_{11}

**Rule 2:** If BMI_{normal} and TG_{high at limit} then U_{11}

**Rule 3:** If BMI_{normal} and TG_{high} then U_{11}

**Rule 4:** If BMI_{overweight} and TG_{normal} then U_{12}

**Rule 5:** If BMI_{overweight} and TG_{high at limit} then U_{12}

**Rule 6:** If BMI_{overweight} and TG_{high} then U_{12}

**Rule 7:** If BMI_{obese} and TG_{normal} then U_{13}

**Rule 8:** If BMI_{obese} and TG_{high at limit} then U_{13}

**Rule 9:** If BMI_{obese} and TG_{high} then U_{13}

**Rule 10:** If U_{11} and FBG_{hypoglycaemia} then GROUP_{control}

**Rule 11:** If U_{11} and FBG_{normal} then GROUP_{control}

**Rule 12:** If U_{11} and FBG_{hyperglycaemia} then GROUP_{control}

**Rule 13:** If U_{12} and FBG_{hypoglycaemia} then GROUP_{control}

**Rule 14:** If U_{12} and FBG_{normal} then GROUP_{hypertension}

**Rule 15:** If U_{12} and FBG_{hyperglycaemia} then GROUP_{hypertension}

**Rule 16:** If U_{13} and FBG_{hypoglycaemia} then GROUP_{hypertension}

**Rule 17:** If U_{13} and FBG_{normal} then GROUP_{hypertension}

**Rule 18:** If U_{13} and FBG_{hyperglycaemia} then GROUP_{hypertension}

#### Discussion

There are a lot of researches on classification problems in which fuzzy models have been used. As being in many research fields, there are a lot of works on classification with fuzzy models built by health data sets in medicine literature too.

Resulting of examination of literature, it is seen that in most of the classification problems NHFMs are used. Karahoca et al. (22)^{ }aimed to compare the classification performances of non-hierarchical fuzzy logic and multinomial logistic regression methods by using age, waist/hip and glucose ratio variables. They divided 390-unit-data set into training (300 units) and test (90 units) sets. In order to build ANFIS structure they fuzzified age and glucose ratio variables that were crisp valued by dividing into three and five sub-categories, respectively. They reported that RMSE of assigning diabetic individuals into “hypoglycaemic”, “hypoglycaemia at low risk”, “healthy”, “diabetes at low risk” or “diabetic” classes with NHFM was 17.45% while this value was found to be 23.43% in multinomial logistic regression. In this way, they determined that non-hierarchical fuzzy logic method made better classification than multinomial logistic regression method. Ankişhan and Ari (23) aimed to make snore-related sound classification by non-hierarchical fuzzy logic method. For this aim, they divided sounds which were normal and related to sleep apnea into pieces, then calculated the entropy and energy of those sounds as independent variables of the model. They reported that the ANFIS structure they created constituted 97.08% of the accuracy of allocating individuals to ‘snoring’, ‘sleeping’ or ‘silent’ classes. Mahmoudi et al. (31)^{ }aimed to compare the performances of the ANFIS structure in classification of individuals into cancer types using a total of six microchip gene expression data sets for breast, blood, colon, prostate, lung and lymphoma cancers and the performance of the support vector machine, k-nearest neighborhood and classification and regression trees methods. They found that the highest classification performance among the models they created separately for all cancer data sets was mostly due to the non-hierarchical fuzzy logic method. In another study, Uçar et al. (24) aimed to use a shorter data mining method as an alternative to the medical diagnostic test for the diagnosis of tuberculosis disease and stated that they preferred ANFIS to estimate in what probability individuals carry the bacterial cause of tuberculosis in their body. They classified dependent variables as 0, 0.25, 0.50, 0.75 or 1.00 probability classes for this and reported that 97% of the classification success of the NHFM using the 20 most important variables among the 30 risk factors of the disease was found. Yang et al. (35) performed a classification study on brain signals, a total of 200 brain signals were recorded from electrical status epilepticus in sleep (ESES) patients and control subjects in 8-second segments with a 16-channel electroencephalogram device. In the study where each channel was used as an independent variable, two different entropies were calculated from 8-second segments and two NHFMs were constructed by building ANFIS structure. With these models created by using bell membership function, the individuals were divided into ESES or control classes with 89% and 82% accuracy respectively. Ziasabounchi and Askerzade (16) aimed to classify individuals with a NHFM using the Gaussian membership function according to their degrees of having cardiac disease. They selected age, chest pain type, cholesterol, maximum heart rate, resting blood pressure, glucose and electrocardiographic variables among independent variables in the Cleveland heart disease data set from the University of California artificial intelligence database, which consists of 303 units and 13 independent variables. In the fuzzification step of the HFM, they divided age, blood pressures at rest, cholesterol variables into three; and the maximum heart rate into two sub-categories. They then divided the data set into 80% (243 units) training and 20% (60 units) test data and reported that they classified the test data set with 15% error and 92.3% accuracy with the classification model built in training set with 1% error. In our study, by using simulation and hypertension data set and different membership functions, HFMs as well as NHFMs were created and the classification performances of these models were compared according to sensitivity, specificity, accuracy, and RMSE criteria. By this comparison, it was found that NHFMs were better than HFMs.

In cases where the number of independent variables is large, hierarchical fuzzy logic method is proposed, which is achieved by combining smaller sized fuzzy sub-models. Since in the process of constructing fuzzy model with the best classification; the number of parameters that need to be adapted in the most appropriate way, which is also called the “dimension problem”, increases as the number of independent variables increases. This causes both parameter complexity and time loss in the classification phase in the fuzzy inference process (5-8).

There are not many studies that use HFMs in the health field. Akbarzadeh-T and Moshtagh-Khorasani (36) conducted a test which was consisted of thirty questions and measured the ability of repeating the sentences, comprehending and matching names, written language qualification of 265 individuals who were aphasic. Because of the large number of independent variables, they pointed out that they aimed to classify aphasia species with an HFM. From the thirty independent variables in the first layer of the HFM, they constructed a fuzzy model with four rules using six interrelated variables that best described disease types, on the other hand; in the second layer using the outputs of the first layer and the four independent variables that they chose among thirty independent variables they created the second fuzzy model and classified aphasia types with 92% accuracy. Amouzadi and Mirzaei (37) aimed to build HFM to make classification of the data sets whose dependent variables were categorical by using “breast cancer” data set which consisting of nine independent variables with 699 units; “pima” data set containing eight independent variables with 768 units, “wine” data set with thirteen independent variables and 178 units, “haberman” data set with three independent variables and 306 units and lastly “iris” data set with four independent variables with 150 units.

They reported that they preferred the hierarchical fuzzy logic method as the classification method in order to avoid the curse of dimensionality caused by a large number of independent variables and the length of the classification process time. They used as many layers as sub-categories that each independent variables had in the study and divided the membership functions they used in each layer into two to form the rule base. At the end of the study, they reported that they achieved a correct classification of 96% in the “breast cancer” data set, 76% in the “pima” data set, 95% in the “wine” data set, 77% in the “haberman” data set and 95% in the “iris” data set. Shaeiri and Ghaderi (38)^{ }aimed to classify patients into types of cancers using gene expression data sets for blood, prostate and colon cancers. In order to do this, they first divided the cancer data set which consisted of 7129 genes of 72 patients into training (38 units) and test (34 units) sets and then classified patients in test data set into “acute myeloblastic leukemia” or “acute myeloid leukemia” classes with accuracy of 100%; in addition to this, they used prostat data set which consisted of 12600 genes of 102 patients and classified patients into “tumor” or “normal” classes with 99.21% accuracy. They also reported that they had 98.84% accuracy of classification of patients into “normal” or “tumor” classes by creating a fuzzy model from the data set which contained 2000 genes of 62 units after dividing it into training and test sets. In our study, the effect of the number of independent variables used in both HFMs and NHFMs on the classification performance of the model was examined. For this purpose, it was observed that the performance of the classification of the model increases with the increase of the number of independent variables as a result of simulation using three and six independent variables. In addition, the classification performances of the models were found to approximate each other. However, with the increase in the number of independent variables, it was observed that the rule base expanded in both models. In simulation, when the number of independent variables increased from 3 to 6; the number of rules increased accordingly from 8 to 64 in NHFM; from 8 to 27 in HFM. In hypertension data set application, 18 rules were obtained in HFM while this number was 27 in NHFM. As a result of the analyses, it was determined that the classification performances of the fuzzy models depend on the distribution of data, the number of sub-categories of each of the independent variables has, the type of membership function to be used, the number of the independent variables to be modelled and correlation between them. Accordingly, histogram graphs of independent variables should be used in the fuzzification step. In cases where the distributions are highly intertwined, the model should be further refined by increasing the number of sub-categories, and the fuzziness should be tried to be eliminated. The extent to which fuzziness is eliminated should be determined from the overlapping regions in the drawn membership function graphs, and a model should be created using the membership function that gives the most appropriate result. Moreover, if the number of independent variables is too large, the variables associated with each other should be included in the same layer, then these layers must be combined to form an HFM. However, loss of information in transitions between layers of HFMs is a limitation of this method. It is predicted that increasing the number of independent variables and modelling the independent variables with high correlation level can prevent the loss of information due to the layers and thus the classification performance of the model will be better.

#### Conclusion

Health data contain many factors that cause diseases. When the diagnosis of a disease is made, which sub-category the values of the factors that cause diseases belong is and the interaction between the sub-categories are important. In this kind of data structures, fuzzy logic methods should be used, which is a method that allows the estimation of the output values by using the factors whose categories are transitive and the interactions of sub-categories of them. Particularly in data sets with large number of factors, HFM which allows the creation of smaller rule base by gathering highly correlated factors into the same layer should be used. In cases that the inference of which sub-categories of the factors interacted to each other are important for classification of the individuals as patient or control, then a NHFM should be used. It should be noted, however, that the number of factors or the number of sub-categories of them should be chosen so as not to constitute an extremely large rule base.

**Ethics **

**Ethics Committee Approval: **It was not taken.

**Informed Consent: **It was not taken.

**Peer-review: **Externally peer-reviewed.

**Authorship Contributions**

Concept: İ.K.Ö., M.T., Design: F.C., İ.K.Ö., M.T., Data Collection or Processing: F.C., İ.K.Ö., Analysis or Interpretation: F.C., İ.K.Ö., M.T., Literature Search: F.C., Writing: F.C., İ.K.Ö.

**Conflict of Interest: **No conflict of interest was declared by the authors.

**Financial Disclosure: **The authors declared that this study received no financial support.