Atterberg Limits Prediction Comparing SVM with ANFIS Model

Support Vector Machine (SVM) and Adaptive Neuro-Fuzzy inference Systems (ANFIS) both analytical methods are used to predict the values of Atterberg limits, such as the liquid limit, plastic limit and plasticity index. The main objective of this study is to make a comparison between both forecasts (SVM & ANFIS) methods. All data of 54 soil samples are used and taken from the area of Peninsular Malaysian and tested for different parameters containing liquid limit, plastic limit, plasticity index and grain size distribution and were. The input parameter used in for this case are the fraction of grain size distribution which are the percentage of silt, clay and sand. The actual and predicted values of Atterberg limit which obtained from the SVM and ANFIS models are compared by using the correlation coefficient R2 and root mean squared error (RMSE) value.  The outcome of the study show that the ANFIS model shows higher accuracy than SVM model for the liquid limit (R2 = 0.987), plastic limit (R2 = 0.949) and plastic index (R2 = 0966). RMSE value that obtained for both methods have shown that the ANFIS model has represent the best performance than SVM model to predict the Atterberg Limits as a whole.


Int r oduct i on
The Atterberg limits can be used to distinguish betw een sand, silt and clay, and it can distinguish betw een different types of sand, silt and clays. These limits w ere creat ed by Albert Atterberg, a Sw edish chemist. They are t hen refined by Arthur Casagrande. Know ledge of the grain size distribution is very important for the behavior of soil under load and soil that come in contact w ith w ater can be identified. Water is also a part of the soil component, and its presence reduces t he strength of the soil (Ali, 2011). If a particular soil grain size distribution is know n, an accurate prediction of how the soil w hen acting as a basis for or a component of the structural w orks such as buildings, dam s, and roads and other can be made. Once you know how t o soil tend to behave, engineers can design and estimate the best foundation to support an initiatory safer and more durable. Previously, the study of the grain size distribution and geological characteristics of the other soil has been done, for exam ple, (Berbenni 2007) conducted a study on t he impact of the size distribution of soil to the yield stress. Reproduction of his results show ed a yield stress decreased w ith increasing grain size distribution.
How ever, in this study, the grain size distribution of soil fractions and percentages w ill be used to predict the Atterberg limits using analytical methods Support Vector Machine (SVM ) and Adaptive Neuro-Fuzzy inference System (ANFIS). Considering the main objective and aim of this w ork the prediction of the Atterberg Limits, it is convenient to review fundam ental principles related to the comparing a Support Vector M achine (SVM ) model w ith Adaptive Neuro -Fuzzy Inference Syst em (ANFIS). The Atterberg limits are a convenient means to describe the plastic type properties of a soil. They are defined by limits on different types of behavior, and are expressed as a w ater content for a detailed description.
SVM is generally utilized in classification and regression problems (Chen et al. 2010). SVM s have the ability to enable a learning machine to generalize w ell to unseen data w ith their strong statistical learning theory grasp and very promising in empirical performance (Lin & Yeh 2009). There are a w ide number of applications that can be utilized by using SVM s such as regression, pattern recognition, Bioinformatics and artificial intelligence (Tripathi et al. 2006). Support vector machine is a machine lear ning method that is w idely used for data analyzing and pattern recognizing. The algorithm w as invented by Vapnik and the current standard incarnation w as proposed by (Cortes and Vapnik 1995). This application note is to help understand the concept of support vector machine and how to build a simple support vector machine using M atlab.
The ANFIS has the ability to learn from data, such as that ow ned by an artificial neural netw ork. ANFIS models can also quickly achieve optimal results even if the target is not given. Additionally, there is no am biguity in the ANFIS, unlike in a neural net w ork. Because ANFIS combines both neural net w orks and fuzzy logic, it can handle complex problems and non-linear problems.

A. Dat a Di st ri but i on
The distribution of the sam ple can be divided into tw o areas, area 1 (Fig 1) and area 2 as show n in (Fig 2). The first sam ple w as taken around the state of Pahang, w hile in the second, the distribution of the sam ple is i n the state of Johor. In t his study, all sam ple data for the grain size distribution w ere prepared by IKRAM and test s of soil classification and testing the limits Atterberg has been obtained from the result s of laboratory tests. All distributions of soil sam ples taken as casual as the distance bet w een t he distributions of sam ples is almost 400 km. A total of 54 soil sam ples taken in the neighbourhood of the Peninsular M alaysian and its di stribution is show n in Table 1. B. Revi si on of Ar ea The Atterberg limits value and Grain size distribution w ere obtained through laboratory test carried out by (IKRAM ) the M alaysian Institute of Public Works. The ANFIS and SVM models w ere then exam ined by applying 54 data records collected from these tests, the actual data value compared w ith the predicted Atterberg limit values. For use as a training data set the ANFIS and SVM models need a set of input and output data. The grain size distribution w as employed For the purpose of this study, as input param eters in the development of the ANFIS and SVM models for the prediction of Atterberg limit values.  The soil sam ple data w ere taken based on the occurrence of debris flow event across Peninsular M alaysia, as recorded in Table 1. Fig 1 present s the location of the grain size distribution sam ple used in the study. The sam pling area can effectively be divided into tw o areas, including the state of Perak and Pahang (Area 1) and Johor (Area 2), respectively. All the 54 soil sam ples w ere collected and for different parameters tested, including grain size distribution, liquid limit (LL), plastic limit (PL), plasticity index and grain size distribution.
M ethods of data collection for this study is to gather existing data for analysis SVM and ANFIS method. Both input and output parameters such as soil grain size distribution, liquid limit (LL), plastic limit (PL) and plasticity index (PI) w ill be identified and studied. The M ethodology w as established for comparing the output param eters w ill be analyzed based on the tw o methods mentioned SVM and ANFIS.
C. Suppor t Vect or M achi ne (SVM ) M odel Support vector machine (SVM ) is a technique valuable for data classification, regression and prediction. SVM s are a set of learning methods that analyses data and recognize patterns, the first introduced in computer science. SVM algorithm is the current standard proposed by (Cortes and Vapnik 1995). SVM has originated from statistical learning theory pioneered by (Boser et al. 1992). Since SVM is a relatively new technique, a brief explanation of how it w orks is given below . M ore detail can be found in many publications. The second learning technique uses t he support vector machine (SVM ) that is firmly based on t he t heory of statistical learning theory, uses regression method. The SVM developed to predict the Plastic Limit (PL), Liquid Limit (LL) and Plastic index (PI). Further, an attempt has been made to simplify the models, requiring only three param eters plastic limit, liquid limit and plastic index as input for prediction.

D. Ker nel funct i on
Once applying the SVM to linearly separable data w e have started by generating a matrix H from the dot product of our input variables: The k (x i ; x j ) is an exam ple of a fam ily of functions in the above equation, called Kernel Functions being know n as a Linear Kernel). The set of kernel functions is composed of variants of (2) in that they are all based on calculating inner products of tw o vectors. This means that if the functions can be recast into a higher dimensionality space by some potentially non-linear feature mapping function .
Only inner products of the mapped inputs in the feature space need be determined w ithout us needing to explicitly calculate .
One of t he reason t hat t his Kernel Trick is valuable is that there are many regression and classification problems that are not linearly regress able and separable in the space of the inputs x, w hich might be in an advanced dimensionality feature space given a suitablemapping.. g. The kernel function can be defined as in equation (2) if w e define our kernel to be: (  (Jang,1993) Fig 5 (a) graphically illustrated mechanism fuzzy reasoning to get a f output from a given input vector [x, y]. That w 1 and w 2 shoot strength usually obtained as a result of grade of membership in the premises, and output f is the w eighted average of each rule`s output. To fascinate learning (or adaptation) Surgeon fuzzy model, it is easy to put int o the fram ew ork of fuzzy model adaptive netw ork that can compute the gradient vector in a syst ematic manner. Resultant net w ork architecture, called ANFIS (Adaptive Neuro-Fuzzy inference system), and show n from Fig. 1b, different layers of ANFIS have or adaptive (Jang, 1993). Different layers w ith their associated nodes are described below : F. Per for m ance Avaluat i on This part is important to have a fair comparison of the predicting result obtained from ANFIS and SVM . Addition, there are a lot of criteria included in the models w hich w ill prove difficult to perform simply by using conventional mathematic formula. Data obtained from both SVM and ANFIS param eters compared to see the difference. This is to see t he effect of changes to the output and error w hen various renovations G. Root M ean Squar e Er r or (RM SE) The correlation coefficient (R), root mean squared error (RM SE) w as used to evaluate the performance of the proposed models. By this formula determines the residual value betw een the actual and predicted Atterberg limits. The effect on coefficient is more obvious by larger error in predicted values than the smaller ones. The best fit can be seen w hen the value of RM SE is zero. The formula for RM SE can be calculat ed using Equation (5).
Where n is am ount of data, h i is observed value, t i is the predicted value.

H. Cor r elat i on Coeffi ci ent (R)
Generally, this formula is the root of ratio betw een the explained variations w here it range betw een the actual value and the predicted value. This formula is best show n by equation (6). (6) Where n is am ount of data, h i is observed value, t i is the predicted value, ͞ h͞ i and t͞ i are the average of the observed and predicted values respectively.
Correlation coefficient R 2 indicates the strength of the linear relationship and the relationship of those variables. R 2 value closer to 1 indicates the efficiency of a model.

Result And Di scussi on
Comparison of both SVM and ANFIS methods of analysis necessary to determine the best methods of both, and to calculate the uncertainty for both these models. Determination of the best and efficient analysis is important that the accurate method can be used for a reference primarily associated w ith Atterberg limits or engineering properties of soil in the future. For SVM analysis method, tw o criteria are discussed modification of renovation and modification of the input training data set. As for the method of analysis ANFIS, modification total input w ill be carried out for comparison purposes. All data obtained w ere analysed and a comparison is made through tables and graphs. Plasticityc index. In terms of observations on all of these Figs, it is seen that the results of ANFIS prediction closer to the experimental data for the analytical testing laboratory liquid limit, plastic limit and plasticity index analysis w here revenue forecasts ANFIS model is closer to the actual value.

Com pari son of SVM and ANFIS best m odel s RM SE and R of 3 Input
In this study, the performance of both ANFIS and SVM model can be assessed by looking at the difference bet w een the values predicted by the correlation coefficient, R 2 and root mean squared error RM SE. The R 2 value closer to 1 indicates t he efficiency of such a model. The smaller RM SE values indicate smaller errors produced by the model. Comparison of R 2 values for the tw o models are briefly described by Table 2 Referring to Table 2 the value of R 2 obtained results ANFIS is better than SVM model for the liquid limit, plastic limit and plasticity index. How ever, the result s indicate that ANFIS is more accurate the SVM model.
In this study comparison of the Root mean square error or RM SE w ill be conducted. RM SE is a mathematical method for measuring the magnitude of the average error. The low er the RM SE value of a data means more accurate predictions. Table 3 show s the RM SE values obtained for the three analyzes the Atterberg limits.
The result s show that t he low RM SE values obtained by ANFIS model for all liquid limit,plastic limit and plasticity index analysis.
M eanw hile, finally the ANFIS model show s the RM SE is low er than SVM . In conclusion, the three Atterberg limits tests conducted, three tests t hat test the liquid limit plastic limit and plasticity index, ANFIS models give a more accurate prediction of the actual value compared w ith the SVM model.

M odi fi cat ion Of Svm M odel
To find out how the number of total input can change the outcome of the prediction by the SVM model, the model is analyzed by carrying out modifications for the am ount of inputs used. The am ount of inputs used for both models are modified from tw o inputs to the three inputs by using the percentage of silt and clay fraction w as then added to the three inputs of the percentage of sand, silt and clay. These modifications ar e briefly described in Table 5 below . A. Tot al Input SVM To find out how the number of total input can change the outcom e of the prediction by the SVM model, the model is analyzed by carrying out modifications for the am ount of inputs used. The am ount of inputs used for both models are modified from tw o inputs to the three inputs by using the percentage of silt and clay fraction w as then added to the three inputs of the percentage of sand, silt and clay. Fig 9, 10 and 11 show the result s of the SVM model predictions for the three tests Atterberg limits on the amount of inputs used. As show n in Fig 4.16, the SVM model predict ions for the liquid limit test that uses three input be represent ed by the red line is closer to the actual data (green lines) than the tw o input be represented by yellow line. Large errors also occur in most of the sam ples as an exam ple, the sam ples 2, 4,6,7,15,16,17,25,26,27,30,36,43,44,53,54 for the tw oinput SVM model predictions aw ay from the true value.
Similarly in Fig 10 below show s the result s of the predictive value of the plastic limit of the SVM model that uses t hree input a little bit accurate than using tw o input model. The difference betw een the SVM prediction model that uses tw o input too much aw ay from the actual value.
In conclusion, based on Fig 9, 10 and 11, the result s of SVM model predictions indicate that the modifier amount of inputs used by the model is related to the value of output produced. This is evidenced also by the R 2 obtained as a result of the analysis. Table 5 below show s t he value of the coefficient R 2 obtained after doing an analysis of both models. Comparison of the coefficient R 2 obtained from SVM model are show n in Table 5 below . The result s show t hat the higher the number the more accurate the inputs used for the prediction model. This is evidenced by the difference in the coefficient R 2 obtained for the SVM model w ith the input of more than the number of inputs. The three tests of the liquid limit, plastic limit and plasticity index indicate that by using more number of inputs, the higher the performance of the SVM model.
The result s of the comparative value of RM SE of the am ount of inputs used are show n in Table 6 below . Referring to Table 6, the SVM model performed better w hen using more inputs for the three test s Atterberg limits are. Low er RM SE values obtained w hen using three input than tw o inputs.
6. M odi fi cat ion Of Anfi s M odel ANFIS model has also been modified in this study for comparison and does not respond to the modification of the model studied. The modification is done in terms of modification of the input.

A. Tot al Input ANFIS
The am ount of inputs used for both models are modified from tw o inputs to the three input s by using the percentage of silt and clay fraction w as then added to the three inputs of the percentage of sand, silt and clay. The results and the prediction of ANFIS model for the t hree values of Atterberg limits are show n in Figs 12, 13 and 14 ANFIS prediction that uses three input is represented by the blue line, w hile t he ANFIS predictions for the tw o input lines are represented in pink.
For liquid limit test, it w as found that using the ANFIS model predictions of three input is closer to the true value compared to the analysis using tw o inputs Similarly, the analysis of plastic limit testing and plasticity index indicate that the ANFIS prediction for the three inputs closer to the true value than tw o inputs.  Comparison of the total input ANFIS model is also reflected in the value of R 2 obtained as show n in Table 7 R 2 values obtained for ANFIS model that uses t w o inputs for limit liquid testing is 0.838 increasing to 0.987 for the model using three inputs. Similar results w ere also obtained for analysis of plastic limit testing and plasticity index of the value of R 2 is also increased w hen the input is increased from tw o to three input.
Referring to Table 8 the results for the low RM SE also obtained by ANFIS model for the analysis of the three liquid limit, plastic limit and plasticity index w hen the three inputs used RM SE values for liquid limit decreased from 3.345 to 0.957 Similarly, the plastic limit testing RM SE values decreased from 1.647 to 0.615 The index test plastic, the RM SE values obtained decreased from 2.739 to 0.421 Thus, w e can conclude t hat, the RM SE obtained w as dependent on the modification of the number of inputs used in the ANFIS model.

Conclusi on
From the results obtained, it can be concluded that the prediction by ANFIS method show s higher accuracy than t he SVM method for the liquid limit plastic limit and plasticity index. R2 coefficient and RM SE values obtained for both methods also show ed ANFIS model performed better than the SVM model in predicting the Atterberg limits as a w hole. The outcome of the study show that the ANFIS model show s higher accuracy than SVM model for the liquid limit (R 2 = 0.987), plastic limit (R 2 = 0.949) and plastic index (R 2 = 0966). RM SE value that obtained for both methods have show n t hat t he ANFIS model has represent the best performance than SVM model to predict the Atterberg Limits as a w hole. M odifications of SVM and ANFIS models have been done in order to evaluate the response of the output to the modification and the efficiency of the model.