Modeling Software Effort Estimation Using Hybrid PSO ANFIS

2016 International Seminar on Intelligent Technology and Its Application

Keywords —Effort Estimation; ANFIS; COCOMO; Particle Swarm Optimization I.

Various parametric models have sprung up, such as COCOMO [18] COCOMO II [19], SLIM [20], ESTIMACS [21], all based Functional size Measurement (FSM) [22] in the estimation of effort. The quality of the data for this model bias depend son subjective assessments for each of the functional size. Model on COCOMO and SLIM depends on the number of SLOC (Source Lines of Code) to be estimated before starting the effort estimation process.

A. COCOMO

In four or five decades, there are many proposed estimation technique, referred to as the estimation model. To solve this problem, there are various methods of machine learning [3, 4] [5]. One of the methodological variations among them, Artificial Neural Networks [6, 7, 8], Fuzzy Logic [9, 10], Evolutionary Computing such as Genetic Algorithm [11] and Particle Swarm Optimization [12, 13, 14]. This method is suitable to solve real-world ambiguity. Artificial neuro-fuzzy inference systems have been applied in software effort estimation fields. Many studies on software effort prediction using artificial neuro-fuzzy inference system [15, 13, 16, 8] have been realized recently. Most of them [6] use a gradient descendent technique for optimizing the antecedent parameters and a least means square method for the consequent parameters of the ANFIS. More recently [17, 11] an optimization technique based on genetic algorithm was proposed for training the parameters in the antecedent part of a fuzzy system.

EVIEW

ITERATURE

II. L

I NTRODUCTION The development of the software industry is very rapid in recent causes the cost of the software to be one of the topics that interest [1]. Estimates of the cost to be one measure of success in a software project. An accurate estimate for the software effort, cost and scheduling is very important to manage financial issues and monitor the activities of development and on-time delivery. Based on data from the Standish Group's CHAOS Report 2012 [2] EXTREME, 30,000 software projects, 39% of project failures, 43% experienced problems, only18% of successful projects. In addition to financial losses, the company's IT project failure caused a decline in the company's reputation. To achieve an accurate estimate, many contributions proposed and validated estimation techniques to reduce and eliminate inaccuracies estimation. Highlights of this research is to design software evaluation effort using Adaptive Neuro- Fuzzy Inference System (ANFIS) the historical dataset COCOMO 81 and shows the significance of this technique compared to other machine learning techniques. This study will use a blend of techniques PSO with ANFIS tested for applicability to predict COCOMO feature. Two methods will be compared in terms of the accuracy of their predictions. The remainder of the paper is divided in different sections as follows: Section II includes a brief literature review about the concepts and techniques used in current model. Section III presents the proposed model based on ANFIS Optimization with PSO for software cost estimation. Experiments and results are described in section IV and conclusion of the paper is described in section V and in the last; Appendix shows the comparison between those models.

Abstract — Accurate estimating software development effort is essential in effective project management processes such as budgeting, project planning and control. To achieve an accurate estimate some algorithmic estimation techniques proposed to eliminate or reduce inaccuracies estimation. COCOMO is a parametric model used to estimate software effort. However, so far no model has proven successful to effectively and consistently predict software effort. Parametric models are considered vulnerable when faced with the problem of non-linearity of the complex in the parameters. In recent years, some estimation technique appears using intelligent systems to predict software effort. This study uses a model Neuro-fuzzy optimized with PSO to get the right model to improve the estimation effort at NASA dataset software project. Parameter cost driver, consisting of 17 feature COCOMO will then be optimized using PSO techniques to get a better prediction accuracy. Furthermore, the results of the optimization will be trained in using the algorithm to get a prediction Neuro-fuzzy effort. The performance of the proposed estimation model will be evaluated with some other intelligent system model parameters to evaluate several criteria such as Mean Standard Error (MSE), Mean Magnitude of Relative Error (MMER), and Level Prediction (Pred). The model that best shows the error rate MSE and MMER lowest to highest Pred.

Modeling Software Effort Estimation Using Hybrid

PSO-ANFIS

Suharjito

Bina Nusantara University Jakarta, Indonesia

Magister of Information Technology

Benfano Soewito

Bina Nusantara University Jakarta, Indonesia

Magister of Information Technology

Saka Nanda

Bina Nusantara University Jakarta, Indonesia

Magister of Information Technology

COCOMO (Constructive Cost Model) is a regression model developed by Prof. Barry W. Boehm [19]. This method introduces a non-linear approach in 1981. COCOMO categorizes projects into three levels, namely Basic, Intermediate, and Detail. When the data set is still widely used in the prediction of effort and budget planning software [15]. COCOMO II model can be described by the following equation:

(1) Where A and B are constants initial calibration, Size refers to the size of software project measured Source Lines of Code (KSLOC), SF is a scale factor, and EM is Effort multiplier. COCOMO II has 17 Effort Multiplier (EMS) used in the model Post-Architecture.

(5) Layer 5: There is a single node with symbol

Fig. 2 Pearson analysis of cost factor

(7) Where X is the cost factor and Y is the actual effort. is the average value of the cost factor of 63 data. Similarly is the average actual effort of 63 Data. Figure 2 shows the correlation between the cost factors with the actual effort by COCOMO 81 dataset.

cost drivers with the actual effort. The coefficient 'r' is calculated by the following equation:

In this study, Pearson Correlation was used to analyze linear relationship between the 15

ETHODOLOGY

III. M

(6) When the particles get Pbest and Gbest, using the above equation updating velocity of the displacement of each particle. Particle next coordinates for the current location coordinates are updated as low speed add the new coordinates.

PSO is a heuristic method aims to optimize iterative problem with trying to improve the candidate solutions linked to certain measures of quality. The method is known as Meta Heuristic [23]. PSO is a population-based search where each solution is represented as a particle of the population (called swarm). Each particle should be noted that the optimal solution path through them, the solution is called locally optimal solution (Pbest). Each particle should also have social behavior of each particle to find the optimal solution in the current search, the so-called global optimal solution (Gbest).

D. PSO

Σ which calculates the overall output of the incoming signal. This stage is also called defuzzification.

Π is a multiplier value. 3. Layer 3: Normalized firing strength. Each node labeled N which is calculated by taking the ratio of the k-th rule firing strength of the overall total of all rules firing strength. 4. Layer 4: Setting the consequent parameters. Each node k in a square layer node with function:

Where x is the input of node k, and is a fuzzy set of rating the node function. is a membership function hat determines the degree of membership . 2. Each node labeled

1. Layer1: each node k in a square layer node with the function: (4)

The function of each node is described as follows:

CDf ^CDRi _CDRi Fig. 1 ANFIS Structure

ẁ1CDi1 ẁ1CDi6 ... ... ... Layer 1 Layer 2 Layer 3 Layer 4 Layer 5

A2 _A6 N N N ... ^w1 _w6 ẁ6 ẁ1

ANFIS is an integrated model between Fuzzy Inference System (FIS) with Neural Network. Keys to success is finding the FIS rule base. FIS convert human knowledge into a rule base in order to maximize performance and minimize the model error output. _A1

C. ANFIS

(3) Wherel1is the left boundary value of membership, l2 and l3 and l4 is a middle limit is the right limit membership value (l1 <l2 <l3 <l4).

(2) Where l1 is the left boundary value of membership, mk is the value of capital and l2 is the right limit membership value (l1 <m <l2).

Several types of membership functions including: Triangular, Gaussian bell, trapezoidal, sigma, S function and the function of Z The three of them namely Triangular, Gaussian and trapezoidal very commonly used in software estimation model as more appropriate to describe the linguistic term.

Fuzzy Membership Function

Through the Pearson Correlation analysis, a relatively strong positive correlation (0.449) was found in factor Database Size (DATA) and the actual effort. This would suggest if the size of the database increases, the value of effort also increases. Likewise, linear associations were also found at factor cost MODP, RELY, and TURN to the actual effort. Furthermore, one-way ANOVA was used to test whether it

START

is true for variables are the same means. This is done to determine whether there are important differences between the means of two unrelated groups, namely the cost factor _{Input Data} and the actual group effort. Null hypothesis is made means of two variables are the same. _{Set Input Parameter dari} _{Membership Function} _{Preprocessing Data} _NO Model ANFIS PSO _Konvergen YES

Fig. 3 one way Anova analysis of Cost Factor _{Inference Result}

Factors to be selected from this process is ACAP, AEXP, PCAP and LEXP. Thus eight dominant features are selected _Decision will be processed in training.

In this study, ANFIS uses PSO to adjust the parameters _{Prediksi Effort} of the membership function. Membership function tested in this study are triangular and Gaussian The algorithm used in the approach described in Figure5.Featuresof data that all END seven dominant cost factor and one LOC preprocessing

Fig. 4 Flow Diagram of Proposed model

results incorporated into the ANFIS (see Fig. 2). in first stage, we conducting training ANFIS with data from the

IV. R ESULTS AND D

ISCUSSION

previous stage. The training process allows the system to adjust the parameters as input/output are included. The Evaluate the accuracy of estimation can be done by training process stops when the number of epoch is reached comparing the results of the production effort and the actual or the number of error-rates achieved. The second stage: effort to calculate the MSE (Mean Squared Error) and MRE create a vector with the number of dimensions N where N is (Magnitude of Relative Error) [24]. The average MER has the number of membership functions. Vector contains the better results than the average MRE (MMRE). MSE can be parameters of membership functions and will be optimized described by the following equation where i is the number by PSO algorithm. Mean-Squared Error is defined as fitness of observations: function. The third stage: Determine the initiation parameter

(8) PSO shown in Table I. Parameters randomly initiated in the first phase and then updated by PSO algorithm. The fourth

MER can be described by the following equation where i is stage: Parameter produced PSO then extracted to output the number of observations:

ANFIS. The output is an effort predictions of PSO-ANFIS (9) approach.

TABLE

I I NITIAL PARAMETER PSO

Pred(l) is used as a complement to the criteria for Parameter Value calculating the percentage estimates are under l to actual

Particle

25 effort. Generally, the valueusedforlwas25%. Iteration 2000 Cognitive Acceleration

2.0 (10)

Social Acceleration

2.0 Inertia weight

0.9 A.

Triangular Membership Result Final Inertia weight

0.4 For standard ANFIS models, training results indicate over

Parameters are initialized at random in the first stage and fitting if using110 epoch. Best fitting was found in the then updated using the PSO algorithm. In each iteration, one number of training10 epoch. Error training is lowest at 4.94. of the parameters of the membership function will be While ANFIS-PSO training results showed a decrease in updated. error until 2.22 in the epoch to 107. Since the model used is Partition Grid, then the number of rules generated is 256 rules.

Fig. 5 Training Error using default ANFIS in 120 epoch.

Fig. 6 Training Error using ANFIS-PSO in 200 epoch.

Fig. 7 Comparison of MSE in testing phase for Triangular Membership.

In the model of ANFIS Triangular, the average MSE ANFIS standards using back propagation optimization is 22.61 while the average MSE with ANFIS modification with optimization PSO is 9.30.

Gaussian Membership Result Training results indicate over fitting if using 500 epoch.

Best fitting was found in the number of training 20 epoch. Error lowest at training is 8.15. While on ANFIS models PSO error rate can still be decreased after the epoch to 500.

The lowest error when training is 7.04. Because the model used is Partition Grid, then the number of rules generated is 256 rules.

Fig. 8 Training Error using default ANFIS in 500 epoch Fig. 9 Training Error using ANFIS-PSO in 500 epoch.

Fig. 10 Comparison of MSE in testing phase for Gaussian Membership.

In ANFIS Gaussian models, the average MSEANFIS standards using back propagation optimization is 12:48 while the average MSE with ANFIS modification with optimization PSO is 10.63.

Subtractive Clustering Result

To ANFIS standards, training results indicate over fitting if using 150 epoch. Best fitting standard ANFIS was found in the number of training 2 epoch. Error training is lowest at 0:52. As for the best-fitting ANFIS PSO was found in the number of training 150 epoch. Lowest error when training is 0:07. Rule number generated by the model are the Subtractive Clustering are 40 rules.

TABLE

II P ERFORMANCE E

VALUATION R ESULT Best Least MMSE MMRE PRED(25) epoch Error

ANFIS Triangular 10 4.94 22.6138566 0.102283 0.9047619 ANFIS-

PSO Triangular 107 2.22 9.30948262 0.05998876 0.96825397 ANFIS

Fig. 11 Training Error using default ANFIS in 10 epoch.

Gaussian

20 8.15 12.4849286 0.06504668 0.95238095 ANFIS- PSO Gaussian 500

7.04 10.6328251 0.05544703 0.95238095 ANFIS SubClust

2 0.52 9.31011887 0.00651858

1 ANFIS- PSO SubClust 150 0.07 4.09384171 0.0040061

1 EFERENCES

[1] S. Grimstad and M. Jorgensen, "The Impact of Irrelevant Information on Estimates of Software Development Effort," Proceedings of the 2007 Fig. 12 Training Error using ANFIS-PSO in 150 epoch.

Australian Software Engineering Conference, ASWEC '07 IEEE Computer Society, pp. 359-368, 2007.

[2] "http://www.cin.ufpe.br/~gmp/docs/papers/extreme_chaos2001.pdf," 2001. [Online]. Available: http://www.cin.ufpe.br/~gmp/docs/papers/extreme_chaos2001.pdf. [3] F. Douglas Fisher, Srinivasan and Krishnamoorthy, "Machine learning approaches to estimating software development effort," IEEE

Transactions on Software Engineering vol.21, no.2, pp. 126-137, 1995.

[4] Prabhakar and Maitreyee Dutta, "Application of machine learning techniques for predicting software effort," Elixir Comp. Sci. & Engg, pp.

13677-13682, 2013. [5] Prabhakar and M. Dutta, "Prediction of Software Effort using Artificial Neural Network and Support Vector Machine," International Journal of

Advanced Research in Computer Science and Software Engineering, vol.3, no. 3, pp. 40-46, 2013.

[6] C. S. Reddy, P. S. Rao, KVSVN Raju and V. V. Kumari, "A New Fig. 13 Comparison of MSE in testing phase for Triangular Membership.

Approach for Estimating Software Effort Using RBFN Network," International Journal of Computer Science and Network Security vol.8, no.7, pp. 237-241, 2008.

In Sub .Clustering ANFIS models, the average MSE

[7] P. Reddy, "Software Effort Estimation using Radial Based and

ANFIS standards using back propagation optimization is

Generalized Regression Neural Network," Journal of Computing, vol.2,

9.31 while the average MSE with ANFIS modification with no. 5, pp. 87-92, 2010. optimization PSO is 4.09.

[8] R. Bhatnagar, V. Bhattacharjee and M. K. Ghose, "Software Development Effort Estimation - Neural Network Vs. Regression

ONCLUSION AND UTURE ORKS

V. C F W

Modeling Approach," International Journal of Engineering Science and Technology, vol.2, pp. 2950-2956, 2010.

Based on experiment conducted in Table II, it can be

[9] Azzeh M., Neagu D. and Cowling P. I., "Fuzzy Grey Relational Analysis

concluded that ANFIS-PSO model evaluation shows that the

for Software Effort Estimation," Empirical Software Engineering, pp. 60-

PSO implemented optimization results better estimate the 90, 2010. level MMRE, MSE lower and higher Pred than the previous

[10] S. Dingra and P. Singh, "An Adaptive Neuro Fuzzy Approach for

ANFIS approach. In addition, the number of epoch used

Software Development Time Estimation," International Journal of

during training are also shown. It can be concluded that the

Advanced Research in Computer Science and Software Engineering, vol.3, no. 5, pp. 586-590, 2013.

previous model of ANFIS is more efficient and stable in

[11] Lin J. C., Chang C. T. and Huang S. Y., "Research on Software Effort

terms of reduced error during training. By comparison

Evaluation Combined with Genetic Algorithm and Support Vector

epoch number greater than optimization PSO, previously

Regression," IEEE International Symposium on Computer Science and

ANFIS can approach the estimation accuracy of ANFIS- Society (11), pp. 349-352, 2011. PSO..

[12] K. Sweta and P. Shashankar, "Comparison and Analysis of Different

For future work, similar studies can be done to predict

Software Cost Estimation Method vol.4, no.1," International Journal of Advanced Computer Science and Applications, pp. 153-157, 2013.

software effort using optimization models based on machine learning algorithms, such as Genetic Algorithms (GA) and

[13] CH.VM.K. Hari and P. Reddy, "A Fine Parameter Tuning for COCOMO

81 Software Effort Estimation using Particle Swarm Optimization," Ant Colony Optimization (ACO).

Journal of Software Engineering (ISSN: 1819-4311/DOI:

10.3923/jse.2011), 2011.

[14] Lin J. C. and Tzeng H. Y., "Applying Particle Swarm Optimization to Estimate Software Effort by Multiple Factors Software Project Clustering," IEEE Computer Symposium (ICS), pp. 1039-1044, 2010. [15] X. Huang, D. Ho, J. Ren and L. F. Capretz, "Improving the COCOMO Model using a Neuro-Fuzzy Approach," Applied Soft Computing, no.7, pp. 29-40, 2007. [16] J. S. Pahariya, V. Ravi, M. Carr and M. Vasu, "Computational Intelligence Hybrids Applied to Software Cost Estimation," International

Journal of Computer Information Systems and Industrial Management Applications (IJCISIM) vol.2, pp. 104-112, 2010.

[17] L. Jin-Cherng Lin, C. Chu-Ting and H. Sheng-Yu, "Research on Software Effort Estimation Combined with Genetic Algorithm and Support Vector Regression," Computer Science and Society (ISCCS) International Symposium, pp. 349-352, 2011. [18] B. Boehm, "Software engineering economics," IEEE Transactions on Software Engineering, pp. (10) 4-21, 1984.

[19] B. Boehm, "Safe and Simple Software Cost Analysis," IEEE Software, pp. 17(5) 14-17, 2000. [20] L. H. Putnam, "SLIM: a quantitative tool for software cost and schedule estimation," in Proceedings of NBS/IEEE/ACM Software Tool Fair,

1981. [21] H. A. Rubin, "Macroestimation of software development parameters: The Estimacs System," in Proceedings of SOFTFAIR Conference on Software

Development Tools, Techniques and Alternatives , Arlington, 1983.

[22] A. J. Albrecht, "Measuring Application Development Productivity," in Proceedings of IBM Application Development Symp. , Monterey, California, 1979.

[23] Sheta A.F., Ayesh A. and Rine D., "Evaluating Software Cost Estimation Models using Particle Swarm Optimi-zation and Fuzzy Logic for NASA Projects: a Comparative Study," International Journal Bio-Inspired Computation, vol.2, no. 6, pp. 365-373, 2010.

[24] Foss T., Stensrud E., Kitchenham B. and Myrtveit L., "A Simulation Study of the Model Evaluation Criterion MMRE," IEEE Transactions on Software Engineering vol.29, pp. 985-995, 2003.

Modeling Software Effort Estimation Using Hybrid PSO ANFIS

0.4 For standard ANFIS models, training results indicate over

ONCLUSION AND UTURE ORKS

Dokumen yang terkait

Implementasi Hybrid (Content Based Dan Collaborative Filtering) Pada Sistem Rekomendasi Software Antivirus Dengan Multi-Criteria Rating

Estimation of Capacity of Lead Acid Battery Using RBF Model

Dynamic Modeling of Bellows Actuated Continuum Robots Using the Euler–Lagrange Formalism07339740

Quality Attributes Decision Modeling for Software Product Line Architecture

Break Even Point Estimation Using Fuzzy Cluster(FCM)

Business Modeling and Software Design

Implementasi Hybrid (Content Based Dan Collaborative Filtering) Pada Sistem Rekomendasi Software Antivirus Dengan Multi-Criteria Rating

Hybrid Modeling and Control of a Hydroelectric Power Plant

Testing Normality and Bandwith Estimation Using Kernel Method For Small Sample Size

Smart Grid Approach by Evaluating Power System Oscillation Mode Using PMU Based Eigenvalues Estimation

Dukungan

Links

Modeling Software Effort Estimation Using Hybrid PSO ANFIS

0.4 For standard ANFIS models, training results indicate over

ONCLUSION AND UTURE ORKS

Dokumen yang terkait

Implementasi Hybrid (Content Based Dan Collaborative Filtering) Pada Sistem Rekomendasi Software Antivirus Dengan Multi-Criteria Rating

Estimation of Capacity of Lead Acid Battery Using RBF Model

Dynamic Modeling of Bellows Actuated Continuum Robots Using the Euler–Lagrange Formalism07339740

Quality Attributes Decision Modeling for Software Product Line Architecture

Break Even Point Estimation Using Fuzzy Cluster(FCM)

Business Modeling and Software Design

Implementasi Hybrid (Content Based Dan Collaborative Filtering) Pada Sistem Rekomendasi Software Antivirus Dengan Multi-Criteria Rating

Hybrid Modeling and Control of a Hydroelectric Power Plant

Testing Normality and Bandwith Estimation Using Kernel Method For Small Sample Size

Smart Grid Approach by Evaluating Power System Oscillation Mode Using PMU Based Eigenvalues Estimation

Dokumen yang Anda mencari sudah siap untuk unduhkan