Title :
Effort estimation of ETL projects using Forward Stepwise Regression
Author :
Raza Rasool;Ali Afzal Malik
Author_Institution :
Department of Computer Science, National University of Computer and Emerging Sciences, Lahore, Pakistan
Abstract :
Effort estimation is a key component of planning a software development project. In the past, there has been a lot of research on estimation methods for traditional applications but, unfortunately, these methods do not apply to Extract Transform Load (ETL) projects. Coming up with a systematic effort estimate for ETL projects is a challenging task since ETL development does not follow the traditional Software Development Life Cycle (SDLC). Traditional application development is requirements-driven whereas ETL application development is data-driven. This research paper describes the development of an effort estimation model for ETL projects and compares this model with the most widely used algorithmic effort estimation model i.e. COCOMO II. A dataset comprising 220 industrial projects from five different software houses is used to build this effort estimation model using Forward Stepwise Regression. After eliminating 20 outliers from this dataset, the adjusted R2 (i.e. goodness of fit) of our model is 0.87. The prediction and training accuracy of this model is measured using the de-facto standard accuracy measures such as MMRE and PRED(25). On a training dataset of 200 projects, the training accuracy value of PRED(25) is 81.16% and MMRE is 0.16. Results show that our proposed estimation model provides considerably better estimation accuracy as compared to COCOMO II. On a validation dataset of 58 projects, the value of PRED(25) was 49% for our model as compared to 21% for COCOMO II. Furthermore, the MMRE of our model is 0.31 as compared to 0.99 for COCOMO II.
Keywords :
"Estimation","Predictive models","Software","Mathematical model","Prediction algorithms","Software algorithms","Standards"
Conference_Titel :
Emerging Technologies (ICET), 2015 International Conference on
Print_ISBN :
978-1-5090-2013-3
DOI :
10.1109/ICET.2015.7389209