DocumentCode :
1834757
Title :
Searching for Rules to find Defective Modules in Unbalanced Data Sets
Author :
Rodriguez, David ; Riquelme, J.C. ; Ruiz, R. ; Aguilar-Ruiz, J.S.
Author_Institution :
Dept. of Comput. Sci., Univ. of Alcala, Alcala de Henares
fYear :
2009
fDate :
13-15 May 2009
Firstpage :
89
Lastpage :
92
Abstract :
The characterisation of defective modules in software engineering remains a challenge. In this work, we use data mining techniques to search for rules that indicate modules with a high probability of being defective. Using data sets from the PROMISE repository, we first applied feature selection (attribute selection) to work only with those attributes from the data sets capable of predicting defective modules. With the reduced data set, a genetic algorithm is used to search for rules characterising modules with a high probability of being defective. This algorithm overcomes the problem of unbalanced data sets where the number of non-defective samples in the data set highly outnumbers the defective ones.
Keywords :
data mining; genetic algorithms; probability; software reliability; PROMISE repository; data mining technique; defective module; feature selection; genetic algorithm; probability; rule searching; software engineering; unbalanced data set; Computer science; Data mining; Degradation; Electronic mail; Genetic algorithms; Pattern recognition; Robustness; Sampling methods; Software engineering; Devective Modules; Genetic Algorithm; Subgroup Discovery;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Search Based Software Engineering, 2009 1st International Symposium on
Conference_Location :
Windsor
Print_ISBN :
978-0-7695-3675-0
Type :
conf
DOI :
10.1109/SSBSE.2009.23
Filename :
5033185
Link To Document :
بازگشت