DocumentCode
1834757
Title
Searching for Rules to find Defective Modules in Unbalanced Data Sets
Author
Rodriguez, David ; Riquelme, J.C. ; Ruiz, R. ; Aguilar-Ruiz, J.S.
Author_Institution
Dept. of Comput. Sci., Univ. of Alcala, Alcala de Henares
fYear
2009
fDate
13-15 May 2009
Firstpage
89
Lastpage
92
Abstract
The characterisation of defective modules in software engineering remains a challenge. In this work, we use data mining techniques to search for rules that indicate modules with a high probability of being defective. Using data sets from the PROMISE repository, we first applied feature selection (attribute selection) to work only with those attributes from the data sets capable of predicting defective modules. With the reduced data set, a genetic algorithm is used to search for rules characterising modules with a high probability of being defective. This algorithm overcomes the problem of unbalanced data sets where the number of non-defective samples in the data set highly outnumbers the defective ones.
Keywords
data mining; genetic algorithms; probability; software reliability; PROMISE repository; data mining technique; defective module; feature selection; genetic algorithm; probability; rule searching; software engineering; unbalanced data set; Computer science; Data mining; Degradation; Electronic mail; Genetic algorithms; Pattern recognition; Robustness; Sampling methods; Software engineering; Devective Modules; Genetic Algorithm; Subgroup Discovery;
fLanguage
English
Publisher
ieee
Conference_Titel
Search Based Software Engineering, 2009 1st International Symposium on
Conference_Location
Windsor
Print_ISBN
978-0-7695-3675-0
Type
conf
DOI
10.1109/SSBSE.2009.23
Filename
5033185
Link To Document