DocumentCode :
3335292
Title :
Incomplete-Case Nearest Neighbor Imputation in Software Measurement Data
Author :
Van Hulse, Jason ; Khoshgoftaar, Taghi M.
Author_Institution :
Florida Atlantic Univ., Boca Raton
fYear :
2007
fDate :
13-15 Aug. 2007
Firstpage :
630
Lastpage :
637
Abstract :
Missing values are commonly encountered in software measurement data, and k nearest neighbor imputation (kNNI) is one of the most popular imputation procedures used by researchers and practitioners in empirical software engineering. Imputation techniques are used to replace missing values with one or more alternatives. Traditionally, kNNI uses only complete cases as possible donors for imputation (called complete case kNNI or CCkNNI), however a variant of CCkNNI called incomplete case k nearest neighbor imputation (ICkNNI) is an attractive alternative which has received very little attention. We present a detailed comparative study of CCkNNI and ICkNNI with missing software measurement data, and demonstrate that using incomplete cases often increases the effectiveness of nearest neighbor imputation (especially at higher missingness levels), regardless of the type of missingness.
Keywords :
data analysis; software metrics; incomplete case k-nearest neighbor imputation; missing values; software engineering; software measurement data; Algorithm design and analysis; Computer science; Data analysis; Data engineering; Libraries; Nearest neighbor searches; Predictive models; Software engineering; Software measurement; State estimation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Reuse and Integration, 2007. IRI 2007. IEEE International Conference on
Conference_Location :
Las Vegas, IL
Print_ISBN :
1-4244-1500-4
Electronic_ISBN :
1-4244-1500-4
Type :
conf
DOI :
10.1109/IRI.2007.4296691
Filename :
4296691
Link To Document :
بازگشت