• Title of article

    Incomplete-case nearest neighbor imputation in software measurement data

  • Author/Authors

    Jason Van Hulse، نويسنده , , Taghi M. Khoshgoftaar، نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2014
  • Pages
    15
  • From page
    596
  • To page
    610
  • Abstract
    k nearest neighbor imputation (kNNI) is one of the most popular methods in empirical software engineering for imputing missing values. kNNI typically uses only complete cases as possible donors for imputation (called complete case kNNI or CCkNNI). Though it often produces reasonable results, CCkNNI is severely limited when the amount of missing data is large (and hence the number of complete cases is small). In response, a variant of CCkNNI called incomplete case k nearest neighbor imputation (ICkNNI) has been proposed as an attractive alternative. This work presents a detailed simulation comparing CCkNNI and ICkNNI using two different software measurement datasets. The empirical results show that using incomplete cases often increases the effectiveness of nearest neighbor imputation (especially at higher missingness levels), regardless of the type of missingness (i.e., the distribution of missing values in the data).
  • Keywords
    Complete-case , Nearest neighbor imputation , Incomplete-case , Software measurement data
  • Journal title
    Information Sciences
  • Serial Year
    2014
  • Journal title
    Information Sciences
  • Record number

    1216008