• DocumentCode
    3407509
  • Title

    Better cross company defect prediction

  • Author

    Peters, F. ; Menzies, T. ; Marcus, Andrian

  • Author_Institution
    Lane Dept. of CS & EE, West Virginia Univ., Morgantown, WV, USA
  • fYear
    2013
  • fDate
    18-19 May 2013
  • Firstpage
    409
  • Lastpage
    418
  • Abstract
    How can we find data for quality prediction? Early in the life cycle, projects may lack the data needed to build such predictors. Prior work assumed that relevant training data was found nearest to the local project. But is this the best approach? This paper introduces the Peters filter which is based on the following conjecture: When local data is scarce, more information exists in other projects. Accordingly, this filter selects training data via the structure of other projects. To assess the performance of the Peters filter, we compare it with two other approaches for quality prediction. Within-company learning and cross-company learning with the Burak filter (the state-of-the-art relevancy filter). This paper finds that: 1) within-company predictors are weak for small data-sets; 2) the Peters filter+cross-company builds better predictors than both within-company and the Burak filter+cross-company; and 3) the Peters filter builds 64% more useful predictors than both within-company and the Burak filter+cross-company approaches. Hence, we recommend the Peters filter for cross-company learning.
  • Keywords
    data mining; learning (artificial intelligence); software quality; Burak filter-cross-company approach; Peters filter-cross-company approach; cross-company defect prediction; cross-company learning; local data; quality prediction; state-of-the-art relevancy filter; training data; within-company learning; within-company predictors; Companies; Estimation; Predictive models; Radio frequency; Software; Training data; Vegetation; Cross company; data mining; defect prediction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Mining Software Repositories (MSR), 2013 10th IEEE Working Conference on
  • Conference_Location
    San Francisco, CA
  • ISSN
    2160-1852
  • Print_ISBN
    978-1-4799-0345-0
  • Type

    conf

  • DOI
    10.1109/MSR.2013.6624057
  • Filename
    6624057