• Title of article

    A study of subgroup discovery approaches for defect prediction

  • Author/Authors

    Rodriguez، نويسنده , , Daniel and Ruiz، نويسنده , , Roberto and Riquelme، نويسنده , , Jose C. and Harrison، نويسنده , , Rachel، نويسنده ,

  • Issue Information
    ماهنامه با شماره پیاپی سال 2013
  • Pages
    13
  • From page
    1810
  • To page
    1822
  • Abstract
    AbstractContext gh many papers have been published on software defect prediction techniques, machine learning approaches have yet to be fully explored. ive s paper we suggest using a descriptive approach for defect prediction rather than the precise classification techniques that are usually adopted. This allows us to characterise defective modules with simple rules that can easily be applied by practitioners and deliver a practical (or engineering) approach rather than a highly accurate result. cribe two well-known subgroup discovery algorithms, the SD algorithm and the CN2-SD algorithm to obtain rules that identify defect prone modules. The empirical work is performed with publicly available datasets from the Promise repository and object-oriented metrics from an Eclipse repository related to defect prediction. Subgroup discovery algorithms mitigate against characteristics of datasets that hinder the applicability of classification algorithms and so remove the need for preprocessing techniques. s sults show that the generated rules can be used to guide testing effort in order to improve the quality of software development projects. Such rules can indicate metrics, their threshold values and relationships between metrics of defective modules. sions duced rules are simple to use and easy to understand as they provide a description rather than a complete classification of the whole dataset. Thus this paper represents an engineering approach to defect prediction, i.e., an approach which is useful in practice, easily understandable and can be applied by practitioners.
  • Keywords
    Subgroup discovery , Rules , defect prediction , Imbalanced datasets
  • Journal title
    Information and Software Technology
  • Serial Year
    2013
  • Journal title
    Information and Software Technology
  • Record number

    2375167