• DocumentCode
    3587357
  • Title

    Mining Developer Mailing List to Predict Software Defects

  • Author

    Yu Zhang ; Beijun Shen ; Yuting Chen

  • Author_Institution
    Sch. of Software, Shanghai JiaoTong Univ., Shanghai, China
  • Volume
    1
  • fYear
    2014
  • Firstpage
    383
  • Lastpage
    390
  • Abstract
    It has been studied that the communication among software stakeholders can be used to predict potential software defects. Yet researchers have rarely studied the relations between the software and the mailing lists of the developers. In this paper, we research on how to predict software defects by mining the mailing lists of the software developers. First, we extract both the structural and the unstructured information from mailing lists as metrics. The structural information is calculated through analyzing the social network hidden in the mailing lists, and the unstructured information is obtained through taking topical and textual analysis of the lists. Second, we design a mailing list-based approach to predicting software defects. We have also analyzed the software repository of several open source projects by linking their bug tracking data-bases to the mailing list archives. The experimental results provide empirical evidence that the mailing list metrics are related to software quality and can be used as predictors of defect-proneness. Furthermore, we found that (1) messages having certain structures may indicate some defect related files, (2) the sentiment and some topic-specific mailing models are of strong correlations with the software defects.
  • Keywords
    data mining; electronic mail; program debugging; program testing; public domain software; software quality; bug tracking databases; mailing list-based approach; open source projects; social network; software defect prediction; software developer mailing list mining; software quality; software repository; software stakeholders; structural information; textual analysis; topic-specific mailing models; topical analysis; unstructured information; Data mining; Electronic mail; Measurement; Postal services; Predictive models; Software; defect prediction; mailing list; software repository mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Engineering Conference (APSEC), 2014 21st Asia-Pacific
  • ISSN
    1530-1362
  • Print_ISBN
    978-1-4799-7425-2
  • Type

    conf

  • DOI
    10.1109/APSEC.2014.63
  • Filename
    7091334