Title :
Mining Developer Mailing List to Predict Software Defects
Author :
Yu Zhang ; Beijun Shen ; Yuting Chen
Author_Institution :
Sch. of Software, Shanghai JiaoTong Univ., Shanghai, China
Abstract :
It has been studied that the communication among software stakeholders can be used to predict potential software defects. Yet researchers have rarely studied the relations between the software and the mailing lists of the developers. In this paper, we research on how to predict software defects by mining the mailing lists of the software developers. First, we extract both the structural and the unstructured information from mailing lists as metrics. The structural information is calculated through analyzing the social network hidden in the mailing lists, and the unstructured information is obtained through taking topical and textual analysis of the lists. Second, we design a mailing list-based approach to predicting software defects. We have also analyzed the software repository of several open source projects by linking their bug tracking data-bases to the mailing list archives. The experimental results provide empirical evidence that the mailing list metrics are related to software quality and can be used as predictors of defect-proneness. Furthermore, we found that (1) messages having certain structures may indicate some defect related files, (2) the sentiment and some topic-specific mailing models are of strong correlations with the software defects.
Keywords :
data mining; electronic mail; program debugging; program testing; public domain software; software quality; bug tracking databases; mailing list-based approach; open source projects; social network; software defect prediction; software developer mailing list mining; software quality; software repository; software stakeholders; structural information; textual analysis; topic-specific mailing models; topical analysis; unstructured information; Data mining; Electronic mail; Measurement; Postal services; Predictive models; Software; defect prediction; mailing list; software repository mining;
Conference_Titel :
Software Engineering Conference (APSEC), 2014 21st Asia-Pacific
Print_ISBN :
978-1-4799-7425-2
DOI :
10.1109/APSEC.2014.63