Title :
Combining Text Mining and Data Mining for Bug Report Classification
Author :
Yu Zhou ; Yanxiang Tong ; Ruihang Gu ; Gall, H.
Author_Institution :
Coll. of Comput. Sci., Nanjing Univ. of Aeronaut. & Astronaut., Nanjing, China
fDate :
Sept. 29 2014-Oct. 3 2014
Abstract :
Misclassification of bug reports inevitably sacrifices the performance of bug prediction models. Manual examinations can help reduce the noise but bring a heavy burden for developers instead. In this paper, we propose a hybrid approach by combining both text mining and data mining techniques of bug report data to automate the prediction process. The first stage leverages text mining techniques to analyze the summary parts of bug reports and classifies them into three levels of probability. The extracted features and some other structured features of bug reports are then fed into the machine learner in the second stage. Data grafting techniques are employed to bridge the two stages. Comparative experiments with previous studies on the same data -- three large-scale open source projects -- consistently achieve a reasonable enhancement (from 77.4% to 81.7%, 73.9% to 80.2% and 87.4% to 93.7%, respectively) over their best results in terms of overall performance. Additional comparative empirical experiments on other two popular open source repositories confirm the findings and demonstrate the benefits of our approach.
Keywords :
data mining; learning (artificial intelligence); program debugging; public domain software; text analysis; bug prediction models; bug report classification; data grafting techniques; data mining techniques; hybrid approach; large-scale open source projects; machine learner; open source repositories; summary parts; text mining techniques; Bayes methods; Feature extraction; Predictive models; Software; Text mining; Training;
Conference_Titel :
Software Maintenance and Evolution (ICSME), 2014 IEEE International Conference on
Conference_Location :
Victoria, BC
DOI :
10.1109/ICSME.2014.53