DocumentCode
237283
Title
An Improved Discriminative Model for Duplication Detection on Bug Reports with Cluster Weighting
Author
Meng-Jie Lin ; Cheng-Zen Yang
Author_Institution
Dept. of Comput. Sci. & Eng., Yuan Ze Univ., Chungli, Taiwan
fYear
2014
fDate
21-25 July 2014
Firstpage
117
Lastpage
122
Abstract
Processing bug reports plays an important role for software maintenance. Recently, the issue of detecting duplicate bug reports has been noticed due to their considerable appearances. In the past, many NLP-based detection schemes have been proposed. However, the cluster-level correlation relationships are not extensively considered in the past studies. In this paper, we present an improved detection scheme using cluster weighting to enhance the detection performance of a previous SVM-based method. We have conducted empirical studies with three open source software projects, Apache, ArgoUML, and SVN. Compared with the original SVM-based method, the proposed SVM-TC scheme can achieve 2.83-16.32% improvements of the top-5 recall rates in three projects.
Keywords
natural language processing; pattern clustering; program debugging; public domain software; software maintenance; support vector machines; Apache; ArgoUML; NLP-based detection scheme; SVM-TC scheme; SVM-based method; SVN; cluster weighting; cluster-level correlation relationship; detection performance; discriminative model; duplicate bug reports; duplication detection; open source software project; software maintenance; Correlation; Feature extraction; Mathematical model; Software; Support vector machines; Training; Vectors; Bug Reports; Cluster Weighting; Duplication Detection; Empirical Study;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Software and Applications Conference (COMPSAC), 2014 IEEE 38th Annual
Conference_Location
Vasteras
Type
conf
DOI
10.1109/COMPSAC.2014.18
Filename
6899208
Link To Document