• DocumentCode
    237283
  • Title

    An Improved Discriminative Model for Duplication Detection on Bug Reports with Cluster Weighting

  • Author

    Meng-Jie Lin ; Cheng-Zen Yang

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Yuan Ze Univ., Chungli, Taiwan
  • fYear
    2014
  • fDate
    21-25 July 2014
  • Firstpage
    117
  • Lastpage
    122
  • Abstract
    Processing bug reports plays an important role for software maintenance. Recently, the issue of detecting duplicate bug reports has been noticed due to their considerable appearances. In the past, many NLP-based detection schemes have been proposed. However, the cluster-level correlation relationships are not extensively considered in the past studies. In this paper, we present an improved detection scheme using cluster weighting to enhance the detection performance of a previous SVM-based method. We have conducted empirical studies with three open source software projects, Apache, ArgoUML, and SVN. Compared with the original SVM-based method, the proposed SVM-TC scheme can achieve 2.83-16.32% improvements of the top-5 recall rates in three projects.
  • Keywords
    natural language processing; pattern clustering; program debugging; public domain software; software maintenance; support vector machines; Apache; ArgoUML; NLP-based detection scheme; SVM-TC scheme; SVM-based method; SVN; cluster weighting; cluster-level correlation relationship; detection performance; discriminative model; duplicate bug reports; duplication detection; open source software project; software maintenance; Correlation; Feature extraction; Mathematical model; Software; Support vector machines; Training; Vectors; Bug Reports; Cluster Weighting; Duplication Detection; Empirical Study;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Software and Applications Conference (COMPSAC), 2014 IEEE 38th Annual
  • Conference_Location
    Vasteras
  • Type

    conf

  • DOI
    10.1109/COMPSAC.2014.18
  • Filename
    6899208