Title :
Applying Fellegi-Sunter (FS) Model for Traceability Link Recovery between Bug Databases and Version Archives
Author :
Sureka, Ashish ; Lal, Sangeeta ; Agarwal, Lucky
Author_Institution :
IIIT-D, New Delhi, India
Abstract :
Defect tracking systems such as Bugzilla and JIRA and source code version control systems such as CVS and SVN are widely used applications to support software development and maintenance activities. Previous studies show that bug databases and version databases are often used as standalone and separate repositories without explicit linkages between issue reports and corresponding commit transactions. This is because developers often do not explicitly mention or tag commit transactions with the relevant bug report IDs. The lack of explicit links between these two databases has been identified as a serious process data quality issue (incomplete and biased data) having implications in predictive model building (such as defect density and error proneness computation) and hypothesis-testing based on the dataset. Researchers have proposed solutions to link the two databases and performed experiments on open source projects such as Fire Fox Mozilla. We review previous approaches and propose a novel technique (based on Fellegi-Sunter (FS) Model for record linkages) to automatically integrate the two databases that overcomes some of the drawbacks of traditional methods. We validate the proposed approach by performing experiments on publicly available bug and version dataset obtained from two open-source projects (Apache HTTP Server and WikiMedia). The results of our experiments demonstrate that the proposed solution is effective in recovering trace ability links (missing links) between bug-fixing commits and corresponding bug reports.
Keywords :
database management systems; program debugging; system recovery; FS; Fellegi-Sunter application; Fire Fox Mozilla; HTTP Server; WikiMedia; bug databases; error proneness computation; open-source projects; software development; source code version control systems; traceability link recovery; tracking systems; version archives; version databases; Couplings; Databases; Joining processes; Software; Software engineering; Training; Vectors; Automated Software Engineering; Data Integration; Defect Tracking Systems; Fellegi-Sunter (FS) Model; Mining Software Repositories; Record Linkages; Software Engineering Process Data Analysis; Traceability Link Recovery; Version Archives;
Conference_Titel :
Software Engineering Conference (APSEC), 2011 18th Asia Pacific
Conference_Location :
Ho Chi Minh
Print_ISBN :
978-1-4577-2199-1
DOI :
10.1109/APSEC.2011.12