DocumentCode :
3237163
Title :
A Case Study of Bias in Bug-Fix Datasets
Author :
Nguyen, Thanh H D ; Adams, Bram ; Hassan, Ahmed E.
Author_Institution :
Software Anal. & Intell. Lab. (SAIL), Queen´´s Univ., Kingston, ON, Canada
fYear :
2010
fDate :
13-16 Oct. 2010
Firstpage :
259
Lastpage :
268
Abstract :
Software quality researchers build software quality models by recovering traceability links between bug reports in issue tracking repositories and source code files. However, all too often the data stored in issue tracking repositories is not explicitly tagged or linked to source code. Researchers have to resort to heuristics to tag the data (e.g., to determine if an issue is a bug report or a work item), or to link a piece of code to a particular issue or bug. Recent studies by Bird et al. and by Antoniol et al. suggest that software models based on imperfect datasets with missing links to the code and incorrect tagging of issues, exhibit biases that compromise the validity and generality of the quality models built on top of the datasets. In this study, we verify the effects of such biases for a commercial project that enforces strict development guidelines and rules on the quality of the data in its issue tracking repository. Our results show that even in such a perfect setting, with a near-ideal dataset, biases do exist - leading us to conjecture that biases are more likely a symptom of the underlying software development process instead of being due to the used heuristics.
Keywords :
program debugging; program diagnostics; software quality; source coding; bug fix dataset; software development process; software quality model; software traceability; source code flies; Birds; Computer bugs; Couplings; Software engineering; Software quality; Tagging; bias; bug-fix; data quality; prediction; sample;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Reverse Engineering (WCRE), 2010 17th Working Conference on
Conference_Location :
Beverly, MA
ISSN :
1095-1350
Print_ISBN :
978-1-4244-8911-4
Type :
conf
DOI :
10.1109/WCRE.2010.37
Filename :
5645567
Link To Document :
بازگشت