Title :
Codebook: discovering and exploiting relationships in software repositories
Author :
Begel, Andrew ; Yit Phang Khoo ; Zimmermann, Thomas
Author_Institution :
Microsoft Res., Redmond, WA, USA
Abstract :
Large-scale software engineering requires communication and collaboration to successfully build and ship products. We conducted a survey with Microsoft engineers on inter-team coordination and found that the most impactful problems concerned finding and keeping track of other engineers. Since engineers are connected by their shared work, a tool that discovers connections in their work-related repositories can help. Here we describe the Codebook framework for mining software repositories. It is flexible enough to address all of the problems identified by our survey with a single data structure (graph of people and artifacts) and a single algorithm (regular language reachability). Codebook handles a larger variety of problems than prior work, analyzes more kinds of work artifacts, and can be customized by and for end-users. To evaluate our framework´s flexibility, we built two applications, Hoozizat and Deep Intellisense. We evaluated these applications with engineers to show effectiveness in addressing multiple inter-team coordination problems.
Keywords :
data mining; data structures; software development management; team working; Deep Intellisense; Hoozizat Intellisense; codebook framework; data structure; interteam coordination; large-scale software engineering; microsoft engineers; software repositories; software repository mining; Computer bugs; Crawlers; Databases; Electronic mail; Programming; Servers; Software; inter-team coordination; knowledge management; mining software repositories; regular expression; regular language reachability; social networking;
Conference_Titel :
Software Engineering, 2010 ACM/IEEE 32nd International Conference on
Conference_Location :
Cape Town
Print_ISBN :
978-1-60558-719-6
DOI :
10.1145/1806799.1806821