DocumentCode :
2673622
Title :
TopicXP: Exploring topics in source code using Latent Dirichlet Allocation
Author :
Savage, Trevor ; Dit, Bogdan ; Gethers, Malcom ; Poshyvanyk, Denys
Author_Institution :
Dept. of Comput. Sci., Coll. of William & Mary, Williamsburg, VA, USA
fYear :
2010
fDate :
12-18 Sept. 2010
Firstpage :
1
Lastpage :
6
Abstract :
Acquiring general understanding of large software systems and components from which they are built can be a time consuming task, but having such an understanding is an important prerequisite to adding features or fixing bugs. In this paper we propose the tool, namely TopicXP, to support developers during such software maintenance tasks by extracting and analyzing unstructured information in source code identifier names and comments using Latent Dirichlet Allocation. TopicXP enables developers to gain an overview of a software system under analysis by extracting and visualizing natural language topics, which generally correspond to concepts or features implemented in software classes. TopicXP is implemented as an open-source Eclipse plug-in, which proposes interactive visualization of topics along with structural dependencies between underlying classes implementing these topics. The paper also presents the results of a preliminary user study aimed at evaluating TopicXP.
Keywords :
data visualisation; natural language processing; object-oriented programming; software maintenance; TopicXP; interactive visualization; latent Dirichlet allocation; natural language topics; open-source Eclipse plug-in; software components; software maintenance; source code identifier names; topics exploration; Resource management; Visualization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Maintenance (ICSM), 2010 IEEE International Conference on
Conference_Location :
Timisoara
ISSN :
1063-6773
Print_ISBN :
978-1-4244-8630-4
Electronic_ISBN :
1063-6773
Type :
conf
DOI :
10.1109/ICSM.2010.5609654
Filename :
5609654
Link To Document :
بازگشت