Title of article :
Clustering Classes in Packages for Program Comprehension
Author/Authors :
Sun, Xiaobing School of Information Engineering - Yangzhou University, Yangzhou, China , Liu,Xiangyue School of Information Engineering - Yangzhou University, Yangzhou, China , Li,Bin School of Information Engineering - Yangzhou University, Yangzhou, China , Li, Bixin School of Computer Science and Engineering - Southeast University, Nanjing, China , Lo,David School of Information Systems - Singapore Management University, Singapore , Liao, Lingzhi Nanjing University of Information Science & Technology, Nanjing, China
Pages :
16
From page :
1
To page :
16
Abstract :
During software maintenance and evolution, one of the important tasks faced by developers is to understand a system quickly and accurately. With the increasing size and complexity of an evolving system, program comprehension becomes an increasingly difficult activity. Given a target system for comprehension, developers may first focus on the package comprehension. The packages in the system are of different sizes. For small-sized packages in the system, developers can easily comprehend them. However, for large-sized packages, they are difficult to understand. In this article, we focus on understanding these large-sized packages and propose a novel program comprehension approach for large-sized packages, which utilizes the Latent Dirichlet Allocation (LDA) model to cluster large-sized packages. Thus, these large-sized packages are separated as small-sized clusters, which are easier for developers to comprehend. Empirical studies on four real-world software projects demonstrate the effectiveness of our approach. The results show that the effectiveness of our approach is better than Latent Semantic Indexing- (LSI-) and Probabilistic Latent Semantic Analysis- (PLSA-) based clustering approaches. In addition, we find that the topic that labels each cluster is useful for program comprehension.
Keywords :
Clustering Classe , Program Comprehension , Packages
Journal title :
Scientific Programming
Serial Year :
2017
Full Text URL :
Record number :
2608084
Link To Document :
بازگشت