Title :
Clustering Source Code Elements by Semantic Similarity Using Wikipedia
Author :
Schindler, Mirco ; Fox, Oliver ; Rausch, Andreas
Author_Institution :
Dept. of Inf. - Software Syst. Eng., Clausthal Univ. of Technol., Clausthal-Zellerfeld, Germany
Abstract :
For humans it is no problem to determine if two words have a high or low semantic similarity in a given context. But is it possible to support a software developer or architect by using semantic data extracted from source code in the same way other relations like typical source code relations do? To answer this question we developed an approach to compute the semantic similarity by using Wikipedia as a textual corpus. In a case study we demonstrate this approach with a manageable software system. The results of using semantic similarities are compared with the outcome of using source code relations instead.
Keywords :
Web sites; information retrieval; pattern clustering; semantic Web; software development management; source code (software); text analysis; word processing; Wikipedia; clustering source code element; manageable software system; semantic data extraction; semantic similarity; software developer; source code relations; textual corpus; Electronic publishing; Encyclopedias; Internet; Semantics; Software systems; Component Structure; Information Retrieval; Modularization; Semantic Similarity; Software Architecture; Spectral Clustering; Wikipedia;
Conference_Titel :
Realizing Artificial Intelligence Synergies in Software Engineering (RAISE), 2015 IEEE/ACM 4th International Workshop on
Conference_Location :
Florence
DOI :
10.1109/RAISE.2015.10