Title :
Extracting, identifiyng and visualisation of the content in software projects
Author :
Uhlar, M. ; Polasek, Ivan
Author_Institution :
Fac. of Inf. & Inf. Technol., Slovak Univ. of Technol. in Bratislava, Bratislava, Slovakia
Abstract :
The paper proposes a method for extracting, identifying and visualisation of topics in software projects. In addition to standard information retrieval techniques, we use AST and WordNet ontology to enrich document vectors extracted from parsed source code, LSI to reduce its dimensionality and the swarm intelligence in the bee behaviour inspired algorithms to cluster documents contained in it. We extract topics from the identified clusters and visualise them in 3D graph. The goal is to provide insight into software projects for development participants in the process of analysing and reusing the source code.
Keywords :
data visualisation; graph theory; information retrieval; ontologies (artificial intelligence); software engineering; source coding; vectors; 3D graph; AST; LSI; WordNet ontology; content extraction; content identification; content visualisation; document vectors; information retrieval; parsed source code; software projects; Clustering algorithms; Indexes; Large scale integration; Software; Software algorithms; Vectors; Visualization; AST; Bee Behaviour Inspired Algorithms; Latent Semantic Indexing; Software Project; Source Code; Swarm Intelligence; Topic Identification and Extraction; Visualisation; WordNet Ontology;
Conference_Titel :
Nature and Biologically Inspired Computing (NaBIC), 2012 Fourth World Congress on
Conference_Location :
Mexico City
Print_ISBN :
978-1-4673-4767-9
DOI :
10.1109/NaBIC.2012.6402242