DocumentCode :
2878661
Title :
Extracting concepts from file names; a new file clustering criterion
Author :
Anquetil, Nicolas ; Hbridge, Timothy Let
Author_Institution :
Sch. of Inf. Technol. & Eng., Ottawa Univ., Ont., Canada
fYear :
1998
fDate :
19-25 Apr 1998
Firstpage :
84
Lastpage :
93
Abstract :
Decomposing complex software systems into conceptually independent subsystems is a significant software engineering activity which received considerable research attention. Most of the research in this domain considers the body of the source code; trying to cluster together files which are conceptually related. We discuss techniques for extracting concepts (abbreviations) from a more informal source of information: file names. The task is difficult because nothing indicates where to split the file names into substrings. In general, finding abbreviations would require domain knowledge to identify the concepts that are referred to in a name and intuition to recognize such concepts in abbreviated forms. We show by experiment that the techniques we propose allow about 90% of the abbreviations to be found automatically
Keywords :
file organisation; reverse engineering; software maintenance; abbreviations; artificial intelligence; complex software decomposition; concept extraction; design recovery; domain knowledge; experiment; file clustering criterion; file names; independent subsystems; program understanding; research; reverse engineering; software engineering; source code; substrings; Buildings; Data mining; Information resources; Information technology; Organizing; Reverse engineering; Software maintenance; Software systems; Software tools; Tellurium;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Engineering, 1998. Proceedings of the 1998 International Conference on
Conference_Location :
Kyoto
ISSN :
0270-5257
Print_ISBN :
0-8186-8368-6
Type :
conf
DOI :
10.1109/ICSE.1998.671105
Filename :
671105
Link To Document :
بازگشت