Title :
Input Data Representation for Self-Organizing Map in Software Classification
Author :
Lin, Yuqing ; Ye, Huilin
Author_Institution :
Sch. of Electr. Eng. & Comput. Sci., Univ. of Newcastle, Callaghan, NSW, Australia
fDate :
Nov. 30 2009-Dec. 1 2009
Abstract :
A self-organizing map (SOM) is used to classify software documents and the associated software components with the aim to facilitate software reuse. SOM learns from input stimuli rather than training data, therefore the quality of input data representation is crucial to the success of SOM. In this paper, we use automatic indexing method to represent a document collection as the input data to train a SOM. The automatic indexing uses a phrase formation method to promote precision and a domain dependent relational thesaurus to enhance recall. A retrieval experiment based on a document collection containing 97 Unix manual pages was conducted to evaluate the effectiveness of this input data representation scheme. Promising retrieval results were observed.
Keywords :
Unix; data structures; indexing; information retrieval; relational databases; self-organising feature maps; software reusability; Unix manual pages; automatic indexing method; document retrieval; domain dependent relational thesaurus; input data representation; phrase formation method; self-organizing map; software classification; software components; software documents; software reuse; Australia; Computer science; Content based retrieval; Humans; Information retrieval; Knowledge acquisition; Machine assisted indexing; Software reusability; Thesauri; Training data; Self-Organizing Map; automatic indexing; software classification;
Conference_Titel :
Knowledge Acquisition and Modeling, 2009. KAM '09. Second International Symposium on
Conference_Location :
Wuhan
Print_ISBN :
978-0-7695-3888-4
DOI :
10.1109/KAM.2009.151