DocumentCode
1921793
Title
A Novel Self-Organizing Map for Text Document Organization
Author
Yang, Hsin-Chang ; Lee, Chung-Hong
Author_Institution
Dept. Inf. Manage., Nat. Univ. of Kaohsiung, Kaohsiung, Taiwan
fYear
2012
fDate
26-28 Sept. 2012
Firstpage
39
Lastpage
44
Abstract
The self-organizing map (SOM) model is a well-known neural network model with wide spread of applications. The main characteristics of SOM are two-fold, namely dimension reduction and topology preservation. Using SOM, a high-dimensional data space will be mapped to some low-dimensional space. Meanwhile, the topological relations among data will be preserved. With such characteristics, the SOM was usually applied on data clustering and visualization tasks. One major shortage of classical SOM learning algorithm is the necessity of predefined map topology. Furthermore, hierarchical relationships among data are also difficult to be revealed. In this work, we propose a novel SOM learning algorithm which incorporates several text mining techniques in expanding the map both laterally and hierarchically that could discover the relationships among documents in both perspectives. The proposed algorithm will first cluster a set of training documents using classical SOM algorithm. We then identify the topics of each cluster and use them to evaluate the criteria on expanding the map. We applied the algorithm on medium-size datasets and obtained promising result.
Keywords
data mining; data visualisation; learning (artificial intelligence); pattern clustering; self-organising feature maps; text analysis; topology; SOM model; classical SOM learning algorithm; data clustering; data visualization; dimension reduction; high-dimensional data space; medium-size datasets; novel SOM learning algorithm; novel self-organizing map; text document organization; text mining techniques; topological relations; topology preservation; training documents; well-known neural network model; Clustering algorithms; Neural networks; Neurons; Text categorization; Topology; Training; Vectors; Hierarchy Generation; Self-organizing Map; Text Mining; Topic Identification;
fLanguage
English
Publisher
ieee
Conference_Titel
Innovations in Bio-Inspired Computing and Applications (IBICA), 2012 Third International Conference on
Conference_Location
Kaohsiung
Print_ISBN
978-1-4673-2838-8
Type
conf
DOI
10.1109/IBICA.2012.53
Filename
6337634
Link To Document