DocumentCode :
948411
Title :
Ranked Centroid Projection: A Data Visualization Approach With Self-Organizing Maps
Author :
Yen, Gary G. ; Wu, Zheng
Author_Institution :
Oklahoma State Univ., Stillwater
Volume :
19
Issue :
2
fYear :
2008
Firstpage :
245
Lastpage :
259
Abstract :
The self-organizing map (SOM) is an efficient tool for visualizing high-dimensional data. In this paper, the clustering and visualization capabilities of the SOM, especially in the analysis of textual data, i.e., document collections, are reviewed and further developed. A novel clustering and visualization approach based on the SOM is proposed for the task of text mining. The proposed approach first transforms the document space into a multidimensional vector space by means of document encoding. Afterwards, a growing hierarchical SOM (GHSOM) is trained and used as a baseline structure to automatically produce maps with various levels of detail. Following the GHSOM training, the new projection method, namely the ranked centroid projection (RCP), is applied to project the input vectors to a hierarchy of 2D output maps. The RCP is used as a data analysis tool as well as a direct interface to the data. In a set of simulations, the proposed approach is applied to an illustrative data set and two real-world scientific document collections to demonstrate its applicability.
Keywords :
data analysis; data mining; data visualisation; self-organising feature maps; text analysis; 2D output maps; data analysis tool; data clustering; data visualization; document encoding; growing hierarchical SOM; high-dimensional data; multidimensional vector space; ranked centroid projection; scientific document collections; self-organizing maps; text mining; textual data analysis; Data visualization; document clustering; self-organizing map (SOM); Artificial Intelligence; Cluster Analysis; Computer Graphics; Computer Simulation; Neural Networks (Computer); Nonlinear Dynamics; Pattern Recognition, Automated;
fLanguage :
English
Journal_Title :
Neural Networks, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9227
Type :
jour
DOI :
10.1109/TNN.2007.905858
Filename :
4359218
Link To Document :
بازگشت