• DocumentCode
    1921793
  • Title

    A Novel Self-Organizing Map for Text Document Organization

  • Author

    Yang, Hsin-Chang ; Lee, Chung-Hong

  • Author_Institution
    Dept. Inf. Manage., Nat. Univ. of Kaohsiung, Kaohsiung, Taiwan
  • fYear
    2012
  • fDate
    26-28 Sept. 2012
  • Firstpage
    39
  • Lastpage
    44
  • Abstract
    The self-organizing map (SOM) model is a well-known neural network model with wide spread of applications. The main characteristics of SOM are two-fold, namely dimension reduction and topology preservation. Using SOM, a high-dimensional data space will be mapped to some low-dimensional space. Meanwhile, the topological relations among data will be preserved. With such characteristics, the SOM was usually applied on data clustering and visualization tasks. One major shortage of classical SOM learning algorithm is the necessity of predefined map topology. Furthermore, hierarchical relationships among data are also difficult to be revealed. In this work, we propose a novel SOM learning algorithm which incorporates several text mining techniques in expanding the map both laterally and hierarchically that could discover the relationships among documents in both perspectives. The proposed algorithm will first cluster a set of training documents using classical SOM algorithm. We then identify the topics of each cluster and use them to evaluate the criteria on expanding the map. We applied the algorithm on medium-size datasets and obtained promising result.
  • Keywords
    data mining; data visualisation; learning (artificial intelligence); pattern clustering; self-organising feature maps; text analysis; topology; SOM model; classical SOM learning algorithm; data clustering; data visualization; dimension reduction; high-dimensional data space; medium-size datasets; novel SOM learning algorithm; novel self-organizing map; text document organization; text mining techniques; topological relations; topology preservation; training documents; well-known neural network model; Clustering algorithms; Neural networks; Neurons; Text categorization; Topology; Training; Vectors; Hierarchy Generation; Self-organizing Map; Text Mining; Topic Identification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Innovations in Bio-Inspired Computing and Applications (IBICA), 2012 Third International Conference on
  • Conference_Location
    Kaohsiung
  • Print_ISBN
    978-1-4673-2838-8
  • Type

    conf

  • DOI
    10.1109/IBICA.2012.53
  • Filename
    6337634