• DocumentCode
    2864603
  • Title

    Hierarchy-regularized latent semantic indexing

  • Author

    Huang, Yi ; Yu, Kai ; Schubert, Matthias ; Yu, Shipeng ; Tresp, Volker ; Kriegel, Hans-Peter

  • Author_Institution
    Inst. for Comput. Sci., Munich Univ., Germany
  • fYear
    2005
  • fDate
    27-30 Nov. 2005
  • Abstract
    Organizing textual documents into a hierarchical taxonomy is a common practice in knowledge management. Beside textual features, the hierarchical structure of directories reflect additional and important knowledge annotated by experts. It is generally desired to incorporate this information into text mining processes. In this paper, we propose hierarchy-regularized latent semantic indexing, which encodes the hierarchy into a similarity graph of documents and then formulates an optimization problem mapping each document into a low dimensional vector space. The new feature space preserves the intrinsic structure of the original taxonomy and thus provides a meaningful basis for various learning tasks like visualization and classification. Our approach employs the information about class proximity and class specificity, and can naturally cope with multi-labeled documents. Our empirical studies show very encouraging results on two real-world data sets, the new Reuters (RCVI) benchmark and the Swissprot protein database.
  • Keywords
    data mining; indexing; knowledge management; text analysis; document mapping; hierarchical structure; hierarchical taxonomy; hierarchy-regularized latent semantic indexing; knowledge management; optimization problem; similarity graph; text mining; textual documents; Computer science; Data visualization; Indexing; Knowledge management; Navigation; Organizing; Proteins; Taxonomy; Text mining; Visual databases;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, Fifth IEEE International Conference on
  • ISSN
    1550-4786
  • Print_ISBN
    0-7695-2278-5
  • Type

    conf

  • DOI
    10.1109/ICDM.2005.76
  • Filename
    1565677