• DocumentCode
    120116
  • Title

    Topic grouping by spectral clustering

  • Author

    Young-Seob Jeong ; Won-Jo Lee ; Ho-Jin Choi

  • Author_Institution
    Dept. of Comput. Sci., KAIST(Korea Adv. Inst. of Sci. & Technol.), Daejeon, South Korea
  • fYear
    2014
  • fDate
    16-19 Feb. 2014
  • Firstpage
    657
  • Lastpage
    661
  • Abstract
    With the growing number of web documents, it becomes difficult to analyze and obtain information from such an array of documents. Furthermore, unsupervised methods are preferable, as most web documents are unlabeled. Probabilistic topic modeling is one such method. It discovers latent structures among unstructured documents. While many traditional topic models usually assume that the topics are independent of each other, some models have been proposed to obtain correlations between the topics or a hierarchy of the topics. These models are designed to obtain both the topics and the correlations without using any other method. Therefore, very few studies apply other methods to determine a correlation between topics. In this paper, we apply spectral clustering to group the topics obtained from a traditional topic model, in this case the Latent Dirichlet Allocation model. To the best of our knowledge, this is the first approach that uses spectral clustering for the grouping of topics. We demonstrate the experimental results with various settings.
  • Keywords
    pattern clustering; probability; text analysis; Web document; latent Dirichlet allocation model; latent structure; probabilistic topic modeling; spectral clustering; topic grouping; unsupervised method; Clustering algorithms; Computational modeling; Correlation; Data models; Educational institutions; Hidden Markov models; Neural networks; Spectral clustering; Topic grouping; Topic model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Communication Technology (ICACT), 2014 16th International Conference on
  • Conference_Location
    Pyeongchang
  • Print_ISBN
    978-89-968650-2-5
  • Type

    conf

  • DOI
    10.1109/ICACT.2014.6779044
  • Filename
    6779044