Title :
Topic grouping by spectral clustering
Author :
Young-Seob Jeong ; Won-Jo Lee ; Ho-Jin Choi
Author_Institution :
Dept. of Comput. Sci., KAIST(Korea Adv. Inst. of Sci. & Technol.), Daejeon, South Korea
Abstract :
With the growing number of web documents, it becomes difficult to analyze and obtain information from such an array of documents. Furthermore, unsupervised methods are preferable, as most web documents are unlabeled. Probabilistic topic modeling is one such method. It discovers latent structures among unstructured documents. While many traditional topic models usually assume that the topics are independent of each other, some models have been proposed to obtain correlations between the topics or a hierarchy of the topics. These models are designed to obtain both the topics and the correlations without using any other method. Therefore, very few studies apply other methods to determine a correlation between topics. In this paper, we apply spectral clustering to group the topics obtained from a traditional topic model, in this case the Latent Dirichlet Allocation model. To the best of our knowledge, this is the first approach that uses spectral clustering for the grouping of topics. We demonstrate the experimental results with various settings.
Keywords :
pattern clustering; probability; text analysis; Web document; latent Dirichlet allocation model; latent structure; probabilistic topic modeling; spectral clustering; topic grouping; unsupervised method; Clustering algorithms; Computational modeling; Correlation; Data models; Educational institutions; Hidden Markov models; Neural networks; Spectral clustering; Topic grouping; Topic model;
Conference_Titel :
Advanced Communication Technology (ICACT), 2014 16th International Conference on
Conference_Location :
Pyeongchang
Print_ISBN :
978-89-968650-2-5
DOI :
10.1109/ICACT.2014.6779044