Title :
A LDA-Based Approach for Interactive Web Mining of Topic Evolutionary Patterns
Author :
Zhou, Bin ; Huang, Jiuming ; Cui, Kai
Author_Institution :
Sch. of Comput., Nat. Univ. of Defense Technol., Changsha, China
Abstract :
Many real-world Web mining tasks need to discover topics interactively, which means the users are likely to interfere the topic discovery and selection processes by expressing their preferences. In this paper, a new algorithm based on Latent Dirichlet Allocation (LDA) is proposed for interactive topic evolution pattern detection. To eliminate those topics not interested, it allows the users to add supervised information by adjusting the posterior topic-word distributions, which may influence the inference process of the following iteration. A framework is designed to incorporate different kinds of supervised information. Experiments on English and Chinese corpus show that the extracted topics capture meaningful themes and the supervised information can help to find better topics more efficiently.
Keywords :
Internet; data mining; inference mechanisms; probability; text analysis; LDA-based approach; interactive Web mining; interactive topic evolution pattern detection; latent Dirichlet allocation; posterior topic-word distributions; selection process; semisupervised inference process; topic discovery; Evolution (biology); Filtering; Probabilistic logic; Semantics; Smoothing methods; Web mining;
Conference_Titel :
Internet Technology and Applications, 2010 International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-5142-5
Electronic_ISBN :
978-1-4244-5143-2
DOI :
10.1109/ITAPP.2010.5566219