DocumentCode
3440718
Title
Evaluation of Stability and Similarity of Latent Dirichlet Allocation
Author
Jun Tang ; Ruilong Huo ; Jiali Yao
Author_Institution
China COE, Pivotal, Beijing, China
fYear
2013
fDate
3-4 Dec. 2013
Firstpage
78
Lastpage
83
Abstract
Latent Dirichlet Allocation (LDA) is an unsupervised, statistical method to model documents and discover latent semantic topics from large set of documents and categorize them into learned topics. In this paper, we first introduce LDA and its distributed version Parallel LDA (PLDA), along with some popular implementations. Then we propose a systematic solution to evaluate stability and similarity of the trained models and classification results of LDA/PLDA. We address three key challenges within the evaluation solution: (i) topics matching in Kullback Liebler (KL) divergence calculation, (ii) calculation of stability using KL divergence and interpretation of relationship between KL divergence and stability of the trained model and the classification results, (iii) calculation and evaluation of similarity of trained models and classification results. Finally, we experiment with real life datasets to show that our solution is sufficient and efficient.
Keywords
data mining; distributed processing; document handling; pattern classification; statistical analysis; unsupervised learning; KL divergence calculation; Kullback Liebler divergence calculation; LDA classification; PLDA classification; distributed parallel LDA; document modelling; latent Dirichlet allocation similarity evaluation; latent Dirichlet allocation stability evaluation; latent semantic topic discovery; topic matching; trained model similarity calculation; trained model similarity evaluation; unsupervised statistical method; Classification algorithms; Computational modeling; Electromagnetic compatibility; Google; Measurement; Stability analysis; Systematics; LDA; evaluation; similarity; stability;
fLanguage
English
Publisher
ieee
Conference_Titel
Software Engineering (WCSE), 2013 Fourth World Congress on
Conference_Location
Hong Kong
Print_ISBN
978-1-4799-2882-8
Type
conf
DOI
10.1109/WCSE.2013.17
Filename
6754267
Link To Document