Title :
Performance evaluation of Latent Dirichlet Allocation in text mining
Author :
Zelong Liu ; Maozhen Li ; Yang Liu ; Ponraj, M.
Author_Institution :
Sch. of Eng. & Design, Brunel Univ., Uxbridge, UK
Abstract :
This paper introduces three classic models of statistical topic models: Latent Semantic Indexing (LSI), Probabilistic Latent Semantic Indexing (PLSI) and Latent Dirichlet Allocation (LDA). Then a method of text classification based on LDA model is briefly described, which uses LDA model as a text representation method. Each document means a probability distribution of fixed latent topic sets. Next, Support Vector Machine (SVM) is chose as classification algorithm. Finally, the evaluation parameters in classification system of LDA with SVM are higher than other two methods which are LSI with SVM and VSM with SVM, showing a better classification performance.
Keywords :
classification; data mining; indexing; statistical distributions; support vector machines; text analysis; SVM; latent Dirichlet allocation; performance evaluation; probabilistic latent semantic indexing; probability distribution; support vector machine; text classification; text mining; text representation; Computational modeling; Indexing; Large scale integration; Matrix decomposition; Semantics; Support vector machines; Text categorization; Latent Dirichlet Allocation; Statistical Topic Model; Support Vector Machine; Text Classification;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-61284-180-9
DOI :
10.1109/FSKD.2011.6020066