DocumentCode :
155669
Title :
Sparse topic models by parameter sharing
Author :
Soleimani, Hossein ; Miller, David J.
Author_Institution :
Dept. of Electr. Eng., Pennsylvania State Univ., University Park, PA, USA
fYear :
2014
fDate :
21-24 Sept. 2014
Firstpage :
1
Lastpage :
6
Abstract :
We propose a sparse Bayesian topic model, based on parameter sharing, for modeling text corpora. In Latent Dirichlet Allocation (LDA), each topic models all words, even though many words are not topic-specific, i.e. have similar occurrence frequencies across different topics. We propose a sparser approach by introducing a universal shared model, used by each topic to model the subset of words that are not topic-specific. A Bernoulli random variable is associated with each word under every topic, determining whether that word is modeled topic-specifically, with a free parameter, or by the shared model, with a common parameter. Results of our experiments show that our model achieves sparser topic presence in documents and higher test likelihood than LDA.
Keywords :
Bayes methods; maximum likelihood estimation; text analysis; Bayesian topic model; Bernoulli random variable; LDA; documents; latent dirichlet allocation; parameter sharing; sparse topic models; sparser topic presence; test likelihood; text corpora modeling; universal shared model; Abstracts; Lead; Parameter estimation; Resource management; Sparse models; Topic models; Variational inference;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning for Signal Processing (MLSP), 2014 IEEE International Workshop on
Conference_Location :
Reims
Type :
conf
DOI :
10.1109/MLSP.2014.6958911
Filename :
6958911
Link To Document :
بازگشت