DocumentCode
86044
Title
Nested Hierarchical Dirichlet Processes
Author
Paisley, John ; Wang, Chingyue ; Blei, David M. ; Jordan, Michael I.
Author_Institution
Department of Electrical Engineering, Columbia University, New York, NY
Volume
37
Issue
2
fYear
2015
fDate
Feb. 1 2015
Firstpage
256
Lastpage
270
Abstract
We develop a nested hierarchical Dirichlet process (nHDP) for hierarchical topic modeling. The nHDP generalizes the nested Chinese restaurant process (nCRP) to allow each word to follow its own path to a topic node according to a per-document distribution over the paths on a shared tree. This alleviates the rigid, single-path formulation assumed by the nCRP, allowing documents to easily express complex thematic borrowings. We derive a stochastic variational inference algorithm for the model, which enables efficient inference for massive collections of text documents. We demonstrate our algorithm on 1.8 million documents from The New York Times and 2.7 million documents from Wikipedia .
Keywords
Atomic measurements; Bayes methods; Data models; Indexes; Pattern analysis; Random variables; Stochastic processes; Bayesian nonparametrics; Dirichlet process; stochastic optimization; topic modeling;
fLanguage
English
Journal_Title
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher
ieee
ISSN
0162-8828
Type
jour
DOI
10.1109/TPAMI.2014.2318728
Filename
6802355
Link To Document