• DocumentCode
    86044
  • Title

    Nested Hierarchical Dirichlet Processes

  • Author

    Paisley, John ; Wang, Chingyue ; Blei, David M. ; Jordan, Michael I.

  • Author_Institution
    Department of Electrical Engineering, Columbia University, New York, NY
  • Volume
    37
  • Issue
    2
  • fYear
    2015
  • fDate
    Feb. 1 2015
  • Firstpage
    256
  • Lastpage
    270
  • Abstract
    We develop a nested hierarchical Dirichlet process (nHDP) for hierarchical topic modeling. The nHDP generalizes the nested Chinese restaurant process (nCRP) to allow each word to follow its own path to a topic node according to a per-document distribution over the paths on a shared tree. This alleviates the rigid, single-path formulation assumed by the nCRP, allowing documents to easily express complex thematic borrowings. We derive a stochastic variational inference algorithm for the model, which enables efficient inference for massive collections of text documents. We demonstrate our algorithm on 1.8 million documents from The New York Times and 2.7 million documents from Wikipedia.
  • Keywords
    Atomic measurements; Bayes methods; Data models; Indexes; Pattern analysis; Random variables; Stochastic processes; Bayesian nonparametrics; Dirichlet process; stochastic optimization; topic modeling;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2014.2318728
  • Filename
    6802355