• DocumentCode
    3166024
  • Title

    Latent Dirichlet Conditional Naive-Bayes Models

  • Author

    Banerjee, Arindam ; Shan, Hanhuai

  • Author_Institution
    Univ. of Minnesota, Minneapolis
  • fYear
    2007
  • fDate
    28-31 Oct. 2007
  • Firstpage
    421
  • Lastpage
    426
  • Abstract
    In spite of the popularity of probabilistic mixture models for latent structure discovery from data, mixture models do not have a natural mechanism for handling sparsity, where each data point only has a few non-zero observations. In this paper, we introduce conditional naive-Bayes (CNB) models, which generalize naive-Bayes mixture models to naturally handle sparsity by conditioning the model on observed features. Further, we present latent Dirichlet conditional naive-Bayes (LD-CNB) models, which constitute a family of powerful hierarchical Bayesian models for latent structure discovery from sparse data. The proposed family of models are quite general and can work with arbitrary regular exponential family conditional distributions. We present a variational inference based EM algorithm for learning along with special case analyses for Gaussian and discrete distributions. The efficacy of the proposed models are demonstrated by extensive experiments on a wide variety of different datasets.
  • Keywords
    Bayes methods; Gaussian distribution; data mining; data structures; expectation-maximisation algorithm; EM algorithm; Gaussian distributions; discrete distributions; latent Dirichlet conditional naive-Bayes models; latent sparse data structure discovery; naive-Bayes mixture models; nonzero observations; probabilistic mixture models; sparsity handling; Bayesian methods; Cities and towns; Computer science; Data engineering; Data mining; Inference algorithms; Linear discriminant analysis; Motion pictures; Niobium; Recommender systems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
  • Conference_Location
    Omaha, NE
  • ISSN
    1550-4786
  • Print_ISBN
    978-0-7695-3018-5
  • Type

    conf

  • DOI
    10.1109/ICDM.2007.55
  • Filename
    4470267