• DocumentCode
    1348707
  • Title

    A Dirichlet Process Mixture of Generalized Dirichlet Distributions for Proportional Data Modeling

  • Author

    Bouguila, Nizar ; Ziou, Djemel

  • Author_Institution
    Concordia Inst. for Inf. Syst. Eng. (CIISE), Concordia Univ., Montreal, QC, Canada
  • Volume
    21
  • Issue
    1
  • fYear
    2010
  • Firstpage
    107
  • Lastpage
    122
  • Abstract
    In this paper, we propose a clustering algorithm based on both Dirichlet processes and generalized Dirichlet distribution which has been shown to be very flexible for proportional data modeling. Our approach can be viewed as an extension of the finite generalized Dirichlet mixture model to the infinite case. The extension is based on nonparametric Bayesian analysis. This clustering algorithm does not require the specification of the number of mixture components to be given in advance and estimates it in a principled manner. Our approach is Bayesian and relies on the estimation of the posterior distribution of clusterings using Gibbs sampler. Through some applications involving real-data classification and image databases categorization using visual words, we show that clustering via infinite mixture models offers a more powerful and robust performance than classic finite mixtures.
  • Keywords
    belief networks; boundary-value problems; data models; pattern clustering; statistical distributions; visual databases; Dirichlet distribution; Dirichlet process mixture; Gibbs sampler; clustering algorithm; clusterings posterior distribution; finite generalized Dirichlet mixture model; image databases categorization; nonparametric Bayesian analysis; proportional data modeling; visual words application; Data classification; Dirichlet processes; Gibbs sampling; MCMC; Metropolis–Hastings (M–H); generalized Dirichlet distribution; image database categorization; mixture modeling; nonparametric Bayesian analysis; proportional data; Algorithms; Artificial Intelligence; Bayes Theorem; Cluster Analysis; Humans; Image Interpretation, Computer-Assisted; Information Storage and Retrieval; Models, Statistical;
  • fLanguage
    English
  • Journal_Title
    Neural Networks, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9227
  • Type

    jour

  • DOI
    10.1109/TNN.2009.2034851
  • Filename
    5345703