• DocumentCode
    1620951
  • Title

    Probabilistic clustering based on Langevin mixture

  • Author

    Amayri, Ola ; Bouguila, Nizar

  • Author_Institution
    Electr. & Comput. Eng. Dept., Concordia Univ., Montreal, QC, Canada
  • Volume
    2
  • fYear
    2011
  • Firstpage
    388
  • Lastpage
    391
  • Abstract
    In this paper, we propose a statistical framework for clustering spherical data which are usually found in machine learning, data mining and computer vision applications. Our framework is based on finite Langevin mixture models which provide a very natural representation of normalized vectors in high dimensional spaces in which the data lie on unit hypersphere. Moreover, we developed minimum message length (MML) criterion for the selection of finite Langevin mixture components from which different probabilistic information divergence distances are then derived. Through empirical experiments, we demonstrate the merits of the proposed learning framework through challenging applications involving spam filtering using visual email content and email categorization.
  • Keywords
    pattern clustering; unsolicited e-mail; vectors; computer vision application; data mining; email categorization; finite Langevin mixture component; finite Langevin mixture model; machine learning; minimum message length criterion; normalized vector; probabilistic clustering; probabilistic information divergence distance; spam filtering; spherical data clustering; statistical framework; visual email content; Accuracy; Data models; Electronic mail; Machine learning; Probabilistic logic; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications and Workshops (ICMLA), 2011 10th International Conference on
  • Conference_Location
    Honolulu, HI
  • Print_ISBN
    978-1-4577-2134-2
  • Type

    conf

  • DOI
    10.1109/icmla.2011.6174513
  • Filename
    6174513