DocumentCode
1620951
Title
Probabilistic clustering based on Langevin mixture
Author
Amayri, Ola ; Bouguila, Nizar
Author_Institution
Electr. & Comput. Eng. Dept., Concordia Univ., Montreal, QC, Canada
Volume
2
fYear
2011
Firstpage
388
Lastpage
391
Abstract
In this paper, we propose a statistical framework for clustering spherical data which are usually found in machine learning, data mining and computer vision applications. Our framework is based on finite Langevin mixture models which provide a very natural representation of normalized vectors in high dimensional spaces in which the data lie on unit hypersphere. Moreover, we developed minimum message length (MML) criterion for the selection of finite Langevin mixture components from which different probabilistic information divergence distances are then derived. Through empirical experiments, we demonstrate the merits of the proposed learning framework through challenging applications involving spam filtering using visual email content and email categorization.
Keywords
pattern clustering; unsolicited e-mail; vectors; computer vision application; data mining; email categorization; finite Langevin mixture component; finite Langevin mixture model; machine learning; minimum message length criterion; normalized vector; probabilistic clustering; probabilistic information divergence distance; spam filtering; spherical data clustering; statistical framework; visual email content; Accuracy; Data models; Electronic mail; Machine learning; Probabilistic logic; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Applications and Workshops (ICMLA), 2011 10th International Conference on
Conference_Location
Honolulu, HI
Print_ISBN
978-1-4577-2134-2
Type
conf
DOI
10.1109/icmla.2011.6174513
Filename
6174513
Link To Document