DocumentCode :
2711409
Title :
Learning the number of gaussian components using hypothesis test
Author :
Heo, Gyeongyong ; Gader, Paul
Author_Institution :
Dept. of Comput. & Inf. Sci. & Eng., Univ. of Florida, Gainesville, FL, USA
fYear :
2009
fDate :
14-19 June 2009
Firstpage :
1206
Lastpage :
1212
Abstract :
This paper addresses the problem of estimating the correct number of components in a Gaussian mixture given a sample data set. In particular, an extension of Gaussian-means (G-means) and projected Gaussian-means (PG-means) algorithms is proposed. All these methods are based on one-dimensional statistical hypothesis test. G-means and PG-means are wrapper algorithms of the k-means and expectation-maximization (EM) algorithms, respectively. Although G-means is a simple and fast algorithm, it does not perform well when clusters overlap since it is based on k-means. PG-means can handle overlapped clusters but requires more computation and sometimes fails to find the right number of clusters. In this paper, we propose an extension, called extended projected Gaussian means (XPG-means). XPG-means is a wrapper algorithm of possibilistic fuzzy C-means (PFCM) algorithm. XPG-means integrates the advantages of both algorithms while resolving some of the disadvantages involving overlapped clusters, noise, and computational complexity. More specifically, XPG-means handles overlapped clusters better than G-means because of the use of fuzzy clustering, handles noise better than both algorithms because it uses possibilitistic clustering. XPG-means is less computationally expensive than PG-means because it uses local hypothesis testing scheme used by G-means that is specific to Gaussians whereas PG-means uses a more general Kolmogorov-Smirnow test on Gaussian mixtures. In addition, XPG-means demonstrates less variance in estimating the number of components than either of the other algorithms.
Keywords :
Gaussian processes; computational complexity; expectation-maximisation algorithm; fuzzy set theory; possibility theory; statistical testing; G-means; Gaussian components; Gaussian mixture; Kolmogorov-Smirnow test; PG-means; XPG-means; computational complexity; expectation-maximization algorithms; extended projected Gaussian means; fuzzy clustering; k-means; one-dimensional statistical hypothesis test; possibilistic fuzzy C-means algorithm; projected Gaussian-means algorithms; wrapper algorithms; Bayesian methods; Clustering algorithms; Computational complexity; Gaussian processes; Iterative algorithms; Maximum likelihood estimation; Neural networks; Noise robustness; Testing; Unsupervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks, 2009. IJCNN 2009. International Joint Conference on
Conference_Location :
Atlanta, GA
ISSN :
1098-7576
Print_ISBN :
978-1-4244-3548-7
Electronic_ISBN :
1098-7576
Type :
conf
DOI :
10.1109/IJCNN.2009.5178886
Filename :
5178886
Link To Document :
بازگشت