DocumentCode
2851105
Title
Mixture Modeling and Information Criteria for Discovering Patterns in Continuous Data
Author
Fonseca, Jaime R S
Author_Institution
Tech. Univ. of Lisbon, Lisbon
fYear
2008
fDate
10-12 Sept. 2008
Firstpage
543
Lastpage
548
Abstract
This study addresses the adequacy of some theoretical information criteria when using finite mixture modelling on discovering patterns in continuous data. In fact, the selection of an adequate number of clusters is a key issue in deriving complex mixture structures and it is desirable that information criteria used for this end are effective. In order to select among several information criteria, which may support the selection of the correct number of clusters, we conduct a simulation study that is intended to determine which information criteria are more appropriate for mixture model selection when considering data sets with only continuous clustering base variables. As a result, the criterion BIC shows a better performance, that is, it indicates the correct number of the simulated cluster structures more often, when referring to mixtures of continuous clustering base variables.
Keywords
data mining; pattern clustering; base variable clustering; continuous data; data mining; data set; finite mixture modeling; information criteria; mixture model selection; pattern discovery; Clustering algorithms; Data mining; Hybrid intelligent systems; Information analysis; Maximum likelihood estimation; Probability distribution; Proposals; Unsupervised learning; Continuous Clustering Base Variables; Finite Mixture Models; Model Selection; Patterns in Continuous Data; Quantitative Methods; Simulation experiments; Theoretical Information Criteria;
fLanguage
English
Publisher
ieee
Conference_Titel
Hybrid Intelligent Systems, 2008. HIS '08. Eighth International Conference on
Conference_Location
Barcelona
Print_ISBN
978-0-7695-3326-1
Electronic_ISBN
978-0-7695-3326-1
Type
conf
DOI
10.1109/HIS.2008.32
Filename
4626686
Link To Document