DocumentCode
1368002
Title
ITERATE: a conceptual clustering algorithm for data mining
Author
Biswas, Gautam ; Weinberg, Jerry B. ; Fisher, Douglas H.
Author_Institution
Dept. of Comput. Sci., Vanderbilt Univ., Nashville, TN, USA
Volume
28
Issue
2
fYear
1998
fDate
5/1/1998 12:00:00 AM
Firstpage
219
Lastpage
230
Abstract
The data exploration task can be divided into three interrelated subtasks: 1) feature selection, 2) discovery, and 3) interpretation. This paper describes an unsupervised discovery method with biases geared toward partitioning objects into clusters that improve interpretability. The algorithm ITERATE employs: 1) a data ordering scheme and 2) an iterative redistribution operator to produce maximally cohesive and distinct clusters. Cohesion or intraclass similarity is measured in terms of the match between individual objects and their assigned cluster prototype. Distinctness or interclass dissimilarity is measured by an average of the variance of the distribution match between clusters. The authors demonstrate that interpretability, from a problem-solving viewpoint, is addressed by the intraclass and interclass measures. Empirical results demonstrate the properties of the discovery algorithm and its applications to problem solving
Keywords
feature extraction; iterative methods; knowledge acquisition; problem solving; ITERATE; biases; cohesion; conceptual clustering algorithm; data exploration; data mining; data ordering scheme; discovery; distinctness; distribution match; feature selection; interclass dissimilarity; interclass measures; interpretation; intraclass measures; intraclass similarity; iterative redistribution operator; maximally cohesive clusters; maximally distinct clusters; object partitioning; problem solving; unsupervised discovery method; Clustering algorithms; Computer science; Data analysis; Data mining; Databases; Iterative algorithms; Partitioning algorithms; Pattern analysis; Problem-solving; Prototypes;
fLanguage
English
Journal_Title
Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on
Publisher
ieee
ISSN
1094-6977
Type
jour
DOI
10.1109/5326.669556
Filename
669556
Link To Document