Title :
Pre-processing for data clustering
Author_Institution :
Dept. of Electr. & Comput. Eng., Memphis Univ., USA
Abstract :
We propose a data transformation approach that facilitates data clustering. Our approach, called MembershipMap, strives to extract the underlying structure or sub-concepts of each raw attribute automatically, and uses the orthogonal union of these sub-concepts to define a new, semantically richer, space. The sub-concept labels of each point in the original space determine the position of that point in the transformed space. Since sub-concept labels are prone to uncertainty inherent in the original data and in the initial extraction process, a combination of labeling schemes that are based on different measures of uncertainty will be presented. We show that the transformed spaces could be used as flexible pre-processing tools to support such tasks as sampling, data cleaning, and outlier detection. We also show that the information extracted from transformed spaces could be used to improve the performance of clustering algorithms.
Keywords :
data mining; pattern clustering; MembershipMap; data clustering; data transformation; extraction process; flexible pre-processing tools; labeling schemes; orthogonal union; Cleaning; Data engineering; Data mining; Data preprocessing; Data visualization; Databases; Labeling; Marine vehicles; Phase measurement; Principal component analysis;
Conference_Titel :
Fuzzy Information, 2004. Processing NAFIPS '04. IEEE Annual Meeting of the
Print_ISBN :
0-7803-8376-1
DOI :
10.1109/NAFIPS.2004.1337437