DocumentCode :
3322417
Title :
Visualizing Attribute Interdependencies Using Mutual Information, Hierarchical Clustering, Multidimensional Scaling, and Self-organizing Maps
Author :
Nazareth, Derek L. ; Soofi, Ehsan S. ; Zhao, Huimin
Author_Institution :
Sheldon B. Lubar Sch. of Bus., Wisconsin Univ., Milwaukee, WI
fYear :
2007
fDate :
Jan. 2007
Firstpage :
53
Lastpage :
53
Abstract :
Data pre-processing tends to be the most critical and time-consuming step during data mining processes. Understanding the inter dependencies among the attributes is especially important for attribute selection and model structure design. Correlation measures, such as Pearson correlation coefficient, have been typically used to measure attribute dependencies. Correlation is useful for capturing linear dependency among quantitative attributes, and is invariant under linear transformations of the variables only. More recently, mutual information has been used to measure interdependencies among attributes measured in continuous scale. Mutual information is applicable to quantitative and categorical variables, captures any type of functional dependency between variables, and is invariant under one-to-one transformations. In this paper, we employ mutual information as a unified measure of interdependencies among attributes, by extending it to accommodate attributes measured in continuous and categorical scales. We further visualize the attribute interdependencies using a host of techniques, including hierarchical clustering, multidimensional scaling, and self-organizing maps. The use of mutual information permits identification of some salient interdependencies between attributes. We demonstrate the utility of the proposed methodology using real data mining applications
Keywords :
data mining; data visualisation; pattern clustering; self-organising feature maps; statistical analysis; Pearson correlation coefficient; attribute interdependency; attribute selection; data mining; data pre-processing; functional dependency; hierarchical clustering; linear dependency; model structure design; multidimensional scaling; mutual information; self-organizing map; visualization; Data mining; Data visualization; Multidimensional systems; Mutual information; Self organizing feature maps;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
System Sciences, 2007. HICSS 2007. 40th Annual Hawaii International Conference on
Conference_Location :
Waikoloa, HI
ISSN :
1530-1605
Electronic_ISBN :
1530-1605
Type :
conf
DOI :
10.1109/HICSS.2007.608
Filename :
4076479
Link To Document :
بازگشت