Title :
Biclustering multivariate data for correlated subspace mining
Author :
Watanabe, Kazuho ; Hsiang-Yun Wu ; Niibe, Yusuke ; Takahashi, Shigeo ; Fujishiro, Issei
Author_Institution :
Toyohashi Univ. of Technol., Toyohashi, Japan
Abstract :
Exploring feature subspaces is one of promising approaches to analyzing and understanding the important patterns in multivariate data. If relying too much on effective enhancements in manual interventions, the associated results depend heavily on the knowledge and skills of users performing the data analysis. This paper presents a novel approach to extracting feature subspaces from multivariate data by incorporating biclustering techniques. The approach has been maximally automated in the sense that highly-correlated dimensions are automatically grouped to form subspaces, which effectively supports further exploration of them. A key idea behind our approach lies in a new mathematical formulation of asymmetric biclustering, by combining spherical k-means clustering for grouping highly-correlated dimensions, together with ordinary k-means clustering for identifying subsets of data samples. Lower-dimensional representations of data in feature subspaces are successfully visualized by parallel coordinate plot, where we project the data samples of correlated dimensions to one composite axis through dimensionality reduction schemes. Several experimental results of our data analysis together with discussions will be provided to assess the capability of our approach.
Keywords :
data analysis; data mining; data reduction; data visualisation; feature extraction; pattern clustering; asymmetric biclustering; biclustering techniques; correlated subspace mining; data analysis; dimensionality reduction; feature subspaces extraction; lower-dimensional data representations; mathematical formulation; multivariate data biclustering; multivariate data patterns; ordinary k-means clustering; parallel coordinate plot; spherical k-means clustering; visualization; Clustering algorithms; Correlation; Data mining; Data models; Data visualization; History; Linear programming; Multivariate data; biclustering; correlation; subspaces;
Conference_Titel :
Visualization Symposium (PacificVis), 2015 IEEE Pacific
Conference_Location :
Hangzhou
DOI :
10.1109/PACIFICVIS.2015.7156389