• DocumentCode
    2218122
  • Title

    ClusterSculptor: A Visual Analytics Tool for High-Dimensional Data

  • Author

    Nam, Eun J. ; Han, Yiping ; Mueller, Klaus ; Zelenyuk, Alla ; Imre, Dan

  • Author_Institution
    Stony Brook Univ., Stony Brook
  • fYear
    2007
  • fDate
    Oct. 30 2007-Nov. 1 2007
  • Firstpage
    75
  • Lastpage
    82
  • Abstract
    Cluster analysis (CA) is a powerful strategy for the exploration of high-dimensional data in the absence of a-priori hypotheses or data classification models, and the results of CA can then be used to form such models. But even though formal models and classification rules may not exist in these data exploration scenarios, domain scientists and experts generally have a vast amount of non-compiled knowledge and intuition that they can bring to bear in this effort. In CA, there are various popular mechanisms to generate the clusters, however, the results from their non- supervised deployment rarely fully agree with this expert knowledge and intuition. To this end, our paper describes a comprehensive and intuitive framework to aid scientists in the derivation of classification hierarchies in CA, using k-means as the overall clustering engine, but allowing them to tune its parameters interactively based on a non-distorted compact visual presentation of the inherent characteristics of the data in high- dimensional space. These include cluster geometry, composition, spatial relations to neighbors, and others. In essence, we provide all the tools necessary for a high-dimensional activity we call cluster sculpting, and the evolving hierarchy can then be viewed in a space-efficient radial dendrogram. We demonstrate our system in the context of the mining and classification of a large collection of millions of data items of aerosol mass spectra, but our framework readily applies to any high-dimensional CA scenario.
  • Keywords
    aerosols; data analysis; data mining; data visualisation; environmental science computing; mass spectroscopy; pattern classification; pattern clustering; ClusterSculptor visual analytics tool; aerosol mass spectra; atmospheric science; cluster analysis; cluster sculpting; clustering engine; data classification models; data mining; high-dimensional data analysis; Analysis of variance; Clustering algorithms; Couplings; Data mining; Displays; Iterative methods; Shape measurement; Space exploration; Visual analytics; Visualization; High-Dimensional Data; Space and Environmental Sciences; Visual Analytics; Visual Data Mining; Visualization in Earth;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Visual Analytics Science and Technology, 2007. VAST 2007. IEEE Symposium on
  • Conference_Location
    Sacramento, CA
  • Print_ISBN
    978-1-4244-1659-2
  • Type

    conf

  • DOI
    10.1109/VAST.2007.4388999
  • Filename
    4388999