• DocumentCode
    1365727
  • Title

    Improved Similarity Trees and their Application to Visual Data Classification

  • Author

    Paiva, Jose Gustavo S ; Florian, L. ; Pedrini, Helio ; Telles, Guilherme P. ; Minghim, Rosane

  • Author_Institution
    Univ. of Sao Paulo, Sao Paulo, Brazil
  • Volume
    17
  • Issue
    12
  • fYear
    2011
  • Firstpage
    2459
  • Lastpage
    2468
  • Abstract
    An alternative form to multidimensional projections for the visual analysis of data represented in multidimensional spaces is the deployment of similarity trees, such as Neighbor Joining trees. They organize data objects on the visual plane emphasizing their levels of similarity with high capability of detecting and separating groups and subgroups of objects. Besides this similarity-based hierarchical data organization, some of their advantages include the ability to decrease point clutter; high precision; and a consistent view of the data set during focusing, offering a very intuitive way to view the general structure of the data set as well as to drill down to groups and subgroups of interest. Disadvantages of similarity trees based on neighbor joining strategies include their computational cost and the presence of virtual nodes that utilize too much of the visual space. This paper presents a highly improved version of the similarity tree technique. The improvements in the technique are given by two procedures. The first is a strategy that replaces virtual nodes by promoting real leaf nodes to their place, saving large portions of space in the display and maintaining the expressiveness and precision of the technique. The second improvement is an implementation that significantly accelerates the algorithm, impacting its use for larger data sets. We also illustrate the applicability of the technique in visual data mining, showing its advantages to support visual classification of data sets, with special attention to the case of image classification. We demonstrate the capabilities of the tree for analysis and iterative manipulation and employ those capabilities to support evolving to a satisfactory data organization and classification.
  • Keywords
    data analysis; data mining; image classification; iterative methods; trees (mathematics); image classification; iterative manipulation; multidimensional projection; neighbor joining trees; similarity tree deployment; similarity trees; similarity-based hierarchical data organization; visual classification; visual data analysis; visual data classification; visual data mining; Algorithm design and analysis; Biomedical image processing; Data visualization; Image classification; Phylogeny; Image Classification.; Multidimensional Projections; Similarity Trees;
  • fLanguage
    English
  • Journal_Title
    Visualization and Computer Graphics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1077-2626
  • Type

    jour

  • DOI
    10.1109/TVCG.2011.212
  • Filename
    6065013