• DocumentCode
    2132007
  • Title

    Using Betweenness Centrality to Identify Manifold Shortcuts

  • Author

    Cukierski, William J. ; Foran, David J.

  • Author_Institution
    Rutgers Univ., Piscataway, NJ
  • fYear
    2008
  • fDate
    15-19 Dec. 2008
  • Firstpage
    949
  • Lastpage
    958
  • Abstract
    High-dimensional data presents a significant challenge to a broad spectrum of pattern recognition and machine-learning applications. Dimensionality reduction (DR) methods serve to remove unwanted variance and make such problems tractable. Several nonlinear DR methods, such as the well known ISOMAP algorithm, rely on a neighborhood graph to compute geodesic distances between data points. These graphs may sometimes contain unwanted edges which connect disparate regions of one or more manifolds. This topological sensitivity is well known, yet managing high-dimensional, noisy data in the absence of a priori knowledge, remains an open and difficult problem. This manuscript introduces a divisive, edge-removal method based on graph betweenness centrality which can robustly identify manifold-shorting edges. The problem of graph construction in high dimensions is discussed and the proposed algorithm is inserted into the ISOMAP workflow. ROC analysis is performed and the performance is tested on both synthetic and real datasets.
  • Keywords
    data reduction; graph theory; learning (artificial intelligence); dimensionality reduction method; graph construction; graph edge-removal method; high-dimensional data; isometric mapping algorithm; machine-learning application; manifold shortcut; pattern recognition; Clustering algorithms; Conferences; Data mining; Dentistry; Geophysics computing; Knowledge management; Manifolds; Nonlinear distortion; Pattern recognition; Robustness; betweenness; centrality; dimensionality reduction; graph theory; isomap;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops, 2008. ICDMW '08. IEEE International Conference on
  • Conference_Location
    Pisa
  • Print_ISBN
    978-0-7695-3503-6
  • Electronic_ISBN
    978-0-7695-3503-6
  • Type

    conf

  • DOI
    10.1109/ICDMW.2008.39
  • Filename
    4734026