• DocumentCode
    257691
  • Title

    Clustering high-dimensional data via random sampling and consensus

  • Author

    Traganitis, Panagiotis A. ; Slavakis, Konstantinos ; Giannakis, Georgios B.

  • Author_Institution
    Dept. of ECE & Digital Technol. Center, Univ. of Minnesota, Minneapolis, MN, USA
  • fYear
    2014
  • fDate
    3-5 Dec. 2014
  • Firstpage
    307
  • Lastpage
    311
  • Abstract
    In response to the urgent need for learning tools tuned to big data analytics, the present paper introduces a feature selection approach to efficient clustering of high-dimensional vectors. The resultant method leverages random sampling and consensus (RANSAC) arguments, originally developed for robust regression tasks in computer vision, to yield novel dimensionality reduction schemes. The advocated random sampling and consensus K-means (RSC-Kmeans) algorithm can operate in either batch or sequential modes, with the latter being able to afford lower computational footprint than the former. Extensive numerical tests on synthetic and real datasets highlight the potential of the proposed algorithms, and demonstrate their competitive performance relative to state-of-the-art random projection alternatives.
  • Keywords
    feature selection; pattern clustering; random processes; sampling methods; Big Data analytics; RANSAC arguments; RSC-Kmeans algorithm; batch modes; computational footprint; dimensionality reduction schemes; feature selection approach; high-dimensional data clustering; high-dimensional vector clustering; learning tools; numerical tests; random sampling-and-consensus K-means algorithm; real datasets; sequential modes; synthetic datasets; Accuracy; Big data; Clustering algorithms; Information processing; Pattern recognition; Robustness; Vectors; Clustering; K-means; feature selection; high-dimensional data; random sampling and consensus;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal and Information Processing (GlobalSIP), 2014 IEEE Global Conference on
  • Conference_Location
    Atlanta, GA
  • Type

    conf

  • DOI
    10.1109/GlobalSIP.2014.7032128
  • Filename
    7032128