• DocumentCode
    3587844
  • Title

    Big data clustering via random sketching and validation

  • Author

    Traganitis, Panagiotis A. ; Slavakis, Konstantinos ; Giannakis, Georgios B.

  • Author_Institution
    Dept. of ECE & Digital Technol. Center, Univ. of Minnesota, Minneapolis, MN, USA
  • fYear
    2014
  • Firstpage
    1046
  • Lastpage
    1050
  • Abstract
    As the number and dimensionality of data increases, development of new efficient processing tools has become a necessity. The present paper introduces a novel dimensionality reduction approach for fast and efficient clustering of high-dimensional data. The new methods extend random sampling and consensus (RANSAC) arguments, originally developed for robust regression tasks in computer vision, to the dimensionality reduction problem. The advocated random sketching and validation K-means (SkeVa K-means) and Divergence SkeVa algorithms can achieve high performance, with the latter being able to afford lower computational footprint than the former. Extensive numerical tests on synthetic and real datasets highlight the potential of the proposed algorithms, and demonstrate their competitive performance relative to state-of-the-art random projection alternatives.
  • Keywords
    Big Data; data reduction; pattern clustering; random processes; sampling methods; RANSAC arguments; big data clustering; computer vision; data dimensionality; dimensionality reduction problem; divergence SkeVa algorithms; random sampling and consensus arguments; random sketching; random sketching and validation k-means algorithms; random validation; real datasets; robust regression tasks; synthetic datasets; Accuracy; Clustering algorithms; Complexity theory; Computer vision; Data models; Kernel; Robustness; Clustering; K-means; big data; feature selection; high-dimensional data; random sampling and consensus; random sketching and validation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signals, Systems and Computers, 2014 48th Asilomar Conference on
  • Print_ISBN
    978-1-4799-8295-0
  • Type

    conf

  • DOI
    10.1109/ACSSC.2014.7094614
  • Filename
    7094614