• DocumentCode
    654995
  • Title

    Towards a Moderate-Granularity Incremental Clustering Algorithm for GPU

  • Author

    Chunlei Chen ; Dejun Mu ; Huixiang Zhang ; Wei Hu

  • Author_Institution
    Sch. of Autom., Northwestern Polytech. Univ., Xi´an, China
  • fYear
    2013
  • fDate
    10-12 Oct. 2013
  • Firstpage
    194
  • Lastpage
    201
  • Abstract
    The incremental clustering algorithm plays a vital role in big data processing. The massive data problems generally raise high computation demand on the hardware platform. GPU-based parallel computing is a promising method to satisfy this demand. However, the existing incremental clustering algorithms face an accuracy-parallelism dilemma when accelerated by GPU. The block-wise algorithms evolve the clusters in coarse granularity and sacrifice accuracy for parallelism, while the point-wise algorithms proceed in fine granularity and barter parallelism for accuracy. We propose a moderate-granularity algorithm. This algorithm constantly generates micro-clusters from the incoming data blocks, and then evolves the clusters in the granularity of a micro-cluster. The proposed algorithm takes the following advantages: first, it avoids predefining a cluster number searching range like block-wise algorithms, second, it alleviates the accuracy problem caused by coarse granularity, third, it adopts the parallel-friendly algorithm to generate micro-clusters and decreases the amount of serial operations, so that it is parallelism-scalable compared to point-wise algorithms. Experiments on a CPU-GPU hybrid platform show that our algorithm can achieve comparable accuracy to its batch counterpart and is scalable in terms of parallelism.
  • Keywords
    Big Data; graphics processing units; parallel processing; pattern clustering; CPU-GPU hybrid platform; GPU-based parallel computing; accuracy-parallelism dilemma; big data processing; block-wise algorithm; cluster evolution; data blocks; data problems; hardware platform; microcluster generation; microcluster granularity; moderate-granularity incremental clustering algorithm; parallel-friendly algorithm; point-wise algorithm; serial operations; Accuracy; Algorithm design and analysis; Clustering algorithms; Graphics processing units; Measurement; Parallel processing; Vectors; GPU; incremental clustering; moderate-granularity;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2013 International Conference on
  • Conference_Location
    Beijing
  • Type

    conf

  • DOI
    10.1109/CyberC.2013.38
  • Filename
    6685679