Title :
Parallel Based on Cloud Computing to Achieve Large Data Sets Clustering
Author :
Li, Heng ; Yang, Dan ; Fang, WeiTao
Author_Institution :
Coll. of Comput. Sci., Chongqing Univ., Chongqing, China
Abstract :
This paper presents a CPCluster Map Reduce algorithm to achieve parallelism in cloud computing platform for clustering large, high-dimensional datasets. The proposed Map Reduce paradigm based clustering algorithm improves the traditional cluster algorithm in a parallelized way. It is scalability and has a good acceleration capability, and by adding the compute nodes, speedup is achieved. Experimental results show that the CPCluster Map Reduce algorithm works much better than traditional cluster algorithm, especially when the number of samples in the data sets increases.
Keywords :
cloud computing; data handling; parallel processing; pattern clustering; set theory; CPCluster map reduce algorithm; acceleration capability; cloud computing platform; clustering algorithm; large data sets clustering; parallel based on cloud computing; Algorithm design and analysis; Clustering algorithms; Computational modeling; Data models; Distributed databases; Educational institutions; File systems; cloud computing; cluster; parallel computing;
Conference_Titel :
Computer Science and Electronics Engineering (ICCSEE), 2012 International Conference on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-4673-0689-8
DOI :
10.1109/ICCSEE.2012.287