DocumentCode :
1895442
Title :
Parallel Based on Cloud Computing to Achieve Large Data Sets Clustering
Author :
Li, Heng ; Yang, Dan ; Fang, WeiTao
Author_Institution :
Coll. of Comput. Sci., Chongqing Univ., Chongqing, China
Volume :
1
fYear :
2012
fDate :
23-25 March 2012
Firstpage :
411
Lastpage :
415
Abstract :
This paper presents a CPCluster Map Reduce algorithm to achieve parallelism in cloud computing platform for clustering large, high-dimensional datasets. The proposed Map Reduce paradigm based clustering algorithm improves the traditional cluster algorithm in a parallelized way. It is scalability and has a good acceleration capability, and by adding the compute nodes, speedup is achieved. Experimental results show that the CPCluster Map Reduce algorithm works much better than traditional cluster algorithm, especially when the number of samples in the data sets increases.
Keywords :
cloud computing; data handling; parallel processing; pattern clustering; set theory; CPCluster map reduce algorithm; acceleration capability; cloud computing platform; clustering algorithm; large data sets clustering; parallel based on cloud computing; Algorithm design and analysis; Clustering algorithms; Computational modeling; Data models; Distributed databases; Educational institutions; File systems; cloud computing; cluster; parallel computing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science and Electronics Engineering (ICCSEE), 2012 International Conference on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-4673-0689-8
Type :
conf
DOI :
10.1109/ICCSEE.2012.287
Filename :
6187874
Link To Document :
بازگشت