مرکز منطقه ای اطلاع رساني علوم و فناوري - Parallel Based on Cloud Computing to Achieve Large Data Sets Clustering

DocumentCode :

1895442

Title :

Parallel Based on Cloud Computing to Achieve Large Data Sets Clustering

Author :

Li, Heng ; Yang, Dan ; Fang, WeiTao

Author_Institution :

Coll. of Comput. Sci., Chongqing Univ., Chongqing, China

Volume :

fYear :

2012

fDate :

23-25 March 2012

Firstpage :

411

Lastpage :

415

Abstract :

This paper presents a CPCluster Map Reduce algorithm to achieve parallelism in cloud computing platform for clustering large, high-dimensional datasets. The proposed Map Reduce paradigm based clustering algorithm improves the traditional cluster algorithm in a parallelized way. It is scalability and has a good acceleration capability, and by adding the compute nodes, speedup is achieved. Experimental results show that the CPCluster Map Reduce algorithm works much better than traditional cluster algorithm, especially when the number of samples in the data sets increases.

Keywords :

cloud computing; data handling; parallel processing; pattern clustering; set theory; CPCluster map reduce algorithm; acceleration capability; cloud computing platform; clustering algorithm; large data sets clustering; parallel based on cloud computing; Algorithm design and analysis; Clustering algorithms; Computational modeling; Data models; Distributed databases; Educational institutions; File systems; cloud computing; cluster; parallel computing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computer Science and Electronics Engineering (ICCSEE), 2012 International Conference on

Conference_Location :

Hangzhou

Print_ISBN :

978-1-4673-0689-8

Type :

conf

DOI :

10.1109/ICCSEE.2012.287

Filename :

6187874

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1895442