DocumentCode :
1682613
Title :
A Switch Criterion for Hybrid Datasets Merging on Top of Map Reduce
Author :
Ma, Lili ; Liao, Huaming ; He, Yongqiang ; Li, Feng ; Gao, Qiang
Author_Institution :
Key Lab. of Net Sci., Chinese Acad. of Sci., Beijing, China
fYear :
2009
Firstpage :
293
Lastpage :
298
Abstract :
With MapReducepsilas restricted structure, multi-datasets merging problem, commonly in many data mining applications, cannot be efficiently resolved with MapReduce. This paper proposes a novel hybrid datasets merging algorithm on top of Map Reduce, HDMA. HDMA can help to automatically determine the relatively better one between two methods, DMCM and DPM, which have different effective fields. HDMA retains the advantages of both methods, and it can make good use of the memory of data nodes. Experiments show that HDMA can get best performance in most situations.
Keywords :
data mining; data structures; merging; HDMA; MapReduce restricted structure; data mining; data node; hybrid dataset; multidataset merging problem; switch criterion; Application software; Computers; Costs; Data mining; Grid computing; Laboratories; Merging; Partitioning algorithms; Switches; Uniform resource locators; DMCM; DPM; Hash table; MapReduce; auto-tuning; memory;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Grid and Cooperative Computing, 2009. GCC '09. Eighth International Conference on
Conference_Location :
Lanzhou, Gansu
Print_ISBN :
978-0-7695-3766-5
Type :
conf
DOI :
10.1109/GCC.2009.28
Filename :
5279579
Link To Document :
بازگشت