DocumentCode :
2839116
Title :
MapReduce-Based Balanced Mining for Closed Frequent Itemset
Author :
Chen, Guang-Peng ; Yang, Yu-Bin ; Zhang, Yao
Author_Institution :
State Key Lab. for Novel Software Technol., Nanjing Univ., Nanjing, China
fYear :
2012
fDate :
24-29 June 2012
Firstpage :
652
Lastpage :
653
Abstract :
Mining closed frequent itemset (CFI) plays an essential role in many real-world data mining applications. With the emergence of abundant large-scale data sets, it now turns to be a significant and challenging issue to mine CFI concurrently. This paper proposes a parallel balanced mining algorithm for CFI based on the MapReduce platform. The proposed algorithm adopts Greedy strategy to group items aiming to balance the computation burdens among all parallel tasks, which is consisted of three main steps: (1) Parallel Counting, (2) Global Construction of Frequent List (F_list) and Group Map (G_map), (3) Parallel Mining for Closed Frequent Itemset. Experimental results validate the method and show its effectiveness as satisfied speedup and scalability are both achieved in large-scale CFI mining tasks.
Keywords :
data mining; parallel processing; CFI mining; MapReduce-based balanced mining; closed frequent itemset mining; data mining application; global construction of frequent list and group map task; greedy strategy; parallel balanced mining algorithm; parallel counting task; parallel mining for closed frequent itemset task; parallel task; Algorithm design and analysis; Clustering algorithms; Conferences; Data mining; Educational institutions; Itemsets; Closed frequent itemset; Cloud computing; Data mining; Hadoop; MapReduce;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Services (ICWS), 2012 IEEE 19th International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
978-1-4673-2131-0
Type :
conf
DOI :
10.1109/ICWS.2012.19
Filename :
6257941
Link To Document :
بازگشت