DocumentCode
2839116
Title
MapReduce-Based Balanced Mining for Closed Frequent Itemset
Author
Chen, Guang-Peng ; Yang, Yu-Bin ; Zhang, Yao
Author_Institution
State Key Lab. for Novel Software Technol., Nanjing Univ., Nanjing, China
fYear
2012
fDate
24-29 June 2012
Firstpage
652
Lastpage
653
Abstract
Mining closed frequent itemset (CFI) plays an essential role in many real-world data mining applications. With the emergence of abundant large-scale data sets, it now turns to be a significant and challenging issue to mine CFI concurrently. This paper proposes a parallel balanced mining algorithm for CFI based on the MapReduce platform. The proposed algorithm adopts Greedy strategy to group items aiming to balance the computation burdens among all parallel tasks, which is consisted of three main steps: (1) Parallel Counting, (2) Global Construction of Frequent List (F_list) and Group Map (G_map), (3) Parallel Mining for Closed Frequent Itemset. Experimental results validate the method and show its effectiveness as satisfied speedup and scalability are both achieved in large-scale CFI mining tasks.
Keywords
data mining; parallel processing; CFI mining; MapReduce-based balanced mining; closed frequent itemset mining; data mining application; global construction of frequent list and group map task; greedy strategy; parallel balanced mining algorithm; parallel counting task; parallel mining for closed frequent itemset task; parallel task; Algorithm design and analysis; Clustering algorithms; Conferences; Data mining; Educational institutions; Itemsets; Closed frequent itemset; Cloud computing; Data mining; Hadoop; MapReduce;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Services (ICWS), 2012 IEEE 19th International Conference on
Conference_Location
Honolulu, HI
Print_ISBN
978-1-4673-2131-0
Type
conf
DOI
10.1109/ICWS.2012.19
Filename
6257941
Link To Document