• DocumentCode
    2839116
  • Title

    MapReduce-Based Balanced Mining for Closed Frequent Itemset

  • Author

    Chen, Guang-Peng ; Yang, Yu-Bin ; Zhang, Yao

  • Author_Institution
    State Key Lab. for Novel Software Technol., Nanjing Univ., Nanjing, China
  • fYear
    2012
  • fDate
    24-29 June 2012
  • Firstpage
    652
  • Lastpage
    653
  • Abstract
    Mining closed frequent itemset (CFI) plays an essential role in many real-world data mining applications. With the emergence of abundant large-scale data sets, it now turns to be a significant and challenging issue to mine CFI concurrently. This paper proposes a parallel balanced mining algorithm for CFI based on the MapReduce platform. The proposed algorithm adopts Greedy strategy to group items aiming to balance the computation burdens among all parallel tasks, which is consisted of three main steps: (1) Parallel Counting, (2) Global Construction of Frequent List (F_list) and Group Map (G_map), (3) Parallel Mining for Closed Frequent Itemset. Experimental results validate the method and show its effectiveness as satisfied speedup and scalability are both achieved in large-scale CFI mining tasks.
  • Keywords
    data mining; parallel processing; CFI mining; MapReduce-based balanced mining; closed frequent itemset mining; data mining application; global construction of frequent list and group map task; greedy strategy; parallel balanced mining algorithm; parallel counting task; parallel mining for closed frequent itemset task; parallel task; Algorithm design and analysis; Clustering algorithms; Conferences; Data mining; Educational institutions; Itemsets; Closed frequent itemset; Cloud computing; Data mining; Hadoop; MapReduce;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Services (ICWS), 2012 IEEE 19th International Conference on
  • Conference_Location
    Honolulu, HI
  • Print_ISBN
    978-1-4673-2131-0
  • Type

    conf

  • DOI
    10.1109/ICWS.2012.19
  • Filename
    6257941