• DocumentCode
    3739241
  • Title

    PaMPa-HD: A Parallel MapReduce-Based Frequent Pattern Miner for High-Dimensional Data

  • Author

    Daniele Apiletti;Elena Baralis;Tania Cerquitelli;Paolo Garza;Pietro Michiardi;Fabio Pulvirenti

  • Author_Institution
    Dipt. di Autom. e Inf., Politec. di Torino, Turin, Italy
  • fYear
    2015
  • Firstpage
    839
  • Lastpage
    846
  • Abstract
    Frequent closed itemset mining is among the most complex exploratory techniques in data mining, and provides the ability to discover hidden correlations in transactional datasets. The explosion of Big Data is leading to new parallel and distributed approaches. Unfortunately, most of them are designed to cope with low-dimensional datasets, whereas no distributed highdimensional frequent closed itemset mining algorithms exists. This work introduces PaMPa-HD, a parallel MapReduce-based frequent closed itemset mining algorithm for high-dimensional datasets, based on Carpenter. The experimental results, performed on both real and synthetic datasets, show the efficiency and scalability of PaMPa-HD.
  • Keywords
    "Itemsets","Data mining","Big data","Conferences","Electronic mail","Explosions","Algorithm design and analysis"
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshop (ICDMW), 2015 IEEE International Conference on
  • Electronic_ISBN
    2375-9259
  • Type

    conf

  • DOI
    10.1109/ICDMW.2015.18
  • Filename
    7395755