• DocumentCode
    1825988
  • Title

    Efficient mining of frequent itemsets in social network data based on MapReduce framework

  • Author

    Farzanyar, Zahra ; Cercone, Nick

  • Author_Institution
    Comput. Sci. & Eng. Dept., York Univ., Toronto, ON, Canada
  • fYear
    2013
  • fDate
    25-28 Aug. 2013
  • Firstpage
    1183
  • Lastpage
    1188
  • Abstract
    Social Networks promote information sharing between people everywhere and at all times. Mining data produced in this data-rich environment can be extremely useful. Frequent itemset mining plays an important role in mining associations, correlations, sequential patterns, causality, episodes, multidimensional patterns, max-patterns, partial periodicity, emerging patterns, and many other significant data mining tasks in social networks. With the exponential growth of social network data towards a terabyte or more, most of the traditional frequent itemset mining algorithms become ineffective due to either huge resource requirements or large communications overhead. Cloud computing has proved that processing very large datasets over commodity clusters can be done by providing the right programming model. As a parallel programming model, MapReduce, one of most important techniques for cloud computing, has emerged in the mining of datasets of terabyte scale or larger on clusters of computers. In this paper, we propose an efficient frequent itemset mining algorithm, called IMRApriori, based on MapReduce framework which deals with Hadoop cloud, a parallel store and computing platform. The paper demonstrates experimental results to corroborate the theoretical claims.
  • Keywords
    cloud computing; data mining; parallel programming; social networking (online); Hadoop cloud; IMRApriori algorithm; MapReduce framework; association mining; causality mining; cloud computing; computer clusters; correlation mining; data mining; emerging pattern mining; episode mining; frequent itemset mining algorithm; information sharing; max-pattern mining; multidimensional pattern mining; parallel computing platform; parallel programming model; parallel storage platform; partial-periodicity mining; sequential pattern mining; social network data; Algorithm design and analysis; Clustering algorithms; Computational modeling; Data mining; Itemsets; Social network services; Cloud Computing; Frequent Itemset Mining; MapReduce; Social networks;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advances in Social Networks Analysis and Mining (ASONAM), 2013 IEEE/ACM International Conference on
  • Conference_Location
    Niagara Falls, ON
  • Type

    conf

  • Filename
    6785853