• DocumentCode
    166098
  • Title

    Modified MapReduce framework for enhancing performance of graph based algorithms by fast convergence in distributed environment

  • Author

    Singhal, Harshit ; Guddeti, Ram Mohana Reddy

  • Author_Institution
    Dept. of Inf. Technol., Nat. Inst. of Technol. Karnataka, Surathkal, India
  • fYear
    2014
  • fDate
    24-27 Sept. 2014
  • Firstpage
    1240
  • Lastpage
    1245
  • Abstract
    The amount of data which is produced is huge in current world and more importantly it is increasing exponentially. Traditional data storage and processing techniques are ineffective in handling such huge data [10]. Many real life applications require iterative computations in general and in particular used in most of machine learning and data mining algorithms over large datasets, such as web link structures and social network graphs. MapReduce is a software framework for easily writing applications which process large amount of data (multi-terabyte) in parallel on large clusters (thousands of nodes) of commodity hardware. However, because of batch oriented processing of MapReduce we are unable to utilize the benefits of MapReduce in iterative computations. Our proposed work is mainly focused on optimizing three factors resulting in performance improvement of iterative algorithms in MapReduce environment. In this paper, we address the key issues based on execution of tasks, the unnecessary creation of new task in each iteration and excessive shuffling of data in each iteration. Our preliminary experiments have shown promising results over the basic MapReduce framework. The comparative study with existing solutions based on MapReduce framework like HaLoop, has also shown better performance w.r.t algorithm run time and amount of data traffic over Hadoop Cluster.
  • Keywords
    data mining; graph theory; iterative methods; learning (artificial intelligence); Hadoop cluster; MapReduce framework; data mining; data storage; distributed environment; graph based algorithm; iterative algorithm; machine learning; Algorithm design and analysis; Performance evaluation; Graph algorithms; Iterative Computations; MapReduce;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advances in Computing, Communications and Informatics (ICACCI, 2014 International Conference on
  • Conference_Location
    New Delhi
  • Print_ISBN
    978-1-4799-3078-4
  • Type

    conf

  • DOI
    10.1109/ICACCI.2014.6968416
  • Filename
    6968416