• DocumentCode
    2693104
  • Title

    Kahuna: Problem diagnosis for Mapreduce-based cloud computing environments

  • Author

    Tan, Jiaqi ; Pan, Xinghao ; Marinelli, Eugene ; Kavulya, Soila ; Gandhi, Rajeev ; Narasimhan, Priya

  • Author_Institution
    DSO Nat. Labs., Singapore, Singapore
  • fYear
    2010
  • fDate
    19-23 April 2010
  • Firstpage
    112
  • Lastpage
    119
  • Abstract
    We present Kahuna, an approach that aims to diagnose performance problems in MapReduce systems. Central to Kahuna´s approach is our insight on peer-similarity, that nodes behave alike in the absence of performance problems, and that a node that behaves differently is the likely culprit of a performance problem. We present applications of Kahuna´s insight in techniques and their algorithms to statistically compare black-box (OS-level performance metrics) and white-box (Hadoop-log statistics) data across the different nodes of a MapReduce cluster, in order to identify the faulty node(s). We also present empirical evidence of our peer-similarity observations from the 4000-processor Yahoo! M45 Hadoop cluster. In addition, we demonstrate Kahuna´s effectiveness through experimental evaluation of two algorithms for a number of reported performance problems, on four different workloads in a 100-node Hadoop cluster running on Amazon´s EC2 infrastructure.
  • Keywords
    Internet; distributed processing; Hadoop-log statistics; Kahuna; MapReduce-based cloud computing; OS-level performance metrics; Yahoo! M45 Hadoop cluster; peer-similarity; problem diagnosis; Cloud computing; Clustering algorithms; Data mining; Facebook; Fault diagnosis; Large-scale systems; Measurement; Open source software; Peer to peer computing; Statistics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Network Operations and Management Symposium (NOMS), 2010 IEEE
  • Conference_Location
    Osaka
  • ISSN
    1542-1201
  • Print_ISBN
    978-1-4244-5366-5
  • Electronic_ISBN
    1542-1201
  • Type

    conf

  • DOI
    10.1109/NOMS.2010.5488446
  • Filename
    5488446