• DocumentCode
    2850247
  • Title

    A distribution aware scheduling method in MapReduce

  • Author

    Zhang, Xiaohong ; Ding, Yang

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Henan Polytech. Univ., Jiaozuo, China
  • fYear
    2012
  • fDate
    24-27 June 2012
  • Firstpage
    128
  • Lastpage
    131
  • Abstract
    Data locality is one of the critical factors which affect the system performance. In this paper, we focus on the data locality problem in Hadoop MapReduce. To improve the data locality of MapReduce, we propose a scheduling method. After receiving a request from a node, the method selects a task from the first level followed by the second and the third level of the node. Then, it checks whether the task is the only one on the first level of the node to issue a request. If so, the method skips the selected task, and selects another task for the node issuing a request. Otherwise, the method schedules the selected task to the node. We have analyzed the method. Comparing with default scheduling method of Hadoop MapReduce, the proposed method can improve the efficiency of data locality.
  • Keywords
    Internet; data handling; public domain software; scheduling; software performance evaluation; Hadoop MapReduce; Internet technologies; data locality problem; distribution aware scheduling method; system performance; Educational institutions; Nonhomogeneous media; Data intesnsive applications; Data locality; MapReduce; Scheduling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Electrical & Electronics Engineering (EEESYM), 2012 IEEE Symposium on
  • Conference_Location
    Kuala Lumpur
  • Print_ISBN
    978-1-4673-2363-5
  • Type

    conf

  • DOI
    10.1109/EEESym.2012.6258605
  • Filename
    6258605