• DocumentCode
    257187
  • Title

    ScalScheduling: A Scalable Scheduling Architecture for MPI-based interactive analysis programs

  • Author

    Jiangling Yin ; Foran, Andrew ; Xuhong Zhang ; Jun Wang

  • Author_Institution
    EECS, Univ. of Central Florida, Orlando, FL, USA
  • fYear
    2014
  • fDate
    4-7 Aug. 2014
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    In today´s large scale clusters, running tasks with high degrees of parallelism allows interactive data visualization/analysis to complete in seconds. However, conventional, centralized scheduling poses significant challenges for these interactive applications. As the amount of data to be processed grows, it becomes too heavy to move across the network. Thus, data processing tasks should be scheduled such that the amount of transferred data is minimized, i.e., realizing data locality computation. To implement this, a scheduler process should collect and analyze data distribution metadata prior to making scheduling decisions, which usually causes milliseconds or seconds of latency. Such scheduling delay is unacceptable for interactive data applications. In this paper, we present a Scalable Scheduling Architecture for conventional interactive data programs and refer to it as ScalScheduling. ScalScheduling is proposed to reduce task scheduling latency, while ensuring the worker processes achieve a high degree of data locality computation and load balance in heterogeneous environments. In our proposed architecture, each worker process uses a novel Modulo-based priority method to schedule its local tasks independently. Multiple scheduler processes are employed according to the number of worker processes to resolve the issue of concurrent requests and assign remote tasks with respect to load balance. We perform experiments using thousands of parallel processes, and the experimental results show the benefits of our proposed scheduling architecture as well as its potential for future oversize task scheduling problems on large-scale clusters.
  • Keywords
    message passing; parallel processing; MPI; ScalScheduling; data distribution metadata; data locality computation; interactive analysis program; interactive data visualization; large scale cluster; modulo-based priority method; parallel process; scalable scheduling architecture; scheduling delay; task scheduling latency; Computer architecture; Data processing; Distributed databases; Process control; Processor scheduling; Schedules; Scheduling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Communication and Networks (ICCCN), 2014 23rd International Conference on
  • Conference_Location
    Shanghai
  • Type

    conf

  • DOI
    10.1109/ICCCN.2014.6911753
  • Filename
    6911753