Title :
High Performance Adaptive Distributed Scheduling Algorithm
Author :
Narang, Arun ; Srivastava, Anurag ; Shyamasundar, R.K.
Author_Institution :
IBM India Res. Lab., New Delhi, India
Abstract :
Exascale computing requires complex runtime systems that need to consider affinity, load balancing and low time and message complexity for scheduling massive scale parallel computations. Simultaneous consideration of these objectives makes online distributed scheduling a very challenging problem. Prior distributed scheduling approaches are limited to shared memory or primarily use work-stealing across distributed memory nodes for load-balancing or depend on the programmer specified affinity. However, the performance of affinity driven scheduling and work stealing based algorithms degrades when the input is irregular(UTS) and/or sparse. In this paper we present a novel adaptive distributed scheduling algorithm (ALDS) for multi-place parallel computations, that uses a unique combination of remote (inter-place) spawns and remote work steals to reduce the overheads in the scheduler, which helps to dynamically maintain load balance across the compute nodes of the system, while ensuring affinity maximally. Using parallel machine learning algorithms such as Support Vector Regression running concurrently with program execution on the target architecture, ALDS can automatically and adaptively tune the parameters for scalable performance. Our design was implemented using GASNet API and POSIX threads. For the UTS (Unbalanced Tree Search) benchmark (using up to 2048 nodes of Blue Gene/P), we deliver superior performance over Charm++ [1] and [2].
Keywords :
application program interfaces; communication complexity; concurrency control; learning (artificial intelligence); parallel algorithms; regression analysis; resource allocation; scheduling; support vector machines; tree searching; ALDS; Blue Gene/P nodes; GASNet API; POSIX threads; UTS benchmark; affinity; automatic-adaptive parameter tuning; complex runtime systems; concurrency; dynamic load balance maintenance; exascale computing; high-performance adaptive distributed scheduling algorithm; load balancing; massive-scale parallel computation scheduling; message complexity; multiplace parallel computations; online distributed scheduling; overhead reduction; parallel machine learning algorithms; program execution; remote interplace spawns; support vector regression; target architecture; time complexity; unbalanced tree search benchmark; Load management; Machine learning algorithms; Message systems; Scheduling; Scheduling algorithms; Support vector machines; Adaptive Scheduling; Distributed Scheduling; Performance Analysis;
Conference_Titel :
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International
Conference_Location :
Cambridge, MA
Print_ISBN :
978-0-7695-4979-8
DOI :
10.1109/IPDPSW.2013.232