• DocumentCode
    2534507
  • Title

    Hierarchical Load Balancing for Charm++ Applications on Large Supercomputers

  • Author

    Zheng, Gengbin ; Meneses, Esteban ; Bhatelé, Abhinav ; Kalé, Laxmikant V.

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
  • fYear
    2010
  • fDate
    13-16 Sept. 2010
  • Firstpage
    436
  • Lastpage
    444
  • Abstract
    Large parallel machines with hundreds of thousands of processors are being built. Recent studies have shown that ensuring good load balance is critical for scaling certain classes of parallel applications on even thousands of processors. Centralized load balancing algorithms suffer from scalability problems, especially on machines with relatively small amount of memory. Fully distributed load balancing algorithms, on the other hand, tend to yield poor load balance on very large machines. In this paper, we present an automatic dynamic hierarchical load balancing method that overcomes the scalability challenges of centralized schemes and poor solutions of traditional distributed schemes. This is done by creating multiple levels of aggressive load balancing domains which form a tree. This hierarchical method is demonstrated within a measurement-based load balancing framework in Charm++. We present techniques to deal with scalability challenges of load balancing at very large scale. We show performance data of the hierarchical load balancing method on up to 16,384 cores of Ranger (at TACC) for a synthetic benchmark. We also demonstrate the successful deployment of the method in a scientific application, NAMD with results on the Blue Gene/P machine at ANL.
  • Keywords
    mainframes; parallel machines; parallel processing; resource allocation; Blue Gene-P machine; Charm++ applications; automatic dynamic hierarchical load balancing method; centralized load balancing algorithms; measurement-based load balancing framework; parallel machines; supercomputers; Databases; Lead; Load management; Load modeling; Memory management; Program processors; Scalability; hierarchical algorithms; load balancing; scalability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing Workshops (ICPPW), 2010 39th International Conference on
  • Conference_Location
    San Diego, CA
  • ISSN
    1530-2016
  • Print_ISBN
    978-1-4244-7918-4
  • Electronic_ISBN
    1530-2016
  • Type

    conf

  • DOI
    10.1109/ICPPW.2010.65
  • Filename
    5599103