• DocumentCode
    3245928
  • Title

    Understanding Network Saturation Behavior on Large-Scale Blue Gene/P Systems

  • Author

    Balaji, P. ; Naik, H. ; Desai, N.

  • Author_Institution
    Math. & Comput. Sci. Div., Argonne Nat. Lab., Argonne, IL, USA
  • fYear
    2009
  • fDate
    8-11 Dec. 2009
  • Firstpage
    586
  • Lastpage
    593
  • Abstract
    As researchers continue to architect massive-scale systems, it is becoming clear that these systems will utilize a significant amount of shared hardware between processing units. Systems such as the IBM Blue Gene (BG) and Cray XT have started utilizing flat (i.e., scalable) networks, which differ from switched fabrics in that they use a 3D torus or similar topology. This allows the network to grow only linearly with system scale, instead of the super linear growth needed for full fat-tree switched topologies, but at the cost of increased network sharing between processing nodes. While in many cases a full fat-tree is an over estimate of the needed bisectional bandwidth, it is not clear whether the other extreme of a flat topology is sufficient to move data around the network efficiently. In this paper, we study the network behavior of the IBM BG/P using several application communication kernels, and we monitor network congestion behavior based on detailed hardware counters. Our studies scale from small systems to 8 racks (32,768 cores) of BG/P and provide insights into the network communication characteristics of the system.
  • Keywords
    Cray computers; computer network management; large-scale systems; parallel machines; telecommunication congestion control; telecommunication network topology; 3D torus; Cray XT; IBM Blue Gene; fat-tree switched topologies; large-scale Blue Gene/P systems; massive-scale systems; network congestion; network saturation behavior; network sharing; similar topology; Bandwidth; Communication switching; Costs; Counting circuits; Fabrics; Hardware; Kernel; Large-scale systems; Monitoring; Network topology; Blue Gene/P; Fat Tree; Petascale; Saturation; Torus;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Systems (ICPADS), 2009 15th International Conference on
  • Conference_Location
    Shenzhen
  • ISSN
    1521-9097
  • Print_ISBN
    978-1-4244-5788-5
  • Type

    conf

  • DOI
    10.1109/ICPADS.2009.117
  • Filename
    5395352