• DocumentCode
    2549245
  • Title

    Designing OS for HPC Applications: Scheduling

  • Author

    Gioiosa, Roberto ; McKee, Sally A. ; Valero, Mateo

  • Author_Institution
    Comput. Sci. Div., Barcelona Supercomput. Center, Barcelona, Spain
  • fYear
    2010
  • fDate
    20-24 Sept. 2010
  • Firstpage
    78
  • Lastpage
    87
  • Abstract
    Operating systems have historically been implemented as independent layers between hardware and applications. User programs communicate with the OS through a set of well defined system calls, and do not have direct access to the hardware. The OS, in turn, communicates with the underlying architecture via control registers. Except for these interfaces, the three layers are practically oblivious to each other. While this structure improves portability and transparency, it may not deliver optimal performance. This is especially true for High Performance Computing (HPC) systems, where modern parallel applications and multi-core architectures pose new challenges in terms of performance, power consumption, and system utilization. The hardware, the OS, and the applications can no longer remain isolated, and instead should cooperate to deliver high performance with minimal power consumption. In this paper we present our experience with the design and implementation of High Performance Linux (HPL), an operating system designed to optimize the performance of HPC applications running on a state-of-the-art compute cluster. We show how characterizing parallel applications through hardware and software performance counters drives the design of the OS and how including knowledge about the architecture improves performance and efficiency. We perform experiments on a dual-socket IBM POWER6 machine, showing performance improvements and stability (performance variation of 2.11% on average) for NAS, a widely used parallel benchmark suite.
  • Keywords
    Linux; performance evaluation; processor scheduling; HPC application; dual socket IBM POWER6; high performance Linux; high performance computing; multicore architecture; multiprocessor scheduling; operating system; performance optimization; power consumption; Computer architecture; Hardware; Kernel; Linux; Load management; Noise; Real time systems; Multiprocessor Systems; Operating system kernels; Performance; Scheduling; Super computers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing (CLUSTER), 2010 IEEE International Conference on
  • Conference_Location
    Heraklion, Crete
  • Print_ISBN
    978-1-4244-8373-0
  • Electronic_ISBN
    978-0-7695-4220-1
  • Type

    conf

  • DOI
    10.1109/CLUSTER.2010.16
  • Filename
    5600318