DocumentCode
2263453
Title
PromisQoS: an architecture for delivering QoS to high-performance applications on Myrinet clusters
Author
Neelamegam, Jothi P. ; Chakravarthi, Srigurunath ; Apte, Manoj ; Skjellum, Anthony
Author_Institution
Dept. of Comput. Sci. & Eng., Mississippi State Univ., MS, USA
fYear
2003
fDate
20-24 Oct. 2003
Firstpage
510
Lastpage
517
Abstract
Clusters of workstations are being extensively used for solving computationally intensive scientific problems. However, there is limited support for quality of service (QoS) based distributed computing on commercial off- the-shelf (COTS) clusters. This limitation has restricted successful deployment of distributed real-time high-performance computing applications to customized and dedicated embedded multi-processor systems. This paper describes research work that attempts to provide a cluster platform that can guarantee access to computational and communication resources to distributed applications. The authors have developed PromisQoS, an architecture that supports execution of hard real-time distributed applications on a Linux cluster while providing high-throughput and low-latency communication using Myrinet. PromisQoS consists of the following major components - Hare, BDM-RT and Turtle. Hare is a prototype implementation of time-based QoS channels specified by the real-time message passing interface (MPI/RT 1.1) standard. BDM-RT is a low-level messaging library on Myrinet that provides deterministic communication latency and bandwidth on Myrinet. Turtle, a variant of RT-Linux, is the real-time operating system that provides guaranteed computation time. This work demonstrates that it is possible to deploy hard real-time distributed applications on COTS clusters and underlines the significance of the MPI/RT API in the realm of distributed high-performance computing applications that require QoS.
Keywords
message passing; network operating systems; quality of service; workstation clusters; BDM-RT; Hare; Linux cluster; Myrinet clusters; QoS; Turtle; commercial off- the-shelf clusters; distributed computing; embedded multi-processor systems; quality of service; real-time message passing interface; real-time operating system; Computer applications; Computer architecture; Distributed computing; Embedded computing; Linux; Message passing; Prototypes; Quality of service; Real time systems; Workstations;
fLanguage
English
Publisher
ieee
Conference_Titel
Local Computer Networks, 2003. LCN '03. Proceedings. 28th Annual IEEE International Conference on
ISSN
0742-1303
Print_ISBN
0-7695-2037-5
Type
conf
DOI
10.1109/LCN.2003.1243177
Filename
1243177
Link To Document