Title :
Large-scale experiment of co-allocation strategies for Peer-to-Peer supercomputing in P2P-MPI
Author :
Genaud, Stéphane ; Rattanapoka, Choopan
Author_Institution :
AlGorille Team - LORIA, Vandoeuvre-les-Nancy
Abstract :
High Performance computing generally involves some parallel applications to be deployed on the multiples resources used for the computation. The problem of scheduling the application across distributed resources is termed as co-allocation. In a grid context, co-allocation is difficult since the grid middleware must face a dynamic environment. Middleware architecture on a peer-to-peer (P2P) basis have been proposed to tackle most limitations of centralized systems. Some of the issues addressed by P2P systems are fault tolerance, ease of maintenance, and scalability in resource discovery. However, the lack of global knowledge makes scheduling difficult in P2P systems. In this paper, we present the new developments concerning locality awareness as well as co-allocation strategies available in the latest release of P2P-MPI. i) The spread strategy tries to map processes on hosts so as to maximize the total amount of available memory while maintaining locality of processes as a secondary objective, ii) The concentrate strategy tries to maximize locality between processes by using as many cores as hosts offer. The co-allocation scheme has been devised to be simple for the user and meets the main high performance computing requirement which is locality. Extensive experiments have been conducted on Grid5000 with up to 600 processes on 6 sites throughout France. Results show that we achieved the targeted goals in these real conditions.
Keywords :
grid computing; message passing; middleware; peer-to-peer computing; resource allocation; P2P-MPI; application scheduling; coallocation strategy; distributed resources; dynamic environment; fault tolerance; grid middleware; high performance computing; maintenance; middleware architecture; parallel applications; peer-to-peer supercomputing; scalability; Computer architecture; Concurrent computing; Floods; Grid computing; High performance computing; Large-scale systems; Middleware; Peer to peer computing; Processor scheduling; Resource management;
Conference_Titel :
Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on
Conference_Location :
Miami, FL
Print_ISBN :
978-1-4244-1693-6
Electronic_ISBN :
1530-2075
DOI :
10.1109/IPDPS.2008.4536212