Title :
Predictive and Distributed Routing Balancing on High-Speed Cluster Networks
Author :
Castillo, Carlos Núñez ; Lugones, Diego ; Franco, Daniel ; Luque, Emilio
Author_Institution :
Comput. Archit. & Oper. Syst. Dept., Univ. Autonoma de Barcelona, Barcelona, Spain
Abstract :
In high performance clusters current parallel application communication needs such as traffic pattern, communication volume, etc., change along time and are difficult to know in advance. Such needs often exceed or do not match available resources causing resource use imbalance, network congestion, throughput reduction and message latency increase, thus degrading the overall system performance. Studies on parallel applications show repetitive behavior that can be characterized by a set of representative phases. This work presents a Predictive and Distributed Routing Balancing (PRDRB) technique, a new method developed to gradually control network congestion, based on paths expansion, traffic distribution, applications pattern repetitiveness and speculative adaptive routing, in order to maintain low latency values. PRDRB monitors messages latencies on routers and logs solutions to congestion, to quickly respond in future similar situations. Traffic congestion experiments were conducted in order to evaluate the performance of the method, and improvements were observed.
Keywords :
parallel processing; telecommunication congestion control; telecommunication network routing; telecommunication traffic; workstation clusters; applications pattern repetitiveness; communication volume; control network congestion; distributed routing balancing; high-speed cluster network; message latency; parallel application communication; paths expansion; predictive routing balancing; speculative adaptive routing; throughput reduction; traffic congestion experiment; traffic distribution; traffic pattern; Databases; Heuristic algorithms; Monitoring; Multiprocessor interconnection; Network topology; Routing; Topology; Interconnection networks; application aware routing; parallel applications; predictive routing;
Conference_Titel :
Computer Architecture and High Performance Computing (SBAC-PAD), 2011 23rd International Symposium on
Conference_Location :
Vitoria, Espirito Santo
Print_ISBN :
978-1-4577-2050-5
DOI :
10.1109/SBAC-PAD.2011.27