Title :
Low-Latency Collectives for the Intel SCC
Author :
Kohler, Adan ; Radetzki, Martin ; Gschwandtner, Philipp ; Fahringer, Thomas
Author_Institution :
Inst. of Comput. Archit. & Comput. Eng., Univ. of Stuttgart, Stuttgart, Germany
Abstract :
Message passing has been adopted as the main programming paradigm for many-core processors with on-chip networks for inter-core communication. To this end, message-passing libraries such as MPI can be used, as they provide well-known interfaces to application developers. Since MPI implementations were originally developed for macroscopic computer networks, the different characteristics of on-chip networks may require rethinking existing solutions. With the example of All reduce, we identify points where collective operations benefit from routines optimized for on-chip networks. The identified issues are then applied to additional collectives including Broadcast, All gather and All to all. The effectiveness of the proposed optimizations is demonstrated on the Single-Chip Cloud Computer (SCC), a many-core research chip created by Intel Labs. Experiments show that collective operations subjected to the identified optimizations are accelerated by factors roughly between 2 to 3 compared to current state of the art implementations. In addition to synthetic benchmarks, we show that the use of the optimized routines accelerates a scientific application by more than 40%.
Keywords :
cloud computing; message passing; multiprocessing systems; network-on-chip; software libraries; Allgather; Allreduce; Alltoall; Broadcast; Intel Labs; Intel SCC; MPI implementations; SCC; art implementations; intercore communication; low-latency collectives; macroscopic computer networks; many-core processors; message-passing libraries; on-chip networks; programming paradigm; scientific application; single-chip cloud computer; synthetic benchmarks; Computer architecture; Libraries; Optimization; Program processors; Synchronization; System-on-a-chip; Vectors; Collective operations; MPI; Many-core processors;
Conference_Titel :
Cluster Computing (CLUSTER), 2012 IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4673-2422-9
DOI :
10.1109/CLUSTER.2012.58