DocumentCode :
2958962
Title :
Optimizing Large-scale Graph Analysis on Multithreaded, Multicore Platforms
Author :
Cong, Guojing ; Makarychev, Konstantin
Author_Institution :
IBM TJ Watson Res. Center, Yorktown Heights, NY, USA
fYear :
2012
fDate :
21-25 May 2012
Firstpage :
414
Lastpage :
425
Abstract :
The erratic memory access pattern of graph algorithms makes it hard to optimize on cache-based architectures. While multithreading hides memory latency, it is unclear how hardware threads combined with caches impact the performance of typical graph workload. As modern architectures strike different balances between caching and multithreading, it remains an open question whether the benefit of optimizing locality behavior outweighs the cost. We study parallel graph algorithms on two different multi-threaded, multi-core platforms, that is, IBM Power7 and Sun Niagara2. Our experiments first demonstrate their performance advantage over prior architectures. We find nonetheless the number of hardware threads in either platform is not sufficient to fully mask memory latency. Our cache-friendly scheduling of memory accesses improves performance by up to 2.6 times on Power7 and prior cache-based architectures, yet the same technique significantly degrades performance on Niagara2. Software prefetching and manipulating the storage of the input to improve spatial locality improve performance by up to 2.1 times and 1.3 times on both platforms. Our study reveals interesting interplay between architecture and algorithm.
Keywords :
cache storage; graph theory; multi-threading; multiprocessing systems; optimisation; parallel memories; processor scheduling; IBM Power7; Sun Niagara2; cache-based architecture; cache-friendly scheduling; memory access pattern; memory accesses scheduling; memory latency; multicore platform; multithreading; optimisation; parallel graph algorithm; software prefetching; storage manipulation; Hardware; Multicore processing; Pipelines; Prefetching; Multi-threading; Parallel Graph Algorithms; Software Prefetch; Traversal;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International
Conference_Location :
Shanghai
ISSN :
1530-2075
Print_ISBN :
978-1-4673-0975-2
Type :
conf
DOI :
10.1109/IPDPS.2012.46
Filename :
6267878
Link To Document :
بازگشت