مرکز منطقه ای اطلاع رساني علوم و فناوري - Optimizing Large-scale Graph Analysis on Multithreaded, Multicore Platforms

DocumentCode :

2958962

Title :

Optimizing Large-scale Graph Analysis on Multithreaded, Multicore Platforms

Author :

Cong, Guojing ; Makarychev, Konstantin

Author_Institution :

IBM TJ Watson Res. Center, Yorktown Heights, NY, USA

fYear :

2012

fDate :

21-25 May 2012

Firstpage :

414

Lastpage :

425

Abstract :

The erratic memory access pattern of graph algorithms makes it hard to optimize on cache-based architectures. While multithreading hides memory latency, it is unclear how hardware threads combined with caches impact the performance of typical graph workload. As modern architectures strike different balances between caching and multithreading, it remains an open question whether the benefit of optimizing locality behavior outweighs the cost. We study parallel graph algorithms on two different multi-threaded, multi-core platforms, that is, IBM Power7 and Sun Niagara2. Our experiments first demonstrate their performance advantage over prior architectures. We find nonetheless the number of hardware threads in either platform is not sufficient to fully mask memory latency. Our cache-friendly scheduling of memory accesses improves performance by up to 2.6 times on Power7 and prior cache-based architectures, yet the same technique significantly degrades performance on Niagara2. Software prefetching and manipulating the storage of the input to improve spatial locality improve performance by up to 2.1 times and 1.3 times on both platforms. Our study reveals interesting interplay between architecture and algorithm.

Keywords :

cache storage; graph theory; multi-threading; multiprocessing systems; optimisation; parallel memories; processor scheduling; IBM Power7; Sun Niagara2; cache-based architecture; cache-friendly scheduling; memory access pattern; memory accesses scheduling; memory latency; multicore platform; multithreading; optimisation; parallel graph algorithm; software prefetching; storage manipulation; Hardware; Multicore processing; Pipelines; Prefetching; Multi-threading; Parallel Graph Algorithms; Software Prefetch; Traversal;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International

Conference_Location :

Shanghai

ISSN :

1530-2075

Print_ISBN :

978-1-4673-0975-2

Type :

conf

DOI :

10.1109/IPDPS.2012.46

Filename :

6267878

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2958962