• DocumentCode
    2958962
  • Title

    Optimizing Large-scale Graph Analysis on Multithreaded, Multicore Platforms

  • Author

    Cong, Guojing ; Makarychev, Konstantin

  • Author_Institution
    IBM TJ Watson Res. Center, Yorktown Heights, NY, USA
  • fYear
    2012
  • fDate
    21-25 May 2012
  • Firstpage
    414
  • Lastpage
    425
  • Abstract
    The erratic memory access pattern of graph algorithms makes it hard to optimize on cache-based architectures. While multithreading hides memory latency, it is unclear how hardware threads combined with caches impact the performance of typical graph workload. As modern architectures strike different balances between caching and multithreading, it remains an open question whether the benefit of optimizing locality behavior outweighs the cost. We study parallel graph algorithms on two different multi-threaded, multi-core platforms, that is, IBM Power7 and Sun Niagara2. Our experiments first demonstrate their performance advantage over prior architectures. We find nonetheless the number of hardware threads in either platform is not sufficient to fully mask memory latency. Our cache-friendly scheduling of memory accesses improves performance by up to 2.6 times on Power7 and prior cache-based architectures, yet the same technique significantly degrades performance on Niagara2. Software prefetching and manipulating the storage of the input to improve spatial locality improve performance by up to 2.1 times and 1.3 times on both platforms. Our study reveals interesting interplay between architecture and algorithm.
  • Keywords
    cache storage; graph theory; multi-threading; multiprocessing systems; optimisation; parallel memories; processor scheduling; IBM Power7; Sun Niagara2; cache-based architecture; cache-friendly scheduling; memory access pattern; memory accesses scheduling; memory latency; multicore platform; multithreading; optimisation; parallel graph algorithm; software prefetching; storage manipulation; Hardware; Multicore processing; Pipelines; Prefetching; Multi-threading; Parallel Graph Algorithms; Software Prefetch; Traversal;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International
  • Conference_Location
    Shanghai
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4673-0975-2
  • Type

    conf

  • DOI
    10.1109/IPDPS.2012.46
  • Filename
    6267878