• DocumentCode
    1684440
  • Title

    Lattice Boltzmann simulation optimization on leading multicore platforms

  • Author

    Williams, Samuel ; Carter, Jonathan ; Oliker, Leonid ; Shalf, John ; Yelick, Katherine

  • Author_Institution
    CRD/NERSC, Lawrence Berkeley Nat. Lab., Berkeley, CA
  • fYear
    2008
  • Firstpage
    1
  • Lastpage
    14
  • Abstract
    We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of search-based performance optimizations, popular in linear algebra and FFT libraries, to application-specific computational kernels. Our work applies this strategy to a lattice Boltzmann application (LBMHD) that historically has made poor use of scalar microprocessors due to its complex data structures and memory access patterns. We explore one of the broadest sets of multicore architectures in the HPC literature, including the Intel Clovertown, AMD Opteron X2, Sun Niagara!, STI Cell, as well as the single core Intel Itanium.2. Rather than hand-tuning LBMHD for each system, we develop a code generator that allows us identify a highly optimized version for each platform, while amortizing the human programming effort. Results show that our auto- tuned LBMHD application achieves up to a 14times improvement compared with the original code. Additionally, we present detailed analysis of each optimization, which reveal surprising hardware bottlenecks and software challenges for future multicore systems and applications.
  • Keywords
    digital simulation; lattice Boltzmann methods; optimisation; parallel memories; physics computing; Lattice boltzmann simulation optimization; auto-tuning approach; complex data structures; memory access patterns; multicore platforms; Application software; Computational modeling; Computer applications; Computer architecture; Kernel; Lattice Boltzmann methods; Libraries; Linear algebra; Multicore processing; Optimization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on
  • Conference_Location
    Miami, FL
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4244-1693-6
  • Electronic_ISBN
    1530-2075
  • Type

    conf

  • DOI
    10.1109/IPDPS.2008.4536295
  • Filename
    4536295