• DocumentCode
    1914775
  • Title

    A Task Parallel Implementation of Fast Multipole Methods

  • Author

    Taura, Koichi ; Nakashima, Jun ; Yokota, Rio ; Maruyama, Naoya

  • Author_Institution
    Grad. Sch. of Inf. Sci. & Technol., Univ. of Tokyo, Tokyo, Japan
  • fYear
    2012
  • fDate
    10-16 Nov. 2012
  • Firstpage
    617
  • Lastpage
    625
  • Abstract
    This paper describes a task parallel implementation of ExaFMM, an open source implementation of fast multipole methods (FMM), using a lightweight task parallel library MassiveThreads. Although there have been many attempts on parallelizing FMM, experiences have almost exclusively been limited to formulation based on flat homogeneous parallel loops. FMM in fact contains operations that cannot be readily expressed in such conventional but restrictive models. We show that task parallelism, or parallel recursions in particular, allows us to parallelize all operations of FMM naturally and scalably. Moreover it allows us to parallelize a “mutual interaction” for force/potential evaluation, which is roughly twice as efficient as a more conventional, unidirectional force/potential evaluation. The net result is an open source FMM that is clearly among the fastest single node implementations, including those on GPUs; with a million particles on a 32 cores Sandy Bridge 2.20GHz node, it completes a single time step including tree construction and force/potential evaluation in 65 milliseconds. The study clearly showcases both programmability and performance benefits of flexible parallel constructs over more monolithic parallel loops.
  • Keywords
    multi-threading; public domain software; software libraries; tree data structures; ExaFMM; GPU; MassiveThreads task parallel library; Sandy Bridge node; fast multipole methods; flat-homogeneous parallel loops; flexible-parallel constructs; force-potential evaluation; mutual interaction parallelization; open source FMM parallelization; open source implementation; parallel recursions; task parallel implementation; tree construction; ExaFMM; FMM; MassiveThreads; divide and conquer; fast multipole methods; task parallelism;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion:
  • Conference_Location
    Salt Lake City, UT
  • Print_ISBN
    978-1-4673-6218-4
  • Type

    conf

  • DOI
    10.1109/SC.Companion.2012.86
  • Filename
    6495868