• DocumentCode
    580120
  • Title

    Dynamic task scheduling for linear algebra algorithms on distributed-memory multicore systems

  • Author

    Fengguang Song ; YarKhan, Asim ; Dongarra, Jack

  • Author_Institution
    EECS Dept., Univ. of Tennessee, Knoxville, TN, USA
  • fYear
    2009
  • fDate
    14-20 Nov. 2009
  • Firstpage
    1
  • Lastpage
    11
  • Abstract
    This paper presents a dynamic task scheduling approach to executing dense linear algebra algorithms on multicore systems (either shared-memory or distributed-memory). We use a task-based library to replace the existing linear algebra subroutines such as PBLAS to transparently provide the same interface and computational function as the ScaLAPACK library. Linear algebra programs are written with the task-based library and executed by a dynamic runtime system. We mainly focus our runtime system design on the metric of performance scalability. We propose a distributed algorithm to solve data dependences without process cooperation. We have implemented the runtime system and applied it to three linear algebra algorithms: Cholesky, LU, and QR factorizations. Our experiments on both shared-memory machines (16, 32 cores) and distributed-memory machines (1024 cores) demonstrate that our runtime system is able to achieve good scalability. Furthermore, we provide analytical analysis to show why the tiled algorithms are scalable and the expected execution time.
  • Keywords
    distributed algorithms; distributed memory systems; linear algebra; mathematics computing; matrix decomposition; processor scheduling; shared memory systems; software libraries; Cholesky factorization; LU factorization; PBLAS; QR factorization; ScaLAPACK library; analytical analysis; computational function; data dependency; dense linear algebra algorithms; distributed algorithm; distributed-memory machines; distributed-memory multicore systems; distributed-memory system; dynamic runtime system; dynamic task scheduling; linear algebra programs; linear algebra subroutines; performance scalability; process cooperation; runtime system design; shared-memory machines; shared-memory system; task-based library; tiled algorithms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing Networking, Storage and Analysis, Proceedings of the Conference on
  • Conference_Location
    Portland, OR
  • Type

    conf

  • DOI
    10.1145/1654059.1654079
  • Filename
    6375569