• DocumentCode
    3228653
  • Title

    Adaptive Strassen and ATLAS´s DGEMM: a fast square-matrix multiply for modern high-performance systems

  • Author

    D´Alberto, Paolo ; Nicolau, Alexandru

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA
  • fYear
    2005
  • fDate
    1-1 July 2005
  • Lastpage
    52
  • Abstract
    Strassen´s algorithm has practical performance benefits for architectures with simple memory hierarchies, because it trades computationally expensive matrix multiplications (MM) with cheaper matrix additions (MA). However, it presents no advantages for high-performance architectures with deep memory hierarchies, because MAs exploit limited data reuse. We present an easy-to-use adaptive algorithm combining Strassen´s recursion and high-tuned version of ATLAS MM. In fact, we introduce a last step in the ATLAS-installation process that determines whether Strassen´s may achieve any speedup. We present a recursive algorithm achieving up to 30% speed-up versus ATLAS alone. We show experimental results for 14 different systems
  • Keywords
    adaptive codes; mathematics computing; matrix multiplication; memory architecture; ATLAS DGEMM; adaptive Strassen algorithm; data reuse; deep memory hierarchy; high-performance systems; matrix additions; matrix multiplications; recursive algorithm; Adaptive coding; Computer architecture; Computer science; Equations; High performance computing; IEEE members; Kernel; Memory architecture; Personal digital assistants; Software packages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High-Performance Computing in Asia-Pacific Region, 2005. Proceedings. Eighth International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    0-7695-2486-9
  • Type

    conf

  • DOI
    10.1109/HPCASIA.2005.18
  • Filename
    1592249