مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

1681700

Title :

Build to order linear algebra kernels

Author :

Siek, Jeremy G. ; Karlin, Ian ; Jessup, E.R.

Author_Institution :

Dept. of Electr. & Comput. Eng., Univ. of Colorado, Denver, CO

fYear :

2008

Firstpage :

Lastpage :

Abstract :

The performance bottleneck for many scientific applications is the cost of memory access inside linear algebra kernels. Tuning such kernels for memory efficiency is a complex task that reduces the productivity of computational scientists. Software libraries such as the Basic Linear Algebra Subprograms (BLAS) ameliorate this problem by providing a standard interface for which computer scientists and hardware vendors have created highly-tuned implementations. Scientific applications often require a sequence of BLAS operations, which presents further opportunities for memory optimization. However, because BLAS are tuned in isolation they do not take advantage of these opportunities. This phenomenon motivated the recent addition to the BLAS of several routines that perform sequences of operations. Unfortunately, the exact sequence of operations needed in a given situation is highly application dependent, so many more routines are needed. In this paper we present preliminary work on a domain- specific compiler that generates implementations for arbitrary sequences of basic linear algebra operations and tunes them for memory efficiency. We report experimental results for dense kernels and show speedups of 25 % to 120 % relative to sequences of calls to GotoBLAS and vendor-tuned BLAS on Intel Xeon and IBM PowerPC platforms.

Keywords :

linear algebra; mathematics computing; program compilers; software libraries; build to order linear algebra kernels; domain-specific compiler; linear algebra subprograms; memory access; memory efficiency; software libraries; Application software; Computer interfaces; Computer science; Costs; Kernel; Linear algebra; Optimizing compilers; Productivity; Read-write memory; Software libraries;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on

Conference_Location :

Miami, FL

ISSN :

1530-2075

Print_ISBN :

978-1-4244-1693-6

Electronic_ISBN :

1530-2075

Type :

conf

DOI :

10.1109/IPDPS.2008.4536183

Filename :

4536183

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1681700