مرکز منطقه ای اطلاع رساني علوم و فناوري - Combining loop transformations considering caches and scheduling

DocumentCode :

1573266

Title :

Combining loop transformations considering caches and scheduling

Author :

Wolf, Michael E. ; Maydan, Dror E. ; Chen, Ding-Kai

Author_Institution :

Silicon Graphics Comput. Syst., Mountain View, CA, USA

fYear :

1996

Firstpage :

274

Lastpage :

286

Abstract :

The performance of modern microprocessors is greatly affected by cache behavior, instruction scheduling, register allocation and loop overhead. High level loop transformations such as fission, fusion, tiling, interchanging and outer loop unrolling (e.g., unroll and jam) are well known to be capable of improving all these aspects of performance. Difficulties arise because these machine characteristics and these optimizations are highly interdependent. Interchanging two loops might, for example, improve cache behavior but make it impossible to allocate registers in the inner loop. Similarly, unrolling or interchanging a loop might individually hurt performance but doing both simultaneously might help performance. Little work has been published on how to combine these transformations into an efficient and effective compiler algorithm. In this paper we present a model that estimates total machine cycle time taking into account cache misses, software pipelining, register pressure and loop overhead. We then develop an algorithm to intelligently search through the various possible transformations, using our machine model to select the set of transformations leading to the best overall performance. We have implemented this algorithm as part of the MIPSPro commercial compiler system. We give experimental results showing that our approach is both effective and efficient in optimizing numerical programs

Keywords :

cache storage; performance evaluation; pipeline processing; scheduling; MIPSPro commercial compiler; cache behavior; cache misses; cache tiling; caches; fission; fusion; instruction scheduling; loop interchange; loop overhead; loop transformations; machine model; microprocessors; outer loop unrolling; overall performance; performance; register allocation; register pressure; scheduling; software pipelining; Algorithms; Graphics; Machine intelligence; Microprocessors; Modems; Pipeline processing; Processor scheduling; Silicon;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Microarchitecture, 1996. MICRO-29.Proceedings of the 29th Annual IEEE/ACM International Symposium on

Conference_Location :

Paris

Print_ISBN :

0-8186-7641-8

Type :

conf

DOI :

10.1109/MICRO.1996.566468

Filename :

566468

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1573266