DocumentCode
2236006
Title
Exploiting Narrow Accelerators with Data-Centric Subgraph Mapping
Author
Hormati, Amir ; Clark, Nathan ; Mahlke, Scott
Author_Institution
Adv. Comput. Archit. Lab., Michigan Univ., Ann Arbor, MI
fYear
2007
fDate
11-14 March 2007
Firstpage
341
Lastpage
353
Abstract
The demand for high performance has driven acyclic computation accelerators into extensive use in modern embedded and desktop architectures. Accelerators that are ideal from a software perspective, are difficult or impossible to integrate in many modern architectures, though, due to area and timing requirements. This reality is coupled with the observation that many application domains under-utilize accelerator hardware, because of the narrow data they operate on and the nature of their computation. In this work, we take advantage of these facts to design accelerators capable of executing in modern architectures by narrowing datapath width and reducing interconnect. Novel compiler techniques are developed in order to generate high-quality code for the reduced-cost accelerators and prevent performance loss to the extent possible. First, data width profiling is used to statistically determine how wide program data will be at run time. This information is used by the subgraph mapping algorithm to optimally select subgraphs for execution on targeted narrow accelerators. Overall, our data-centric compilation techniques achieve on average 6.5%, and up to 12%, speed up over previous subgraph mapping algorithms for 8-bit accelerators. We also show that, with appropriate compiler support, the increase in the total number of execution cycles in reduced-interconnect accelerators is less than 1% of the fully-connected accelerator
Keywords
embedded systems; microcomputers; acyclic computation accelerators; data-centric subgraph mapping; desktop architectures; embedded architectures; narrow accelerators; software perspective; Application specific integrated circuits; Computational efficiency; Computer architecture; Electronic mail; Embedded computing; Hardware; High performance computing; Integrated circuit interconnections; Logic; Process design;
fLanguage
English
Publisher
ieee
Conference_Titel
Code Generation and Optimization, 2007. CGO '07. International Symposium on
Conference_Location
San Jose, CA
Print_ISBN
0-7695-2764-7
Type
conf
DOI
10.1109/CGO.2007.11
Filename
4145126
Link To Document