DocumentCode :
1814254
Title :
Dynamic Reconfigurable Shaders with Load Balancing for Embedded Graphics Processing
Author :
Chen, Yi-Chi ; Yang, Hui-Chin ; Chung, Chung-Ping ; Wang, Wei-Ting
Author_Institution :
Dept. of Comput. Sci., Nat. Chiao Tung Univ., Hsinchu, Taiwan
Volume :
2
fYear :
2009
fDate :
29-31 Aug. 2009
Firstpage :
31
Lastpage :
36
Abstract :
The increasing demand for application specific processing on portable devices is driving the design with highly efficient hardware. Many applications are streamlined, and the delays of all streamline stages better be equalized. Unfortunately, loads at each stage of streamlined processing may vary depending on input data, making load monitoring and balancing very desirable hardware features. The aim of this paper is to solve the load-imbalance problem of heterogeneous shading processors using a dynamically reconfigurable architecture. We begin with arithmetic analysis of required computations at different processors, determining basic building blocks for assorted functional units. Next we design reconfiguration controller capable of configuring a number of these blocks into desires functional units. Then, we measure load distribution and variations of concerned processors, and design load monitor to generate reconfiguration controller stimulus. Finally, we put all these together to propose our idea. Vertex and pixel processors are used in this study, due to that number of vertices to be processed versus number of pixels has noticeable variations along time. To evaluate this idea, clock cycle-based simulation using 3Dmark05 is conducted. Silicon area and processing cycles are measured in the experiments, and results are compared against a design using the same number of (vertex+pixel) shaders under the best partition for the test workloads. As a result, the measured data show that 60% improvement in speed and 30% improvement in hardware utilization can be achieved. In addition, using 27~35% less silicon area, the same performance can be achieved. The most valuable contribution is more than area and time saving of just the involved stages: with load monitoring and balancing, delays in different streamline stages can be well balanced, squeezing out the ultimate performance of the entire system.
Keywords :
coprocessors; microcontrollers; resource allocation; 3Dmark05; application specific processing; arithmetic analysis; clock cycle-based simulation; dynamic reconfigurable shaders; dynamically reconfigurable architecture; embedded graphics processing; heterogeneous shading processors; load balancing; load distribution measurement; load monitoring; load-imbalance problem; pixel processor; portable devices; reconfiguration controller; vertex processor; Aerodynamics; Arithmetic; Delay; Graphics; Hardware; Load management; Measurement units; Monitoring; Reconfigurable architectures; Silicon; graphic processing units; graphics hardware; load balancing; reconfigurable shader;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Science and Engineering, 2009. CSE '09. International Conference on
Conference_Location :
Vancouver, BC
Print_ISBN :
978-1-4244-5334-4
Electronic_ISBN :
978-0-7695-3823-5
Type :
conf
DOI :
10.1109/CSE.2009.470
Filename :
5283803
Link To Document :
بازگشت