Author :
Gaudiot, Jeen-Luc
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Univ. of California, Irvine, CA, USA
Abstract :
Summary form only given. The newly emerging many-core-on-a-chip designs have renewed an intense interest in parallel processing. By applying Amdahl\´s formulation to the programs in the PARSEC and SPLASH-2 benchmark suites, we find that most applications may not have sufficient parallelism to efficiently utilize modern parallel machines. The long sequential portions in these application programs are caused by computation as well as communication latency. However, value prediction techniques may allow the "parallelization" of the sequential portion by predicting values before they are produced. In conventional superscalar architectures, the computation latency dominates the sequential sections. Thus value prediction techniques may be used to predict the computation result before it is produced. In many-core architectures, since the communication latency increases with the number of cores, value prediction techniques may be used to reduce both the communication and computation latency. We extend these ideas by using GPUs to accelerate programs that contain limited parallelism and those that are hard to parallelize.
Keywords :
application program interfaces; computer graphic equipment; coprocessors; integrated circuit design; parallel architectures; parallel machines; GPU; PARSEC benchmark suite; SPLASH-2 benchmark suite; application programs; communication latency; computation latency; many-core architectures; many-core-on-a-chip designs; parallel architectures; parallel machines; parallel processing; superscalar architectures; value prediction techniques;
Conference_Titel :
Computational Science and Engineering (CSE), 2011 IEEE 14th International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4577-0974-6
DOI :
10.1109/CSE.2011.14