Title :
A distributed multiple-SIMD processor in memory
Author :
Rangan, Krishna Kumar ; Abu-Ghazaleh, Nael B. ; Wilsey, Philip A.
Author_Institution :
Experimental Comput. Lab., Cincinnati Univ., OH, USA
Abstract :
The integration of processing and DRAM offers a potential solution to the memory bottleneck problem. The bandwidth available within the chip is several orders of magnitude higher than that at the memory bus with a lower access time. As workloads shift towards data-intensive/multimedia applications, the wide bandwidth can be effectively utilized by harnessing the parallelism available in these applications. There are difficult challenges in developing architectures and programming models that expose the available bandwidth to the application. This paper presents the design of an intelligent memory based on a distributed data-parallel architecture with limited support for control parallelism. We investigate some of the relevant design issues and evaluate the success of such an architecture in supporting data-intensive applications. The design is evaluated as a stand-alone system, and also as a co-processor acting as a memory access filter. A cycle-accurate simulator is developed and used to study the performance of the architecture for data-intensive applications. The performance is compared against that of a modern superscalar processor.
Keywords :
distributed memory systems; memory architecture; parallel architectures; DRAM; cycle-accurate simulator; distributed data-parallel architecture; distributed multiple-SIMD processor; intelligent memory; memory access filter; memory bottleneck problem; programming models; stand-alone system; superscalar processor; Application software; Bandwidth; Computer architecture; Coprocessors; Laboratories; Logic programming; Microprocessors; Parallel processing; Plasma welding; Random access memory;
Conference_Titel :
Parallel Processing, 2001. International Conference on
Conference_Location :
Valencia, Spain
Print_ISBN :
0-7695-1257-7
DOI :
10.1109/ICPP.2001.952098