DocumentCode :
2925232
Title :
Novel micro-threading techniques on the Cell Broadband Engine
Author :
Ahmed, Mohamed F. ; Ammar, Reda A. ; Rajasekaran, Sanguthevar
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of Connecticut, Storrs, CT, USA
fYear :
2009
fDate :
5-8 July 2009
Firstpage :
570
Lastpage :
575
Abstract :
The Cell Broadband Engine (CBE) is a heterogeneous multi-core processor with unique design properties for high-performance computing. It consists of one power processing element (PPE) and eight synergistic processing elements (SPEs) connected with the elements interconnect network (EIB). It employs some novel techniques, such as software managed cache, to hide memory latency and guarantees, by default, maximum utilization for the overall system resources. However, utilization of these facilities requires complex designs and implementations of algorithms to get the best performance. In this paper we discuss our micro-threading model realized by a nano-kernel implemented on top of each SPE. SPE´s Nano-kernel, or SPENK, employs the micro-threading model to increase CBE resources utilization while simplifying the programming model. Our framework boosted the processor´s overall performance by a factor of five compared to the current threading model. It allowed us to build a distributed model for SPEs´ tasks management and automated local storage (LS) management. We tested our framework on two types of algorithms: (1) uniform memory access algorithms, such as parallel summation, and (2) Non-uniform or irregular memory access algorithms, specifically the parallel tree spanning algorithm. We have also investigated the optimal parameterization of micro-threads on each SPE to automatically reach the best possible performance. Using proper parameterization of micro-threads, we could achieve three to fivefold performance improvement.
Keywords :
microprocessor chips; operating system kernels; parallel machines; resource allocation; storage management; CBE resource utilization; Cell Broadband Engine; SPENK; automated local storage management; distributed model; elements interconnect network; heterogeneous multicore processor; high-performance computing; irregular memory access algorithm; maximum overall system resource utilization; memory latency; microthreading technique; nanokernel; nonuniform memory access algorithm; parallel summation; parallel tree spanning algorithm; power processing element; software managed cache; synergistic processing element; task management; uniform memory access algorithm; Algorithm design and analysis; Delay; Engines; Memory management; Multicore processing; Power system interconnection; Power system management; Process design; Resource management; Storage automation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computers and Communications, 2009. ISCC 2009. IEEE Symposium on
Conference_Location :
Sousse
ISSN :
1530-1346
Print_ISBN :
978-1-4244-4672-8
Electronic_ISBN :
1530-1346
Type :
conf
DOI :
10.1109/ISCC.2009.5202256
Filename :
5202256
Link To Document :
بازگشت