Optimization of GALS CMP architecture with DCT as case study

Author

Menon, Arjun S. ; Gini, J.R. ; Aishwarya, B. ; Balaji, C.C.G. ; Jaswanth, R. ; Krishnadas, Archana

Author_Institution

Dept. of Electron. & Commun., Amrita Vishwa Vidyapeetham, Coimbatore, India

Volume

3

fYear

2011

fDate

8-10 April 2011

Firstpage

330

Lastpage

333

Abstract

Globally Asynchronous Locally Synchronous (GALS) Chip multiprocessors with separate clocks for separate modules inside the chip are highly suited for processing number crunching and processing jobs that have to be done with limited energy. The GALS uses clock and voltage scaling jointly in system sub-modules to achieve low energy consumption rates. An advantage here is feasibility to use different clock rates for different modules in the chip. GALS allows up to 25% energy savings in addition to clock as well as voltage scalability. However, a chronic drawback of GALS is additional communication latency between the various clock domains. In addition the component processors consume power even when they are idle. Latency to a great extent is reduced by implementing large inter-processor FIFOs buffers. This work proposes to enhance the latency minimization by alternately enhance one processor domain to optimally manage latency and power wastage. Real Time System algorithms are used for managing the inter clock domain communication. This technique can be used to either enhance the FIFO buffer technique if area is not a consideration where only the efficiency is considered or as a standalone manager for handling the inter clock domain communications efficiently with reduced area and increase resource handling capability. Tri state buffers are put to use to switch clock and supply voltage to individual clock domains. For performance evaluation, Component processors to find the DCT were implemented with FIFO in between the modules for communication between processors and in turn reduce latency. Comparison of performance of the latency management processor enhanced GALS chip with FIFO buffer chip revealed increase of 15% throughput and 40% energy savings approximately as compared to 10% and 25% for the latter.

Keywords

buffer storage; discrete cosine transforms; energy conservation; microprocessor chips; optimisation; parallel architectures; power aware computing; power consumption; real-time systems; CMP architecture; DCT; FIFO buffer; GALS; chip microprocessor; energy savings; globally asynchronous locally synchronous; inter clock domain communication; latency management processor; optimization; power consumption; real time system; Clocks; Discrete cosine transforms; Energy efficiency; Performance evaluation; Power demand; Program processors; Protocols; Discrete cosine transform(DCT); First in first out(FIFO); Globally asynchronous locally synchronous (GALS); chip multiprocessor; control processor; low power; multiple clock; tri state buffer;

fLanguage

English

Publisher

ieee

Conference_Titel

Electronics Computer Technology (ICECT), 2011 3rd International Conference on

Conference_Location

Kanyakumari

Print_ISBN

978-1-4244-8678-6

Electronic_ISBN

978-1-4244-8679-3

Type

conf

DOI

10.1109/ICECTECH.2011.5941766

Filename

5941766