DocumentCode
2254872
Title
DLWAP-buffer: A Novel HW/SW Architecture to Alleviate the Cache Coherence on Streaming-like Data in CMP
Author
Huang, Xiaoping ; Fan, Xiaoya ; Zhang, Shengbing ; Chen, Yuhui
Author_Institution
Sch. of Comput. Sci., Northwestern Polytech. Univ., Xi´´an, China
fYear
2012
fDate
20-22 Sept. 2012
Firstpage
23
Lastpage
28
Abstract
In shared-memory Chip Multiprocessor (CMP), shared data between different cores must be exchanged through the last-level-shared-cache and cache coherence must be maintained at the same time. As the number of cores increase, the cache coherence wall has become more and more serious. As for the multimedia applications full of streaming-like data, existing multicore cache coherence protocols show lower performance and cannot meet the timeliness. In the paper, considering the poor temporal locality and high real-time characteristics of the multimedia data, we propose the distributed light-weight active-push buffer (DWALP-buffer) architecture to alleviate the cache coherence latency on streaming-like data in CMP. The architecture introduces a dedicated shared-data exchange channel between adjacent cores. The channel bridges the internal register files and reduces the shared-data communication latency. Supported by the control protocol, the architecture can adaptively balance the rate mismatch in producer-consumer pipeline model. We build a quad-core CMP simulation platform with the DWLAP-buffers. Our experiment indicates that comparing with the last-shared-level-cache method the architecture can improve the average performance by 13% and alleviate the snooping operations caused from maintaining cache coherence by 26%.
Keywords
cache storage; media streaming; memory architecture; shared memory systems; CMP; DLWAP-buffer; HW-SW architecture; cache coherence latency; dedicated shared-data exchange channel; distributed light-weight active-push buffer architecture; internal register files; last-level-shared-cache method; multicore cache coherence protocols; multimedia applications; producer-consumer pipeline model; quad-core CMP simulation platform; shared-data communication latency reduction; shared-memory chip multiprocessor; streaming-like data; Coherence; Computer architecture; Instruction sets; Multimedia communication; Pipelines; Protocols; Registers; DLWAP-buffer; adaptive; cache coherence; multicore; streaming-like data;
fLanguage
English
Publisher
ieee
Conference_Titel
Embedded Multicore Socs (MCSoC), 2012 IEEE 6th International Symposium on
Conference_Location
Aizu-Wakamatsu
Print_ISBN
978-1-4673-2535-6
Electronic_ISBN
978-0-7695-4800-5
Type
conf
DOI
10.1109/MCSoC.2012.19
Filename
6354674
Link To Document