Title :
Reducing design complexity of the load/store queue
Author :
Park, Il ; Ooi, Chong Liang ; Vijaykumar, T.N.
Author_Institution :
Sch. of Electr. & Comput. Eng., Purdue Univ., West Lafayette, IN, USA
Abstract :
With faster CPU clocks and wider pipelines, all relevant microarchitecture components should scale accordingly. There have been many proposals for scaling the issue queue, register file, and cache hierarchy. However, nothing has been done for scaling the load/store queue, despite the increasing pressure on the load/store queue in terms of capacity and search bandwidth. The load/store queue is a CAM structure which holds in-flight memory instructions and supports simultaneous searches to honor memory dependencies and memory consistency models. Therefore, it is difficult to scale the load/store queue. In this study, we introduce novel techniques to scale the load/store queue. We propose two techniques, store-load pair predictor and load buffer, to reduce the search bandwidth requirement; and one technique, segmentation, to scale the size. We show that a load/store queue using our predictor and load buffer needs only one port to outperform a conventional two-ported load/store queue. Compared to the same base case, segmentation alone achieves speedups of 5% for integer benchmarks and 19% for floating point benchmarks. A one-ported load/store queue using all of our techniques improves performance on average by 6% and 23%, and up to 15% and 59%, for integer and floating-point benchmarks, respectively, over a two-ported conventional load/store queue.
Keywords :
benchmark testing; cache storage; parallel architectures; pipeline processing; queueing theory; CAM structure; CPU clocks; cache hierarchy; capacity bandwidth; design complexity; floating point benchmarks; in-flight memory instructions; issue queue; load buffer; load-store queue; microarchitectures; out-of-order microprocessor; pipelines; register file; search bandwidth; store-load pair predictor; Bandwidth; Buffer storage; CADCAM; Clocks; Computer aided manufacturing; Microarchitecture; Microprocessors; Out of order; Pipelines; Registers;
Conference_Titel :
Microarchitecture, 2003. MICRO-36. Proceedings. 36th Annual IEEE/ACM International Symposium on
Print_ISBN :
0-7695-2043-X
DOI :
10.1109/MICRO.2003.1253245