DocumentCode :
124024
Title :
A scalable, high-performance customized priority queue
Author :
Muhuan Huang ; Lim, Khan ; Cong, J.
Author_Institution :
Comput. Sci. Dept., Univ. of California, Los Angeles, Los Angeles, CA, USA
fYear :
2014
fDate :
2-4 Sept. 2014
Firstpage :
1
Lastpage :
4
Abstract :
Priority queues are abstract data structures where each element is associated with a priority, and the highest priority element is always retrieved first from the queue. The data structure is widely used within databases, including the last stage of a merge-sort, forecasting read-ahead I/O to stream data for the merge-sort, and replacement selection sort. Typical software implementations use a balanced binary tree-based structure, providing O(log N) time for both enqueue and dequeue operations. To improve the performance, we propose several scalable and high-speed FPGA-based implementations of a priority queue. Our insight is that the above listed applications primarily use priority queues through “replace” operations, which remove the highest priority element and place a new element into the queue. Thus, our designs are customized for this operation, allowing for a simple and scalable architecture. We implement three priority queue designs, including use of a register-based array, register-based tree, and BRAM-based tree, which have different benefits and trade-offs of throughput, frequency, and maximum size. More importantly, all designs achieve O(1) time between replace operations. To incorporate the best aspects of our designs, we propose a Hybrid Priority Queue (H-PQ), which combines a register-based array with multiple BRAM-based trees. This design provides, on average, very fast access times to the top items in the queue (through the register-based array), while scaling to large priority queue sizes (through the BRAM-based trees). In our evaluations, we find that H-PQ achieves 4.3x speedup and 21.5x energy efficiency, compared with the Xeon CPU implementations.
Keywords :
data structures; field programmable gate arrays; BRAM-based tree; FPGA-based implementation; Xeon CPU implementation; abstract data structures; dequeue operation; energy efficiency; enqueue operation; field programmable gate array; high-performance customized priority queue; merge-sort; merge-sort forecasting read-ahead input-output; priority element; queue size; register-based array; register-based tree; replace operation; replacement selection sort; Arrays; Clocks; Database systems; Field programmable gate arrays; Registers; Software; Throughput;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Field Programmable Logic and Applications (FPL), 2014 24th International Conference on
Conference_Location :
Munich
Type :
conf
DOI :
10.1109/FPL.2014.6927413
Filename :
6927413
Link To Document :
بازگشت