مرکز منطقه ای اطلاع رساني علوم و فناوري - A scalable, serially-equivalent, high-quality parallel placement methodology suitable for modern multicore and GPU architectures

DocumentCode :

124092

Title :

A scalable, serially-equivalent, high-quality parallel placement methodology suitable for modern multicore and GPU architectures

Author :

Fobel, Christian ; Grewal, Gary ; Stacey, Deborah

Author_Institution :

Sch. of Comput. Sci., Univ. of Guelph, Guelph, ON, Canada

fYear :

2014

fDate :

2-4 Sept. 2014

Firstpage :

Lastpage :

Abstract :

Placement and routing run-times continue to dominate the automated FPGA design flow. As the size of FPGA architectures continue to grow exponentially, it remains critical to develop parallel tools for FPGA design where the amount of exposed concurrent work scales with the size of the designs to be synthesized. In this paper, we propose a novel algorithm for parallel placement, based on simulated annealing, where the amount of parallel work directly scales with the size of the net-list to be placed. Our approach concurrently evaluates and conditionally applies very large sets of non-conflicting swaps using common parallel computing primitives, including stream compaction, category reduction, and sort. While our design is suitable for targeting all modern parallel computing platforms, we present results from our implementation which targets NVIDIA´s CUDA platform, where we achieve a mean speed-up of 19x over VPR with post-routing critical-path-delay and wire-length quality that matches or exceeds VPR. We believe that this work is an important step towards the development of a scalable, high-quality placement tool.

Keywords :

field programmable gate arrays; graphics processing units; logic design; multiprocessing systems; network routing; parallel architectures; simulated annealing; sorting; FPGA architectures; GPU architectures; NVIDIA CUDA platform; automated FPGA design flow; category reduction; common parallel computing primitives; high-quality parallel placement methodology; modern multicore; net-list; nonconflicting swaps; parallel tools; parallel work; post-routing critical-path-delay; routing run-times; scalable parallel placement methodology; serially-equivalent parallel placement methodology; simulated annealing; sort; stream compaction; wire-length quality; Annealing; Computational modeling; Computer architecture; Educational institutions; Field programmable gate arrays; Indexes; Instruction sets;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Field Programmable Logic and Applications (FPL), 2014 24th International Conference on

Conference_Location :

Munich

Type :

conf

DOI :

10.1109/FPL.2014.6927481

Filename :

6927481

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=124092