Title :
Generation of Heterogeneous Distributed Architectures for Memory-Intensive Applications Through High-Level Synthesis
Author :
Huang, Chao ; Ravi, Srivaths ; Raghunathan, Anand ; Jha, Niraj K.
Author_Institution :
Virginia Polytech. Inst. & State Univ., Blacksburg
Abstract :
Memory-intensive applications present unique challenges to an application-specific integrated circuit (ASIC) designer in terms of the choice of memory organization, memory size requirements, bandwidth and access latencies, etc. The high potential of single-chip distributed logic-memory architectures in addressing many of these issues has been recognized in general-purpose computing, and more recently, in ASIC design. The high-level synthesis (HLS) techniques presented in this paper are motivated by the fact that many memory-intensive applications exhibit irregular array data access patterns. Synthesis should, therefore, be capable of determining a partitioned architecture, wherein array data and computations may have to be heterogeneously distributed for achieving the best performance speed-up. We use a combination of clustering and min-cut style partitioning techniques to yield distributed architectures, based on simulation profiling while considering various factors including data access locality, balanced workloads, inter-partition communication, etc. Our experiments with several benchmark applications show that the proposed techniques yielded two-way partitioned architectures that can achieve upto 2.1 x (average of 1.9 x) performance speed-up over conventional HLS solutions, while achieving upto 1.5 x (average of 1.4 x) performance speed-up over the best homogeneous partitioning solution feasible. At the same time, the reduction in the energy-delay product over conventional single-memory designs is upto 2.7 x (average of 2.0 x). A larger amount of partitioning makes further system performance improvement achievable at the cost of chip area.
Keywords :
application specific integrated circuits; high level synthesis; integrated circuit design; logic partitioning; memory architecture; ASIC design; application-specific integrated circuit; balanced workloads; clustering; data access locality; general-purpose computing; heterogeneous distributed architectures; high-level synthesis techniques; interpartition communication; irregular array data access patterns; memory organization; memory size requirements; memory-intensive applications; min-cut style partitioning; partitioned architecture; single-chip distributed logic-memory architectures; Application specific integrated circuits; Bandwidth; Computational modeling; Computer architecture; Costs; Delay; Distributed computing; High level synthesis; Memory architecture; System performance; Application-specific integrated circuit (ASIC); behavioral synthesis; high-level synthesis; memory-intensive application; partitioning;
Journal_Title :
Very Large Scale Integration (VLSI) Systems, IEEE Transactions on
DOI :
10.1109/TVLSI.2007.904096