• DocumentCode
    3081790
  • Title

    Balanced Block Design Architecture for Parallel Computing in Mobile CPUs/GPUs

  • Author

    Mani, G. ; Berkovich, Simon ; Duoduo Liao

  • Author_Institution
    George Washington Univ., Washington, DC, USA
  • fYear
    2013
  • fDate
    22-24 July 2013
  • Firstpage
    140
  • Lastpage
    141
  • Abstract
    To increase performance, processor manufacturers extract parallelism through shrinking transistors and adding more of them to single-core chips and create multi-core systems. Although microprocessors performance continues to grow at an exponential rate, this approach generates too much heat and consumes too much power. These architectures not only introduce several complications but require tremendous efforts for organization of special software for parallel processing. In many cases, these difficulties are insurmountable. The programmers have to write complex code to prioritize the tasks or perform the task in parallel like extracting parallelism through threads in GPUs. One of the key issues for the programmers is how to divide the tasks in to sub-tasks. A faulty calculation may lead to increased data dependency which will slow the processor. Processor that performs more parallel operations can simultaneously increase the queuing delays. In both of the scenarios mentioned above, the relative cost of communication (also known as data transportation energy) between processing elements in microprocessor (or objects in parallel programming) is increasing relative to that of computation. This trend is resulting in larger caches for every new processor generation and more complex and costly latency tolerant mechanisms. Here we introduce a combinatorial architecture that has a unique property-multi-core running on a sequential code. This architecture can be used for both CPUs and GPUs. Some minor adjustments to a regular compiler are needed for loading. Especially, current mobile GPUs technologies are still relatively immature and require substantial improvements to enable wireless devices to perform the complex graphics-related functions. Our new architecture is more suitable for mobile GPUs/CPUs, i.e., mobile heterogeneous computing, with limited resources and relative greater performance.
  • Keywords
    combinatorial mathematics; graphics processing units; mobile computing; parallel architectures; balanced block design architecture; combinatorial architecture; complex graphics-related functions; data dependency; latency tolerant mechanisms; microprocessors performance; mobile CPUs-GPUs; mobile heterogeneous computing; multicore systems; parallel computing; parallel processing; processor generation; queuing delays; regular compiler; sequential code; transistors; Computer architecture; Fault tolerance; Fault tolerant systems; Instruction sets; Mobile communication; Parallel processing; combinatorial architecture; fault-tolerance; mobile gpu; parallel computing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computing for Geospatial Research and Application (COM.Geo), 2013 Fourth International Conference on
  • Conference_Location
    San Jose, CA
  • Type

    conf

  • DOI
    10.1109/COMGEO.2013.27
  • Filename
    6602058