• DocumentCode
    1995657
  • Title

    Toward a Generic Hybrid CPU-GPU Parallelization of Divide-and-Conquer Algorithms

  • Author

    Lopez-Ortiz, A. ; Salinger, Alejandro ; Suderman, Robert

  • Author_Institution
    Cheriton Sch. of Comput. Sci., Univ. of Waterloo, Waterloo, ON, Canada
  • fYear
    2013
  • fDate
    20-24 May 2013
  • Firstpage
    601
  • Lastpage
    610
  • Abstract
    The increasing power and decreasing cost of Graphic Processing Units (GPUs) together with the development of programming languages for General Purpose Computing on GPUs (GPGPU) have led to the development and implementation of fast parallel algorithms for this architecture for a large spectrum of applications. Given the streaming-processing characteristics of GPUs, most practical applications so far are on highly data-parallel algorithms. Many problems, however, allow for task-parallel solutions or a combination of task and data-parallel algorithms. For these, a hybrid CPU-GPU parallel algorithm that combines the highly parallel stream-processing power of GPUs with the higher scalar power of multi-cores is likely to be superior. In this paper we describe a generic translation of any recursive sequential implementation of a divide-and-conquer algorithm into an implementation that benefits from running in parallel in both multi-cores and GPUs. This translation is generic in the sense that it requires little knowledge of the particular algorithm. We then present a schedule and work division scheme that adapts to the characteristics of each algorithm and the underlying architecture, efficiently balancing the workload between GPU and CPU. Our experiments show a 4.5x speedup over a single core recursive implementation, while demonstrating the accuracy and practicality of the approach.
  • Keywords
    divide and conquer methods; graphics processing units; multiprocessing systems; parallel algorithms; performance evaluation; power aware computing; GPGPU; GPU streaming-processing characteristics; data-parallel algorithms; divide-and-conquer algorithms; general purpose computing on GPU; generic hybrid CPU-GPU parallelization; generic translation; graphic processing units; hybrid CPU-GPU parallel algorithm; parallel stream-processing power; programming languages; schedule scheme; task-parallel solutions; work division scheme; Algorithm design and analysis; Central Processing Unit; Computer architecture; Graphics processing units; Instruction sets; Parallel processing; Schedules; GPU; divide-and-conquer; heterogeneous architectures; hybrid algorithms; multi-core; parallel algorithms; performance modeling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International
  • Conference_Location
    Cambridge, MA
  • Print_ISBN
    978-0-7695-4979-8
  • Type

    conf

  • DOI
    10.1109/IPDPSW.2013.200
  • Filename
    6650936