مرکز منطقه ای اطلاع رساني علوم و فناوري - Pushing the Performance Envelope of Modular Exponentiation Across Multiple Generations of GPUs

DocumentCode :

3199164

Title :

Pushing the Performance Envelope of Modular Exponentiation Across Multiple Generations of GPUs

Author :

Emmart, Niall ; Weems, Charles

Author_Institution :

Sch. of Comput. Sci., Univ. of Massachusetts, Amherst, MA, USA

fYear :

2015

fDate :

25-29 May 2015

Firstpage :

166

Lastpage :

176

Abstract :

Multiprecision modular exponentiation is a key operation in popular encryption schemes such as RSA, but is computationally expensive. Contexts such as handling many secure web connections in a server can demand higher rates of exponent operations than a traditional multicore can support. Graphics processors offer an opportunity to accelerate batches of exponent calculations both by executing them in parallel as well as through parallelizing the operations within the multiprecision arithmetic itself. However, obtaining performance close to the theoretical peak can be extremely challenging. Furthermore, each new generation of GPU architecture can require a substantially different approach to achieve maximum performance. In this paper we show how we improve modular exponentiation performance over prior results by at factors ranging from 2.6 to 24, across generations of NVIDIA GPU, from compute capability 1.1 onward. Of particular interest is the parameter space that must be searched to find the optimal configuration of memory layout, launch geometry, and algorithm for each architecture at different problem sizes. Our efforts have resulted in a set of tools for generating library functions in the PTX assembly language and searching to find these optima. From our experience it can be argued that a new programming paradigm is needed to achieve full performance potential on core library components as GPUs evolve through multiple generations.

Keywords :

assembly language; graphics processing units; software libraries; GPU architecture; NVIDIA GPU; PTX assembly language; RSA; compute capability; core library components; encryption schemes; exponent operations; graphics processing unit; graphics processors; launch geometry; library functions; memory layout; multiprecision modular exponentiation performance; multiprocessing arithmetic; optimal configuration; secure Web connections; Computational modeling; Computer architecture; Generators; Graphics processing units; Load modeling; Message systems; Registers; GPU accelerated modular exponentiation; SSL acceleration with GPUs; asymmetric cryptography on GPUs; modular exponentiation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International

Conference_Location :

Hyderabad

ISSN :

1530-2075

Type :

conf

DOI :

10.1109/IPDPS.2015.69

Filename :

7161506

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3199164