DocumentCode :
1791594
Title :
Identification of SNP interactions using data-parallel primitives on GPUs
Author :
Altinigneli, Can ; Konten, Bettina ; Rujescir, Dan ; Bohm, Christian ; Plant, Claudia
Author_Institution :
Univ. of Munich, Munich, Germany
fYear :
2014
fDate :
27-30 Oct. 2014
Firstpage :
539
Lastpage :
548
Abstract :
A major goal of a Genome Wide Association Study (GWAS) is to find associations between genetic variations, such as Single-Nucleotide Polymorphisms (SNPs) and the risk for developing a complex disease, such as cancer or schizophrenia. Logic Feature Selection (logicFS) is a technique to search for interactions between SNPs possibly enhancing the risk to develop a particular disease. Composed of several hundreds of processors, the Graphics Processing Unit (GPU) has become a very interesting platform for computationally demanding tasks on massive data. A special hierarchy of processors and fast memory units allow very powerful and efficient parallelization but also demands novel parallel algorithms. In this paper, we formulate LogicFS-GPU algorithm particularly suited for the data parallel architectures, such as GPUs. For this purpose, we employ low (or device) level and high level data parallel primitives, e.g. map, compaction, parallel-prefix-sum (scan) and parallel reduction. The primary idea of our algorithm is to allow the parallel threads developing cooperatively their own private high quality binary interaction models to predict the affection status of subjects. We demonstrate (1) how to formulate the parallel LogicFS-GPU algorithm to be able to exploit most of the potential parallelism hidden in the base logicFS algorithm and (2) how to utilize the special memory and processor architecture of a modern GPU in order to share this information among threads in an optimal way. As a perspective, LogicFS-GPU is not limited examining SNP interactions, but can also be applied to any problem in which multi-variate binary predictor interactions are tried to be associated with observations. Furthermore, the target architecture of LogicFS-GPU is not only constrained by GPU and it may be possible to port our formulation to any other target data-parallel architecture.
Keywords :
biology computing; genomics; graphics processing units; memory architecture; multi-threading; parallel algorithms; parallel architectures; GWAS; SNP interaction identification; SNP interaction search; affection status prediction; base logicFS algorithm; cancer; complex disease development risk; data compaction; data mapping; data parallel architectures; data-parallel primitives; genetic variations; genome wide association study; graphics processing unit; high-level data parallel primitives; information sharing; logic feature selection; low-level data parallel primitives; memory architecture; memory units; multivariate binary predictor interactions; parallel LogicFS-GPU algorithm; parallel algorithms; parallel reduction; parallel threads; parallel-prefix-sum; parallelization; private high-quality binary interaction models; processor architecture; processor hierarchy; schizophrenia; single-nucleotide polymorphisms; target data-parallel architecture; Computer architecture; Data models; Diseases; Graphics processing units; Instruction sets; Silicon;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data (Big Data), 2014 IEEE International Conference on
Conference_Location :
Washington, DC
Type :
conf
DOI :
10.1109/BigData.2014.7004271
Filename :
7004271
Link To Document :
بازگشت