DocumentCode :
945537
Title :
A Cryptographic Approach to Securely Share and Query Genomic Sequences
Author :
Kantarcioglu, Murat ; Jiang, Wei ; Liu, Ying ; Malin, Bradley
Author_Institution :
Dept. of Comput. Sci., Univ. of Texas, Dallas, TX
Volume :
12
Issue :
5
fYear :
2008
Firstpage :
606
Lastpage :
617
Abstract :
To support large-scale biomedical research projects, organizations need to share person-specific genomic sequences without violating the privacy of their data subjects. In the past, organizations protected subjects´ identities by removing identifiers, such as name and social security number; however, recent investigations illustrate that deidentified genomic data can be ldquoreidentifiedrdquo to named individuals using simple automated methods. In this paper, we present a novel cryptographic framework that enables organizations to support genomic data mining without disclosing the raw genomic sequences. Organizations contribute encrypted genomic sequence records into a centralized repository, where the administrator can perform queries, such as frequency counts, without decrypting the data. We evaluate the efficiency of our framework with existing databases of single nucleotide polymorphism (SNP) sequences and demonstrate that the time needed to complete count queries is feasible for real world applications. For example, our experiments indicate that a count query over 40 SNPs in a database of 5000 records can be completed in approximately 30 min with off-the-shelf technology. We further show that approximation strategies can be applied to significantly speed up query execution times with minimal loss in accuracy. The framework can be implemented on top of existing information and network technologies in biomedical environments.
Keywords :
biology computing; biomedical engineering; cryptography; data mining; genetics; query processing; SNP sequence database; cryptography; encrypted genomic sequence records; genomic data mining; large scale biomedical research projects; query execution time; secure genomic sequence querying; secure genomic sequence sharing; single nucleotide polymorphism; Databases; Genomics; Homo-morphic Encryption; Privacy; Security; genomics; homomorphic encryption; privacy; security; Base Sequence; Chromosome Mapping; Computer Security; Genome; Humans; Information Storage and Retrieval; Molecular Sequence Data; Polymorphism, Single Nucleotide; Security Measures; Sequence Analysis, DNA;
fLanguage :
English
Journal_Title :
Information Technology in Biomedicine, IEEE Transactions on
Publisher :
ieee
ISSN :
1089-7771
Type :
jour
DOI :
10.1109/TITB.2007.908465
Filename :
4358920
Link To Document :
بازگشت