DocumentCode
564802
Title
Cloud statistical significance estimation for optimal local alignment of huge DNA sequences
Author
Hosny, Ahmad M. ; Shedeed, Howida A. ; Hussein, Ashraf S. ; Tolba, Mohamed F.
Author_Institution
Dept. of Sci. Comput., Ain Shams Univ., Cairo, Egypt
fYear
2012
fDate
14-16 May 2012
Abstract
Confidence in a pairwise local sequence alignment is a fundamental problem in bioinformatics. For huge DNA sequences, this problem is highly compute-intensive because it involves evaluating thousands of local alignments to construct an empirical score distribution. Recent parallel solutions support only small sequence sizes and/or are based on sophisticated infrastructures that are not available for most research labs. This paper presents an efficient parallel solution for evaluating the statistical significance for a pair of huge DNA sequences using cloud infrastructures. This solution can receive requests from various researchers via web-portal and allocate resources according to the demand. As it is cloud-based solution, it improves robustness, scalability and performance. The fundamental innovation in this research work is proposing an efficient solution that utilizes both shared and distributed memory architectures using the cloud technology to enhance the performance of evaluating the statistical significance for pair of DNA sequences. In this manner, the condition of the sequence size is released to be in megabyte-scale, which was not supported before. The present solution was verified against other recent parallel solutions, and the performance evaluation was carried out on Microsoft´s Cloud. The results show that the performance scales with relatively linear speedup, as the number of instances increases.
Keywords
DNA; bioinformatics; cloud computing; portals; statistical analysis; DNA sequences; Microsoft cloud; Web-portal; bioinformatics; cloud infrastructures; cloud statistical significance estimation; cloud-based solution; empirical score distribution; megabyte-scale; optimal local alignment; pairwise local sequence alignment; parallel solutions; Bioinformatics; Cloud computing; Computer architecture; Computers; DNA; Estimation; Parallel processing; Cloud Computing; megabase DNA sequence; multi-core architectures; sequence alignment; statistical significance estimation;
fLanguage
English
Publisher
ieee
Conference_Titel
Informatics and Systems (INFOS), 2012 8th International Conference on
Conference_Location
Cairo
Print_ISBN
978-1-4673-0828-1
Type
conf
Filename
6236505
Link To Document