Title :
A MapReduce-based Algorithm for Motif Search
Author :
Huo, Hongwei ; Lin, Shuai ; Yu, Qiang ; Zhang, Yipu ; Stojkovic, Vojislav
Author_Institution :
Sch. of Comput. Sci. & Technol., Xidian Univ., Xi´´an, China
Abstract :
Motif search plays an important role in gene finding and understanding gene regulation relationship. Motif search is one of the most challenging problems in bioinformatics. In this paper, we present three data partitions for the PMSP algorithm and propose the PMSP MapReduce algorithm (PMSPMR) for solving the motif search problem. For instances of the problem with different difficulties, the experimental results on the Hadoop cluster demonstrate that PMSPMR has good scalability. In particular, for the more difficult motif search problems, PMSPMR shows its advantage because the speedup is almost linearly proportional to the number of nodes in the Hadoop cluster. We also present experimental results on realistic biological data by identifying known transcriptional regulatory motifs in eukaryotes as well as in actual promoter sequences extracted from Saccharomyces cerevisiae.
Keywords :
bioinformatics; genetics; search problems; Hadoop cluster; PMSP MapReduce algorithm; PMSPMR; Saccharomyces cerevisiae; bioinformatics; data partitions; gene finding; gene regulation relationship; motif search; Algorithm design and analysis; Clustering algorithms; DNA; Hamming distance; Partitioning algorithms; Search problems; Hadoop; MapReduce; Motif search; data partition; scalability;
Conference_Titel :
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International
Conference_Location :
Shanghai
Print_ISBN :
978-1-4673-0974-5
DOI :
10.1109/IPDPSW.2012.255