Title :
Fast Search of Sequences with Complex Symbol Correlations using Profile Context-Sensitive HMMS and Pre-Screening Filters
Author :
Byung-Jun Yoon ; Vaidyanathan, P.P.
Author_Institution :
Dept. of Electr. Eng., California Inst. of Technol., Pasadena, CA, USA
Abstract :
Recently, profile context-sensitive HMMs (profile-csHMMs) have been proposed which are very effective in modeling the common patterns and motifs in related symbol sequences. Profile-csHMMs are capable of representing long-range correlations between distant symbols, even when these correlations are entangled in a complicated manner. This makes profile-csHMMs an useful tool in computational biology, especially in modeling noncoding RNAs (ncRNAs) and finding new ncRNA genes. However, a profile-csHMM based search is quite slow, hence not practical for searching a large database. In this paper, we propose a practical scheme for making the search speed significantly faster without any degradation in the prediction accuracy. The proposed method utilizes a pre-screening filter based on a profile-HMM, which filters out most sequences that will not be predicted as a match by the original profile-csHMM. Experimental results show that the proposed approach can make the search speed eighty times faster.
Keywords :
filtering theory; hidden Markov models; macromolecules; medical computing; sequences; common patterns modeling; complex symbol correlations; computational biology; long-range correlations; motifs modeling; noncoding RNA; prescreening filters; profile context-sensitive HMM; symbol sequences; Accuracy; Biological system modeling; Computational biology; Context modeling; Databases; Degradation; Hidden Markov models; Matched filters; RNA; Sequences; context-sensitve HMM (csHMM); homology search; noncoding RNA (ncRNA); profile-csHMM; pseudoknot;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0727-3
DOI :
10.1109/ICASSP.2007.366687