DocumentCode :
2702550
Title :
INDARE - An indexed DAG of regular expressions for selecting position frequency matrices
Author :
Park, Meeyoung ; Sanghvi, Jubin ; Dinakarpandian, Deendayal
Author_Institution :
Univ. of Missouri-Kansas City, Kansas City
fYear :
2007
fDate :
2-4 Nov. 2007
Firstpage :
191
Lastpage :
196
Abstract :
The identification of putative motifs in biomolecular sequences or whole genomes/proteomes is frequently based on window-based scanning with position frequency matrices (PFMs). The exponential increase in the amount of sequence data and the growing number of patterns to be screened has resulted in the need for rapid screening methods. In recognition of this, we have developed the Indexed DAG of regular expressions extractor (INDARE), a tool that dynamically extracts regular expressions (REs) for each PFM in the database, and creates a directed acyclic graph of REs. The INDARE generated DAG is very effective in pruning the search space and easily outperforms the naive exhaustive sequential search approach. The method is general enough to be applicable for the identification of motifs in any domain.
Keywords :
biology computing; molecular biophysics; INDARE tool; Indexed DAG of Regular Expressions Extractor sequential search approach; biomolecular sequences; genomes; position frequency matrices; proteomes; Bioinformatics; Cities and towns; Computer science; Data mining; Databases; Frequency; Genomics; Informatics; Inverse problems; Pattern recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomedicine Workshops, 2007. BIBMW 2007. IEEE International Conference on
Conference_Location :
Fremont, CA
Print_ISBN :
978-1-4244-1604-2
Type :
conf
DOI :
10.1109/BIBMW.2007.4425418
Filename :
4425418
Link To Document :
بازگشت