Title :
Detection of Transcription Factor Binding Sites via Motif Clustering and Matching
Author :
Li-fang, Liu ; Li-cheng, Jiao
Author_Institution :
Sch. of Comput. Sci. & Technol., Xidian Univ., Xi´´an
Abstract :
The identification of transcription factor binding sites in promoter sequences is an important problem, since it reveals information about the transcription regulation of genes. In this paper, a novel motif discovery method based on motif clustering and matching is proposed. Against a precompiled library of motifs which is represented by position weight matrices(PWMs), each L-mer in the dataset is matched to a motif base on the match scorepsilas P-value, then the PWMs are updated and clustered according to their similarity. Motif features are ranked in term of statistical significance (P-value). The advantage of this approach is that it can be used to simultaneously characterize every feature present in the dataset thus lessening the chance that weaker signals will be missed. We apply our method (implemented as a computer program called MotifCM) to the benchmark which has 56 datasets, and demonstrate that MotifCM achieves improved performance over several other popular motif discovery tools.
Keywords :
biology computing; genetics; matrix algebra; pattern clustering; pattern matching; sequences; statistical analysis; MotifCM; genes; motif clustering; motif discovery method; motif discovery tools; motif matching; position weight matrices; promoter sequences; statistical significance; transcription factor binding sites identification; transcription regulation; Accuracy; Computational biology; Computer science; DNA; Information processing; Libraries; Monte Carlo methods; Pulse width modulation; Sequences; Software engineering; Motif discovery; P-value; Statistical significance; Transcription factor binding site;
Conference_Titel :
Computer Science and Software Engineering, 2008 International Conference on
Conference_Location :
Wuhan, Hubei
Print_ISBN :
978-0-7695-3336-0
DOI :
10.1109/CSSE.2008.628