DocumentCode :
2008307
Title :
Ensemble Machine Methods for DNA Binding
Author :
Fan, Yue ; Kon, Mark A. ; DeLisi, Charles
Author_Institution :
Dept. of Math. & Stat., Boston Univ., Boston, MA, USA
fYear :
2008
fDate :
11-13 Dec. 2008
Firstpage :
709
Lastpage :
716
Abstract :
We introduce three ensemble machine learning methods for analysis of biological DNA binding by transcription factors (TFs). The goal is to identify both TF target genes and their binding motifs. Subspace-valued weak learners (formed from an ensemble of different motif finding algorithms) combine candidate motifs as probability weight matrices (PWM), which are then translated into subspaces of a DNA k-mer (string) feature space. Assessing and then integrating highly informative subspaces by machine methods gives more reliable target classification and motif prediction. We compare these target identification methods with probability weight matrix (PWM) rescanning and use of support vector machines on the full k-mer space of the yeast S. cerevisiae. This method, SVMotif-PWM, can significantly improve accuracy in computational identification of TF targets. The software is publicly available at http://cagt10.bu.edu/SVMotif .
Keywords :
DNA; biology computing; genetics; learning (artificial intelligence); matrix algebra; pattern classification; probability; biological DNA binding analysis; ensemble machine learning method; motif prediction; probability weight matrix; target classification; transcription factor target gene; Bioinformatics; DNA; Genomics; Learning systems; Machine learning; Mathematics; Pulse width modulation; Sequences; Statistics; Systems biology; DNA; bioinformatics; ensembles; machine learning; transcription factor;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications, 2008. ICMLA '08. Seventh International Conference on
Conference_Location :
San Diego, CA
Print_ISBN :
978-0-7695-3495-4
Type :
conf
DOI :
10.1109/ICMLA.2008.114
Filename :
4725053
Link To Document :
بازگشت