Title :
Motif Discovery as a Multiple-Instance Problem
Author :
Zhang, Ya ; Chen, Yixin ; Ji, Xiang
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Kansas Univ., Lawrence, KS
Abstract :
Motif discovery from bio sequences, a challenging task both experimentally and computationally, has been a topic of immense study in recent years. In this paper, we formulate the motif discovery problem as a multiple-instance problem and employ a multiple-instance learning method, the MILES method, to identify motif from biological sequences. Each sequence is mapped into a feature space defined by instances in training sequences with a novel instance-bag similarity measure. We employ I-norm SVM to select important features and construct classifiers simultaneously. These high-ranked features correspond to discovered motifs. We apply this method to discover transcriptional factor binding sites in promoters, a typical motif finding problem in biology, and show that the method is at least comparable to existing methods
Keywords :
biology computing; genetics; learning (artificial intelligence); pattern classification; support vector machines; I-norm SVM; MILES method; bio sequences; biological sequences; feature space; instance bag similarity measure; motif discovery problem; multiple instance learning; multiple instance problem; promoters; training sequences; transcriptional factor binding sites; Bioinformatics; Biology computing; Computational Intelligence Society; DNA; Extraterrestrial measurements; Genetic mutations; Learning systems; Random sequences; Support vector machine classification; Support vector machines;
Conference_Titel :
Tools with Artificial Intelligence, 2006. ICTAI '06. 18th IEEE International Conference on
Conference_Location :
Arlington, VA
Print_ISBN :
0-7695-2728-0
DOI :
10.1109/ICTAI.2006.89