Title :
Positive sample only learning (PSOL) for predicting RNA genes in E. coli
Author :
Meraz, Richard F. ; He, Xiaofeng ; Ding, Chris H Q ; Holbrook, Stephen R.
Author_Institution :
Lawrence Berkeley Nat. Lab., CA, USA
Abstract :
RNA genes lack most of the signals used for protein gene identification. A major shortcoming of previous discriminative methods to distinguish functional RNA (fRNA) genes from other non-coding genomic sequences is that only positive examples of fRNAs are known; there are no confirmed negatives - only intergenic sequences that may be positive or negative. To address this problem we developed the "positive sample only learning" (PSOL) method. This method can identify the most likely negative examples from an unlabeled set and is therefore able to distinguish putative functional RNA genes from other non-coding sequence. We compare RNA gene predictions using the PSOL method with previous large-scale analyses of the E. coli K12 genome.
Keywords :
biology computing; genetics; learning (artificial intelligence); macromolecules; molecular biophysics; proteins; E. coli K12 genome; discriminative methods; functional RNA gene prediction; noncoding genomic sequences; positive sample only learning; protein gene identification; Bioinformatics; Genomics; Helium; Iterative algorithms; Laboratories; Large-scale systems; Physics computing; Proteins; RNA; Signal processing;
Conference_Titel :
Computational Systems Bioinformatics Conference, 2004. CSB 2004. Proceedings. 2004 IEEE
Print_ISBN :
0-7695-2194-0
DOI :
10.1109/CSB.2004.1332488