DocumentCode
86161
Title
Predicting Protein Relationships to Human Pathways through a Relational Learning Approach Based on Simple Sequence Features
Author
Garcia-Jimenez, Beatriz ; Pons, Tirso ; Sanchis, Araceli ; Valencia, Alfonso
Author_Institution
Comput. Sci. Dept., Univ. Carlos III de Madrid, Leganés, Spain
Volume
11
Issue
4
fYear
2014
fDate
July-Aug. 1 2014
Firstpage
753
Lastpage
765
Abstract
Biological pathways are important elements of systems biology and in the past decade, an increasing number of pathway databases have been set up to document the growing understanding of complex cellular processes. Although more genome-sequence data are becoming available, a large fraction of it remains functionally uncharacterized. Thus, it is important to be able to predict the mapping of poorly annotated proteins to original pathway models. Results: We have developed a Relational Learning-based Extension (RLE) system to investigate pathway membership through a function prediction approach that mainly relies on combinations of simple properties attributed to each protein. RLE searches for proteins with molecular similarities to specific pathway components. Using RLE, we associated 383 uncharacterized proteins to 28 pre-defined human Reactome pathways, demonstrating relative confidence after proper evaluation. Indeed, in specific cases manual inspection of the database annotations and the related literature supported the proposed classifications. Examples of possible additional components of the Electron transport system, Telomere maintenance and Integrin cell surface interactions pathways are discussed in detail. Availability: All the human predicted proteins in the 2009 and 2012 releases 30 and 40 of Reactome are available at http://rle.bioinfo.cnio.es.
Keywords
DNA; biochemistry; bioinformatics; cellular biophysics; feature extraction; genomics; learning (artificial intelligence); molecular biophysics; molecular configurations; pattern classification; proteins; annotated proteins; biological pathways; classifications; complex cellular processes; database annotations; electron transport system; genome-sequence data; human pathways; integrin cell surface interactions pathways; pathway databases; predefined human Reactome pathways; protein relationships; relational learning approach; relational learning-based extension system; simple sequence features; systems biology; telomere maintenance; Bioinformatics; Computational biology; Databases; Decision trees; Prediction algorithms; Proteins; Pathway relationship prediction; function prediction; human reactome pathways; knowledge relational representation; machine learning; sequence-based prediction;
fLanguage
English
Journal_Title
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher
ieee
ISSN
1545-5963
Type
jour
DOI
10.1109/TCBB.2014.2318730
Filename
6802366
Link To Document