DocumentCode :
3673196
Title :
An information extraction system for protein function prediction
Author :
Kamal Taha;Paul D. Yoo
Author_Institution :
Electrical and Computer Engineering Department, Khalifa University, UAE
fYear :
2015
Firstpage :
1
Lastpage :
7
Abstract :
We present a Natural Language Processing extraction system called IESforPFP, which can retrieve useful information from biomedical abstracts. IESforPFP aims at enhancing the state of the art of biological text mining by applying novel linguistic computational technique. By retrieving significant patterns of associations between proteins and molecules from biomedical abstracts, IESforPFP can determine the functions of un-annotated proteins. The system determines the semantic relationship between each protein-molecule pair in sentences using novel semantic rules. It applies a semantic relationship extraction model that retrieves information from different structural forms of constituents in sentences. In the framework of IESforPFP, each protein p is represented by a vector of weights. Each weight reflects the significance of a molecule m in the biomedical abstracts associated with p. That is, each weight quantifies the likelihood of the association between m and p. IESforPFP determines the set of annotated proteins that is semantically similar to p by comparing their vectors. It then annotates p with the functions of these annotated proteins. We evaluated the quality of IESforPFP by comparing it experimentally with two other systems. Results showed marked improvement.
Keywords :
"Proteins","Protein engineering","Semantics","Pragmatics","Information retrieval","Syntactics"
Publisher :
ieee
Conference_Titel :
Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2015 IEEE Conference on
Type :
conf
DOI :
10.1109/CIBCB.2015.7300300
Filename :
7300300
Link To Document :
بازگشت