Title :
PPIExtractor: A Protein Interaction Extraction and Visualization System for Biomedical Literature
Author :
Zhihao Yang ; Zhehuan Zhao ; Yanpeng Li ; Yuncui Hu ; Hongfei Lin
Author_Institution :
Coll. of Comput. Sci. & Technol., Dalian Univ. of Technol., Dalian, China
Abstract :
Protein-protein interactions (PPIs) play a key role in various aspects of the structural and functional organization of the cell. Knowledge about them unveils the molecular mechanisms of biological processes. However, the amount of biomedical literature regarding protein interactions is increasing rapidly and it is difficult for interaction database curators to detect and curate protein interaction information manually. In this paper, we present a PPI extraction system, termed PPIExtractor, which automatically extracts PPIs from biomedical text and visualizes them. Given a Medline record dataset, PPIExtractor first applies Feature Coupling Generalization (FCG) to tag protein names in text, next uses the extended semantic similarity-based method to normalize them, then combines feature-based, convolution tree and graph kernels to extract PPIs, and finally visualizes the PPI network. Experimental evaluations show that PPIExtractor can achieve state-of-the-art performance on a DIP subset with respect to comparable evaluations. PPIExtractor is freely available for academic purposes at: http://202.118.75.18:8080/PPIExtractor/.
Keywords :
biology computing; cellular biophysics; feature extraction; molecular biophysics; molecular configurations; proteins; semantic networks; Medline record dataset; PPI network; PPIExtractor; biological processes; biomedical literature; biomedical text; cell; curate protein interaction information; database curators interaction; extended semantic similarity-based method; feature coupling generalization; feature-based convolution tree; functional organization; graph kernels; molecular mechanisms; protein interaction extraction; protein tag; protein visualization system; protein-protein interactions; state-of-the-art performance; structural organization; Convolution; Databases; Dictionaries; Feature extraction; Kernel; Protein engineering; Proteins; Feature coupling generalization; information extraction; multiple kernels learning; protein-protein interaction; Algorithms; Computational Biology; Data Mining; Databases, Protein; Protein Interaction Maps; Proteins; User-Computer Interface;
Journal_Title :
NanoBioscience, IEEE Transactions on
DOI :
10.1109/TNB.2013.2263837