Title :
Protein interaction network constructing based on text mining and reinforcement learning with application to prostate cancer
Author :
Fei Zhu ; Quan Liu ; Xiaofang Zhang ; Bairong Shen
Author_Institution :
Sch. of Comput. Sci. & Technol., Soochow Univ., Suzhou, China
Abstract :
Constructing interaction network from biomedical texts is a very important and interesting work. The authors take advantage of text mining and reinforcement learning approaches to establish protein interaction network. Considering the high computational efficiency of co-occurrence-based interaction extraction approaches and high precision of linguistic patterns approaches, the authors propose an interaction extracting algorithm where they utilise frequently used linguistic patterns to extract the interactions from texts and then find out interactions from extended unprocessed texts under the basic idea of co-occurrence approach, meanwhile they discount the interaction extracted from extended texts. They put forward a reinforcement learning-based algorithm to establish a protein interaction network, where nodes represent proteins and edges denote interactions. During the evolutionary process, a node selects another node and the attained reward determines which predicted interaction should be reinforced. The topology of the network is updated by the agent until an optimal network is formed. They used texts downloaded from PubMed to construct a prostate cancer protein interaction network by the proposed methods. The results show that their method brought out pretty good matching rate. Network topology analysis results also demonstrate that the curves of node degree distribution, node degree probability and probability distribution of constructed network accord with those of the scale-free network well.
Keywords :
cancer; data mining; learning (artificial intelligence); medical computing; molecular biophysics; proteins; statistical distributions; text analysis; topology; cooccurrence-based interaction extraction approach; matching rate; network topology; node degree distribution; node degree probability; probability distribution; prostate cancer protein interaction network; reinforcement learning; reinforcement learning-based algorithm; scale-free network; text mining;
Journal_Title :
Systems Biology, IET
DOI :
10.1049/iet-syb.2014.0050