Title :
Extracting and mining protein-protein interaction network from biomedical literature
Author :
Hu, Xiaohua ; Yoo, Illhoi ; Song, Il-Yeol ; Song, Min ; Han, Jianchao ; Lechner, Mark
Author_Institution :
Coll. of Inf. Sci. & Technol., Drexel Univ., Philadelphia, PA, USA
Abstract :
We present a biomedical literature data mining system SPIE-DM (Scalable and Portable Information Extraction and Data Mining) to extract and mine the protein-protein interaction network from biomedical literature such as MedLine. SPIE-DM consists of two phases: in phase 1, we develop a scalable and portable ie method (SPIE) to extract the protein-protein interaction from the biomedical literature. These extracted protein-protein interactions form a scale-free network graph. In phase 2, we apply a novel clustering method SFCluster to mine the protein-protein interaction network. The clusters in the network graph represent some potential protein complexes, which are very important for biologist to study the protein functionality. The clustering algorithm considers the characteristics of the scale-free network graphs and is based on the local density of the vertex and its neighborhood functions that can be used to find more meaningful clusters at different density levels. The experiments of SPIE-DM on around 1600 chromatin proteins indicate that our system is very promising for extracting and mining from biomedical literature databases.
Keywords :
biology computing; data mining; molecular biophysics; proteins; statistical analysis; SPIE-DM; biomedical literature; biomedical literature database; chromatin protein; clustering algorithm; data mining system; neighborhood function; novel clustering method; potential protein complex; protein-protein interaction network; scale-free network graph; Abstracts; Bioinformatics; Clustering algorithms; Clustering methods; Data mining; Databases; Genomics; Natural languages; Protein engineering; Text mining;
Conference_Titel :
Computational Intelligence in Bioinformatics and Computational Biology, 2004. CIBCB '04. Proceedings of the 2004 IEEE Symposium on
Print_ISBN :
0-7803-8728-7
DOI :
10.1109/CIBCB.2004.1393960