• DocumentCode
    11315
  • Title

    iPFPi: A System for Improving Protein Function Prediction through Cumulative Iterations

  • Author

    Taha, Kamal ; Yoo, Paul D. ; Alzaabi, Mohammed

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Khalifa Univ., Abu Dhabi, United Arab Emirates
  • Volume
    12
  • Issue
    4
  • fYear
    2015
  • fDate
    July-Aug. 1 2015
  • Firstpage
    825
  • Lastpage
    836
  • Abstract
    We propose a classifier system called iPFPi that predicts the functions of un-annotated proteins. iPFPi assigns an un-annotated protein P the functions of GO annotation terms that are semantically similar to P. An un-annotated protein P and a GO annotation term T are represented by their characteristics. The characteristics of P are GO terms found within the abstracts of biomedical literature associated with P. The characteristics of Tare GO terms found within the abstracts of biomedical literature associated with the proteins annotated with the function of T. Let F and F/ be the important (dominant) sets of characteristic terms representing T and P, respectively. iPFPi would annotate P with the function of T, if F and F/ are semantically similar. We constructed a novel semantic similarity measure that takes into consideration several factors, such as the dominance degree of each characteristic term t in set F based on its score, which is a value that reflects the dominance status of t relative to other characteristic terms, using pairwise beats and looses procedure. Every time a protein P is annotated with the function of T, iPFPi updates and optimizes the current scores of the characteristic terms for T based on the weights of the characteristic terms for P. Set F will be updated accordingly. Thus, the accuracy of predicting the function of T as the function of subsequent proteins improves. This prediction accuracy keeps improving over time iteratively through the cumulative weights of the characteristic terms representing proteins that are successively annotated with the function of T. We evaluated the quality of iPFPi by comparing it experimentally with two recent protein function prediction systems. Results showed marked improvement.
  • Keywords
    bioinformatics; iterative methods; molecular biophysics; pattern classification; proteins; GO annotation; biomedical literature; classifier system; cumulative iterations; iPFPi; protein function prediction systems; semantic similarity; unannotated proteins; Abstracts; Equations; Mathematical model; Protein engineering; Proteins; Semantics; Vectors; Protein function prediction; biomedical literature; protein annotation; semantic similarity;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2014.2344681
  • Filename
    6871330