• DocumentCode
    3807271
  • Title

    Predicting Novel Human Gene Ontology Annotations Using Semantic Analysis

  • Author

    Bogdan Done;Purvesh Khatri;Arina Done;Sorin Draghici

  • Author_Institution
    Wayne State University, Detriot
  • Volume
    7
  • Issue
    1
  • fYear
    2010
  • Firstpage
    91
  • Lastpage
    99
  • Abstract
    The correct interpretation of many molecular biology experiments depends in an essential way on the accuracy and consistency of the existing annotation databases. Such databases are meant to act as repositories for our biological knowledge as we acquire and refine it. Hence, by definition, they are incomplete at any given time. In this paper, we describe a technique that improves our previous method for predicting novel GO annotations by extracting implicit semantic relationships between genes and functions. In this work, we use a vector space model and a number of weighting schemes in addition to our previous latent semantic indexing approach. The technique described here is able to take into consideration the hierarchical structure of the gene ontology (GO) and can weight differently GO terms situated at different depths. The prediction abilities of 15 different weighting schemes are compared and evaluated. Nine such schemes were previously used in other problem domains, while six of them are introduced in this paper. The best weighting scheme was a novel scheme, n2tn. Out of the top 50 functional annotations predicted using this weighting scheme, we found support in the literature for 84 percent of them, while 6 percent of the predictions were contradicted by the existing literature. For the remaining 10 percent, we did not find any relevant publications to confirm or contradict the predictions. The n2tn weighting scheme also outperformed the simple binary scheme used in our previous approach.
  • Keywords
    "Humans","Ontologies","Databases","Indexing","Organisms","Extrapolation","Biological system modeling","Predictive models","Biology","Biological processes"
  • Journal_Title
    IEEE/ACM Transactions on Computational Biology and Bioinformatics
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2008.29
  • Filename
    4459308