• DocumentCode
    1266557
  • Title

    Molecular Function Prediction Using Neighborhood Features

  • Author

    Bogdanov, Petko ; Singh, Ambuj K.

  • Author_Institution
    Dept. of Comput. Sci., Univ. of California, Santa Barbara, CA, USA
  • Volume
    7
  • Issue
    2
  • fYear
    2010
  • Firstpage
    208
  • Lastpage
    217
  • Abstract
    The recent advent of high-throughput methods has generated large amounts of gene interaction data. This has allowed the construction of genomewide networks. A significant number of genes in such networks remain uncharacterized and predicting the molecular function of these genes remains a major challenge. A number of existing techniques assume that genes with similar functions are topologically close in the network. Our hypothesis is that genes with similar functions observe similar annotation patterns in their neighborhood, regardless of the distance between them in the interaction network. We thus predict molecular functions of uncharacterized genes by comparing their functional neighborhoods to genes of known function. We propose a two-phase approach. First, we extract functional neighborhood features of a gene using Random Walks with Restarts. We then employ a KNN classifier to predict the function of uncharacterized genes based on the computed neighborhood features. We perform leave-one-out validation experiments on two S. cerevisiae interaction networks and show significant improvements over previous techniques. Our technique provides a natural control of the trade-off between accuracy and coverage of prediction. We further propose and evaluate prediction in sparse genomes by exploiting features from well-annotated genomes.
  • Keywords
    bioinformatics; genetics; molecular biophysics; pattern classification; random processes; KNN classifier; Saccharomyces cerevisiae interaction networks; annotation patterns; functional neighborhood feature extraction; gene interaction data; gene interaction network; gene molecular function; genomewide networks; high throughput methods; leave one out validation experiments; molecular function prediction; neighborhood features; random walks with restarts; Gene function prediction; classification; feature extraction; functional interaction network.; Animals; Databases, Genetic; Gene Regulatory Networks; Genes; Genomics; Models, Statistical; Pattern Recognition, Automated; ROC Curve; Saccharomyces cerevisiae;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2009.81
  • Filename
    5313794