• DocumentCode
    2866296
  • Title

    Gradient-Based Feature Selection for Conditional Random Fields and its Applications in Computational Genetics

  • Author

    Chen, Minmin ; Chen, Yixin ; Brent, Michael R. ; Tenney, Aaron E.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Washington Univ. in St. Louis, St. Louis, MO, USA
  • fYear
    2009
  • fDate
    2-4 Nov. 2009
  • Firstpage
    750
  • Lastpage
    757
  • Abstract
    Gene prediction is one of the first and most important steps in understanding the genome of a species, and different approaches haven been proposed. In 2007, a de novo gene predictor, called CONTRAST, based on Conditional Random Fields (CRFs) is introduced, and proved to substantially outperform previous predictors. However, the oversize feature set used in the model has posed several issues, like overfitting problem and excessive computational demand. To resolve these issues, we did a thorough survey of two existing feature selection methods for CRFs, namely the gain-based and gradient-based methods, and applied the later one to CONTRAST. The results show that with the gradient-based feature selection scheme, we are able to achieve comparable or even better prediction accuracy on testing data, using only a very small fraction of the features from the candidate pool. The feature selection method also helps researchers better understand the underlying structure of the genomic sequences, further provides insights of the function and evolutionary dynamics of genomes.
  • Keywords
    biology computing; genomics; gradient methods; learning (artificial intelligence); probability; Contrast gene predictor; computational genetics application; conditional random fields; feature selection method; gain based method; genomic structure; gradient based method; Artificial intelligence; Bioinformatics; Computational efficiency; Computer applications; Data mining; Filtering; Genetics; Genomics; Hidden Markov models; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Tools with Artificial Intelligence, 2009. ICTAI '09. 21st International Conference on
  • Conference_Location
    Newark, NJ
  • ISSN
    1082-3409
  • Print_ISBN
    978-1-4244-5619-2
  • Electronic_ISBN
    1082-3409
  • Type

    conf

  • DOI
    10.1109/ICTAI.2009.82
  • Filename
    5366367