• DocumentCode
    3237643
  • Title

    GMM-Based Classification of Genomic Sequences

  • Author

    Akhtar, Mahmood ; Ambikairajah, Eliathamby ; Epps, Julien

  • Author_Institution
    Univ. of New South Wales, Sydney
  • fYear
    2007
  • fDate
    1-4 July 2007
  • Firstpage
    103
  • Lastpage
    106
  • Abstract
    At present many digital signal processing based techniques are available to predict genomic protein coding regions. However, accurate identification of these regions at the level of individual nucleotides remains a challenge. In this paper, we propose the novel use of a multi-dimensional feature and Gaussian mixture models for the classification between protein coding and non-coding nucleotides. Employing signal processing based time-domain and frequency-domain features, the novel system described herein is shown to produce identification accuracies of more than 75% and 79% respectively for protein coding and non-coding nucleotides, when evaluated on the GENSCAN data set.
  • Keywords
    Gaussian processes; cellular biophysics; feature extraction; genetics; medical computing; molecular biophysics; proteins; signal processing; time-frequency analysis; GENSCAN data set; GMM; Gaussian mixture models; classification; digital signal processing; frequency-domain features; genomic protein coding regions; genomic sequences; multidimensional feature; noncoding nucleotides; time-domain features; Accuracy; Bioinformatics; DNA; Digital signal processing; Discrete Fourier transforms; Genomics; Multidimensional signal processing; Proteins; Sequences; Signal processing algorithms; Gaussian mixture models; Genomic signal processing; digital filters; discrete Fourier transforms; discrete cosine transforms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Digital Signal Processing, 2007 15th International Conference on
  • Conference_Location
    Cardiff
  • Print_ISBN
    1-4244-0882-2
  • Electronic_ISBN
    1-4244-0882-2
  • Type

    conf

  • DOI
    10.1109/ICDSP.2007.4288529
  • Filename
    4288529