• DocumentCode
    3517636
  • Title

    DNA coding using finite-context models and arithmetic coding

  • Author

    Pinho, Armando J. ; Neves, António J R ; Bastos, Carlos A C ; Ferreira, Paulo J S G

  • Author_Institution
    Signal Process. Lab., Univ. of Aveiro, Aveiro
  • fYear
    2009
  • fDate
    19-24 April 2009
  • Firstpage
    1693
  • Lastpage
    1696
  • Abstract
    The interest in DNA coding has been growing with the availability of extensive genomic databases. Although only two bits are sufficient to encode the four DNA bases, efficient lossless compression methods are still needed due to the size of DNA sequences and because standard compression algorithms do not perform well on DNA sequences. As a result, several specific coding methods have been proposed. Most of these methods are based on searching procedures for finding exact or approximate repeats. Low order finite-context models have only been used as secondary, fall back mechanisms. In this paper, we show that finite-context models can also be used as main DNA encoding methods. We propose a coding method based on two finite-context models that compete for the encoding of data, on a block by block basis. The experimental results confirm the effectiveness of the proposed method.
  • Keywords
    DNA; arithmetic codes; biology computing; data compression; scientific information systems; DNA coding; DNA sequence; arithmetic coding; fall back mechanism; finite-context model; genomic database; lossless compression method; Arithmetic; Availability; Bioinformatics; Biomedical signal processing; DNA; Encoding; Genomics; Humans; Proteins; Sequences; DNA coding; arithmetic coding; bioinformatics; finite-context modeling; source coding;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
  • Conference_Location
    Taipei
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-2353-8
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2009.4959928
  • Filename
    4959928