• DocumentCode
    1936122
  • Title

    A Fast Method for Determining the Repeat Pattern Size in DNA Sequences

  • Author

    Zhou, Hong-Xia ; Yan, Hong

  • Author_Institution
    City Univ. of Hong Kong, Kowloon
  • Volume
    6
  • fYear
    2007
  • fDate
    19-22 Aug. 2007
  • Firstpage
    3314
  • Lastpage
    3318
  • Abstract
    Tandem repeats occur frequently in the human genome. The functions of them are still largely unclear, but some of them have been shown to cause human disease, and have relationship with regulatory functions. Thus, detecting tandem repeats has considerable significance. Because of the undetermined length of repeat pattern and indels and substitutions existing in a tandem repeat, identifying a tandem repeat in genomic sequence data is a difficult task. In this paper, an efficient algorithm is proposed, which is based on the autoregressive (AR) model. We analyze residual errors of the AR model with different orders for a DNA sequence. According to changes of residual errors, we can determine whether a sequence contains a tandem repeat and what pattern size is. Examples show this algorithm can not only detect exact tandem repeats but also approximate ones.
  • Keywords
    DNA; autoregressive processes; genetics; DNA sequences; autoregressive model; genomic sequence data; human disease; human genome; regulatory functions; repeat pattern size determination; tandem repeat detection; Bioinformatics; DNA; Diseases; Frequency; Genomics; Humans; Machine learning; Pattern analysis; Sequences; Testing; Autoregressive model; Pattern size; Residual error; Tandem repeat;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2007 International Conference on
  • Conference_Location
    Hong Kong
  • Print_ISBN
    978-1-4244-0973-0
  • Electronic_ISBN
    978-1-4244-0973-0
  • Type

    conf

  • DOI
    10.1109/ICMLC.2007.4370720
  • Filename
    4370720