DocumentCode :
1936122
Title :
A Fast Method for Determining the Repeat Pattern Size in DNA Sequences
Author :
Zhou, Hong-Xia ; Yan, Hong
Author_Institution :
City Univ. of Hong Kong, Kowloon
Volume :
6
fYear :
2007
fDate :
19-22 Aug. 2007
Firstpage :
3314
Lastpage :
3318
Abstract :
Tandem repeats occur frequently in the human genome. The functions of them are still largely unclear, but some of them have been shown to cause human disease, and have relationship with regulatory functions. Thus, detecting tandem repeats has considerable significance. Because of the undetermined length of repeat pattern and indels and substitutions existing in a tandem repeat, identifying a tandem repeat in genomic sequence data is a difficult task. In this paper, an efficient algorithm is proposed, which is based on the autoregressive (AR) model. We analyze residual errors of the AR model with different orders for a DNA sequence. According to changes of residual errors, we can determine whether a sequence contains a tandem repeat and what pattern size is. Examples show this algorithm can not only detect exact tandem repeats but also approximate ones.
Keywords :
DNA; autoregressive processes; genetics; DNA sequences; autoregressive model; genomic sequence data; human disease; human genome; regulatory functions; repeat pattern size determination; tandem repeat detection; Bioinformatics; DNA; Diseases; Frequency; Genomics; Humans; Machine learning; Pattern analysis; Sequences; Testing; Autoregressive model; Pattern size; Residual error; Tandem repeat;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2007 International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-0973-0
Electronic_ISBN :
978-1-4244-0973-0
Type :
conf
DOI :
10.1109/ICMLC.2007.4370720
Filename :
4370720
Link To Document :
بازگشت