Title of article :
Correlations in DNA sequences across the three domains of life
Author/Authors :
S.K. Guharay، نويسنده , , Sabyasachi and Hunt، نويسنده , , Brian R. and Yorke، نويسنده , , James A. and White، نويسنده , , Owen R.، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2000
Abstract :
We report statistical studies of correlation properties of ∼7500 gene sequences, covering coding (exon) and non-coding (intron) sequences for DNA and primary amino acid sequences for proteins, across all three domains of life, namely Eukaryotes (cells with nuclei), Prokaryotes (bacteria) and Archaea (archaebacteria). Mutual information function, power spectrum and Hölder exponent analyses show exons with somewhat greater correlation content than the introns studied. These results are further confirmed with hypothesis testing. While ∼30% of the Eukaryote coding sequences show distinct correlations above noise threshold, this is true for only ∼10% of the Prokaryote and Archaea coding sequences. For protein sequences, we observe correlation lengths similar to that of “random” sequences.
Keywords :
DNA correlations , Long-range correlations , Protein sequences , Introns versus exons , Mathematical biology , statistical genetics , Three domains of life
Journal title :
Physica D Nonlinear Phenomena
Journal title :
Physica D Nonlinear Phenomena