Author/Authors :
Arquès، نويسنده , , Didier G. and Michel، نويسنده , , Christian J.، نويسنده ,
Abstract :
The mutation process is a classical evolutionary genetic process mainly based on the (random) substitutions of one base (A = Adenine, C = Cytosine, G = Guanine, T = Thymine) for another. Two analytical solutions derived here allow us to analyse in genes the occurrence probabilities of motifs (e.g. dinucleotides) after substitutions (in the evolutionary sense: from the past to the present) and, unexpectedly, also before substitutions (after back substitutions, in the inverse evolutionary sense: from the present to the past). We generalize on the alphabet {A, C, G, T} of the analytical solutions and of the properties derived on the alphabet {R, Y} (R= purine = A or G,Y= pyrimidine = C or T). Application of the theory is based on the analytical solution giving the probabilities of the 16 dinucleotides AA, . . . , TT in the protein (coding) genes of (nuclear) eukaryotes, viruses and prokaryotes and in (eukaryotic) introns after back substitutions (called primitive genes). After back substitutions, four of 16 dinucleotides—CG, TA, GT and AC—occur with low probabilities in each of these four primitive gene populations, except for CG in the primitive prokaryotic protein genes. In the primitive eukaryotic protein genes, the dinucleotide AT has also a significant low probability.
sent the properties of the two analytical solutions, and the functions which may have these five dinucleotides in primitive genes are described in terms of biological signals.