Title of article :
Two-Sample Tests for Comparing Intra-Individual Genetic Sequence Diversity between Populations
Author/Authors :
Gilbert، Peter B. نويسنده , , Rossini، A. J. نويسنده , , Shankarappa، Raj نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2005
Abstract :
Consider a study of two groups of individuals infected with a population of a genetically related heterogeneous mixture of viruses, and multiple viral sequences are sampled from each person. Based on estimates of genetic distances between pairs of aligned viral sequences within individuals, we develop four new tests to compare intra-individual genetic sequence diversity between the two groups. This problem is complicated by two levels of dependency in the data structure: (i) Within an individual, any pairwise distances that share a common sequence are positively correlated; and (ii) for any two pairings of individuals which share a person, the two differences in intra-individual distances between the paired individuals are positively correlated. The first proposed test is based on the difference in mean intraindividual pairwise distances pooled over all individuals in each group, standardized by a variance estimate that corrects for the correlation structure using Ustatistic theory. The second procedure is a nonparametric rank-based analog of the first test, and the third test contrasts the set of subject-specific average intraindividual pairwise distances between the groups. These tests are very easy to use and solve correlation problem (i). The fourth procedure is based on a linear combination of all possible U-statistics calculated on independent, identically distributed sequence subdatasets, over the two levels (i) and (ii) of dependencies in the data, and is more complicated than the other tests but can be more powerful. Although the proposed methods are empirical and do not fully utilize knowledge from population genetics, the tests reflect biology through the evolutionary models used to derive the pairwise sequence distances. The new tests are evaluated theoretically and in a simulation study, and are applied to a dataset of 200 HIV sequences sampled from 21 children.
Keywords :
Wilcoxon test , U-statistic , HIV genetic diversity , Median test , CTL epitope , Nonparametric statistics , hypothesis testing , Correlated data , two-sample test
Journal title :
BIOMETRICS (BIOMETRIC SOCIETY)
Journal title :
BIOMETRICS (BIOMETRIC SOCIETY)