مرکز منطقه ای اطلاع رساني علوم و فناوري - Estimation of genotype imputation accuracy using reference populations with varying degrees of relationship and marker density

Author/Authors :

Vafaye Valleh, M Department of Animal Science - Faculty of Agriculture - University of Zabol - Zabol, Iran , Barjasteh, Sh Department of Animal Science - Faculty of Agriculture - University of Zabol - Zabol, Iran , Dashab, G.R Department of Animal Science - Faculty of Agriculture - University of Zabol - Zabol, Iran , Rokouei, M Department of Animal Science - Faculty of Agriculture - University of Zabol - Zabol, Iran , Shariati, M.M Department of Animal Science - College of Agriculture - Ferdowsi University of Mashhad - Mashhad, Iran

Abstract :

Genotype imputation from low-density to high-density (SNP) chips is an important step before applying genomic selection, because denser chips can provide more reliable genomic predic-tions. In the current research, the accuracy of genotype imputation from low and moderate-density panels (5K and 50K) to high-density panels in the purebred and crossbred populations was assessed. The simulated populations included two purebred populations (lines A and B) and two crossbred populations (cross and backcross). Three scenarios were assessed for selecting the subset of the ref-erences that used to impute un-genotyped loci of animals in the validation set, where: 1) high rela-tionship with validation set, 2) randomly, and 3) high inbreeding selecting. Imputing the individuals of validation set 5K and 50K to marker density 777K using the various combinations of reference set was performed by FImpute software. The imputation accuracies were calculated using two methods including Pearson correlation coefficient (PCC) and concordance rate (CR). The results showed that imputation accuracy in the purebred populations lines A and B was higher than the cross and back-cross populations. When the reference set has been selected based on high relationships, the genotype accuracy in lines A and B was the highest, and there was less difference between imputation from 5K and 50K density to 777K compared to the other subset selection methods. In the crossbred pop-ulation with imputation from 50K to 777K, the imputation accuracy was the highest in the state of the randomly selected of the reference population (0.98 and 0.97 for PCC and CR, respectively). In the backcross population, the imputation accuracy was the lowest when the reference set selected according to the high inbreeding, which it could be resulting from the lower homozygosis in these populations.