DocumentCode :
2151968
Title :
Iterative PCA for population structure analysis
Author :
Limpiti, T. ; Intarapanich, A. ; Assawamakin, A. ; Wangkumhang, P. ; Tongsima, S.
Author_Institution :
Fac. of Eng., King Mongkut´´s Inst. of Technol. Ladkrabang, Bangkok, Thailand
fYear :
2011
fDate :
22-27 May 2011
Firstpage :
597
Lastpage :
600
Abstract :
An extension of principal component analysis called ip-PCA has been proposed earlier for analyzing structure in genetic data. This non-parametric framework iteratively classifies individuals into subpopulations. However, it is prone to false positives when dealing with large datasets and mixed-type genetic markers. We address these shortcomings by introducing a unified encoding scheme and suggesting a new terminating criterion for ipPCA. To validate the improvements, simulated datasets as well as real bovine and large human genetic datasets are analyzed. It is observed that the estimation of the number of subpopulations and the individual assignment accuracy have been improved. Furthermore, the structure resolved by this approach can be used to identify subset of individuals for further parametric population structure analysis.
Keywords :
DNA; demography; encoding; genetics; principal component analysis; signal processing; PCA; genetic marker; iterative PCA; population structure analysis; principal component analysis; unified encoding scheme; Bioinformatics; Clustering algorithms; Eigenvalues and eigenfunctions; Encoding; Genetics; Principal component analysis; Shape; PCA; SNP; Tracy-Widom; clustering; population structure;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
ISSN :
1520-6149
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2011.5946474
Filename :
5946474
Link To Document :
بازگشت