DocumentCode
2151968
Title
Iterative PCA for population structure analysis
Author
Limpiti, T. ; Intarapanich, A. ; Assawamakin, A. ; Wangkumhang, P. ; Tongsima, S.
Author_Institution
Fac. of Eng., King Mongkut´´s Inst. of Technol. Ladkrabang, Bangkok, Thailand
fYear
2011
fDate
22-27 May 2011
Firstpage
597
Lastpage
600
Abstract
An extension of principal component analysis called ip-PCA has been proposed earlier for analyzing structure in genetic data. This non-parametric framework iteratively classifies individuals into subpopulations. However, it is prone to false positives when dealing with large datasets and mixed-type genetic markers. We address these shortcomings by introducing a unified encoding scheme and suggesting a new terminating criterion for ipPCA. To validate the improvements, simulated datasets as well as real bovine and large human genetic datasets are analyzed. It is observed that the estimation of the number of subpopulations and the individual assignment accuracy have been improved. Furthermore, the structure resolved by this approach can be used to identify subset of individuals for further parametric population structure analysis.
Keywords
DNA; demography; encoding; genetics; principal component analysis; signal processing; PCA; genetic marker; iterative PCA; population structure analysis; principal component analysis; unified encoding scheme; Bioinformatics; Clustering algorithms; Eigenvalues and eigenfunctions; Encoding; Genetics; Principal component analysis; Shape; PCA; SNP; Tracy-Widom; clustering; population structure;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location
Prague
ISSN
1520-6149
Print_ISBN
978-1-4577-0538-0
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2011.5946474
Filename
5946474
Link To Document