• DocumentCode
    2151968
  • Title

    Iterative PCA for population structure analysis

  • Author

    Limpiti, T. ; Intarapanich, A. ; Assawamakin, A. ; Wangkumhang, P. ; Tongsima, S.

  • Author_Institution
    Fac. of Eng., King Mongkut´´s Inst. of Technol. Ladkrabang, Bangkok, Thailand
  • fYear
    2011
  • fDate
    22-27 May 2011
  • Firstpage
    597
  • Lastpage
    600
  • Abstract
    An extension of principal component analysis called ip-PCA has been proposed earlier for analyzing structure in genetic data. This non-parametric framework iteratively classifies individuals into subpopulations. However, it is prone to false positives when dealing with large datasets and mixed-type genetic markers. We address these shortcomings by introducing a unified encoding scheme and suggesting a new terminating criterion for ipPCA. To validate the improvements, simulated datasets as well as real bovine and large human genetic datasets are analyzed. It is observed that the estimation of the number of subpopulations and the individual assignment accuracy have been improved. Furthermore, the structure resolved by this approach can be used to identify subset of individuals for further parametric population structure analysis.
  • Keywords
    DNA; demography; encoding; genetics; principal component analysis; signal processing; PCA; genetic marker; iterative PCA; population structure analysis; principal component analysis; unified encoding scheme; Bioinformatics; Clustering algorithms; Eigenvalues and eigenfunctions; Encoding; Genetics; Principal component analysis; Shape; PCA; SNP; Tracy-Widom; clustering; population structure;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
  • Conference_Location
    Prague
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4577-0538-0
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2011.5946474
  • Filename
    5946474