• Title of article

    Effective dimensionality for principal component analysis of time series expression data

  • Author/Authors

    Wahde، Mattias نويسنده , , Hertz، John نويسنده , , Hornquist، Michael نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2003
  • Pages
    -310
  • From page
    311
  • To page
    0
  • Abstract
    Large-scale expression data are today measured for thousands of genes simultaneously. This development has been followed by an exploration of theoretical tools to get as much information out of these data as possible. Several groups have used principal component analysis (PCA) for this task. However, since this approach is data-driven, care must be taken in order not to analyze the noise instead of the data. As a strong warning towards uncritical use of the output from a PCA, we employ a newly developed procedure to judge the effective dimensionality of a specific data set. Although this data set is obtained during the development of rat central nervous system, our finding is a general property of noisy time series data. Based on knowledge of the noise-level for the data, we find that the effective number of dimensions that are meaningful to use in a PCA is much lower than what could be expected from the number of measurements. We attribute this fact both to effects of noise and the lack of independence of the expression levels. Finally, we explore the possibility to increase the dimensionality by performing more measurements within one time series, and conclude that this is not a fruitful approach.
  • Keywords
    PCA , Dimensionality , Expression data , Noise effects
  • Journal title
    BioSystems
  • Serial Year
    2003
  • Journal title
    BioSystems
  • Record number

    47727