• DocumentCode
    1784769
  • Title

    RMA with quantile normalization mixes biological signals between different sample groups in microarray data analysis

  • Author

    Chang Sik Kim ; Seungwoo Hwang ; Shu-Dong Zhang

  • Author_Institution
    Centre for Cancer Res. & Cell Biol. (CCRCB), Queen´s Univ. Belfast (QUB), Belfast, UK
  • fYear
    2014
  • fDate
    2-5 Nov. 2014
  • Firstpage
    139
  • Lastpage
    143
  • Abstract
    Quantile normalization (QN) is a technique for microarray data processing and is the default normalization method in the Robust Multi-array Average (RMA) procedure, which was primarily designed for analysing gene expression data from Affymetrix arrays. Given the abundance of Affymetrix microarrays and the popularity of the RMA method, it is crucially important that the normalization procedure is applied appropriately. In this study we carried out simulation experiments and also analysed real microarray data to investigate the suitability of RMA when it is applied to dataset with different groups of biological samples. From our experiments, we showed that RMA with QN does not preserve the biological signal included in each group, but rather it would mix the signals between the groups. We also showed that the Median Polish method in the summarization step of RMA has similar mixing effect. RMA is one of the most widely used methods in microarray data processing and has been applied to a vast volume of data in biomedical research. The problematic behaviour of this method suggests that previous studies employing RMA could have been misadvised or adversely affected. Therefore we think it is crucially important that the research community recognizes the issue and starts to address it. The two core elements of the RMA method, quantile normalization and Median Polish, both have the undesirable effects of mixing biological signals between different sample groups, which can be detrimental to drawing valid biological conclusions and to any subsequent analyses. Based on the evidence presented here and that in the literature, we recommend exercising caution when using RMA as a method of processing microarray gene expression data, particularly in situations where there are likely to be unknown subgroups of samples.
  • Keywords
    bioinformatics; genetics; medical signal processing; Affymetrix arrays; Median Polish method; RMA procedure; Robust Multi-array Average procedure; biological signals; biomedical research; gene expression data; microarray data analysis; quantile normalization; Arrays; Biological system modeling; Breast cancer; Erbium; Gene expression; Probes; Microarray data analysis; Mixing biological signals; Quantile normalization; Robust Multi-array Average (RMA);
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine (BIBM), 2014 IEEE International Conference on
  • Conference_Location
    Belfast
  • Type

    conf

  • DOI
    10.1109/BIBM.2014.6999142
  • Filename
    6999142