• DocumentCode
    3496083
  • Title

    Differential gene expression analysis using coexpression and RNA-Seq data

  • Author

    Ei-Wen Yang ; Girkes, Thomas ; Tao Jaing

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of California, Riverside, Riverside, CA, USA
  • fYear
    2013
  • fDate
    12-14 June 2013
  • Firstpage
    1
  • Lastpage
    1
  • Abstract
    RNA-Seq is increasingly being used for differential gene expression analysis which was dominated by the microarray technology in the past decade. However, inferring differential gene expression based on the observed difference of RNA-Seq read counts has unique challenges that were not present in microarray-based analysis. The differential expression estimation may be biased against low read count values such that the differential expression of genes with high read counts is more easily detected. The estimation bias may further propagate in downstream analyses at the systems biology level if it is not corrected. To obtain a better inference of differential gene expression, we propose a new efficient algorithm based on a markov random field (MRF) model, called MRFSeq, that uses additional gene coexpression data to enhance the prediction power. Our main technical contribution is the careful selection of the clique potential functions in the MRF so its maximum a posteriori (MAP) estimation can be reduced to the well-known maximum flow problem and thus solved in polynomial time. Our extensive experiments on simulated and real RNA-Seq datasets demonstrate that MRFSeq is more accurate and less biased against genes with low read counts than the existing methods based on RNA-Seq data alone. For example, on the well-studied MAQC dataset, MRFSeq improved the sensitivity from 11.6% to 38.8% for genes with low read counts. MRFSeq is implemented in C++ and available at http://www.cs.ucr.edu/~yyang027/mrfseq.htm.
  • Keywords
    RNA; biology computing; genetics; lab-on-a-chip; maximum likelihood estimation; molecular biophysics; polynomials; C++; MRF model; MRFSeq; RNA-Seq datasets; RNA-Seq read counts; clique potential functions; differential expression estimation; differential gene expression analysis; downstream analysis; gene coexpression data; high read counts; low read count values; markov random field model; maximum a posteriori estimation; maximum flow problem; microarray technology; microarray-based analysis; polynomial time; prediction power; systems biology level; Computer science; Educational institutions; Electronic mail; Estimation; Gene expression; Genomics; Plants (biology);
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Advances in Bio and Medical Sciences (ICCABS), 2013 IEEE 3rd International Conference on
  • Conference_Location
    New Orleans, LA
  • Type

    conf

  • DOI
    10.1109/ICCABS.2013.6629222
  • Filename
    6629222