• DocumentCode
    178528
  • Title

    Variable selection for noisy data applied in proteomics

  • Author

    Dridi, N. ; Giremus, Audrey ; Giovannelli, Jean-Francois ; Truntzer, C. ; Roy, Pranab ; Gerfaut, L. ; Charrier, Jean-Philippe ; Ducoroy, P. ; Mercier, C. ; Grangeat, Pierre

  • Author_Institution
    IMS, Univ. Bordeaux, Talence, France
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    2833
  • Lastpage
    2837
  • Abstract
    The paper proposes a variable selection method for proteomics. It aims at selecting, among a set of proteins, those (named biomarkers) which enable to discriminate between two groups of individuals (healthy and pathological). To this end, data is available for a cohort of individuals: the biological state and a measurement of concentrations for a list of proteins. The proposed approach is based on a Bayesian hierarchical model for the dependencies between biological and instrumental variables. The optimal selection function minimizes the Bayesian risk, that is to say the selected set of variables maximizes the posterior probability. The two main contributions are: (1) we do not impose ad-hoc relationships between the variables such as a logistic regression model and (2) we account for instrumental variability through measurement noise. We are then dealing with indirect observations of a mixture of distributions and it results in intricate probability distributions. A closed-form expression of the posterior distributions cannot be derived. Thus, we discuss several approximations and study the robustness to the noise level. Finally, the method is evaluated both on simulated and clinical data.
  • Keywords
    Bayes methods; biology computing; data handling; proteins; proteomics; statistical distributions; Bayesian hierarchical model; Bayesian risk minimization; biological state; biological variables; biomarkers; healthy individuals; instrumental variability; instrumental variables; measurement noise; noise level; noisy data; optimal selection function; pathological individuals; posterior probability; probability distributions; proteins; proteomics; variable selection method; Bayes methods; Biological system modeling; Biomarkers; Input variables; Noise; Proteins; Bayesian approach; Gaussian mixture; Model and variable selection; biological et technological variability; proteomics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6854117
  • Filename
    6854117