• DocumentCode
    1454030
  • Title

    Peakbin Selection in Mass Spectrometry Data Using a Consensus Approach with Estimation of Distribution Algorithms

  • Author

    Armañanzas, Rubén ; Saeys, Yvan ; Inza, Iñaki ; García-Torres, Miguel ; Bielza, Concha ; van de Peer, Y. ; Larrañaga, Pedro

  • Author_Institution
    Dept. de Intel. Artificial, Univ. Politec. de Madrid, Boadilla del Monte, Spain
  • Volume
    8
  • Issue
    3
  • fYear
    2011
  • Firstpage
    760
  • Lastpage
    774
  • Abstract
    Progress is continuously being made in the quest for stable biomarkers linked to complex diseases. Mass spectrometers are one of the devices for tackling this problem. The data profiles they produce are noisy and unstable. In these profiles, biomarkers are detected as signal regions (peaks), where control and disease samples behave differently. Mass spectrometry (MS) data generally contain a limited number of samples described by a high number of features. In this work, we present a novel class of evolutionary algorithms, estimation of distribution algorithms (EDA), as an efficient peak selector in this MS domain. There is a trade-of f between the reliability of the detected biomarkers and the low number of samples for analysis. For this reason, we introduce a consensus approach, built upon the classical EDA scheme, that improves stability and robustness of the final set of relevant peaks. An entire data workflow is designed to yield unbiased results. Four publicly available MS data sets (two MALDI-TOF and another two SELDI-TOF) are analyzed. The results are compared to the original works, and a new plot (peak frequential plot) for graphically inspecting the relevant peaks is introduced. A complete online supplementary page, which can be found at http://www.sc.ehu.es/ccwbayes/members/ruben/ms, includes extended info and results, in addition to Matlab scripts and references.
  • Keywords
    MALDI mass spectra; diseases; evolutionary computation; mass spectrometers; medical diagnostic computing; EDA; MALDI-TOF; SELDI-TOF; biomarkers; complex diseases; estimation of distribution algorithms; evolutionary algorithms; mass spectrometry; peakbin selection; Bioinformatics; Biomarkers; Computational biology; Detectors; Diseases; Electronic design automation and methodology; Evolutionary computation; Mass spectroscopy; Peptides; Robust stability; EDA; Mass spectrometry; biomarker discovery.; feature selection; Algorithms; Biological Markers; Carcinoma, Hepatocellular; Computational Biology; Databases, Factual; Humans; Liver Neoplasms; Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization; Stochastic Processes;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2010.18
  • Filename
    5438984