• DocumentCode
    3453898
  • Title

    Feature selections using AdaBoost: Application in gene-gene interaction detection

  • Author

    Assareh, A. ; Volkert, L.G. ; Jing Li

  • Author_Institution
    CS Dept., Kent State Univ., Kent, OH, USA
  • fYear
    2012
  • fDate
    4-7 Oct. 2012
  • Firstpage
    831
  • Lastpage
    837
  • Abstract
    One of the main goals of genome wide association studies (GWAS) has been detecting the gene-gene interactions, also known as epistasis in a broad sense, underlying complex diseases. The ability of decision trees and their ensembles to capture interactions among input variable has attracted attention among computational biologists for this aim. However, individual decision trees suffer from some limitations including data fragmentation and representational problem that can impact the epistasis detection performance of their ensembles when not taken into account. Here we take a closer look at feature selection capability of AdaBoost in the realm of epistasis detection and the effect of tuning the weak classifiers on its performance. We also explore the efficacy of applying different statistical and information theoretic strategies in tandem with AdaBoost in order to improve its performance. The results show that the performance of AdaBoost is more sensitive to the parameters settings of the weak learner when risk allele frequencies are low, which can be explained with respect to the data fragmentation phenomenon. Also depending on the model of interaction between the risk SNPs different criterion might excel in the second stage.
  • Keywords
    biology computing; data structures; decision trees; genomics; learning (artificial intelligence); pattern classification; AdaBoost; GWAS; computational biologist; data fragmentation problem; data representational problem; decision trees; ensemble learning; epistasis; feature selection; gene-gene interaction detection; genome wide association studies; weak classifier tuning; Additives; Bioinformatics; Data models; Decision trees; Diseases; Logistics; Mutual information; AdaBoost; GWAS; decision trees; epistasis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine Workshops (BIBMW), 2012 IEEE International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    978-1-4673-2746-6
  • Electronic_ISBN
    978-1-4673-2744-2
  • Type

    conf

  • DOI
    10.1109/BIBMW.2012.6470248
  • Filename
    6470248