• DocumentCode
    2257548
  • Title

    Statistical bias and variance of gene selection and cross validation methods: A case study on hypertension prediction

  • Author

    Gormez, Zeliha ; Kursun, Olcay ; Sertbas, Ahmet ; Aydin, Nizamettin ; Seker, Huseyin

  • Author_Institution
    Comput. Eng. Dept., Univ. of Istanbul, Istanbul, Turkey
  • fYear
    2012
  • fDate
    5-7 Jan. 2012
  • Firstpage
    616
  • Lastpage
    619
  • Abstract
    In exploratory association studies of genes with certain diseases, a single or a small number of genes (features) related with the diseases are selected1 among many thousands investigated. We investigate the statistical bias and variance of simple yet common (correlation and mutual information based) feature selection algorithms using well-known cross-validation methods (leave-one-out and k-fold) on a gene finding study for hypertension prediction. Our findings show that selected genes are different for different methods and different cross-validation runs for both single gene selection and gene subset selection.
  • Keywords
    learning (artificial intelligence); medical computing; statistical analysis; cross validation methods; feature selection algorithms; gene subset selection; hypertension prediction; single gene selection; statistical bias; statistical variance; Prediction algorithms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Biomedical and Health Informatics (BHI), 2012 IEEE-EMBS International Conference on
  • Conference_Location
    Hong Kong
  • Print_ISBN
    978-1-4577-2176-2
  • Electronic_ISBN
    978-1-4577-2175-5
  • Type

    conf

  • DOI
    10.1109/BHI.2012.6211658
  • Filename
    6211658