DocumentCode
2257548
Title
Statistical bias and variance of gene selection and cross validation methods: A case study on hypertension prediction
Author
Gormez, Zeliha ; Kursun, Olcay ; Sertbas, Ahmet ; Aydin, Nizamettin ; Seker, Huseyin
Author_Institution
Comput. Eng. Dept., Univ. of Istanbul, Istanbul, Turkey
fYear
2012
fDate
5-7 Jan. 2012
Firstpage
616
Lastpage
619
Abstract
In exploratory association studies of genes with certain diseases, a single or a small number of genes (features) related with the diseases are selected1 among many thousands investigated. We investigate the statistical bias and variance of simple yet common (correlation and mutual information based) feature selection algorithms using well-known cross-validation methods (leave-one-out and k-fold) on a gene finding study for hypertension prediction. Our findings show that selected genes are different for different methods and different cross-validation runs for both single gene selection and gene subset selection.
Keywords
learning (artificial intelligence); medical computing; statistical analysis; cross validation methods; feature selection algorithms; gene subset selection; hypertension prediction; single gene selection; statistical bias; statistical variance; Prediction algorithms;
fLanguage
English
Publisher
ieee
Conference_Titel
Biomedical and Health Informatics (BHI), 2012 IEEE-EMBS International Conference on
Conference_Location
Hong Kong
Print_ISBN
978-1-4577-2176-2
Electronic_ISBN
978-1-4577-2175-5
Type
conf
DOI
10.1109/BHI.2012.6211658
Filename
6211658
Link To Document