DocumentCode :
2588340
Title :
Determining the repeat number of cross-validation
Author :
Yang, Kun ; Wang, Haipeng ; Dai, Guojun ; Hu, Sanqing ; Zhang, Yanbin ; Xu, Jing
Author_Institution :
Sch. of Comput. Sci. & Technol., Hangzhou Dianzi Univ., Hangzhou, China
Volume :
3
fYear :
2011
fDate :
15-17 Oct. 2011
Firstpage :
1706
Lastpage :
1710
Abstract :
The cross-validation is probably the most popular approach for estimating the classification error rate in classifying gene expression data. In order to reduce the variance of estimation, the procedure of cross-validation will be repeated to get the average result. However, the repetition number of cross-validation is generally set by an empirical value. This paper proposed two methods (FCI and TSE) for determining the repeat number of cross-validation based on the approximate confidence interval. The experimental results on real data show the empirical method of giving repeat number of cross-validation is usually unreliable and the proposed methods can determine cross-validation repeat number to achieve a pre-specified precision of the error rate. Furthermore, both methods can automatically adjust to meet the change of data, the value of k-fold and classification model.
Keywords :
error analysis; genetics; genomics; lab-on-a-chip; classification error rate; cross-validation; empirical method; gene expression data; microarray; Bioinformatics; Cancer; Colon; Error analysis; Gene expression; Liver; Support vector machines; classification; cross-validation; error rate; microarray(gene expression) data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Biomedical Engineering and Informatics (BMEI), 2011 4th International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-9351-7
Type :
conf
DOI :
10.1109/BMEI.2011.6098566
Filename :
6098566
Link To Document :
بازگشت