Title :
The Data Selection Criteria for HSC and SVM Algorithms
Author :
He, Qing ; Zhuang, Fuzhen ; Shi, Zhongzhi
Author_Institution :
Key Lab. of Intell. Inf. Process., Chinese Acad. of Sci., Beijing
Abstract :
This paper makes a discussion of consistent subsets (CS) selection criteria for hyper surface Classification (HSC) and SVM algorithms. The consistent subsets play an important role in the data selection. Firstly, the paper proposes that minimal consistent subset for a disjoint cover set (MCSC) plays an important role in the data selection for HSC. The MCSC can be applied to select a representative subset from the original sample set for HSC. MCSC has the same classification model with the entire sample set and can totally reflect its classification ability. Secondly, the number of MCSC is calculated. Thirdly, by comparing the performance of HSC and SVM on corresponding CS, we argue that it is not reasonable that using the same train data set to train different classifiers and then testing the classifiers by the same test data set for different algorithms. The experiments show that algorithms can respectively select the proper data set for training, which ensures good performance and generalization ability. MCSC is the best selection for HSC, and support vector set is the effective selection for SVM.
Keywords :
pattern classification; support vector machines; HSC; SVM; consaistent subsets selection criteria; data selection criteria; generalization ability; hyper surface classification; minimal consistent subset for a disjoint cover set; Computers; Databases; Helium; Information processing; Laboratories; Nearest neighbor searches; Neural networks; Support vector machine classification; Support vector machines; Testing; Consistent Subsets (CS); Data Selection Criteria; Minimal Consistent Subset for a disjoint Cover set;
Conference_Titel :
Natural Computation, 2008. ICNC '08. Fourth International Conference on
Conference_Location :
Jinan
Print_ISBN :
978-0-7695-3304-9
DOI :
10.1109/ICNC.2008.334