DocumentCode :
3393890
Title :
Feature Selection and Prediction of Sub-health State Using SVM-RFE
Author :
Wang, Li-Min ; Pei, Yu ; Chen, Jia-Xu ; Zhao, Xin ; Cui, Hua-Ting ; Cui, Hai-Zhen
Author_Institution :
Beijing Univ. of Chinese Med., Beijing, China
Volume :
3
fYear :
2010
fDate :
23-24 Oct. 2010
Firstpage :
199
Lastpage :
202
Abstract :
Sub-health state is a low-quality status between health and disease. The aim of this study was to determine which factors and/or combination of factors could be predictive of sub-health state. In this paper, we carried out a clinical epidemiology survey and obtained two datasets both of which include 50 symptoms in report. The Dataset 1 consists of 572 samples, of which 523 cases were in sub-health state and 49 cases were in healthy. The Dataset 2 consists of 185 samples, of which 131 cases were in sub-health state and 54 cases were in healthy. The Dataset 1 was used to select variables and estimate the performance of the classifier built by SVM, while the Dataset 2 was used to validate the performance of the classifier based on the Dataset 1. Based on association declined by mutual information, we propose a feature selection method based on support vector machine recursive feature elimination (SVM-RFE) to predict the sub-health state from the analysis of the clinical data. We have considered optimal performance at the threshold where sensitivity and specificity were respectively 0.82 and 0.72. The performance of this method achieved an average prediction accuracy of 80.35%. The top 8 features (symptoms) selected by SVM-RFE were as follows: Fatigue, Degree of insomnia, Pessimism, Constipation, Dysphoria, Giddiness, Anorexia and Vexation. Therefore, we propose a new method for feature selection in classification problems that uses SVM-RFE. The goal is to remove too many features during each iteration, but not to eliminate the important one.
Keywords :
classification; diseases; health care; medical information systems; recursive functions; support vector machines; SVM-RFE; classification problems; classifier; clinical data; clinical epidemiology survey; disease; feature prediction; feature selection; health; sub-health state; support vector machine recursive feature elimination; Accuracy; Data mining; Diseases; Fatigue; Input variables; Support vector machines; Training; Recursive Feature Elimination; SVM RFE; Sub-health state; Support Vector Machine; classifier; variable selection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Artificial Intelligence and Computational Intelligence (AICI), 2010 International Conference on
Conference_Location :
Sanya
Print_ISBN :
978-1-4244-8432-4
Type :
conf
DOI :
10.1109/AICI.2010.280
Filename :
5655277
Link To Document :
بازگشت