Title :
An improved feature selection algorithm based on Markov blanket
Author :
Zuo, Xiaohan ; Lu, Peng ; Liu, Xi ; Gao, Yibo ; Yang, Yiping ; Chen, Jianxin
Author_Institution :
Dept. of Integrated Inf. Syst., Inst. of Autom., Beijing, China
Abstract :
For decades, coronary artery heart disease(CHD) has been one of the most threatening diseases to human health. Syndrome pattern mining is one of the attempts researchers have been done to conquer this disease. The main issue of syndrome pattern mining is to confirm the correspondence between syndrome and syndroms subset, so it can be done through feature selection techniques. Feature selection is a critical unit in classification, which is used to classify syndroms into different syndromes, and can effectively improve the speed and accuracy. In this paper, we propose a novel feature selection algorithm based on Markov blanket and information gain(MB-IGFS) for syndroms classification problem. In particular, we give a new and intuitive measurement of condition independence between features and class labels, which is more accurate and easy for calculation. For evaluation, experiments were conducted on Breast Cancer Wisconsin (Diagnostic) Data Set. Results suggest that, compared with other feature selection methods, MB-IGFS is effective and efficient in eliminating irrelevant and redundant features. Then we used MB-IGFS to give optimal syndroms subsets for both Solid ZHENG and Virtual ZHENG syndrome. We conclude that MB-IGFS appears a very attractive solution in syndroms classification applications.
Keywords :
Markov processes; biological organs; cancer; cardiology; classification; data mining; diseases; gynaecology; medical computing; Markov blanket; breast cancer Wisconsin data set; classification; coronary artery heart disease; feature selection algorithm; information gain; solid ZHENG syndrome; syndrome pattern mining; virtual ZHENG syndrome; Approximation algorithms; Decision trees; Diseases; Entropy; Indexes; Markov processes; Medical diagnostic imaging; Markov blanket; coronary artery heart disease; feature selection; information gain; syndrome pattern discovery; syndrome recognition; traditional Chinese medicine;
Conference_Titel :
Biomedical Engineering and Informatics (BMEI), 2011 4th International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-9351-7
DOI :
10.1109/BMEI.2011.6098568