Title :
Improving the efficiency of biomarker identification using expert knowledge
Author :
Mahmood, Ali Mirza ; Kuppa, Mrithyumjaya Rao
Author_Institution :
Acharya Nagarjuna Univ., Guntur, India
Abstract :
In this paper, we present a practical algorithm to deal with the data specific classification problem when there are datasets with different properties. We proposed to integrate error rate, missing values and expert judgment as factors for determining data specific pruning to form Expert Knowledge Based Pruning (EKBP). We conduct an extensive experimental study on openly available 40 real world datasets from UCI repository. In all these experiments, the proposed approach shows considerably reduction of tree size and achieves equal or better accuracy compared to several bench mark decision tree methods. We have also conducted a case study of heart disease dataset by using our improved algorithm. This study suggests that (Thal), type of defect in heart is the most important predictor for confirming heart disease presence, Number of major vessels colored by fluoroscopy (MV) and type of chest pain (Chest) as biomarkers of heart disease.
Keywords :
biology computing; cancer; decision trees; expert systems; patient treatment; pattern classification; biomarker identification; data specific classification problem; expert knowledge based pruning; heart disease dataset; Accuracy; Classification algorithms; Decision trees; Diseases; Error analysis; Heart; Machine learning algorithms; Biomarker; Decisions tree; EKBP; expert knowledge; intelligent in-exact classification; pruning;
Conference_Titel :
Trendz in Information Sciences & Computing (TISC), 2010
Conference_Location :
Chennai
Print_ISBN :
978-1-4244-9007-3
DOI :
10.1109/TISC.2010.5714618