Title :
Robustness analysis of diversified ensemble decision tree algorithms for Microarray data classification
Author :
Hu, Hong ; Li, Jiu-yong ; Wang, Hua ; Daggard, Grant ; Wang, Li-zhen
Author_Institution :
Dept. of Math. & Comput., Univ. of Southern Queensland, Toowoomba, QLD
Abstract :
Ensemble classification methods have shown promise for achieving higher classification accuracy for microarray data classification analysis. As noise values do exist in all microarray data even after microarray data preprocessing stage, robustness is therefore another very important criteria in addition to accuracy for evaluating reliable microarray classification algorithms. In this paper, we conduct experimental comparison of our newly developed MDMT with C4.5, BaggingC4.5, Ad-aBoostingC4.5, Random Forest and CS4 on four microarray cancer data sets. We test and evaluate how well a given single or ensemble classifier can tolerate noise data in unseen test datasets, particularly with increasing levels of noise. The experimental results show that MDMT tolerates the noise values in unseen test data sets better than other compared methods do, particularly with increasing levels of noise data. We observe that a random forests is comparable to MDMT in term of resistance to noise. The experimental results also show that ensemble decision tree methods tolerate the noise values better than single tree C4.5 does. We conclude that avoiding overlapping genes exist among the ensemble trees is an intuitive, simple and effective way to achieve higher degree of diversity for ensemble decision tree methods. The algorithm based on this principal is more reliable to deal with microarray data sets with certain level of noise data.
Keywords :
cancer; classification; data analysis; decision trees; medical computing; diversified ensemble decision tree; microarray cancer data sets; microarray data classification; robustness analysis; Algorithm design and analysis; Cancer; Classification algorithms; Classification tree analysis; Data analysis; Data preprocessing; Decision trees; Noise level; Noise robustness; Testing; Classification; Diversity; Ensemble decision trees; Microarry data; Robustness;
Conference_Titel :
Machine Learning and Cybernetics, 2008 International Conference on
Conference_Location :
Kunming
Print_ISBN :
978-1-4244-2095-7
Electronic_ISBN :
978-1-4244-2096-4
DOI :
10.1109/ICMLC.2008.4620389