Title of article :
Computer-aided diagnosis of pulmonary nodules using a two-step approach for feature selection and classifier ensemble construction
Author/Authors :
Lee، نويسنده , , Michael C. and Boroczky، نويسنده , , Lilla and Sungur-Stasik، نويسنده , , Kivilcim and Cann، نويسنده , , Aaron D. and Borczuk، نويسنده , , Alain C. and Kawut، نويسنده , , Steven M. and Powell، نويسنده , , Charles A.، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2010
Pages :
11
From page :
43
To page :
53
Abstract :
Objective te classification methods are critical in computer-aided diagnosis (CADx) and other clinical decision support systems. Previous research has reported on methods for combining genetic algorithm (GA) feature selection with ensemble classifier systems in an effort to increase classification accuracy. In this study, we describe a CADx system for pulmonary nodules using a two-step supervised learning system combining a GA with the random subspace method (RSM), with the aim of exploring algorithm design parameters and demonstrating improved classification performance over either the GA or RSM-based ensembles alone. s and materials d a retrospective database of 125 pulmonary nodules (63 benign; 62 malignant) with CT volumes and clinical history. A total of 216 features were derived from the segmented image data and clinical history. Ensemble classifiers using RSM or GA-based feature selection were constructed and tested via leave-one-out validation with feature selection and classifier training executed within each iteration. We further tested a two-step approach using a GA ensemble to first assess the relevance of the features, and then using this information to control feature selection during a subsequent RSM step. The base classification was performed using linear discriminant analysis (LDA). s M classifier alone achieved a maximum leave-one-out Az of 0.866 (95% confidence interval: 0.794–0.919) at a subset size of s = 36 features. The GA ensemble yielded an Az of 0.851 (0.775–0.907). The proposed two-step algorithm produced a maximum Az value of 0.889 (0.823–0.936) when the GA ensemble was used to completely remove less relevant features from the second RSM step, with similar results obtained when the GA-LDA results were used to reduce but not eliminate the occurrence of certain features. After accounting for correlations in the data, the leave-one-out Az in the two-step method was significantly higher than in the RSM and the GA-LDA. sions e developed a CADx system for evaluation of pulmonary nodule based on a two-step feature selection and ensemble classifier algorithm. We have shown that by combining classifier ensemble algorithms in this two-step manner, it is possible to predict the malignancy for solitary pulmonary nodules with a performance exceeding that of either of the individual steps.
Keywords :
Random subspace , Pulmonary nodules , computer-aided diagnosis , Genetic algorithms , feature selection , linear discriminant analysis
Journal title :
Artificial Intelligence In Medicine
Serial Year :
2010
Journal title :
Artificial Intelligence In Medicine
Record number :
1836931
Link To Document :
بازگشت