Title of article
Feature Selection and Classification of Clinical Datasets Using Bioinspired Algorithms and Super Learner
Author/Authors
Murugesan, S Anna University - Chennai, India , Bhuvaneswaran, R. S Anna University - Chennai, India , Khanna Nehemiah, H Anna University - Chennai, India , Keerthana Sankari, S Department of Computer Science and Engineering - Anna University - Chennai, India , Nancy Jane, Y Department of Computer Technology - Anna University - Chennai, India
Pages
17
From page
1
To page
17
Abstract
A computer-aided diagnosis (CAD) system that employs a super learner to diagnose the presence or absence of a disease has been
developed. Each clinical dataset is preprocessed and split into training set (60%) and testing set (40%). A wrapper approach that
uses three bioinspired algorithms, namely, cat swarm optimization (CSO), krill herd (KH) ,and bacterial foraging optimization
(BFO) with the classification accuracy of support vector machine (SVM) as the fitness function has been used for feature
selection. The selected features of each bioinspired algorithm are stored in three separate databases. The features selected by
each bioinspired algorithm are used to train three back propagation neural networks (BPNN) independently using the conjugate
gradient algorithm (CGA). Classifier testing is performed by using the testing set on each trained classifier, and the diagnostic
results obtained are used to evaluate the performance of each classifier. The classification results obtained for each instance of
the testing set of the three classifiers and the class label associated with each instance of the testing set will be the candidate
instances for training and testing the super learner. The training set comprises of 80% of the instances, and the testing set
comprises of 20% of the instances. Experimentation has been carried out using seven clinical datasets from the University of
California Irvine (UCI) machine learning repository. The super learner has achieved a classification accuracy of 96.83% for
Wisconsin diagnostic breast cancer dataset (WDBC), 86.36% for Statlog heart disease dataset (SHD), 94.74% for hepatocellular
carcinoma dataset (HCC), 90.48% for hepatitis dataset (HD), 81.82% for vertebral column dataset (VCD), 84% for Cleveland
heart disease dataset (CHD), and 70% for Indian liver patient dataset (ILP).
Keywords
Bioinspired , Algorithms , CAD , BPNN
Journal title
Computational and Mathematical Methods in Medicine
Serial Year
2021
Full Text URL
Record number
2614999
Link To Document