Title :
Particle swarm optimization-based bio-network discovery method for the diagnosis of colorectal cancer
Author :
Akutekwe, Arinze ; Seker, Huseyin
Author_Institution :
Sch. of Comput. Sci. & Inf., De Montfort Univ., Leicester, UK
Abstract :
Machine learning techniques for automatic discovery of biomarkers and construction of predictive models have been applied for the diagnosis of colorectal cancer. Strategies such as Empirical Mode Decomposition (EMD) combined with Least Square Support Vector Machine (LS-SVM) have been proposed. Other methods using Discrete Wavelet Transform (DFT) and Support Vector Machine classifier have also been applied. However these methods adopt filter method of feature selection, which ignore interaction with the classifier, resulting in poor selection of features. They are also not able to detect temporal relationships among biomarkers that will aid better understanding of the disease. In this paper, we apply a two-stage bio-discovery approach which is a hybrid of classifier-dependent embedded feature selection methods, and modelling of temporal association of selected features using Dynamic Bayesian Network. Particle Swarm Optimization is also used to tune the parameters of feature selection algorithms for improved generalization performance. We demonstrate our method using the serum protein profiles of colorectal cancer patients. Results show that 21 features selected by the Support Vector Machine Recursive Feature Elimination with linear kernel had 99.09% generalization performance which outperforms that from previous studies. In addition, the analysis stratified Angiotensinogen (serpin peptidase inhibitor, clade A, member 8) that might inhibit IgA-inducing protein homolog (Bos taurus) and Gem (nuclear organelle) associated protein 2 might play inhibitory role against Alpha-2-HS-glycoprotein, which is also associated with liver cancer, and all having their cDNA sources from the bowel.
Keywords :
Bayes methods; DNA; belief networks; bioinformatics; biological organs; cancer; discrete wavelet transforms; enzymes; feature selection; inhibitors; learning (artificial intelligence); least squares approximations; liver; molecular biophysics; particle swarm optimisation; patient diagnosis; pattern classification; support vector machines; Alpha-2-HS-glycoprotein; Bos taurus; DFT; Gem; IgA-inducing protein homolog; LS-SVM; automatic biomarkers discovery; bowel; cDNA sources; classifier-dependent embedded feature selection methods; colorectal cancer diagnosis; colorectal cancer patients; discrete wavelet transform; dynamic Bayesian network; empirical mode decomposition; feature selection algorithms; filter method; generalization performance; least square support vector machine; linear kernel; liver cancer; machine learning techniques; nuclear organelle; particle swarm optimization-based bionetwork discovery method; predictive models construction; serpin peptidase inhibitor; serum protein profiles; stratified angiotensinogen; support vector machine classifier; support vector machine recursive feature elimination; temporal association modelling; temporal relationships; two-stage biodiscovery approach; Accuracy; Biological system modeling; Biomarkers; Cancer; Polynomials; Proteins; Support vector machines;
Conference_Titel :
Bioinformatics and Biomedicine (BIBM), 2014 IEEE International Conference on
Conference_Location :
Belfast
DOI :
10.1109/BIBM.2014.6999241