Title :
A Grammatical Swarm for protein classification
Author :
Ramstein, Gerard ; Beaume, Nicolas ; Jacques, Yannick
Author_Institution :
LINA Lab., Nantes Univ., Nantes
Abstract :
We present a grammatical swarm (GS) for the optimization of an aggregation operator. This combines the results of several classifiers into a unique score, producing an optimal ranking of the individuals. We apply our method to the identification of new members of a protein family. Support vector machine and naive Bayes classifiers exploit complementary features to compute probability estimates. A great advantage of the GS is that it produces an understandable algorithm revealing the interest of the classifiers. Due to the large volume of candidate sequences, ranking quality is of crucial importance. Consequently, our fitness criterion is based on the area under the ROC curve rather than on classification error rate. We discuss the performances obtained for a particular family, the cytokines and show that this technique is an efficient means of ranking the protein sequences.
Keywords :
Bayes methods; biology computing; macromolecules; optimisation; pattern classification; proteins; support vector machines; Naive Bayes classifiers; aggregation operator; cytokines; fitness criterion; grammatical swarm; optimal ranking; protein classification; protein sequences; ranking quality; support vector machine; Application software; Bioinformatics; Data mining; Error analysis; Genetic algorithms; Genetic programming; Laboratories; Proteins; Support vector machine classification; Support vector machines;
Conference_Titel :
Evolutionary Computation, 2008. CEC 2008. (IEEE World Congress on Computational Intelligence). IEEE Congress on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-1822-0
Electronic_ISBN :
978-1-4244-1823-7
DOI :
10.1109/CEC.2008.4631142