DocumentCode :
617933
Title :
Multi-objective evolutionary algorithm for variable selection in calibration problems: A case study for protein concentration prediction
Author :
de Lucena, Daniel Vitor ; Woerle de Lima, Telma ; da Silva Soares, Anderson ; Delbem, Alexandre C. B. ; Rodrigues Galvao Filho, Arlindo ; Coelho, C.J. ; Laureano, Gustavo Teodoro
Author_Institution :
Inst. of Inf., Fed. Univ. of Goias, Goiania, Brazil
fYear :
2013
fDate :
20-23 June 2013
Firstpage :
1053
Lastpage :
1059
Abstract :
This paper presents a multi-objective formulation for variable selection in calibration problems. The prediction of protein concentration on wheat is obtained by a linear regression model using variables obtained by a spectrophotometer device. This device measure hundreds of correlated variables related with physicochemical properties and that can be used to estimate the protein concentration. The problem is the selection of a subset informative and uncorrelated variables that help the minimization of prediction error. In this work we propose the use of two objectives in this problem: the prediction error and the number of variables in the model, both related to linear equations system stability. We proposed a multi-objective formulation using two multi-objective algorithms: the NSGA-II and the SPEA-II. Additionally we propose a final decision maker method to choice the final subset of variables from the Pareto front. For the case study is used wheat data obtained by NIR spectrometry where the objective is the determination of a variable subgroup with information about protein concentration. The results of traditional techniques of multivariate calibration as the Successive Projections Algorithm (SPA), Partial Least Square (PLS) and mono-objective genetic algorithm are presents for comparisons. For NIR spectral analysis of protein concentration on wheat, the number of variables selected from 775 spectral variables was reduced for just 10 in the SPEA-II algorithm. The prediction error decreased from 0.2 in the classical methods to 0.09 in proposed approach, a reduction of 45%. The model using variables selected by SPEA-II had better prediction performance than classical algorithms and full-spectrum partial least-squares (PLS).
Keywords :
Pareto optimisation; calibration; chemical variables measurement; crops; decision making; genetic algorithms; minimisation; proteins; regression analysis; spectrochemical analysis; NIR spectral analysis; NIR spectrometry; NSGA-II algorithm; PLS algorithm; Pareto front; SPA; SPEA-II algorithm; final decision maker method; linear equations system stability; linear regression model; mono-objective genetic algorithm; multiobjective evolutionary algorithm; multiobjective formulation; multivariate calibration techniques; partial least square algorithm; physicochemical properties; prediction error minimization; protein concentration estimation; protein concentration prediction; spectral variables; spectrophotometer device; subset informative variable selection; subset uncorrelated variable selection; successive projections algorithm; variable subgroup determination; wheat; Calibration; Equations; Genetic algorithms; Input variables; Mathematical model; Prediction algorithms; Predictive models; Multi-objective algorithms; Protein Concentration; linear regression; variable selection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Evolutionary Computation (CEC), 2013 IEEE Congress on
Conference_Location :
Cancun
Print_ISBN :
978-1-4799-0453-2
Electronic_ISBN :
978-1-4799-0452-5
Type :
conf
DOI :
10.1109/CEC.2013.6557683
Filename :
6557683
Link To Document :
بازگشت