Title :
Comparative studies of model performance based on different data sampling methods
Author :
You Lv ; Jizhen Liu ; Tingting Yang
Author_Institution :
State Key Lab. of Alternate Electr. Power Syst. with Renewable Energy Sources, North China Electr. Power Univ., Beijing, China
Abstract :
This paper presents a comparative study on the effects of different data sampling methods to the performance of data-driven models. An engineering benchmark modeling problem is investigated, focused on which, three sampling methods, i.e. orthogonal Latin sampling, uniform design sampling and random sampling are used to generate the training data of different property. Six typical data-driven modeling techniques, which consist of artificial intelligent methods (least squares support vector machine, BP neural network and RBF neural network) and statistical methods (multiple linear regression, linear and nonlinear partial least squares regressions), are performed to make the comparison. The root mean square error (RMSE), R square ( ) and mean relative error (MRE) values are taken as the comparison criteria. The results reveal that data sampling and data property play a very key role in establishing an accurate data-driven model.
Keywords :
backpropagation; benchmark testing; data models; mean square error methods; radial basis function networks; regression analysis; sampling methods; support vector machines; BP neural network; MRE values; R square values; RBF neural network; RMSE values; artificial intelligent methods; data property; data sampling methods; data-driven model performance; data-driven modeling techniques; engineering benchmark modeling problem; least squares support vector machine; mean relative error values; multiple linear regression; nonlinear partial least squares regressions; orthogonal Latin sampling; random sampling; root mean square error; statistical methods; uniform design sampling; Data models; Predictive models; Sampling methods; Support vector machines; Testing; Training data; Vectors; artificial neural network; data-driven model; least squares support vector machine; orthogonal Latin sampling; partial least squares; uniform design;
Conference_Titel :
Control and Decision Conference (CCDC), 2013 25th Chinese
Conference_Location :
Guiyang
Print_ISBN :
978-1-4673-5533-9
DOI :
10.1109/CCDC.2013.6561406