چكيده لاتين :
Introduction: Sediment transportation and accurate estimation of its rate is a significant issue for river
engineers and researchers. So far, various and complex relationships have been proposed to predict the amount of
suspended sediment transport rate, such as velocity and critical shear stress based equations. However, the complex
nature of sediment transport and lack of validated models make it difficult to model the suspended sediment
concentration and suspended sediment discharge carried by rivers. Although the developed models led to
promising results in sediment transport prediction, due to the importance of sediment transport and its impact on
hydraulic structures it is necessary to use other methods with higher efficiency. On the other hand, in recent years,
the Meta model approaches have been applied in investigating the hydraulic and hydrologic complex phenomena.
Hybrid models involving signal decomposition have also been shown to be effective in improving the prediction
accuracy of time series prediction methods, as indicated in. Complementary Ensemble Empirical Mode
Decomposition analysis is one of the widely used signal decomposition methods for hydrological time series
prediction. Decomposition of time series reduces the difficulty of forecasting, thereby improving forecasting
accuracy.
In this study, due to the complexity of the sediment and erosion phenomenon and the effect of different
parameters in estimating, time series pre-processing methods along with support vector machine (SVM) and
Gaussian process regression (GPR) kernel based approaches were used to estimate suspended sediment load of a
natural river at two consecutive hydrometric stations. For this purpose, different models were defined based on
hydraulic and sediment particles characteristics. Moreover, the capability of integrated pre-processing and postprocessing
methods in two states of inter-station and between-stations was investigated. First, the Wavelet
Transform (WT) method was used for data pre-processing then, the high-frequency sub-series were selected and
re-decomposed using the Empirical Mode Decomposition (EMD). Finally, the most effective sub-series were
imposed as inputs for kernel-based models. In addition, to assess the reliability of the superior model, Monte Carlo
uncertainty analysis was used.The results showed that the GPR model had a desirable degree of uncertainty in
modeling.
Materials and Methods: In this study, data of two stations of Housatonic River was used. The distance
between stations was approximately 50 km. The first station is located near Great Brighton, Massachusetts, and
the second station is in Connecticut. The basin area for the stations is 282 and 634 square miles, respectively. The
flow path is from the first station to the second station. SVM and GPR models are based on the assumption that
adjacent observations should convey information about each other. Gaussian processes are a way of specifying a
prior directly over function space. This is a natural generalization of the Gaussian distribution whose mean and
covariance are a vector and matrix, respectively. Due to prior knowledge about the data and functional
dependencies, no validation process is required for generalization, and GP regression models are able to understand
the predictive distribution corresponding to the test input. Wavelet Transform (WT) uses a flexible window
function (mother wavelet) in signal processing. The flexible window function can be changed over time according
to the signal shape and compactness. After using WT, the signal will decompose into two approximations (largescale
or low-frequency component) and detailed (small-scale component) components. EEMD was proposed to
solve the mode mixing issue of empirical mode decomposition (EMD) which specifies the true IMF as the mean
of an ensemble of trials. Each trial consists of the decomposition results of the signal plus a white noise of finite
amplitude. EMD can be used to decompose any complex signal into finite intrinsic mode functions and a residue,
resulting in subtasks with simpler frequency components and stronger correlations that are easier to analyze and
forecast. Another important feature of empirical model of decomposition is that it can be used for noise reduction
of noisy time series, which can be effective in improving the accuracy of model predictions. In the uncertainty
analysis method, two elements are used to test the robustness and to analyze the models uncertainty. The first one
is the percentage of the studied outputs which are in the range of 95PPU and the next one is the average distance
between the upper (XU) and lower (XL) uncertainty bands. In this regard, the considered model should be run many times (1000 times in this study), and the empirical cumulative distribution probability of the models be calculated.
The upper and lower bands are considered 2.5% and 97.5% probabilities of the cumulative distribution,
respectively.
Results and Discussion: In order to evaluate and review the performance of the tested models and determine
the accuracy of the selected models, three performance criteria named Correlation Coefficient (CC), Determination
Coefficient (DC), and Root Mean Square Errors (RSME) were used. The obtained results indicated that the
accuracy of the applied integrated models was higher than the single SVM and GPR models. The use of integrated
methods decreased the error criteria between 20 to 25 %. The obtained results for the uncertainty analysis showed
that in suspended sediment load modeling the observed and predicted values were within the 95 PPU band in most
of the cases. Moreover, it was found that the amount of d-Factors for train and test datasets were smaller than the
standard deviation of the observed data. Therefore, based on the results, it could be induced that the suspended
sediment modeling via integrated WT-EEMD-GPR model led to an allowable degree of uncertainty.
Conclusion: Comparison of the developed models’ accuracy revealed that integrated GPR and SVM models
had higher performance compared with single GPR and SVM models in predicting the suspended sediment
discharge. The use of these two methods approximately decreased the error criteria between 20 to 25 %. According
to the results, for the models that were developed based on the station data, the model with the input parameters
of Dwt, Dwt-1, and Dst-1 and in the case of investigating the relationship between the stations, the model with the
input parameters of Dst-2, Dwt-1, and Dst-1 were superior models. Also, based on the uncertainty analysis, the
integrated GPR model had an allowable degree of uncertainty in suspended sediment modeling. However, it should
be noted that the used methods are data sensitive models. Therefore, further studies using data ranges out of this
study and field data should be carried out to determine the merits of the models to estimate suspended sediment
load in the real conditions of flow.