پديد آورندگان :
رضازاده جودي، علي نويسنده , , ستاري، محمدتقي نويسنده ,
كليدواژه :
بار رسوبي معلق , صوفي چاي , رگرسيون فرايند گاوسي , منحني سنجه رسوب , رگرسيون بردار پشتيبان
چكيده فارسي :
به سبب اهميت فراوان انتقال رسوب در استفاده بهينه از منابع آبي و طراحي سدها، دستيابي به روشي با دقت مناسب براي تخمين ميزان بار رسوبي معلق رودخانهها بسيار ضروري است. در اين پژوهش ميزان بار رسوبي معلق رودخانه صوفيچاي به وسيله روشهاي نوين دادهكاويـ شامل رگرسيون فرايند گاوسي و رگرسيون بردار پشتيبانـ كه با بهرهگيري از توابع كرنل توانايي بسياري در حل مسايل غيرخطي دارند، تخمين زده شد، سپس، با مقادير بهدستآمده از روشهاي تجربي منحني سنجه رسوب و روش فصلي مقايسه شد. روش رگرسيون فرايند گاوسي با ارايه شاخصهاي آماري ضريب همبستگي (R) برابر 977/0، ضريب همبستگي نش- ساتكليف (N-S) برابر 794/0، ميانگين خطاي مطلق (MAE) برابر 4278/77 تن در روز، و ريشه ميانگين مربعات خطا (RMSE) برابر 7455/698 تن در روز داراي بيشترين دقت و كمترين خطا از ميان روشهاي بررسيشده در اين مطالعه است. نتايج بهدستآمده نشان داد هر دو روش دادهكاوي بررسيشده رگرسيون فرايند گاوسي و رگرسيون بردار پشتيبان بهمراتب نتايج بهتري نسبت به منحني سنجه رسوب و روش فصلي ارايه ميكنند.
چكيده لاتين :
Introduction
Because the importance of sediment transport in the efficient use of water resources and dams design, estimation of the sediment load in rivers has been an essential interest to the engineers from long times ago, which leads to the design of various methods such as different empirical equations for solving the sediment transport. The error in most traditional experimental methods is common due to complexities in how this process works out and the vast amount of factors that cause this phenomenon, so achieving a suitable method that can accurately estimate the amount of sediment is very essential. In this study, the suspended sediment loads of Sofi Chay river has been estimated, by modern data mining methods include Gaussian process and support vector machines that use the kernel functions that have a high ability to solve nonlinear problems, and the results that obtained were compared with experimental methods such as sediment rating curve and seasonal method.
Materials and Methods
The Study Area
Sofi Chay catchment has area of up to 311 Km3 and has been located in the south of East Azerbaijan province and the northern city of Maragheh. Sofi Chay river is located within the geographical coordinates 37? and ʹ15 and "2 to 37? and ʹ45 and ʹ3ʹ north latitude and 45? and ʹ56 and" 31 to 46? and ʹ25 and "5 east longitude.
Gaussian process regression
Gaussian processes are a fruitful way of defining prior distributions for flexible regression and classification models in which the regression or class probability functions are not limited to simple parametric forms. One attraction of Gaussian processes is the variety of covariance functions one can choose from, which lead to functions with different degrees of smoothness, or different sorts of additive structures. When such a function, defines the average response in a regression model with Gaussian errors, we can use matrix calculations to deduce that it is possible for data sets with more than a thousand samples. Gaussian processes in statistical modeling are very important because they are normal characteristics. Gaussian processes and related methods have been used in various contexts for many years. Despite this past usage, and despite the fundamental simplicity of the idea, Gaussian process models appear to have been little appreciated by most Bayesians. I speculate that this could be partly due to confusion between the properties one expects of the true function being modeled and those of the best predictor for this unknown function.
Support vector regression
SVRs are a subset of SVMs that are particular learning systems that use a linear high dimensional hypothesis space called feature space. These systems are trained using a learning algorithm which is based on optimization theory. This method was introduced by Vapnik in 1995. SVMs have been employed for regression estimation, the so called support vector regression (SVR), in which the real value functions are estimated. ln this case, the aim of learning process is to find a function f(x) as an approximation of the value y(x) with minimum risk, and only based on the available independent and identically distributed data. Often in complex nonlinear problems the original input space (predictor variable) is non-linearly related to the predicted variable (lateral spread displacement).
Results and discussions
In this study, after collecting required data related to the Sofi Chay River, these data were examined by the standard normal homogeneity tests such as the Buishand range, Pettitt and Von Neumann’s ratio; and after refining the data, the drawing of sediment rating curve was developed. Then the amount of sediment discharge of the river Sofi Chay was estimated using Gaussian process regression, support vector regression, sediment rating curve and seasonal methods. To achieve optimum results by the used data mining techniques, various scenarios including different types of kernel functions and different intervals of hyper parameters of kernel functions were defined. When Gaussian process regression, along with radial basis function kernel in amounts of (Gaussian noise (?) equal to 0.01) and (gamma (?) equal to 0.5) were used to estimate sediment discharge rate of the river Sofi Chay, it was observed that this method by presenting statistical indicators (correlation coefficient (R) equal to 0.977, Nash-Sutcliffe coefficient (NS) equal to 0.794, mean absolute error (MAE) equal to 77.4278 (tons/day) and root mean squared error (RMSE) equal to 698.7455 (tons/day)), have the highest accuracy and lowest error among the methods investigated in this study. Also the both investigated data mining methods have far greater efficiency and accuracy in this area.
Conclusion
In this study the amount of suspended sediment load was estimated using traditional methods such as sediment rating curve and seasonal method in comparison of modern data mining methods that based on kernel functions such as Gaussian process regression and support vector regression. Results indicated that seasonal method has better performance in this case rather than sediment rating curve. Also, the comprehensive results show that both of modern data mining methods that examined in this study outperform rather than traditional methods. Among the Gaussian process regression and support vector regression results show the higher ability of Gaussian process regression method with using radial basis function as a kernel function. Generally use of Gaussian process regression method suggested in similar cases.