شبیه‌سازی رسوب معلق با استفاده از الگوریتم‌های یادگیری ماشینی و داده‌های بارش ماهواره CHIRPS با تاکید بر خوشه‌بندی داده‌ها و آزمون گاما، مطالعه موردی: حوزه آبخیز رامیان، استان گلستان

طباطبایی, محمودرضا; صالح پور جم, امین; مصفایی, جمال

doi:10.22092/ijwmse.2022.358396.1972

نوع مقاله : مقاله پژوهشی

نویسندگان

دانشیار، پژوهشکده حفاظت خاک و آبخیزداری، سازمان تحقیقات، آموزش و ترویج کشاورزی، تهران، ایران

https://doi.org/10.22092/ijwmse.2022.358396.1972

چکیده

مقدمه
چرخه فرسایش خاک، شامل برداشت، حمل و رسوبگذاری که رسوبدهی حوزه‌های آبخیز را کنترل می‌کند، شامل مجموعه‌ای از فرایندهای پیچیده و به ‌شدت غیر‌خطی است. از سوی دیگر، عوامل تاثیرگذار در رسوبدهی حوزه‌های آبخیز بسیار متنوع بوده و با توجه به شرایط خاص اقلیمی، خاکشناسی، پوشش ‌گیاهی، زمین‌شناسی، توپوگرافی و غیره در هر حوضه، وزن و نقش هر یک از عوامل یاد شده در تولید رسوب بسیار متفاوت است. تعیین و اندازه‌گیری دقیق این عوامل و ایجاد رابطه‌های ریاضی بین آن‌ها اغلب مشکل، پرهزینه، زمانبر و با خطا همراه بوده است. این در حالی است که با استفاده از مدل‌های مبتنی بر هوش محاسباتی و به‌کارگیری تعداد محدودی از متغیر‌های دینامیک حوضه، می‌توان رفتار حوزه آبخیز را در تولید رسوب به‌ خوبی شبیه‌سازی کرد. صرف‌نظر از نوع مدل‌های هوشمند، در اغلب پژوهش‌های انجام شده (به‌ویژه در تحقیقات داخلی)، شبیه‌سازی رسوب معلق به‌طور عمده، بر پایه متغیر دبی جریان بوده است و به نقش متغیرهایی نظیر بارش (به‌ویژه بارش اخذ شده از تصاویر ماهوارهای) که در رسوبدهی حوضه‌ها موثرند، کمتر توجه شده است. علاوه‌بر بارش، چولگی داده‌های رسوب‌سنجی نیز از جمله مسایلی است که عدم شناخت و توجه به آن سبب کاهش کارایی مدل‌های برآوردگر خواهد شد. در پژوهش حاضر، نقش متغیر بارش روزانه اخذ شده از ماهواره CHIRPS در شبیه‌سازی رسوب معلق رودخانه قرهچای مورد بررسی قرار گرفته است.

مواد و روش‌ها
به‌منظور شبیه‌سازی غلظت رسوب معلق روزانه رودخانه قره‌چای در محل ایستگاه آب‌سنجی رامیان در استان گلستان، از شبکه عصبی مصنوعی پرسپترون چند لایه استفاده شد. به این منظور، از متغیرهای دبی جریان و دبی جریان پیشین در مقیاس‌های لحظه‌ای و روزانه و همچنین، متوسط بارش روزانه و پیشین حوضه اخذ شده از ماهواره CHIRPS برای یک دوره آماری 37 ساله (1396-1359) به‌عنوان متغیرهای ورودی مدل، استفاده شد. جهت افزایش قدرت تعمیمدهی مدل‌ها، از شبکه عصبی نگاشت خود سازمانده (برای خوشه‌بندی داده‌ها) و به‌منظور یافتن بهترین ترکیب متغیرهای ورودی، از آزمون گاما استفاده شد. در راستای افزایش کارایی آموزش شبکه، انواع توابع فعالسازی و زیان و همچنین، الگوریتم جلوگیری از بیش برازش استفاده شد. به‌منظور بررسی تاثیر به‌کارگیری توابع فعالسازی و زیان در برآورد رسوب معلق، سناریوهای مختلفی در نظر گرفته شد که در مجموع منجر به ساخت نه مدل شد. پس از آن، با استفاده از شاخص‌های صحت‌سنجی، میزان کارایی مدل‌ها در شبیه‌سازی رسوب معلق مورد بررسی و مقایسه قرار گرفت و سپس مدل برتر انتخاب شد.

نتایج و بحث
نتایج پژوهش حاضر، نشان داد که از بین مدل‌های مختلف، مدل شبکه عصبی با تابع فعالسازی Huber و تابع زیان ReLU، با داشتن میانگین قدر مطلق خطا برابر 368 میلی‌گرم در لیتر، ریشه میانگین مربعات خطا برابر 597 میلی‌گرم در لیتر، ضریب ناش-ساتکلیف 0.87 و درصد اریبی 2.2- درصد، به‌عنوان مدل برتر انتخاب شد. نتایج همچنین نشان داد که استفاده از متغیر بارش، به‌عنوان یکی از عوامل مهم در ایجاد فرسایش و انتقال رسوب حوضه، سبب بهبود کارایی مدل‌ها شده است. لذا با توجه به سهولت استفاده از داده‌های بارش ماهواره CHIRPS، می‌‎توان در شبیه‌سازی رسوب معلق رودخانه‌ها، از این داده نیز به همراه سایر متغیرهای پیشبینی کننده استفاده شود.

نتیجه‌گیری
در شبیه‌سازی رسوب معلق، اغلب از متغیر دبی جریان به‌عنوان تنها متغیر پیش‌بینی کننده رسوب معلق استفاده می‌شود، این در حالی است که در حوضه‌هایی با رژیم‌های بارانی، یا بارانی-برفی، نقش بارش در تولید رواناب‌های سطحی و فرسایش خاک بسیار با اهمیت بوده است و نقش مهمی در تولید و انتقال رسوب حوضه دارد. اگرچه استفاده از داده‌های بارش اخذ شده از ایستگاه‌های باران‌سنجی زمینی، نقش موثری در افزایش کارایی مدل‌های داده مبنا در برآورد رسوب معلق داشته است، با این‌ حال، تهیه صدها لایه مکانی توزیعی بارش روزانه از داده‌های نقطه‌ای ایستگاه‌های زمینی، استفاده از این متغیر را در شبیه‌سازی رسوب معلق حوضه با مشکلات فراوان (نظیر کمبود یا نامناسب بودن توزیع مکانی ایستگاه‌های باران‌سنجی، نواقص آماری، به‌کارگیری روش‌های میان‌یابی نامناسب و زمانبر بودن انجام محاسبات) روبهرو ساخته است. لذا، در عمل، اغلب از متغیر دبی جریان رودخانه به‌عنوان متغیر پیشبینی کننده رسوب استفاده شده و کمتر از بارش استفاده می‌شود. یکی از راه‌حلهای برون‌رفت از مشکل یاد شده که در پژوهش حاضر به آن پرداخته شد، استفاده از داده‌های ماهواره‌ای CHIRPS است که برای اولین بار در این پژوهش مورد بررسی قرار گرفت. این داده‌ها از سال 1981 میلادی در دسترس است و به سادگی می‌تواند برای شبیه‌سازی رسوب معلق یا دیگر کاربردهای مرتبط با حوزه‌های آبخیز مورد استفاده قرار گیرد. نکته مهم دیگر که لازم است در شبیه‌سازی رسوب معلق به آن توجه شود، وجود چولگی زیاد در داده‌های رسوب‌سنجی بوده (داده‌های رسوب معلق و دبی جریان) که عدم توجه به آن در فرایند آموزش (یا واسنجی) و آزمون مدل‌ها منجر به ساخت مدل‌های ضعیف به لحاظ کارایی و وجود عدم قطعیت در صحت نتایج آنها خواهد شد. در این رابطه، لازم است از تبدیل‌های لگاریتمی و یا توابع مناسب فعالسازی و زیان در فرایند آموزش استفاده شود که در این پژوهش به‌ترتیب دو تابع ReLU و Huber پیشنهاد شد. از نکات مهم دیگر، توجه به قدرت تعمیم‌دهی مدل‌های داده مبنا است که تا اندازه زیادی وابسته به داده‌های استفاده شده در فرایند واسنجی یا آموزش آنها است. این داده‌ها باید به‌گونه‌ای انتخاب شوند که ضمن آن‌که معرف داده‌ها در کل دوره آماری هستند، با دیگر مجموعه‌های داده (نظیر مجموعه‌های ارزیابی یا آزمون)، مشابه و از توزیع یکسان برخوردار باشند. با توجه به نتایج به‌دست ‌آمده از پژوهش حاضر و به‌منظور افزایش کارایی مدل‌های شبکه عصبی مصنوعی در برآورد رسوب معلق ایستگاه‌های هیدرومتری حوزه‌های آبخیز، پیشنهاد می‌شود از تجارب به‌دست آمده در این پژوهش در دیگر ایستگاه‌های رسوب‌سنجی کشور نیز استفاده شود.

کلیدواژه‌ها

عنوان مقاله [English]

Suspended sediment simulation using machine learning algorithms and CHIRPS satellite precipitation data with emphasis on data clustering and gamma test, case study: Ramyan Watershed, Golestan Province

نویسندگان [English]

Mahmoudreza Tabatabaei
Amin Salehpour Jam
Jamal Mosaffaie

Associate Professor, Soil Conservation and Watershed Management Research Institute, Agricultural Research, Education and Extension Organization (AREEO), Tehran, Iran

چکیده [English]

Introduction
The cycle of soil erosion (including removal, transport and deposition) that controls the sedimentation of watersheds, includes a set of complex and highly nonlinear processes. On the other hand, the factors influencing sedimentation in watersheds are very diverse, and according to the specific conditions of climate, soil, vegetation, geology, topography, etc., in each basin, the weight and role of each of the mentioned factors in sediment production is very different. Accurately determining and measuring these factors and making mathematical relationships between them are often difficult, expensive, time-consuming and error-prone, and this is the case with the use of models based on computational intelligence and the use of a limited number of basin dynamic variables, it is possible to simulate the behavior of the watershed in sediment production. Regardless of the type of intelligent models, in most of the conducted research (especially in internal research), the simulation of suspended sediment is mainly based on the discharge variable and the role of variables such as precipitation (especially precipitation obtained from satellite images), which are effective in the sedimentation of basins, have received less attention. In addition to precipitation, the skewness of sediment measurement data is also one of the issues that lack of recognition and attention will reduce the efficiency of estimator models. In the present study, the role of variable daily rainfall (taken from CHIRPS satellite) in the simulation of suspended sediment of Qarachai River has been investigated.

Materials and methods
Multi-layer perceptron artificial neural network was used in order to simulate the daily suspended sediment concentration of Qarachai River (at the Ramian hydrometer station in Golestan province). In this regard, the variables of discharge and previous discarge (in instantaneous and daily scales) as well as the average daily and previous rainfall of the basin (taken from CHIRPS satellite) for a statistical period of 37 years (1980-2017) as variables model input was used. In order to increase the generalization power of the models, self-organized mapping neural network (for data clustering) and gamma test was used to find the best combination of input variables. In order to improve the efficiency of network training, a variety of activation and loss functions as well as the overfitting prevention algorithm were used. In order to investigate the effect of using activation and loss functions in suspended sediment estimation, different scenarios were considered, which led to the construction of 9 models. After that, using validation indicators, the effectiveness of the models in simulating suspended sediment was investigated and compared, and then the best model was selected.

Results and discussion
The results obtained from the present research showed that among the different models, the neural network model with Huber's activation function and ReLU loss function, having the average absolute value of the error equal to 368 mg/l, the root mean square error equal to 597 mg per liter, the Nash-Sutcliffe coefficient of 0.87 and the percent bias -2.2% were selected as the best model. The results also showed that the use of the rainfall variable (as one of the important factors in causing erosion and sediment transfer in the basin) has improved the efficiency of the models, therefore, considering the ease of using CHIRPS satellite rainfall data, it is suggested in order to simulate the suspended sediment of rivers, this data is also used along with other predictive variables.

Conclusion
In the simulation of suspended sediment, discharge variable is often used as the only predicting variable of suspended sediment, while in basins with rainy, or rainy-snow regimes, the role of precipitation in the production of surface runoff and soil erosion is very important and plays an important role in the production and transport of sediment in the basin. In this regard, although the use of rainfall data obtained from ground rain gauge stations has played an effective role in increasing the efficiency of data-based models in estimating suspended sediment, however, the preparation of hundreds of spatial distribution layers of daily rainfall from the data point data of ground stations, the use of this variable in the simulation of the suspended sediment of the basin has been faced with many problems (such as the lack or inappropriateness of the spatial distribution of rain gauge stations, statistical deficiencies, the use of inappropriate interpolation methods and time-consuming calculations). Therefore, in practice, the variable of river flow is often used as a predictor of sediment, and precipitation is used less often. One of the solutions to the problem mentioned in the present study is the use of CHIRPS satellite data, which was investigated for the first time in this study. These data, available since 1981, can easily be used to simulate suspended sediment or other applications related to watersheds. Another important point that needs to be taken into account in the simulation of suspended sediment is the presence of high skewness in sediment measurement data (both suspended sediment and flow rate), which lack of attention in the process of training (or recalibration) and testing the models leads to It will lead to the construction of weak models in terms of efficiency and the existence of uncertainty in the accuracy of their results. In this regard, it is necessary to use logarithmic transformations or suitable functions of activation and loss in the training process, which in this research, two functions, ReLU and Huber, were proposed respectively. Another important point is to pay attention to the generalization power of data-based models, which is largely dependent on the data used in their calibration or training process. These data should be selected in such a way that while they are representative of the data in the entire statistical period, they are similar and have the same distribution with other data sets (such as cross-validation or test sets). According to the results obtained from the present research and in order to increase the efficiency of artificial neural network models in estimating the suspended sediment of watershed hydrometric stations, it is suggested to use the experiences obtained in this research in other sediment measuring stations of the country.

کلیدواژه‌ها [English]

Daily precipitation
Model
Neural network
Sediment yield
Self-organizing map

مراجع

Ahmadi, M., Dadashi Roudbari, A., Deyrmajai, A., 2020. Runoff estimation using ihacres model based on chirps satellite data and cmip5 models, case study: Gorganroud Basin-Aq Qala area. Iran J Soil Water Res. 51(3), 659-671.

Altunkaynak, A., 2009. Sediment load prediction by genetic algorithms. Adv. Eng. Softw. 40, 928-934.

Ayes Rivera, I., Callau Poduje, A.C., Molina-Carpio, J., Ayala, J.M., Armijos Cardenas, E., Espinoza-Villar, R., Espinoza, J.C., Gutierrez-Cori, O., Filizola, N., 2019. On the relationship between suspended sediment concentration, rainfall variability and groundwater: an empirical and probabilistic analysis for the Andean Beni River, Bolivia (2003–2016). Water 11(12), 2497.

Bowden, G.J., Maier, H.R., Dandy, G.C., 2002. Optimal division of data for neural network models in water resources applications. Water Resour. Res. 38(2), 1-2.

Buyukyildiz, M., Kumcu, S.Y., 2017. An estimation of the suspended sediment load using adaptive network based fuzzy inference system, support vector machine and artificial neural network models. Water Resour. Manag. 31(4), 1343-1359.

Chen, X.Y., Chau. K.W., 2019. Uncertainty analysis on hybrid double feedforward neural network model for sediment load estimation with LUBE method. Water Resour. Manag. 33(10), 3563-3577.

Chiang, J.L., Tsai, K.J., Chen, Y.R., Lee, M.H., Sun. J.W., 2014. Suspended sediment load prediction using support vector machines in the Goodwin Creek Experimental Watershed. Proceedings of the EGU General Assembly Conference, Munich, Germany.

Cho, J., Bosch, D., Lowrance, R., Strickland, T., Vellidis, G., 2009. Effect of spatial distribution of rainfall on temporal and spatial uncertainty of SWAT output. Transactions of the ASABE 52(5), 1545-1556.

Choubin, B., Malekian, A., 2017. Combined gamma and M-test-based ANN and ARIMA models for groundwater fluctuation forecasting in semiarid regions. Environ. Earth Sci. 76(15), 1-10.

Cobaner, M., Unal, B., Kisi, O., 2009. Suspended sediment concentration estimation by an adaptive neuro-fuzzy and neural network approaches using hydro-meteorological data. J. Hydrol. 367(1-2), 52-61.

Duan, Z., Tuo, Y., Liu, J., Gao, H., Song, X., Zhang, Z., Yang, L., Mekonnen, D.F., 2019. Hydrological evaluation of open-access precipitation and air temperature datasets using SWAT in a poorly gauged basin in Ethiopia. J. Hydrol. 569, 612-626.

Durrant, P.J., 2001. Wingamma a non-linear data analysis and modelling tool with applications to flood prediction. PhD Thesis, Cardiff University.

Funk, C., Peterson, P., Landsfeld, M., Pedreros, D., Verdin, J., Shukla, S., Husak, G., Rowland, J., Harrison, L., Hoell, A., Michaelsen. J., 2015. The climate hazards infrared precipitation with stations-a new environmental record for monitoring extremes. Sci. Data 2(1), 1-21.

Hornik, K., Stinchcombe, M., White, H., 1989. Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359-366.

Jones, A.J., Evans, D., Margetts, S., Durrant, P.J., 2002. Heuristic and optimization for knowledge discovery. Chapter IX, Idea Group Publishing, Hershey, 142-162 pages.

Joshi, R., Kumar, K., Adhikari, V.P.S., 2016. Modelling suspended sediment concentration using artificial neural networks for Gangotri glacier. Hydrol. Process. 30(9), 1354-1366.

Kaufman, L., Rousseeuw, P.J., 2009. Finding groups in data: an introduction to cluster analysis, Vol. 344. John Wiley and Sons, New Jersey, USA.

Kaveh, K., Kaveh, H., Bui, M.D., Rutschmann, P., 2021. Long short-term memory for predicting daily suspended sediment concentration. Eng. Comput. 37(3), 2013-2027.

Khan, M.Y.A., Tian, F., Hasan, F., Chakrapani, G.J., 2019. Artificial neural network simulation for prediction of suspended sediment concentration in the River Ramganga, Ganges Basin, India. Int. J. Sediment Res. 34(2), 95-107.

Kişi, Ö., Fedakar, H.I., 2014. Modeling of suspended sediment concentration carried in natural streams using fuzzy genetic approach. Computational Intelligence Techniques in Earth and Environmental Sciences, Springer, Dordrecht.

Kisi, O., Shiri, J., 2012. River suspended sediment estimation by climatic variables implication: Comparative study among soft computing techniques. Comput. Geosci. 43, 73-82.

Kohonen, T., 1998. The self-organizing map. Neurocomputing 21(1), 1-6.

Koncar, N., 1997. Optimisation methodologies for direct inverse neurocontrol. PhD Thesis, University of London.

Legates, D.R., McCabe, G.J., 1999. Evaluating the use of “goodness-of-fit” measures in hydrologic and hydroclimatic model validation. Water Resour. Res. 35(1), 233-241.

Li, X., Nour, M.H., Smith, D.W., Prepasc, A.A., 2010. Neural networks modeling of nitrogen export: model development and application to unmonitored boreal forest watersheds. Environ. Technol. 31(5), 495–510

Mansourfar, K., 2017. Advanced statistical methods: using applied software. University of Tehran Press (in Persian).

May, R.J., Maier, H.R., Dandy, G.C., 2010. Data splitting for artificial neural networks using SOM-based stratified sampling. Neural Netw. 23,: 283-294.

Melesse, A.M., Ahmad, S., McClain, M.E., Wang, X., Lim, Y.H., 2011. Suspended sediment load prediction of river systems: an artificial neural network. Agric. Water Manag. 98(5), 855-866.

Muleta, M.K., 2011. Model performance sensitivity to objective function during automated calibrations. J. Hydrol. Eng. 17(6), 756-767.

Nour, M.H., Smith, D.W., Gamal El-Din, M., Prepas, E.E., 2006. Neural networks modelling of streamflow, phosphorus, and suspended solids: application to the Canadian Boreal forest. Water Sci. Technol. 53(10), 91-99.

Olyaie, E., Banejad, H., Chau, K.W., Melesse, A.M., 2015. A comparison of various artificial intelligence approaches performance for estimating suspended sediment load of river systems: a case study in United States. Environ. Monit. Assess. 187(4), 1-22.

Rezai Banafshe, M., Feyzolahpour, M., Sadrafshary, S., 2013. Using neural fuzzy inference system to estimate sediment load and a comparison with MLR and SRC models in Ghranghu River Basin. Phys. Geog. Res. Quarterly 45(2), 77-90.

Rodríguez-Blanco, M.L., Taboada-Castro, M.M., Palleiro-Suárez, L., Taboada-Castro, M.T., 2010. Temporal changes in suspended sediment transport in an Atlantic Catchment, NW Spain. Geomorphology 123(1), 181-188.

Sahoo, B.B., Dalai, C., Srikanth, B., Bhushan, M., 2022. Evaluation of daily suspended sediment load using deep learning models. Research Square, in Press.

Shams, S., Ratnayake, U., Rahman, E.A., Alimin, A.A., 2020. Analysis of sediment load under combined effect of rainfall and flow. Proceedings of the Second International Conference on Civil and Environmental Engineering, Langkawi, Kedah, Malaysia.

Sulugodu, B., Deka, P.C., 2019. Evaluating the performance of CHIRPS satellite rainfall data for streamflow forecasting. Water Resour. Manag. 33(11), 3913-3927.

Tabatabaei, M., Salehpour Jam, A., Hosseini, S.A., 2019. Suspended sediment load prediction using non-dominated sorting genetic algorithm II. Int. Soil Water Conserv. Res. 7(2), 119-129.

Tayfur, G., 2012. Soft computing in water resources engineering: artificial neural networks, fuzzy logic and genetic algorithms. WIT Press.

Tayfur, G., Guldal, V., 2006. Artificial neural networks for estimating daily total suspended sediment in natural streams. Hydrol. Res. 37(1), 69-79.

Taylor, K.E., 2001. Summarizing multiple aspects of model performance in a single diagram. J. Geophys. Res. Atmos. 106(D7), 7183-7192.

Teixeira, L.C., Mariani, P.P., Pedrollo, O.C., dos Reis Castro, N.M., Sari, V., 2020. Artificial neural network and fuzzy inference system models for forecasting suspended sediment and turbidity in basins at different scales. Water Resour. Manag. 34(11), 3709-3723.

Ulke, A., Tayfur, G., Ozkul, S., 2009. Predicting suspended sediment loads and missing data for Gediz River, Turkey. J. Hydrol. Eng. 14, 954-965.

مهندسی و مدیریت آبخیز

مراجع

مراجع

دوره 15، شماره 3
مهر 1402
صفحه 328-350

مراجع

مراجع

دوره 15، شماره 3مهر 1402صفحه 328-350

دوره 15، شماره 3
مهر 1402
صفحه 328-350