Please login first
The Application of Multivariate Statistics and Geospatial and Machine Learning Techniques to the Prediction of Water's Suitability for Irrigation in the Sokoto–Rima Catchment in Nigeria
* 1 , 2 , 1 , 3 , 4
1  Department of Geological Sciences, Achievers University, Owo
2  Department of Civil Engineering, Ladoke Akintola University of Technology, Ogbomosho
3  Department of Earth Sciences, Achievers University Owo
4  Department of Computer Sciences, Achievers University Owo
Academic Editor: ATHANASIOS LOUKAS

Abstract:

In the Sokoto–Rima catchment, over 70% of the population depends on groundwater for subsistence farming. The use of the conventional techniques in the assessment of water quality is expensive because it requires several parameters, so developing an accurate and reliable model is essential in the management of water resources for effective agricultural practices. This study applied multivariate statistics, geospatial analyses and machine learning (ML) models to groundwater chemistry and irrigation indices to determine the classification and the spatial–temporal distribution. MS Excel was to used calculate the irrigation suitability parameters, such as SAR, KR, ESP, Na%, PI and ESP, and then PAST4.0 statistical software and machine learning algorithms such as multiple linear regression (MLR), a Decision Tree (DT), Random Forest (RF), a Support Vector Model (SVM) and K-NN Neigbors (K-NN) were used for the model predictions, based on a composite index that combined all the indices (SAR, KR, EST, Na, MAR, PI) into a single score representing the overall irrigation water quality and then SAR (and subsequently other indices like KR, EST, Na, MAR, PI, etc.). The IDW interpolation technique was used to generate the spatial distribution maps for each parameter of the groundwater dataset for this study. The results from the fifty-element (50) water chemistry dataset obtained from the archive of the Federal Ministry of Water Resources predicted four (4) clusters from the hierarchical cluster analysis, with two (2) principal components, PC1 and PC2, representing the major geochemical processes controlling the groundwater quality. A very strong correlation association was observed between EC-Ca (0.85), KR-SAR (0.95) and Na%-ESP (0.85). The machine learning models indicated for the composite index showed a low MSE of 0.00 and a high R of 1.00 for multiple linear regression and R values of 0.6 and 0.63 and MSE values of 68.5 and 67.86, respectively, for the DT and RF models. Predicting PI as the target variable with KR, SAR, MAR, Na% and ESP demonstrated a notable predictive capability, with a low RMSE of 13.3 and a high R of 0.9836, with RF. While KNN showcases a robust performance for Na% as the target variable, as did the DT and RF for ESP, MLR showed a strong predictive performance for SAR.

Keywords: Irrigation Suitability; Geospatial analysis; Machine Learning Models; Multivariate analysis; Sokoto-Rima Catchment

 
 
Top