Please login first
Comprehensive analysis of drinking water quality using machine learning techniques
1 , * 2
1  Department of Computer Science and Engineering, Saintgits College of Engineering, Kottayam, India
2  Department of Electronics Engineering, Saintgits College of Engineering, Kottayam, India.
Academic Editor: Junye Wang

Abstract:

Ensuring the safety and quality of drinking water is crucial for public health, particularly in regions where water contamination is a significant concern. This study investigates the application of machine learning techniques for water quality analysis in the Indian state of Kerala. A total of 328 water samples were collected and analyzed for various parameters including pH, dissolved oxygen, total coliform, fecal coliform, conductivity, nitrate, and biochemical oxygen demand. These parameters were used to compute the Water Quality Index (WQI), which was subsequently classified into four categories: clean, unclean, polluted, and highly polluted. Five machine learning classifiers were employed to classify the water quality data: Support Vector Machine (SVM), Decision Tree (DT), k-Nearest Neighbors (k-NN), Logistic Regression (LR), and XGBoost. The classifiers were trained and tested on the dataset to determine their accuracy in predicting water quality classes. Among these, XGBoost emerged as the most accurate classifier, achieving a classification accuracy of 91%. The study highlights the effectiveness of machine learning in environmental monitoring and demonstrates the potential of these techniques to aid in water quality management. The high accuracy of XGBoost suggests that it can be a valuable tool for predicting water quality and identifying areas at risk of pollution. By providing reliable classifications, machine learning models can support decision-makers in implementing timely and appropriate interventions to ensure the safety and cleanliness of drinking water.

Keywords: Water quality analysis; machine learning; drinking water safety; Water Quality Index (WQI); water pollution detection

 
 
Top