Soil moisture content (SMC) is a key environmental variable which influences numerous hydrological and ecological processes. However, the complex and dynamic nature of SMC makes it difficult to estimate. Moreover, invalid SMC measurements and data gaps in sensor-based SMC monitoring are common occurrences due to various reasons. This study investigates the effectiveness of the Random Forest (RF) machine learning algorithm in reconstructing missing SMC time series at depths of 10 cm, 30 cm, and 50 cm at two agricultural sites in the Arta plain, Greece. Input data included existing SMC time series at alternative depths and the NDVI vegetation index derived from Sentinel-2 satellite data. RF models were trained using daily SMC data from 2020 to 2021 and validated with 2022 observations. Model performance was evaluated using the Nash–Sutcliffe efficiency (NSE) and Root Mean Square Error (RMSE). The results demonstrated high predictive accuracy, with NSE values up to 0.98 and RMSE as low as 0.33 m³/m³. The best results were achieved when two SMC series were used as inputs. NDVI contributed less to model improvement, possibly because the NDVI daily time series is derived through temporal interpolation, as the NDVI values are not originally available on a daily basis. In addition, some NDVI values are discarded when the satellite image has more than 10% cloud cover. Overall, the study confirms that RF models are effective for imputing missing SMC data and can support irrigation management by reconstructing reliable soil moisture records even with limited sensor information.
Previous Article in event
Next Article in event
Soil Moisture Time Series Gap-Filling Using Random Forest Machine Learning Models: A Case Study in the Arta Plain
Published:
06 November 2025
by MDPI
in The 9th International Electronic Conference on Water Sciences
session Remote Sensing, Artificial Intelligence and New Technologies in Water Sciences
Abstract:
Keywords: Soil Moisture Content; Random Forest;Machine Learning; Vegetation Index;NDVI;