Paddy, Oryza sativa, is a widely consumed staple food worldwide, feeding over 50% of the global population and Sri Lanka has been producing paddy for centuries. Because of the rapid increase in population, forecasting paddy yield has become essential. This research aims to predict Sri Lankan district wise paddy yield with openly available data: CHIRPS 2.0, NASA POWER APIs, Sri Lankan Rice Research and Development Institute’s PH and Salinity Maps, Paddy Statistics from the Department of Census and Statistics - Sri Lanka for local agro zones using an XGB regressor-based stacked ensemble learning framework. For the three major agro-climatic zones and 25 administrative districts, data from 2004 to 2024 were collected for the two harvesting seasons “Yala” and “Maha”, with the specific target being the total paddy production per district (in metric tons) with the range (185, 530356) MT. lowest recorded for “Mannar” district in 2006 “Yala” season and highest “Anuradhapura” district in 2019-2020 “Maha” season. The crop calendar template provided by the Department of Agriculture - Sri Lanka was used to simulate end-to-end crop harvesting patterns for the selected time span as the foundation for dataset construction. By combining harvesting simulations with climate variables, soil properties and historical yield records, we created 12 heterogeneous datasets. These datasets were used to train 12 base models, whose out-of-fold predictions were subsequently integrated into two intermediate meta models. Finally, the outputs of the intermediate meta models were stacked for the final meta model to evaluate predictive performance, showcasing RMSE of 3,535 MT and R² of 0.9986 on the validation set with a NRMSE of 0.76%, indicating that the model can predict district level seasonal paddy yield with high accuracy. These results highlight the potential of integrating openly available data to support reliable, data driven decision making for sustainable paddy production in Sri Lanka.
Previous Article in event
Previous Article in session
Next Article in event
Next Article in session
AI Driven Paddy (Oryza sativa) Yield Forecasting Using Open Satellite Data, Weather APIs and Historical Data for Sri Lankan Agro Zones
Published:
11 December 2025
by MDPI
in The 5th International Electronic Conference on Agronomy
session Precision and Digital Agriculture
Abstract:
Keywords: machine learning; Oryza sativa; paddy; precision agriculture; yield forecasting
