Introduction: Wheat is a major crop in Pakistan since it guarantees the country's food supply and economic stability. Effective yield prediction is necessary to maximize output, reduce waste following crop harvesting, and save resources. Conventional approaches to yield prediction are frequently imprecise and fail to recognize how climatic conditions impact crop growth. This research aims to develop an AI-driven framework for wheat yield prediction.
Methods: This research uses 23 years of historical agro-meteorological data, with features including evapotranspiration (mm), mean sea level pressure (hPa), mean soil moisture (m³/m³, 7-28 cm depth), mean soil moisture available to plants (fraction, 7-28 cm depth), mean relative humidity (%), minimum temperature (°C, 2m elevation), and mean soil temperature (°C, 7-28 cm depth), retrieved within the archives of Meteoblue and actual historical yield (acres) from the Pakistan Bureau of Statistic. Various machine learning models were trained and tested, and after preprocessing and converting to a time series with lagged features, a two-layer Long Short-Term Memory (LSTM) network performed the best in all evaluation measures.
Results: Early tests showed good results with the proposed models, but the deep learning-based LSTM model was used because of its strong ability with time-series data, improving the accuracy of forecasting yields. Using this method, the features are captured for their time dependencies, leading to accurate yield predictions with an R² score of 0.979, a mean squared error (MSE) of 0.0004, a root mean squared error (RMSE) of 0.0201, and a mean absolute error (MAE) of 0.0111 on the test set.
Conclusion: In conclusion, the results demonstrate that in environmentally sensitive regions (like Pakistan), deep learning is a suitable approach for agriculture forecasting. Future research should focus on improving the generalizability of the model and applying the technique to other staple crops for more agricultural relevance.