Please login first
An Evaluation of Machine Learning Algorithm Performance in Crop Recognition Using Remote Sensing: A Case Study in Southern Ukraine
1  Department of Irrigated Agriculture and Decarbonization of Agroecosystems, Institute of Climate-Smart Agriculture, Odessa, 67667, Ukraine
Academic Editor: Bin Gao

Abstract:

Crop recognition using remote sensing data is vital for modern agriculture, enabling dynamic crop mapping, land use monitoring, and cropland structure analysis. Beyond identifying crops, distinguishing irrigated from rainfed croplands enhances agricultural water management. This study utilized the Normalized Difference Vegetation Index (NDVI), collected monthly from the Kherson and Mykolaiv regions (Ukraine), to classify irrigated and rainfed croplands and crop types via machine learning. NDVI data, sourced from the OneSoil platform, covered grain corn, wheat, sunflower, and soybeans, with equal representation of irrigated and rainfed conditions, forming eight distinct classes. Five algorithms were applied: Linear Discriminant Analysis (LDA), Multiple Logistic Regression (MLR), Support Vector Machine (SVM), Random Forest (RF), and eXtreme Gradient Boosting (XGB). Classification was performed on the original dataset, an augmented dataset (via Gaussian noise), and a normalized dataset. Performance was assessed using k-fold cross-validation, with F1 scores computed for each model in Python 3.13 with relevant libraries. The results showed normalization had no impact on performance. All models excelled at separating irrigated from rainfed croplands, with the SVM achieving the highest F1 scores (0.9292 original; 0.9352 augmented) and LDA and MLR the lowest (0.8938 original; 0.8879 augmented, respectively). Crop type recognition proved more challenging, with F1 scores not exceeding 0.60; XGB scored highest on the original dataset (0.5911) and RF on the augmented dataset (0.6346). Two-fold data augmentation generally improved F1 scores, with the SVM performing best overall on the augmented dataset (average F1: 0.7839), while XGB led on the original dataset (0.7556). Data normalization proved ineffective for monthly NDVI-based crop recognition, suggesting it can be omitted. Gaussian noise augmentation enhanced most models’ performance and altered their relative efficacy. The SVM excelled at distinguishing irrigation status, but simultaneous crop type classification remains difficult, warranting further refinement. These findings highlight the potential of applying machine learning with NDVI data for irrigation classification and the need for improved approaches to crop type identification.

Keywords: croplands; data augmentation; data normalization; F1 score; normalized difference vegetation index; precision.

 
 
Top