Please login first
Comparative Evaluation of CNN and ViT Architectures for Citrus Disease Detection in Field Conditions
1  Pine View School, Osprey, Florida, 34229, USA
Academic Editor: Sanzidur Rahman

Abstract:

The citrus industry worldwide has been devastated by widespread diseases, particularly greening, canker, and black spot, leading to significant tree losses, orchard closures, and reduced orange production. The traditional inspection methods for detecting such diseases are expensive and inefficient, thus warranting a better solution. This study aims to compare the effectiveness of different AI-powered, real-time computer vision architectures in accurately detecting and classifying citrus diseases through imagery. Two object detection models were compared: YOLOv8, a Convolutional Neural Network (CNN), and RT-DETR, a Vision Transformer (ViT). Both models were trained and cross-validated on a custom benchmark dataset, which featured 6,000 citrus images. This included 1,500 original images, as well as 4,500 images through augmentation, which were split into three difficulty levels to test the model response to varying simulations of real-world conditions such as lighting and motion blur. Initial training across the original dataset revealed that YOLOv8 outperformed RT-DETR in its accuracy and real-time speed by a slight margin. The weight decay, learning rate, and batch size were finetuned via Bayesian optimization. Additional few-shot learning on several other datasets boosted the performance and speed, resulting in 92.5% mean average precision (mAP) for YOLOv8 versus 87.07% mAP for RT-DETR. While YOLOv8 performed better overall, RT-DETR demonstrated a better performance on the hardest set, which displayed the model’s robustness in difficult environmental conditions. The optimized models were deployed on a Raspberry Pi 5 with a camera module and several sensors. Field tests in a citrus grove confirmed successful real-time detection and accurate classification of diseased leaves and fruits, with visual explainability through Grad-CAM analysis. This research showcases the viability of low-cost platforms for object detection and introduces a novel data framework for future research into AI implementation in the citrus industry, allowing for the early detection and rapid treatment of such diseases.

Keywords: artificial intelligence; vision transformers; convolutional neural network; citrus diseases

 
 
Top