Please login first
A Transformer-CNN Hybrid Autoencoder for Semi-Supervised Plant Disease Detection
1  Department of Computer Systems Technology, College of Science and Technology, North Carolina A&T State University, Greensboro, NC, 27411, United States
Academic Editor: Oscar Vicente

Abstract:

Plant diseases represent a persistent challenge to global food security, causing an estimated 16% annual loss in crop yield and quality. As global food demand is projected to rise by 70% by 2050, developing reliable and scalable detection systems has become increasingly critical. Traditional disease identification methods based on manual inspection are labor-intensive, subjective, and impractical for large-scale agricultural monitoring. Early automated systems that relied on handcrafted features and classical classifiers achieved only limited success because they were unable to adapt to complex, variable field environments. Although deep learning, particularly convolutional neural networks (CNNs), has significantly improved detection performance by learning discriminative features directly from images, these models depend heavily on large, annotated datasets collected under controlled conditions—restricting their generalization to real-world settings. Furthermore, the scarcity of labeled data for rare or emerging diseases limits the scalability of supervised approaches.

Anomaly detection offers a promising solution by training models solely on healthy plant samples and identifying diseases as deviations from normal patterns. However, CNN-based autoencoders used for this task often struggle to capture the long-range dependencies required to recognize subtle or spatially distributed disease symptoms. To address this limitation, we propose a semi-supervised anomaly detection framework built upon a hybrid autoencoder with a Vision Transformer (ViT) encoder backbone. By leveraging the ViT’s self-attention mechanism, our model learns rich global representations of healthy foliage and detects anomalies through reconstruction errors. Experimental evaluations on multiple plant disease datasets demonstrate that our framework achieves an F1-score of 80%, representing a 7% improvement over the state-of-the-art anomaly detection methods. These results highlight the framework’s potential for robust, scalable, and early detection of plant diseases across diverse agricultural environments.

Keywords: Plant disease detection, Vision Transformer, Autoencoder, Anomaly detection, Unsupervised learning, Precision agriculture.
Comments on this paper
Currently there are no comments available.


 
 
Top