This study presents the first navigable and georeferenced catalog of olive trees in Apulia (Italy), developed as part of the WADIT (Water Digital Twin) project. Olive canopy detection was performed using the YOLO11n-seg semantic segmentation algorithm, trained on 23,000 manually annotated olive trees across 250 parcels in the Barletta-Andria-Trani province. The model achieved strong performance, with sensitivity and precision exceeding 92%, and a mAP(50) of approximately 95%. Inference was scaled to the entire Apulian region using over 3TB of AGEA2019 orthophotos accessed via WMS services and processed in parallel across 254 threads, covering 460,000 tiles (200m × 200m) in 36 hours.
To enhance model generalization and address challenges such as duplicate detections, omissions, and false positives, an active learning strategy was employed. This iterative approach guided the manual review and targeted re-annotation of ambiguous or error-prone regions, significantly improving the model’s robustness across diverse agricultural landscapes. Post-processing steps included non-maximum suppression, spatial filtering via Dask–Geopandas, and validation using the Copernicus Crop Type 2019 layer to exclude non-olive tree species.
The final estimate of 59 million olive trees in 2019 closely aligns with official pre-Xylella outbreak figures, demonstrating the effectiveness of the proposed pipeline. This high-resolution catalog supports integration of vegetation data into regional water modeling frameworks, contributing to sustainable water resource management. Future work will focus on expanding temporal coverage, improving detection in degraded or high-density canopies, and advancing full automation of the monitoring pipeline.
