Introduction:
Artificial intelligence (AI) is increasingly applied to cone–beam computed tomography (CBCT) for evaluating maxillary sinus pathology; however, diagnostic performance varies across studies, and expert consensus is frequently used as the clinical reference standard. A consolidated assessment of CBCT-based AI systems benchmarked against expert-derived labels is needed to clarify current diagnostic capability.
Methods:
A systematic search of PubMed and Springer Nature databases was conducted for English-language studies published between 2020 and 2025 using the terms “(AI OR artificial intelligence) AND CBCT AND maxillary sinus.” Following title and abstract screening, full-text articles were reviewed for eligibility. Data were extracted using a predefined framework capturing study design, CBCT imaging characteristics, AI model type, target pathology, expert-consensus reference standard, and reported diagnostic performance metrics. Owing to methodological heterogeneity, findings were synthesized narratively. From 212 records identified, seven studies met our inclusion criteria.
Results:
Seven CBCT-based studies using expert-consensus reference standards were included. Across all studies, AI systems demonstrated high diagnostic reliability for maxillary sinus pathology. Classification models accurately identified mucosal thickening, mucous retention cysts, sinusitis, chronic rhinosinusitis, fungal ball, and polypoid lesions, reporting consistently strong sensitivity, specificity, and F1-scores. Segmentation-focused studies showed high agreement with expert-defined pathology boundaries, frequently achieving Dice scores >0.85. Diagnostic performance improved with CBCT denoising and GAN-based data augmentation, while 3D CBCT models accurately predicted sinus-lift approach and mucosal-perforation risk, aligning closely with expert assessments.
Conclusions:
Deep-learning systems evaluated using expert-consensus reference standards demonstrate high diagnostic accuracy for maxillary sinus pathology on CBCT and show strong potential to support clinical decision-making. Further standardization of reference definitions, evaluation metrics, and reporting is needed to facilitate broader clinical translation.
