Please login first
From Atlas to Algorithm: A Systematic Review of CNN-Based Bone Age Assessment
* 1 , 2 , 1, 3
1  Escuela de Doctorado y Estudios de Posgrado, Universidad de La Laguna, San Cristóbal de La Laguna 38203, Spain
2  École Nationale Supérieure d’Informatique et d’Analyse des Systèmes (ENSIAS), Mohammed V University in Rabat, Rabat 10010, Morocco
3  Faculty of Health Sciences, Universidad Europea de Canarias, Santa Cruz de Tenerife, 38300, Spain
Academic Editor: Emmanuel Andrès

Abstract:

Introduction: Bone age (BA) assessment is crucial in pediatric endocrinology, growth disorder evaluation, and forensic age estimation. Traditional methods, such as the Atlas of Greulich–Pyle and Tanner–Whitehouse, are widely adopted but remain time-consuming, operator-dependent, and prone to inter-observer variability of 0.5–1.0 years. Convolutional Neural Networks (CNNs) offer a promising automated alternative with the potential to improve accuracy and consistency. Methods: This systematic review followed PRISMA guidelines and was prospectively registered in PROSPERO (CRD42024619808). A comprehensive search was conducted across eight databases—MEDLINE (PubMed), Google Scholar, Scopus (Elsevier), EBSCOhost, Cochrane Library, Web of Science (WoS), IEEE Xplore, and ProQuest—from 2019 to 2024. Eligible studies applied CNN-based models to posteroanterior hand–wrist radiographs for BA estimation. Architectures included VGGNet, ResNet, DenseNet, Inception-v4, Inception-ResNet-v2, Xception, MobileNetV2, and EfficientNet, with or without transfer learning. Extracted data covered model architecture, dataset characteristics, training strategy, and performance metrics, focusing on mean absolute error (MAE) and measures of variability (standard deviation [SD] or confidence intervals). Results: Fifty-five studies met the inclusion criteria. CNN-based models achieved MAEs as low as 0.23 ± 0.02 years (≈2.75 ± 0.24 months), markedly surpassing traditional manual assessments. In large-scale datasets, CNN predictions showed 95% confidence intervals within ±0.4 years, compared with ±1.2 years for expert evaluations. Hybrid and ensemble approaches, which combine CNN outputs with atlas-based scoring, enhanced robustness. Methodological refinements, including preprocessing pipelines, automated region-of-interest detection, U-Net segmentation, and attention mechanisms, optimized feature extraction and reduced error variability. Conclusions: CNNs provide high-precision BA estimates with substantially lower dispersion than Atlas-based methods, achieving accuracies within less than one growth stage in standard atlases. Their integration into clinical workflows could reduce diagnostic variability, accelerate reporting, and enable population-specific calibration. Future work should prioritize multimodal data integration, cross-population validation, and explainable AI to enhance clinical trust and regulatory adoption.

Keywords: Bone age estimation; Convolutional neural networks; Deep learning; Pediatric radiology; Forensic age assessment; Systematic review
Top