Al Hajaj SW, Soliman K, Zafar M, Garnham C, Al Hajaj D, Elshafie O, Alsswah A, Elwan MH. From subtle breaks to missed diagnoses: Real-world evaluation of an artificial intelligence fracture detection tool. World J Orthop 2026; 17(4): 113710 [DOI: 10.5312/wjo.v17.i4.113710]
Corresponding Author of This Article
Sari Wathiq Al Hajaj, MD, Department of Trauma and Orthopaedics, Kettering General Hospital, NHS Foundation Trust, Kettering NN16 8UZ, Northamptonshire, United Kingdom. sarialhajaj95@gmail.com
Research Domain of This Article
Orthopedics
Article-Type of This Article
Retrospective Cohort Study
Open-Access Policy of This Article
This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Apr 18, 2026 (publication date) through Apr 16, 2026
Times Cited of This Article
Times Cited (0)
Journal Information of This Article
Publication Name
World Journal of Orthopedics
ISSN
2218-5836
Publisher of This Article
Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA
Share the Article
Al Hajaj SW, Soliman K, Zafar M, Garnham C, Al Hajaj D, Elshafie O, Alsswah A, Elwan MH. From subtle breaks to missed diagnoses: Real-world evaluation of an artificial intelligence fracture detection tool. World J Orthop 2026; 17(4): 113710 [DOI: 10.5312/wjo.v17.i4.113710]
World J Orthop. Apr 18, 2026; 17(4): 113710 Published online Apr 18, 2026. doi: 10.5312/wjo.v17.i4.113710
From subtle breaks to missed diagnoses: Real-world evaluation of an artificial intelligence fracture detection tool
Sari Wathiq Al Hajaj, Khaled Soliman, Mahira Zafar, Callum Garnham, Dawod Al Hajaj, Omar Elshafie, Ahmad Alsswah, Mohammed H Elwan
Sari Wathiq Al Hajaj, Department of Trauma and Orthopaedics, Kettering General Hospital, Kettering NN16 8UZ, Northamptonshire, United Kingdom
Khaled Soliman, Mahira Zafar, Callum Garnham, Omar Elshafie, Ahmad Alsswah, Mohammed H Elwan, Emergency Medicine, Kettering General Hospital, Kettering NN16 8UZ, Northamptonshire, United Kingdom
Dawod Al Hajaj, Altinbaş University, Istanbul 34360, Türkiye
Co-first authors: Sari Wathiq Al Hajaj and Khaled Soliman.
Author contributions: Al Hajaj SW and Soliman K study design and conceptualization; Zafar M, Garnham C and Al Hajaj D data collection and curation; Elshafie O, Alsswah A statistical analysis and data validation; Elwan MH, Al Hajaj SW interpretation of findings and drafting of the manuscript. All authors critical revision of the manuscript, approval of the final version, and agreement to be accountable for all aspects of the work. Al Hajaj SW and Soliman K contributed equally to this work as co-first authors.
Institutional review board statement: The study was conducted in accordance with the Declaration of Helsinki. Ethical approval was obtained from the Research and Ethics Committee of Kettering General Hospital NHS Foundation Trust.
Informed consent statement: As this was a retrospective study using anonymised radiology data with no patient identifiers collected, informed consent was not required.
Conflict-of-interest statement: The authors declare no conflicts of interest related to this study.
STROBE statement: The authors have read the STROBE Statement-checklist of items, and the manuscript was prepared and revised according to the STROBE Statement-checklist of items.
Data sharing statement: The anonymised dataset underlying this article is available from the corresponding author upon reasonable request.
Corresponding author: Sari Wathiq Al Hajaj, MD, Department of Trauma and Orthopaedics, Kettering General Hospital, NHS Foundation Trust, Kettering NN16 8UZ, Northamptonshire, United Kingdom. sarialhajaj95@gmail.com
Received: September 1, 2025 Revised: October 2, 2025 Accepted: January 14, 2026 Published online: April 18, 2026 Processing time: 221 Days and 10.1 Hours
Abstract
BACKGROUND
Artificial intelligence (AI) shows promise in musculoskeletal imaging, particularly in detecting fractures in emergency situations. Nonetheless, doubts persist regarding its diagnostic precision, common error types, and the clinical impact of false-negative (FN) and false-positive (FP) outcomes.
AIM
To evaluate the AI fracture detection system's diagnostic performance vs radiology reports and analyse the causes and outcomes of FP and FN cases.
METHODS
We retrospectively reviewed radiographic examinations over 3 months interpreted by both AI and radiologists. Radiology reports served as the standard of reference. Diagnostic accuracy metrics were calculated from the primary dataset. FP and FN cases were further examined using dedicated outcome sheets with explanatory notes categorised into predefined reason groups, and clinical outcomes classified into standardised categories.
RESULTS
The AI algorithm achieved an accuracy of 94.0%, a sensitivity of 89.9%, a specificity of 96.1%, a negative predictive value of 96.4%, and a positive predictive value of 89.8%. Out of 563 fractures identified in radiology reports, 54 (9.6%) were missed by AI. Most false negatives (92.6%) were due to subtle or minimally displaced fractures, primarily at the wrist and ankle. False positives (n = 58) were mainly caused by degenerative changes, growth plate variations, or healed fractures (91.4%). FN cases had greater clinical impact, often requiring fracture clinic follow-up (61.1%), additional imaging (14.8%), or hospital admission (5.6%), whereas FP cases mainly led to unnecessary follow-up or imaging.
CONCLUSION
AI demonstrated strong diagnostic performance, comparable to that in the published literature, but remains limited in detecting subtle and anatomically complex fractures. FN errors pose risks of delayed or missed treatment, while FP errors increase resource utilisation. AI should be integrated as a triage and decision-support tool with radiologist oversight, and future refinement should target wrist and ankle injuries and better differentiation of chronic from acute findings.
Core Tip: Artificial intelligence (AI) is increasingly utilised in musculoskeletal imaging; however, its clinical utility in fracture detection remains a subject of debate. In this retrospective study aimed at assessing diagnostic accuracy, we evaluated an AI-based fracture detection system in comparison with radiologist reports across over 2000 limb radiographs. The AI demonstrated high accuracy but failed to identify approximately 10% of fractures, primarily subtle or minimally displaced injuries, and frequently over diagnosed old or degenerative changes. False negatives posed a greater clinical risk than false positives, underscoring the importance of implementing AI as a triage and decision-support tool rather than a substitute for radiologists.