BPG is committed to discovery and dissemination of knowledge
Observational Study
Copyright: ©Author(s) 2026. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution-NonCommercial (CC BY-NC 4.0) license. No commercial re-use. See permissions. Published by Baishideng Publishing Group Inc.
World J Orthop. Jun 18, 2026; 17(6): 118593
Published online Jun 18, 2026. doi: 10.5312/wjo.v17.i6.118593
OpenEvidence performs at similar levels compared to current and previous GPT models on orthopedic training and education questions
Kashif Javid, Alexander Driessche, Colton Clymer, Muhammad J Abbas, Annamarie Pantuso, Lindsay M Maier, Joseph Hoegler, William M Hakeos, Stuart T Guthrie
Kashif Javid, Alexander Driessche, Colton Clymer, Muhammad J Abbas, Annamarie Pantuso, Lindsay M Maier, Joseph Hoegler, William M Hakeos, Stuart T Guthrie, Department of Orthopaedic Surgery, Henry Ford Health System, Detroit, MI 48202, United States
Author contributions: All authors contributed to the study conception and design. Javid K and Driessche A contributed to material preparation, data collection and analysis; Javid K, Driessche A, and Clymer C wrote the first draft of the manuscript; Abbas M, Pantuso A, Maier LM, Hoegler J, Hakeos WM, and Guthrie ST contributed to revisions; all authors commented on previous versions of the manuscript, read and approved the final manuscript.
Institutional review board statement: This study does not involve human or animal experiments and thus does not require an ethical document.
Informed consent statement: This study does not require an informed consent form.
Conflict-of-interest statement: All the authors report no relevant conflicts of interest for this article.
STROBE statement: The authors have read the STROBE Statement-checklist of items, and the manuscript was prepared and revised according to the STROBE Statement- checklist of items.
Data sharing statement: Not applicable.
Corresponding author: Kashif Javid, Department of Orthopaedic Surgery, Henry Ford Health System, 2799 W. Grand Blvd, Detroit, MI 48202, United States. kjavid1@hfhs.org
Received: January 7, 2026
Revised: February 6, 2026
Accepted: March 30, 2026
Published online: June 18, 2026
Processing time: 162 Days and 3.4 Hours
Core Tip

Core Tip: We evaluated the performance of contemporary large language models on orthopedic board-style questions, comparing ChatGPT-5 and OpenEvidence (OE), with the established GPT-4. Using a standardized orthopedic training exam question set, we found that ChatGPT-5 achieved the highest overall accuracy and consistently outperformed prior models across subspecialties and question formats. OE performed comparably to GPT-4 across multiple fields. All models demonstrated reduced accuracy on image-based questions, highlighting persistent limitations in visual interpretation. We assert that OE is a reputable addition to the tools available to orthopedists. The added benefit of training drawn from peer-reviewed literature adds to its potential value.

Write to the Help Desk