Observational Study
Copyright ©The Author(s) 2024. Published by Baishideng Publishing Group Inc. All rights reserved.
World J Methodol. Dec 20, 2024; 14(4): 92802
Published online Dec 20, 2024. doi: 10.5662/wjm.v14.i4.92802
Comparative evaluation of artificial intelligence systems' accuracy in providing medical drug dosages: A methodological study
Swaminathan Ramasubramanian, Sangeetha Balaji, Tejashri Kannan, Naveen Jeyaraman, Shilpa Sharma, Filippo Migliorini, Suhasini Balasubramaniam, Madhan Jeyaraman
Swaminathan Ramasubramanian, Sangeetha Balaji, Tejashri Kannan, Department of Orthopaedics, Government Medical College, Omandurar Government Estate, Chennai 600002, Tamil Nadu, India
Naveen Jeyaraman, Madhan Jeyaraman, Department of Orthopaedics, ACS Medical College and Hospital, Dr MGR Educational and Research Institute, Chennai 600077, Tamil Nadu, India
Shilpa Sharma, Department of Paediatric Surgery, All India Institute of Medical Sciences, New Delhi 110029, India
Filippo Migliorini, Department of Life Sciences, Health, Link Campus University, Rome 00165, Italy
Filippo Migliorini, Department of Orthopaedic and Trauma Surgery, Academic Hospital of Bolzano (SABES-ASDAA), Teaching Hospital of the Paracelsus Medical University, Bolzano 39100, Italy
Suhasini Balasubramaniam, Department of Radio-Diagnosis, Government Stanley Medical College and Hospital, Chennai 600001, Tamil Nadu, India
Author contributions: Ramasubramanian S, Balaji S conceived and designed the study and drafted the manuscript (original and revision); Jeyaraman M and Migliorini F supervised the study and drafted the manuscript (revision); Jeyaraman M, Migliorini F, and Sharma S drafted the manuscript (revision); Ramasubramanian S, Balaji S, Kannan T contributed to drafting (original); Ramasubramanian S, Migliorini F, Balasubramaniam S, Jeyaraman N conducted visualizations and data analysis; All authors have agreed to the final version to be published and agree to be accountable for all aspects of the work.
Institutional review board statement: This study did not require approval from the Institutional Review Board.
Informed consent statement: This study did not require informed consent forms due to the data collection process.
Conflict-of-interest statement: The authors declare no conflicts of interest related to this manuscript.
Data sharing statement: No additional data are available outside of the Supplemental Materials that are published with this article.
STROBE statement: The authors have read the STROBE Statement-checklist of items, and the manuscript was prepared and revised according to the STROBE Statement- checklist of items.
Open-Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Madhan Jeyaraman, MS, PhD, Assistant Professor, Research Associate, Department of Orthopaedics, ACS Medical College and Hospital, Dr MGR Educational and Research Institute, Velappanchavadi, Chennai 600077, Tamil Nadu, India.madhanjeyaraman@gmail.com
Received: February 6, 2024
Revised: May 29, 2024
Accepted: June 25, 2024
Published online: December 20, 2024
Processing time: 171 Days and 4.1 Hours
Abstract
BACKGROUND

Medication errors, especially in dosage calculation, pose risks in healthcare. Artificial intelligence (AI) systems like ChatGPT and Google Bard may help reduce errors, but their accuracy in providing medication information remains to be evaluated.

AIM

To evaluate the accuracy of AI systems (ChatGPT 3.5, ChatGPT 4, Google Bard) in providing drug dosage information per Harrison's Principles of Internal Medicine.

METHODS

A set of natural language queries mimicking real-world medical dosage inquiries was presented to the AI systems. Responses were analyzed using a 3-point Likert scale. The analysis, conducted with Python and its libraries, focused on basic statistics, overall system accuracy, and disease-specific and organ system accuracies.

RESULTS

ChatGPT 4 outperformed the other systems, showing the highest rate of correct responses (83.77%) and the best overall weighted accuracy (0.6775). Disease-specific accuracy varied notably across systems, with some diseases being accurately recognized, while others demonstrated significant discrepancies. Organ system accuracy also showed variable results, underscoring system-specific strengths and weaknesses.

CONCLUSION

ChatGPT 4 demonstrates superior reliability in medical dosage information, yet variations across diseases emphasize the need for ongoing improvements. These results highlight AI's potential in aiding healthcare professionals, urging continuous development for dependable accuracy in critical medical situations.

Keywords: Dosage calculation; Artificial intelligence; ChatGPT; Drug dosage; Healthcare; Large language models

Core Tip: This study reveals ChatGPT 4's superior accuracy in providing medical drug dosage information, highlighting the potential of artificial intelligence (AI) to aid healthcare professionals in minimizing medication errors. The analysis, based on Harrison's Principles of Internal Medicine, underscores the need for ongoing AI development to ensure reliability in critical medical situations. Variations in disease-specific and organ system accuracies suggest areas for improvement and continuous refinement of AI systems in medicine.