1
|
Sadowsky SJ. Can ChatGPT be trusted as a resource for a scholarly article on treatment planning implant-supported prostheses? J Prosthet Dent 2025:S0022-3913(25)00258-6. [PMID: 40210509 DOI: 10.1016/j.prosdent.2025.03.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2025] [Revised: 03/17/2025] [Accepted: 03/18/2025] [Indexed: 04/12/2025]
Abstract
STATEMENT OF PROBLEM Access to artificial intelligence is ubiquitous but its limitations in the preparation of scholarly articles have not been established in implant restorative treatment planning. PURPOSE The purpose of this study is to determine if ChatGPT can be a reliable resource in synthesizing the best available literature on treatment planning questions for implant-supported prostheses. MATERIAL AND METHODS Six questions were posed to ChatGPT on treatment planning implant-supported prostheses for the partially edentulous and completely edentulous scenario. Question 1: Would higher crown to implant ratios (C/I) greater than 1:1 be linked to increased marginal bone loss? Question 2: Do 2-unit posterior cantilevers lead to more bone loss than 2 adjacent implants? Question 3: Should implants be splinted in the posterior maxilla in patients that require no grafting and are not bruxers? Question 4: Do patients prefer a maxillary implant overdenture to a well-made complete denture? Question 5: Do resilient and rigid anchorage systems have the same maintenance when comparing implant overdentures? Question 6: Do denture patients prefer fixed implant prostheses compared with removable implant prostheses? Follow-up questions were intended to clarify the source and content of the supporting evidence for ChatGPT's responses. Additional higher quality and timely studies indexed on PubMed were identified for ChatGPT to consider in a revision of its original implant treatment planning answer. A quantitative rating was assessed based on 4 indices: accurate/retrievable source, representative literature, accurate interpretation of evidence, original conclusion reflects best evidence. RESULTS ChatGPT's responses to: Question 1: "Higher C/I can be associated with an increased risk of marginal bone loss." Revision: "While many clinicians believe that higher C/I ratios lead to bone loss, recent evidence suggests that this concern is less relevant for modern implants." Question 2: "The presence of cantilever extensions with short implants tend to fail at earlier time points and has been associated with a higher incidence of technical complications. Revision: "The use of implant-supported single-unit crowns with cantilever extensions in posterior regions is a viable long-term treatment option with minimal complications." Question 3: "Splinted restorations were associated with a higher implant survival rate, particularly in the posterior region." Revision: "There is no compelling evidence to suggest that splinting all implants in the posterior maxilla is necessary." Question 4: Patients report higher satisfaction with maxillary implant-supported overdentures compared to conventional complete dentures. Revision: "For patients with adequate maxillary bone support, a conventional denture may be just as satisfactory as an implant overdenture." Question 5: "While resilient attachments may require more frequent replacement of components, rigid attachments might necessitate monitoring for implant-related complications due to increased stress." Revision: "Research indicates that rigid attachment systems, such as bar and telescopic attachments, do not necessarily lead to increased complications due to stress in implant overdentures." Question 6: "Yes, in general, denture patients tend to prefer fixed implant prostheses over removable implant prostheses due to several key advantages. However, preferences can vary based on individual needs, costs, and clinical factors." Revision: "There is no universal patient preference for fixed or removable implant prostheses. Satisfaction is generally high with both options, and preference depends on individual patient factors, including comfort, hygiene, cost, and anatomical considerations." CONCLUSIONS ChatGPT has not demonstrated the ability to accurately cull the literature, stratify the rigor of the evidence, and extract accurate implications from the studies selected to deliver the best evidence-based answers to questions on treatment planning implant-supported prostheses.
Collapse
Affiliation(s)
- Steven J Sadowsky
- Professor Emeritus, Preventive and Restorative Department, University of the Pacific Arthur A. Dugoni School of Dentistry, San Francisco, Calif.
| |
Collapse
|
2
|
Stadler RD, Sudah SY, Moverman MA, Denard PJ, Duralde XA, Garrigues GE, Klifto CS, Levy JC, Namdari S, Sanchez-Sotelo J, Menendez ME. Identification of ChatGPT-Generated Abstracts Within Shoulder and Elbow Surgery Poses a Challenge for Reviewers. Arthroscopy 2025; 41:916-924.e2. [PMID: 38992513 DOI: 10.1016/j.arthro.2024.06.045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 06/21/2024] [Accepted: 06/27/2024] [Indexed: 07/13/2024]
Abstract
PURPOSE To evaluate the extent to which experienced reviewers can accurately discern between artificial intelligence (AI)-generated and original research abstracts published in the field of shoulder and elbow surgery and compare this with the performance of an AI detection tool. METHODS Twenty-five shoulder- and elbow-related articles published in high-impact journals in 2023 were randomly selected. ChatGPT was prompted with only the abstract title to create an AI-generated version of each abstract. The resulting 50 abstracts were randomly distributed to and evaluated by 8 blinded peer reviewers with at least 5 years of experience. Reviewers were tasked with distinguishing between original and AI-generated text. A Likert scale assessed reviewer confidence for each interpretation, and the primary reason guiding assessment of generated text was collected. AI output detector (0%-100%) and plagiarism (0%-100%) scores were evaluated using GPTZero. RESULTS Reviewers correctly identified 62% of AI-generated abstracts and misclassified 38% of original abstracts as being AI generated. GPTZero reported a significantly higher probability of AI output among generated abstracts (median, 56%; interquartile range [IQR], 51%-77%) compared with original abstracts (median, 10%; IQR, 4%-37%; P < .01). Generated abstracts scored significantly lower on the plagiarism detector (median, 7%; IQR, 5%-14%) relative to original abstracts (median, 82%; IQR, 72%-92%; P < .01). Correct identification of AI-generated abstracts was predominately attributed to the presence of unrealistic data/values. The primary reason for misidentifying original abstracts as AI was attributed to writing style. CONCLUSIONS Experienced reviewers faced difficulties in distinguishing between human and AI-generated research content within shoulder and elbow surgery. The presence of unrealistic data facilitated correct identification of AI abstracts, whereas misidentification of original abstracts was often ascribed to writing style. CLINICAL RELEVANCE With rapidly increasing AI advancements, it is paramount that ethical standards of scientific reporting are upheld. It is therefore helpful to understand the ability of reviewers to identify AI-generated content.
Collapse
Affiliation(s)
- Ryan D Stadler
- Rutgers Robert Wood Johnson Medical School, New Brunswick, New Jersey, U.S.A..
| | - Suleiman Y Sudah
- Department of Orthopaedic Surgery, Monmouth Medical Center, Monmouth, New Jersey, U.S.A
| | - Michael A Moverman
- Department of Orthopaedics, University of Utah School of Medicine, Salt Lake City, Utah, U.S.A
| | | | | | - Grant E Garrigues
- Midwest Orthopaedics at Rush University Medical Center, Chicago, Illinois, U.S.A
| | - Christopher S Klifto
- Department of Orthopaedic Surgery, Duke University School of Medicine, Durham, North Carolina, U.S.A
| | - Jonathan C Levy
- Levy Shoulder Center at Paley Orthopedic & Spine Institute, Boca Raton, Florida, U.S.A
| | - Surena Namdari
- Rothman Orthopaedic Institute at Thomas Jefferson University Hospitals, Philadelphia, Pennsylvania, U.S.A
| | | | - Mariano E Menendez
- Department of Orthopaedics, University of California Davis, Sacramento, California, U.S.A
| |
Collapse
|
3
|
Raman R. Transparency in research: An analysis of ChatGPT usage acknowledgment by authors across disciplines and geographies. Account Res 2025; 32:277-298. [PMID: 37877216 DOI: 10.1080/08989621.2023.2273377] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 10/17/2023] [Indexed: 10/26/2023]
Abstract
This investigation systematically reviews the recognition of generative AI tools, particularly ChatGPT, in scholarly literature. Utilizing 1,226 publications from the Dimensions database, ranging from November 2022 to July 2023, the research scrutinizes temporal trends and distribution across disciplines and regions. U.S.-based authors lead in acknowledgments, with notable contributions from China and India. Predominantly, Biomedical and Clinical Sciences, as well as Information and Computing Sciences, are engaging with these AI tools. Publications like "The Lancet Digital Health" and platforms such as "bioRxiv" are recurrent venues for such acknowledgments, highlighting AI's growing impact on research dissemination. The analysis is confined to the Dimensions database, thus potentially overlooking other sources and grey literature. Additionally, the study abstains from examining the acknowledgments' quality or ethical considerations. Findings are beneficial for stakeholders, providing a basis for policy and scholarly discourse on ethical AI use in academia. This study represents the inaugural comprehensive empirical assessment of AI acknowledgment patterns in academic contexts, addressing a previously unexplored aspect of scholarly communication.
Collapse
Affiliation(s)
- Raghu Raman
- Amrita School of Business, Amrita Vishwa Vidyapeetham, Amritapuri, Kerala, India
| |
Collapse
|
4
|
Katz G, Zloto O, Hostovsky A, Huna-Baron R, Ben-Bassat Mizrachi I, Burgansky Z, Skaat A, Vishnevskia-Dai V, Fabian ID, Sagiv O, Priel A, Glicksberg BS, Klang E. Chat GPT vs an experienced ophthalmologist: evaluating chatbot writing performance in ophthalmology. Eye (Lond) 2025:10.1038/s41433-025-03779-1. [PMID: 40169887 DOI: 10.1038/s41433-025-03779-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 02/18/2025] [Accepted: 03/20/2025] [Indexed: 04/03/2025] Open
Abstract
PURPOSE To examine the abilities of ChatGPT in writing scientific ophthalmology introductions and to compare those abilities to experienced ophthalmologists. METHODS OpenAI web interface was utilized to interact with and prompt ChatGPT 4 for generating the introductions for the selected papers. Consequently, each paper had two introductions-one drafted by ChatGPT and the other by the original author. Ten ophthalmology specialists with a minimal experience of more than 15 years, each representing distinct subspecialties-retina, neuro-ophthalmology, oculoplastic, glaucoma, and ocular oncology were provided with the two sets of introductions without revealing the origin (ChatGPT or human author) and were tasked to evaluate the introductions. RESULTS For each type of introduction, out of 45 instances, specialists correctly identified the source 26 times (57.7%) and erred 19 times (42.2%). The misclassification rates for introductions were 25% for experts evaluating introductions from their own subspecialty while to 44.4% for experts assessed introductions outside their subspecialty domain. In the comparative evaluation of introductions written by ChatGPT and human authors, no significant difference was identified across the assessed metrics (language, data arrangement, factual accuracy, originality, data Currency). The misclassification rate (the frequency at which reviewers incorrectly identified the authorship) was highest in Oculoplastic (66.7%) and lowest in Retina (11.1%). CONCLUSIONS ChatGPT represents a significant advancement in facilitating the creation of original scientific papers in ophthalmology. The introductions generated by ChatGPT showed no statistically significant difference compared to those written by experts in terms of language, data organization, factual accuracy, originality, and the currency of information. In addition, nearly half of them being indistinguishable from the originals. Future research endeavours should explore ChatGPT-4's utility in composing other sections of research papers and delve into the associated ethical considerations.
Collapse
Affiliation(s)
- Gabriel Katz
- Faculty of Medical & Health Sciences, Tel Aviv University, Tel Aviv, Israel
- Goldschleger Eye Institute, Sheba Medical Center, Tel Hashomer, Israel
| | - Ofira Zloto
- Faculty of Medical & Health Sciences, Tel Aviv University, Tel Aviv, Israel.
- Goldschleger Eye Institute, Sheba Medical Center, Tel Hashomer, Israel.
| | - Avner Hostovsky
- Faculty of Medical & Health Sciences, Tel Aviv University, Tel Aviv, Israel
- Goldschleger Eye Institute, Sheba Medical Center, Tel Hashomer, Israel
| | - Ruth Huna-Baron
- Faculty of Medical & Health Sciences, Tel Aviv University, Tel Aviv, Israel
- Goldschleger Eye Institute, Sheba Medical Center, Tel Hashomer, Israel
| | - Iris Ben-Bassat Mizrachi
- Faculty of Medical & Health Sciences, Tel Aviv University, Tel Aviv, Israel
- Goldschleger Eye Institute, Sheba Medical Center, Tel Hashomer, Israel
| | - Zvia Burgansky
- Faculty of Medical & Health Sciences, Tel Aviv University, Tel Aviv, Israel
- Goldschleger Eye Institute, Sheba Medical Center, Tel Hashomer, Israel
| | - Alon Skaat
- Faculty of Medical & Health Sciences, Tel Aviv University, Tel Aviv, Israel
- Goldschleger Eye Institute, Sheba Medical Center, Tel Hashomer, Israel
| | - Vicktoria Vishnevskia-Dai
- Faculty of Medical & Health Sciences, Tel Aviv University, Tel Aviv, Israel
- Goldschleger Eye Institute, Sheba Medical Center, Tel Hashomer, Israel
| | - Ido Didi Fabian
- Faculty of Medical & Health Sciences, Tel Aviv University, Tel Aviv, Israel
- Goldschleger Eye Institute, Sheba Medical Center, Tel Hashomer, Israel
| | - Oded Sagiv
- Faculty of Medical & Health Sciences, Tel Aviv University, Tel Aviv, Israel
- Goldschleger Eye Institute, Sheba Medical Center, Tel Hashomer, Israel
- Section of Ophthalmology, Department of Head and Neck Surgery, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Ayelet Priel
- Faculty of Medical & Health Sciences, Tel Aviv University, Tel Aviv, Israel
- Goldschleger Eye Institute, Sheba Medical Center, Tel Hashomer, Israel
| | - Benjamin S Glicksberg
- The Windreich Department of Artificial Intelligence and Human Health, Mount Sinai Medical Center, New York, NY, USA
| | - Eyal Klang
- The Windreich Department of Artificial Intelligence and Human Health, Mount Sinai Medical Center, New York, NY, USA
- The Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
5
|
Cooperman SR, Olaniyan A, Brandão RA. AI discernment in foot and ankle surgery research: A survey investigation. Foot Ankle Surg 2025; 31:214-219. [PMID: 39426884 DOI: 10.1016/j.fas.2024.10.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 10/01/2024] [Accepted: 10/06/2024] [Indexed: 10/21/2024]
Abstract
BACKGROUND This study evaluated the ability to differentiate between AI-generated and human-authored abstracts in foot and ankle surgery. METHODS An AI system (ChatGPT 3.0) was trained on 21 published abstracts to create six novel case abstracts. Nine foot and ankle surgeons participated in a blinded survey, tasked with distinguishing AI-generated from human-written abstracts, rating their confidence in their responses. Surveys were completed twice at two different time points to evaluate intra-/inter-observer reliability. RESULTS The overall accuracy rate for distinguishing AI-generated from human-written abstracts was 50.5 % (n = 109/216), indicating no better performance than random chance. Reviewer experience and AI familiarity did not significantly affect accuracy. Inter-rater reliability was moderate initially but decreased over time, and intra-rater reliability was poor. CONCLUSIONS In their current form, AI-generated abstracts are nearly indistinguishable from human-written ones, posing challenges for consistent identification in foot and ankle surgery. LEVEL OF EVIDENCE IV.
Collapse
Affiliation(s)
- Steven R Cooperman
- Orthopedic Foot and Ankle Center Advanced Foot and Ankle Reconstruction Fellowship, 350 W. Wilson Bridge Rd, Ste. 200, Worthington, OH 43085, USA.
| | | | - Roberto A Brandão
- Orthopedic Foot and Ankle Center, 350 W. Wilson Bridge Rd, Ste. 200, Worthington, OH 43085, United States; Board Certified Foot and Ankle Surgeon, 350 W. Wilson Bridge Rd, Ste. 200, Worthington, OH 43085, United States
| |
Collapse
|
6
|
De Cassai A, Dost B, Mormando G, Boscolo A, Navalesi P. Instructions for authors for large language models: Missing in action! J Clin Anesth 2025; 102:111761. [PMID: 39837232 DOI: 10.1016/j.jclinane.2025.111761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2025] [Accepted: 01/15/2025] [Indexed: 01/23/2025]
Affiliation(s)
- Alessandro De Cassai
- Department of Medicine - DIMED, University of Padua, Padua, Italy; Institute of Anesthesia and Intensive Care Unit, University Hospital of Padua, Padua, Italy.
| | - Burhan Dost
- Department of Anaesthesiology and Reanimation, Ondokuz Mayis University Faculty of Medicine, Samsun, Türkiye
| | - Giulia Mormando
- Department of Medicine - DIMED, University of Padua, Padua, Italy
| | - Annalisa Boscolo
- Department of Medicine - DIMED, University of Padua, Padua, Italy; Institute of Anesthesia and Intensive Care Unit, University Hospital of Padua, Padua, Italy; Thoracic Surgery and Lung Transplant Unit, Department of Cardiac, Thoracic, Vascular Sciences, and Public Health, University of Padua, Padua, Italy
| | - Paolo Navalesi
- Department of Medicine - DIMED, University of Padua, Padua, Italy; Institute of Anesthesia and Intensive Care Unit, University Hospital of Padua, Padua, Italy
| |
Collapse
|
7
|
Wu J. The rise of DeepSeek: technology calls for the "catfish effect". J Thorac Dis 2025; 17:1106-1108. [PMID: 40083508 PMCID: PMC11898396 DOI: 10.21037/jtd-2025b-02] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2025] [Accepted: 02/27/2025] [Indexed: 03/16/2025]
Affiliation(s)
- Jinlin Wu
- Department of Cardiac Surgery, Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| |
Collapse
|
8
|
Yousaf MN. Practical Considerations and Ethical Implications of Using Artificial Intelligence in Writing Scientific Manuscripts. ACG Case Rep J 2025; 12:e01629. [PMID: 39974689 PMCID: PMC11838153 DOI: 10.14309/crj.0000000000001629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/27/2024] [Accepted: 01/30/2025] [Indexed: 02/21/2025] Open
Affiliation(s)
- Muhammad Nadeem Yousaf
- Department of Medicine, Division of Gastroenterology and Hepatology, University of Missouri School of Medicine, Columbia, MO
| |
Collapse
|
9
|
Al‐Qudimat AR, Fares ZE, Elaarag M, Osman M, Al‐Zoubi RM, Aboumarzouk OM. Advancing Medical Research Through Artificial Intelligence: Progressive and Transformative Strategies: A Literature Review. Health Sci Rep 2025; 8:e70200. [PMID: 39980823 PMCID: PMC11839394 DOI: 10.1002/hsr2.70200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 07/23/2024] [Accepted: 10/28/2024] [Indexed: 02/22/2025] Open
Abstract
Background and Aims Artificial intelligence (AI) has become integral to medical research, impacting various aspects such as data analysis, writing assistance, and publishing. This paper explores the multifaceted influence of AI on the process of writing medical research papers, encompassing data analysis, ethical considerations, writing assistance, and publishing efficiency. Methods The review was conducted following the PRISMA guidelines; a comprehensive search was performed in Scopus, PubMed, EMBASE, and MEDLINE databases for research publications on artificial intelligence in medical research published up to October 2023. Results AI facilitates the writing process by generating drafts, offering grammar and style suggestions, and enhancing manuscript quality through advanced models like ChatGPT. Ethical concerns regarding content ownership and potential biases in AI-generated content underscore the need for collaborative efforts among researchers, publishers, and AI creators to establish ethical standards. Moreover, AI significantly influences data analysis in healthcare, optimizing outcomes and patient care, particularly in fields such as obstetrics and gynecology and pharmaceutical research. The application of AI in publishing, ranging from peer review to manuscript quality control and journal matching, underscores its potential to streamline and enhance the entire research and publication process. Overall, while AI presents substantial benefits, ongoing research, and ethical guidelines are essential for its responsible integration into the evolving landscape of medical research and publishing. Conclusion The integration of AI in medical research has revolutionized efficiency and innovation, impacting data analysis, writing assistance, publishing, and others. While AI tools offer significant benefits, ethical considerations such as biases and content ownership must be addressed. Ongoing research and collaborative efforts are crucial to ensure responsible and transparent AI implementation in the dynamic landscape of medical research and publishing.
Collapse
Affiliation(s)
- Ahmad R. Al‐Qudimat
- Department of Surgery, Hamad Medical CorporationSurgical Research SectionDohaQatar
- Department of Public Health, College of Health Sciences, QU‐HealthQatar UniversityDohaQatar
| | - Zainab E. Fares
- Department of Surgery, Hamad Medical CorporationSurgical Research SectionDohaQatar
| | - Mai Elaarag
- Department of Surgery, Hamad Medical CorporationSurgical Research SectionDohaQatar
| | - Maha Osman
- Department of Public Health, College of Health Sciences, QU‐HealthQatar UniversityDohaQatar
| | - Raed M. Al‐Zoubi
- Department of Surgery, Hamad Medical CorporationSurgical Research SectionDohaQatar
- Department of Biomedical Sciences, College of Health Sciences, QU‐HealthQatar UniversityDohaQatar
- Department of Chemistry, College of ScienceJordan University of Science and TechnologyIrbidJordan
| | - Omar M. Aboumarzouk
- Department of Surgery, Hamad Medical CorporationSurgical Research SectionDohaQatar
- School of Medicine, Dentistry and NursingThe University of GlasgowGlasgowUK
| |
Collapse
|
10
|
Shiva Shankar B, Mohan S. ChatGPT-4 as an Assistant for Evidence-Based Decision-Making Among General Dentists: An Observational Feasibility Study. Cureus 2025; 17:e79556. [PMID: 40012697 PMCID: PMC11859412 DOI: 10.7759/cureus.79556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/24/2025] [Indexed: 02/28/2025] Open
Abstract
Background Evidence-based decision-making (EBDM) is essential in contemporary dentistry. However, navigating the extensive and constantly evolving scientific literature can be challenging. Large language models (LLMs), such as ChatGPT-4, have the potential to transform EBDM by analyzing vast datasets and extracting critical information, thereby significantly reducing the time required to find evidence. This observational feasibility study investigates ChatGPT-4's potential in dental EBDM, focusing on its capabilities, strengths, and limitations. Materials and methods In this observational feasibility study, two independent examiners conducted interactive sessions with ChatGPT-4. Five clinical scenarios were explored using the Google Chrome web browser, accessing publicly available scientific evidence from Cochrane, ADA, and PubMed. This approach ensured compliance with the Cochrane guidelines for EBDM. Two independent dentists engaged with ChatGPT-4 in simulated real-life clinical scenarios to seek scientific information. The output from ChatGPT-4 for each scenario was assessed based on predetermined criteria. Its responses were evaluated for accuracy, relevance, efficiency, actionability, and ethical considerations using the ChatGPT-4 Response Scoring System (CRSS) and the ChatGPT-4 Generative Ability Matrix (C-GAM). Results ChatGPT-4 demonstrated consistent performance across all five clinical scenarios, achieving a C-GAM score of 46.4% and a CRSS score of 12 out of 28. It effectively identified relevant sources of evidence and provided concise summaries, potentially saving valuable time and enhancing access to information. No significant differences in scores were found when the responses to all clinical scenarios were analyzed independently by the two researchers. However, a notable limitation was its inability to provide specific web links directing users to relevant scientific articles. Additionally, while ChatGPT-4 offered suggestions for incorporating the latest scientific publications into decision-making, it could not generate direct links to these articles. Conclusion Despite its current limitations, ChatGPT-4, as a generative AI, can assist clinicians in making evidence-based decisions. It can save time compared to conventional search engines. Ethical considerations must be prioritized in training these models to ensure that clinicians make responsible, evidence-based decisions rather than relying solely on specific evidence statements provided by ChatGPT-4. This model shows its potential as an AI tool for EBDM in dentistry. Further development and training could address existing limitations and enhance its effectiveness; however, clinicians must retain ultimate responsibility for informed decisions, necessitating expertise and critical evaluation of the evidence presented.
Collapse
|
11
|
Zhang Y, Qiu R, Wang Y, Ye Z. Navigating the future: unveiling new facets of nurse work engagement. BMC Nurs 2025; 24:80. [PMID: 39849468 PMCID: PMC11755895 DOI: 10.1186/s12912-024-02517-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Accepted: 11/12/2024] [Indexed: 01/25/2025] Open
Abstract
OBJECTIVE This study investigates the influence of structural empowerment and psychological capital on nurse work engagement within the context of rising healthcare demands and nursing staff shortages. METHODS A cross-sectional descriptive study involving 778 registered nurses from six tertiary hospitals in Hangzhou, China, was conducted. Data were collected using multiple tools, including a demographic questionnaire, the CWEQ-II (Conditions for Work Effectiveness Questionnaire II), the PCQ (Psychological Capital Questionnaire), and the UWES-9 (Utrecht Work Engagement Scale-9). SPSS 27.0 was used for Pearson correlation and regression analyses, while structural equation modeling (SEM) in AMOS was employed to explore relationships among variables. Model fit was evaluated using chi-square, CFI, AGFI, and RMSEA indices. RESULTS Structural empowerment and psychological capital were significantly and positively correlated with nurses' work engagement. Regression analysis indicated that structural empowerment (support, resources, opportunity, and information) and psychological capital (optimism, resilience, self-efficacy, and hope) were significant positive predictors of work engagement (p < 0.01), jointly accounting for 69% of its variance. SEM analysis further revealed that structural empowerment indirectly influenced work engagement through psychological capital, with significant path coefficients (P < 0.001) and a good model fit (χ²/df = 3.727, P = 0.000, RMSEA = 0.059). CONCLUSION Structural empowerment and psychological capital are crucial factors in enhancing nurse work engagement, effectively supporting nurses' workplace performance. Management should focus on fostering psychological capital and enhancing structural empowerment to improve care quality and job satisfaction. This study provides empirical evidence for nursing management practice and suggests that future research should explore dynamic relationships among these variables in various populations and settings.
Collapse
Affiliation(s)
- Yini Zhang
- Department of Nursing, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, Hangzhou, 310018, China
| | - Ruolin Qiu
- Hangzhou Nomal University School of Nursing, Hangzhou, China
| | - Yuezhong Wang
- Department of Nursing, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, Hangzhou, 310018, China
| | - Zhihong Ye
- Department of Nursing, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, Hangzhou, 310018, China.
- Department of Nursing, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, No.3 Qingchun East Road, Shangcheng District, Hangzhou, Zhejiang Province, 310018, China.
| |
Collapse
|
12
|
De Cassai A, Dost B, Karapinar YE, Beldagli M, Yalin MSO, Turunc E, Turan EI, Sella N. Evaluating the utility of large language models in generating search strings for systematic reviews in anesthesiology: a comparative analysis of top-ranked journals. Reg Anesth Pain Med 2025:rapm-2024-106231. [PMID: 39828514 DOI: 10.1136/rapm-2024-106231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2024] [Accepted: 01/01/2025] [Indexed: 01/22/2025]
Abstract
BACKGROUND This study evaluated the effectiveness of large language models (LLMs), specifically ChatGPT 4o and a custom-designed model, Meta-Analysis Librarian, in generating accurate search strings for systematic reviews (SRs) in the field of anesthesiology. METHODS We selected 85 SRs from the top 10 anesthesiology journals, according to Web of Science rankings, and extracted reference lists as benchmarks. Using study titles as input, we generated four search strings per SR: three with ChatGPT 4o using general prompts and one with the Meta-Analysis Librarian model, which follows a structured, Population, Intervention, Comparator, Outcome-based approach aligned with Cochrane Handbook standards. Each search string was used to query PubMed, and the retrieved results were compared with the PubMed retrieved studies from the original search string in each SR to assess retrieval accuracy. Statistical analysis compared the performance of each model. RESULTS Original search strings demonstrated superior performance with a 65% (IQR: 43%-81%) retrieval rate, which was statistically different from both LLM groups in PubMed retrieved studies (p=0.001). The Meta-Analysis Librarian achieved a superior median retrieval rate to ChatGPT 4o (median, (IQR); 24% (13%-38%) vs 6% (0%-14%), respectively). CONCLUSION The findings of this study highlight the significant advantage of using original search strings over LLM-generated search strings in PubMed retrieval studies. The Meta-Analysis Librarian demonstrated notable superiority in retrieval performance compared with ChatGPT 4o. Further research is needed to assess the broader applicability of LLM-generated search strings, especially across multiple databases.
Collapse
Affiliation(s)
- Alessandro De Cassai
- Department of Medicine (DIMED), Padua University Hospital, University of Padua, Padova, Italy
- Anesthesia and Intensive Care Unit, Padua University Hospital, University-Hospital of Padova, Padova, Italy
| | - Burhan Dost
- Department of Anesthesiology and Reanimation, Ondokuz Mayis University Faculty of Medicine, Samsun, Turkey
| | - Yunus Emre Karapinar
- Department of Anesthesiology and Reanimation, Ataturk University, Erzurum, Turkey
| | - Müzeyyen Beldagli
- Department of Anesthesiology and Reanimation, Samsun University Faculty of Medicine, Canik, Turkey
| | | | - Esra Turunc
- Department of Anesthesiology and Reanimation, Ondokuz Mayis University Faculty of Medicine, Samsun, Turkey
| | - Engin Ihsan Turan
- Department of Anesthesiology, Istanbul Health Science University Kanuni Sultan Süleyman Education and Training Hospital, Istanbul, Turkey
| | - Nicolò Sella
- Anesthesia and Intensive Care Unit, Padua University Hospital, University-Hospital of Padova, Padova, Italy
| |
Collapse
|
13
|
Akefe IO, Adegoke VA, Akefe E. Strategic tips to successfully undertake research: a comprehensive roadmap for medical trainees. Postgrad Med J 2025:qgaf001. [PMID: 39815988 DOI: 10.1093/postmj/qgaf001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Revised: 12/17/2024] [Accepted: 01/02/2025] [Indexed: 01/18/2025]
Abstract
Engaging in research during medical training is crucial for fostering critical thinking, enhancing clinical skills, and deepening understanding of medical science. Despite its importance, the shortage of physician-scientists lingers with many trainees and junior doctors encountering challenges navigating the research process. Drawing on current literature, this article provides a comprehensive roadmap, categorising 12 actionable strategies into five themes, to help medical trainees overcome common obstacles and optimise their research experience. The strategies include early planning, research conduct and integrity, productivity and time management, collaboration and dissemination, and personal growth and development. By implementing these evidence-based recommendations, derived from current literature and expert insights, medical trainees can refine their research skills, produce high-quality outputs, and contribute meaningfully to the scientific community, ultimately enriching their medical training and future careers.
Collapse
Affiliation(s)
- Isaac Oluwatobi Akefe
- CDU Menzies School of Medicine, Charles Darwin University, Ellengowan Drive, Darwin, NT 0909, Australia
| | - Victoria Aderonke Adegoke
- School of Biomedical Science, Faculty of Medicine, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Elijah Akefe
- Faculty of Law, University of Abuja, PMB 117, Gwagwalada, Abuja 902101, Nigeria
| |
Collapse
|
14
|
Saeki S. Artificial Intelligence in Academic Writing. JMA J 2025; 8:314-315. [PMID: 39926082 PMCID: PMC11799597 DOI: 10.31662/jmaj.2024-0224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2024] [Accepted: 11/07/2024] [Indexed: 02/11/2025] Open
Affiliation(s)
- Soichiro Saeki
- Emergency Medicine and Critical Care, Center Hospital of the National Center for Global Health and Medicine, Tokyo, Japan
- Division of Public Health, Department of Social Medicine, Graduate School of Medicine, Osaka University, Osaka, Japan
| |
Collapse
|
15
|
Sequí-Sabater JM, Benavent D. Artificial intelligence in rheumatology research: what is it good for? RMD Open 2025; 11:e004309. [PMID: 39778924 PMCID: PMC11748787 DOI: 10.1136/rmdopen-2024-004309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2024] [Accepted: 12/08/2024] [Indexed: 01/11/2025] Open
Abstract
Artificial intelligence (AI) is transforming rheumatology research, with a myriad of studies aiming to improve diagnosis, prognosis and treatment prediction, while also showing potential capability to optimise the research workflow, improve drug discovery and clinical trials. Machine learning, a key element of discriminative AI, has demonstrated the ability of accurately classifying rheumatic diseases and predicting therapeutic outcomes by using diverse data types, including structured databases, imaging and text. In parallel, generative AI, driven by large language models, is becoming a powerful tool for optimising the research workflow by supporting with content generation, literature review automation and clinical decision support. This review explores the current applications and future potential of both discriminative and generative AI in rheumatology. It also highlights the challenges posed by these technologies, such as ethical concerns and the need for rigorous validation and regulatory oversight. The integration of AI in rheumatology promises substantial advancements but requires a balanced approach to optimise benefits and minimise potential possible downsides.
Collapse
Affiliation(s)
- José Miguel Sequí-Sabater
- Rheumatology Department, La Ribera University Hospital, Alzira, Spain
- Rheumatology Deparment, La Fe University and Polytechnic Hospital, Valencia, Spain
- Division of Rheumatology, Department of Medicine Solna, Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden
| | - Diego Benavent
- Rheumatology Department, Hospital Universitari de Bellvitge, L'Hospitalet de Llobregat, Barcelona, Spain
| |
Collapse
|
16
|
Han Z, Yang Y, Rushlow J, Huo J, Liu Z, Hsu YC, Yin R, Wang M, Liang R, Wang KY, Zhou HC. Development of the design and synthesis of metal-organic frameworks (MOFs) - from large scale attempts, functional oriented modifications, to artificial intelligence (AI) predictions. Chem Soc Rev 2025; 54:367-395. [PMID: 39582426 DOI: 10.1039/d4cs00432a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2024]
Abstract
Owing to the exceptional porous properties of metal-organic frameworks (MOFs), there has recently been a surge of interest, evidenced by a plethora of research into their design, synthesis, properties, and applications. This expanding research landscape has driven significant advancements in the precise regulation of MOF design and synthesis. Initially dominated by large-scale synthesis approaches, this field has evolved towards more targeted functional modifications. Recently, the integration of computational science, particularly through artificial intelligence predictions, has ushered in a new era of innovation, enabling more precise and efficient MOF design and synthesis methodologies. The objective of this review is to provide readers with an extensive overview of the development process of MOF design and synthesis, and to present visions for future developments.
Collapse
Affiliation(s)
- Zongsu Han
- Department of Chemistry, Texas A&M University, College Station, Texas 77843, USA.
| | - Yihao Yang
- Department of Chemistry, Texas A&M University, College Station, Texas 77843, USA.
| | - Joshua Rushlow
- Department of Chemistry, Texas A&M University, College Station, Texas 77843, USA.
| | - Jiatong Huo
- Department of Chemistry, Texas A&M University, College Station, Texas 77843, USA.
| | - Zhaoyi Liu
- Department of Chemistry, Texas A&M University, College Station, Texas 77843, USA.
| | - Yu-Chuan Hsu
- Department of Chemistry, Texas A&M University, College Station, Texas 77843, USA.
| | - Rujie Yin
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas 77843, USA
| | - Mengmeng Wang
- Institute of Condensed Matter and Nanosciences, Molecular Chemistry, Materials and Catalysis (IMCN/MOST), Université catholique de Louvain, 1348 Louvain-laNeuve, Belgium
| | - Rongran Liang
- Department of Chemistry, Texas A&M University, College Station, Texas 77843, USA.
| | - Kun-Yu Wang
- Department of Chemistry, Texas A&M University, College Station, Texas 77843, USA.
| | - Hong-Cai Zhou
- Department of Chemistry, Texas A&M University, College Station, Texas 77843, USA.
| |
Collapse
|
17
|
Dardara EA, Al-Makhalid KA. Development and Psychometric Validation of the Cyber-Self Scale (CSS) in Saudi Arabia. ACTAS ESPANOLAS DE PSIQUIATRIA 2025; 53:62-70. [PMID: 39801410 PMCID: PMC11726205 DOI: 10.62641/aep.v53i1.1758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 09/14/2024] [Accepted: 09/20/2024] [Indexed: 01/16/2025]
Abstract
BACKGROUND Measuring adolescents' and youths' perception of their Cyber-Self can enhance the understanding of how digital technology influences identity formation. While psychological literature offers numerous measures of the self, there is a notable lack of studies addressing the measurement of the Cyber-Self. This study aims to evaluate the reliability, factorial- and criterion-related validity, and measurement invariance of the Cyber-Self Scale (CSS) across age and gender among the youth and adolescents in Saudi Arabia. METHODS The Cyber Relationship Motives (CRM) and E-Emotional Questionnaire (EEQ) were administered to students at Umm Al-Qura University (N = 335), aged 17-31 years (39.7% male, 60.3% female; mean (M) = 21.75, standard deviation (SD) = 2.17). RESULTS The results indicated significant positive correlations between the sub-components of the CRM and EEQ. One item was selected based on two criteria: the highest correlation with other items and the highest correlation with the general factor. A total of 12 items were identified as the final form of the CSS, which demonstrated acceptable internal consistency for both male and female participants. Confirmatory factor analysis (CFA) revealed that the CSS model fit the data well, with all 12 items meeting the fit criteria for chi-square and root mean square error of approximation (RMSEA). CONCLUSION The Arabic version of the CSS is sufficiently reliable and valid for use among Arabic-speaking adolescents and youth. Further research is recommended to examine its measurement invariance over extended periods.
Collapse
Affiliation(s)
- Elsaeed A. Dardara
- Psychology Department, Faculty of Arts, Minia University, 61519 Minia, Egypt
| | - Khalid A. Al-Makhalid
- Psychology Department, College of Education, Umm Al-Qura University, 24381 Mecca, Saudi Arabia
| |
Collapse
|
18
|
Ng JY, Maduranayagam SG, Suthakar N, Li A, Lokker C, Iorio A, Haynes RB, Moher D. Attitudes and perceptions of medical researchers towards the use of artificial intelligence chatbots in the scientific process: an international cross-sectional survey. Lancet Digit Health 2025; 7:e94-e102. [PMID: 39550312 DOI: 10.1016/s2589-7500(24)00202-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 07/16/2024] [Accepted: 09/12/2024] [Indexed: 11/18/2024]
Abstract
Chatbots are artificial intelligence (AI) programs designed to simulate conversations with humans that present opportunities and challenges in scientific research. Despite growing clarity from publishing organisations on the use of AI chatbots, researchers' perceptions remain less understood. In this international cross-sectional survey, we aimed to assess researchers' attitudes, familiarity, perceived benefits, and limitations related to AI chatbots. Our online survey was open from July 9 to Aug 11, 2023, with 61 560 corresponding authors identified from 122 323 articles indexed in PubMed. 2452 (4·0%) provided responses and 2165 (94·5%) of 2292 who met eligibility criteria completed the survey. 1161 (54·0%) of 2149 respondents were male and 959 (44·6%) were female. 1294 (60·5%) of 2138 respondents were familiar with AI chatbots, and 945 (44·5%) of 2125 had previously used AI chatbots in research. Only 244 (11·4%) of 2137 reported institutional training on AI tools, and 211 (9·9%) of 2131 noted institutional policies on AI chatbot use. Despite mixed opinions on the benefits, 1428 (69·7%) of 2048 expressed interest in further training. Although many valued AI chatbots for reducing administrative workload (1299 [66·9%] of 1941), there was insufficient understanding of the decision making process (1484 [77·2%] of 1923). Overall, this study highlights substantial interest in AI chatbots among researchers, but also points to the need for more formal training and clarity on their use.
Collapse
Affiliation(s)
- Jeremy Y Ng
- Centre for Journalology, Methods Centre, Ottawa Hospital Research Institute, Ottawa, ON, Canada.
| | - Sharleen G Maduranayagam
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Nirekah Suthakar
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Amy Li
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Cynthia Lokker
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Alfonso Iorio
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada; Department of Medicine, McMaster University, Hamilton, ON, Canada
| | - R Brian Haynes
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - David Moher
- Centre for Journalology, Methods Centre, Ottawa Hospital Research Institute, Ottawa, ON, Canada; School of Epidemiology, Public Health, and Preventive Medicine, University of Ottawa, Ottawa, ON, Canada
| |
Collapse
|
19
|
Ansari N, Pitts L, Wilson C. An online approach to teaching graduate nursing students to write integrative literature reviews: Examples from two published cases. Nurse Educ Pract 2025; 82:104241. [PMID: 39732116 DOI: 10.1016/j.nepr.2024.104241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2024] [Revised: 11/26/2024] [Accepted: 12/16/2024] [Indexed: 12/30/2024]
Abstract
AIM/OBJECTIVE To empower nursing graduate students, in master's or doctoral programs, through distance-accessible methods for conducting integrative reviews, enhancing their ability to transition from clinical to publication-oriented writing. BACKGROUND Mastering literature review methods is vital for advancing evidence-based practice. Integrative reviews, inclusive of multiple research methodologies, offer a comprehensive approach suited for nursing students. However, transitioning from clinical to publication-oriented writing poses challenges, necessitating innovative solutions. DESIGN This discussion presents strategic writing methods, including technologies for distance-accessible collaboration and data organization, framework use and critical analysis skills to support nursing graduate students in conducting integrative reviews. METHODS Approaches for preparing and developing an integrative review are outlined, including integrating online tools and collaborative platforms and structured frameworks to guide data organization and critical analysis. Two published integrative reviews exemplifying these approaches are presented. RESULTS Distance-accessible and strategic approaches have significantly improved the quality of integrative literature reviews conducted by nursing graduate students. These innovations have equipped students with essential skills for navigating the contemporary academic landscape. CONCLUSIONS Embracing innovative approaches and staying informed about technological advancements empowers nursing graduate students to excel in their research pursuits. This contributes to evidence-based practice and nursing scholarship in the evolving healthcare landscape.
Collapse
Affiliation(s)
- Natasha Ansari
- University of Utah College of Nursing, Salt Lake City, UT, United States.
| | - Leslie Pitts
- University of Alabama Birmingham, Birmingham, AL, United States
| | - Christina Wilson
- University of Utah College of Nursing, Salt Lake City, UT, United States; University of Alabama Birmingham, Birmingham, AL, United States
| |
Collapse
|
20
|
Joseph G, Bhatti N, Mittal R, Bhatti A. Current Application and Future Prospects of Artificial Intelligence in Healthcare and Medical Education: A Review of Literature. Cureus 2025; 17:e77313. [PMID: 39935913 PMCID: PMC11812282 DOI: 10.7759/cureus.77313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/12/2025] [Indexed: 02/13/2025] Open
Abstract
Artificial Intelligence (AI) is being used in every aspect of life today. It has found great application in the healthcare sector, with the use of this technology by medical schools all over the globe. AI has found multiple applications in medical fields such as diagnostics, medicine, surgery, oncology, radiology, ophthalmology, medical education, and numerous other medical fields. It has assisted in diagnosing conditions in a much quicker and more efficient manner, and the use of AI chatbots has greatly enhanced the learning process. Despite the benefits that AI applications provide, such as saving precious time for healthcare givers, there are also concerns regarding AI, mainly, ethical, and the fact that they might render the human race unemployed. However, despite these concerns, a lot of innovations are being made using AI applications, which show a very bright prospect for this technology. Although humans use AI in every part of their daily lives, they are also opposed to its use because they believe it could eventually replace them in the future. In this review of literature, a detailed analysis of the use of AI in the healthcare industry and medical education will be done, along with its shortcomings as well as its future prospects.
Collapse
Affiliation(s)
- Girish Joseph
- Pharmacology, Christian Medical College & Hospital, Ludhiana, IND
| | - Neena Bhatti
- Pharmacology, Christian Medical College & Hospital, Ludhiana, IND
| | - Rithik Mittal
- Neurosciences, Oakland Community College, Michigan, USA
| | - Arun Bhatti
- Ophthalmology, M. S. Ramaiah Medical College, Bangalore, IND
| |
Collapse
|
21
|
Li J, Gao X, Dou T, Gao Y, Li X, Zhu W. Quantitative evaluation of GPT-4's performance on US and Chinese osteoarthritis treatment guideline interpretation and orthopaedic case consultation. BMJ Open 2024; 14:e082344. [PMID: 39806703 PMCID: PMC11749315 DOI: 10.1136/bmjopen-2023-082344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 11/28/2024] [Indexed: 01/16/2025] Open
Abstract
OBJECTIVES To evaluate GPT-4's performance in interpreting osteoarthritis (OA) treatment guidelines from the USA and China, and to assess its ability to diagnose and manage orthopaedic cases. SETTING The study was conducted using publicly available OA treatment guidelines and simulated orthopaedic case scenarios. PARTICIPANTS No human participants were involved. The evaluation focused on GPT-4's responses to clinical guidelines and case questions, assessed by two orthopaedic specialists. OUTCOMES Primary outcomes included the accuracy and completeness of GPT-4's responses to guideline-based queries and case scenarios. Metrics included the correct match rate, completeness score and stratification of case responses into predefined tiers of correctness. RESULTS In interpreting the American Academy of Orthopaedic Surgeons and Chinese OA guidelines, GPT-4 achieved a correct match rate of 46.4% and complete agreement with all score-2 recommendations. The accuracy score for guideline interpretation was 4.3±1.6 (95% CI 3.9 to 4.7), and the completeness score was 2.8±0.6 (95% CI 2.5 to 3.1). For case-based questions, GPT-4 demonstrated high performance, with over 88% of responses rated as comprehensive. CONCLUSIONS GPT-4 demonstrates promising capabilities as an auxiliary tool in orthopaedic clinical practice and patient education, with high levels of accuracy and completeness in guideline interpretation and clinical case analysis. However, further validation is necessary to establish its utility in real-world clinical settings.
Collapse
Affiliation(s)
- Juntan Li
- Jinzhou Medical University, Jinzhou, Liaoning, China
- The First Affiliated Hospital of China Medical University, Shenyang, Liaoning, China
| | - Xiang Gao
- Department of Orthopedics, Fourth Affiliated Hospital of China Medical University, Shenyang, Liaoning, China
| | - Tianxu Dou
- Department of Orthopedics, The First Hospital of China Medical University, Shenyang, China
| | - Yuyang Gao
- Department of Orthopedics, The First Hospital of China Medical University, Shenyang, China
| | - Xu Li
- Department of Orthopedics, Fourth Affiliated Hospital of China Medical University, Shenyang, Liaoning, China
| | - Wannan Zhu
- Jinzhou Medical University, Jinzhou, Liaoning, China
| |
Collapse
|
22
|
Matalon J, Spurzem A, Ahsan S, White E, Kothari R, Varma M. Reader's digest version of scientific writing: comparative evaluation of summarization capacity between large language models and medical students in analyzing scientific writing in sleep medicine. Front Artif Intell 2024; 7:1477535. [PMID: 39777163 PMCID: PMC11704966 DOI: 10.3389/frai.2024.1477535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Accepted: 11/28/2024] [Indexed: 01/11/2025] Open
Abstract
Introduction As artificial intelligence systems like large language models (LLM) and natural language processing advance, the need to evaluate their utility within medicine and medical education grows. As medical research publications continue to grow exponentially, AI systems offer valuable opportunities to condense and synthesize information, especially in underrepresented areas such as Sleep Medicine. The present study aims to compare summarization capacity between LLM generated summaries of sleep medicine research article abstracts, to summaries generated by Medical Student (humans) and to evaluate if the research content, and literary readability summarized is retained comparably. Methods A collection of three AI-generated and human-generated summaries of sleep medicine research article abstracts were shared with 19 study participants (medical students) attending a sleep medicine conference. Participants were blind as to which summary was human or LLM generated. After reading both human and AI-generated research summaries participants completed a 1-5 Likert scale survey on the readability of the extracted writings. Participants also answered article-specific multiple-choice questions evaluating their comprehension of the summaries, as a representation of the quality of content retained by the AI-generated summaries. Results An independent sample t-test between the AI-generated and human-generated summaries comprehension by study participants revealed no significant difference between the Likert readability ratings (p = 0.702). A chi-squared test of proportions revealed no significant association (χ 2 = 1.485, p = 0.223), and a McNemar test revealed no significant association between summary type and the proportion of correct responses to the comprehension multiple choice questions (p = 0.289). Discussion Some limitations in this study were a small number of participants and user bias. Participants attended at a sleep conference and study summaries were all from sleep medicine journals. Lastly the summaries did not include graphs, numbers, and pictures, and thus were limited in material extraction. While the present analysis did not demonstrate a significant difference among the readability and content quality between the AI and human-generated summaries, limitations in the present study indicate that more research is needed to objectively measure, and further define strengths and weaknesses of AI models in condensing medical literature into efficient and accurate summaries.
Collapse
Affiliation(s)
- Jacob Matalon
- Medical school, California University of Science and Medicine, Colton, CA, United States
| | - August Spurzem
- Medical school, California University of Science and Medicine, Colton, CA, United States
| | - Sana Ahsan
- Medical school, California University of Science and Medicine, Colton, CA, United States
| | - Elizabeth White
- Medical school, California University of Science and Medicine, Colton, CA, United States
| | - Ronik Kothari
- Medical school, California University of Science and Medicine, Colton, CA, United States
| | - Madhu Varma
- Department of Medical Education and Clinical Skills, California University of Science and Medicine, Colton, CA, United States
| |
Collapse
|
23
|
Heisinger S, Salzmann SN, Senker W, Aspalter S, Oberndorfer J, Matzner MP, Stienen MN, Motov S, Huber D, Grohs JG. ChatGPT's Performance in Spinal Metastasis Cases-Can We Discuss Our Complex Cases with ChatGPT? J Clin Med 2024; 13:7864. [PMID: 39768787 PMCID: PMC11727723 DOI: 10.3390/jcm13247864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2024] [Revised: 12/11/2024] [Accepted: 12/19/2024] [Indexed: 01/06/2025] Open
Abstract
Background: The integration of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT-4, is transforming healthcare. ChatGPT's potential to assist in decision-making for complex cases, such as spinal metastasis treatment, is promising but widely untested. Especially in cancer patients who develop spinal metastases, precise and personalized treatment is essential. This study examines ChatGPT-4's performance in treatment planning for spinal metastasis cases compared to experienced spine surgeons. Materials and Methods: Five spine metastasis cases were randomly selected from recent literature. Consequently, five spine surgeons and ChatGPT-4 were tasked with providing treatment recommendations for each case in a standardized manner. Responses were analyzed for frequency distribution, agreement, and subjective rater opinions. Results: ChatGPT's treatment recommendations aligned with the majority of human raters in 73% of treatment choices, with moderate to substantial agreement on systemic therapy, pain management, and supportive care. However, ChatGPT's recommendations tended towards generalized statements, with raters noting its generalized answers. Agreement among raters improved in sensitivity analyses excluding ChatGPT, particularly for controversial areas like surgical intervention and palliative care. Conclusions: ChatGPT shows potential in aligning with experienced surgeons on certain treatment aspects of spinal metastasis. However, its generalized approach highlights limitations, suggesting that training with specific clinical guidelines could potentially enhance its utility in complex case management. Further studies are necessary to refine AI applications in personalized healthcare decision-making.
Collapse
Affiliation(s)
- Stephan Heisinger
- Department of Orthopedics and Trauma Surgery, Medical University of Vienna, 1090 Vienna, Austria; (S.H.)
| | - Stephan N. Salzmann
- Department of Orthopedics and Trauma Surgery, Medical University of Vienna, 1090 Vienna, Austria; (S.H.)
| | - Wolfgang Senker
- Department of Neurosurgery, Kepler University Hospital, 4020 Linz, Austria (S.A.)
| | - Stefan Aspalter
- Department of Neurosurgery, Kepler University Hospital, 4020 Linz, Austria (S.A.)
| | - Johannes Oberndorfer
- Department of Neurosurgery, Kepler University Hospital, 4020 Linz, Austria (S.A.)
| | - Michael P. Matzner
- Department of Orthopedics and Trauma Surgery, Medical University of Vienna, 1090 Vienna, Austria; (S.H.)
| | - Martin N. Stienen
- Spine Center of Eastern Switzerland & Department of Neurosurgery, Kantonsspital St. Gallen, Medical School of St. Gallen, University of St.Gallen, 9000 St. Gallen, Switzerland
| | - Stefan Motov
- Spine Center of Eastern Switzerland & Department of Neurosurgery, Kantonsspital St. Gallen, Medical School of St. Gallen, University of St.Gallen, 9000 St. Gallen, Switzerland
| | - Dominikus Huber
- Division of Oncology, Department of Medicine I, Medical University of Vienna, 1090 Vienna, Austria
| | - Josef Georg Grohs
- Department of Orthopedics and Trauma Surgery, Medical University of Vienna, 1090 Vienna, Austria; (S.H.)
| |
Collapse
|
24
|
Van Norman GA. Writing the Roadmap for Medical Practice: Ethics of Medical Authorship. Anesthesiol Clin 2024; 42:617-630. [PMID: 39443034 DOI: 10.1016/j.anclin.2024.02.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2024]
Abstract
The medical literature guides ethical clinical care by providing information on medical innovations, clinical care, the history of medical advances, explanations for past mistakes and inspiration for future discoveries. Ethical authorship practices are thus imperative to preserving the integrity of medical publications and fulfilling our obligations to ethical patient care. Unethical authorship practices such as plagiarism, guest authorship, and ghost authorship are increasing and pose serious threats to the medical literature. The rise of artificial intelligence in assisting scholarly work poses particular concerns. Authors may face severe and career-changing penalties for engaging in unethical authorship.
Collapse
Affiliation(s)
- Gail A Van Norman
- Anesthesiology and Pain Medicine, University of Washington, UWMC 1959 NE Pacific Street, Seattle, WA 98195, USA.
| |
Collapse
|
25
|
Wang J, Liao Y, Liu S, Zhang D, Wang N, Shu J, Wang R. The impact of using ChatGPT on academic writing among medical undergraduates. Ann Med 2024; 56:2426760. [PMID: 39555617 PMCID: PMC11574940 DOI: 10.1080/07853890.2024.2426760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Revised: 08/12/2024] [Accepted: 10/08/2024] [Indexed: 11/19/2024] Open
Abstract
BACKGROUND ChatGPT is widely used for writing tasks, yet its effects on medical students' academic writing remain underexplored. This study aims to elucidate ChatGPT's impact on academic writing efficiency and quality among medical students, while also evaluating students' attitudes towards its use in academic writing. METHODS We collected systematic reviews from 130 third-year medical students and administered a questionnaire to assess ChatGPT usage and student attitudes. Three independent reviewers graded the papers using EASE guidelines, and statistical analysis compared articles generated with or without ChatGPT assistance across various parameters, with rigorous quality control ensuring survey reliability and validity. RESULTS In this study, 33 students (25.8%) utilized ChatGPT for writing (ChatGPT group) and 95 (74.2%) did not (Control group). The ChatGPT group exhibited significantly higher daily technology use and prior experience with ChatGPT (p < 0.05). Writing time was significantly reduced in the ChatGPT group (p = 0.04), with 69.7% completing tasks within 2-3 days compared to 48.4% in the control group. They also achieved higher article quality scores (p < 0.0001) with improvements in completeness, credibility, and scientific content. Self-assessment indicated enhanced writing skills (p < 0.01), confidence (p < 0.001), satisfaction (p < 0.001) and a positive attitude toward its future use in the ChatGPT group. CONCLUSIONS Integrating ChatGPT in medical academic writing, with proper guidance, improves efficiency and quality, illustrating artificial intelligence's potential in shaping medical education methodologies.
Collapse
Affiliation(s)
- Jingyu Wang
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha, Hunan Province, China
- Xiangya School of Medicine, Central South University, Changsha, Hunan Province, China
| | - Yuxuan Liao
- National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences Peking Union Medical College, Beijing, China
- Graduate School of Peking Union Medical College, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Shaojun Liu
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha, Hunan Province, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha, Hunan Province, China
| | - Decai Zhang
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha, Hunan Province, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha, Hunan Province, China
| | - Na Wang
- The First People's Hospital of Foshan, Foshan, China
| | - Jiankun Shu
- The First People's Hospital of Foshan, Foshan, China
- The First School of Clinical Medicine, Southern Medical University, Guangzhou, China
| | - Rui Wang
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha, Hunan Province, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha, Hunan Province, China
| |
Collapse
|
26
|
Graña M, Badiola-Zabala G, Cano-Escalera G. Comment on Uzun Ozsahin et al. COVID-19 Prediction Using Black-Box Based Pearson Correlation Approach. Diagnostics 2023, 13, 1264. Diagnostics (Basel) 2024; 14:2528. [PMID: 39594194 PMCID: PMC11592728 DOI: 10.3390/diagnostics14222528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 10/28/2024] [Indexed: 11/28/2024] Open
Abstract
The declaration of the COVID-19 pandemic by the World Health Organization (WHO) in March 2020 has triggered the publication of thousands of papers covering a plethora of aspects of the pandemic, from epidemiology models [...].
Collapse
|
27
|
Malik MA, Amjad AI, Aslam S, Fakhrou A. Global insights: ChatGPT's influence on academic and research writing, creativity, and plagiarism policies. Front Res Metr Anal 2024; 9:1486832. [PMID: 39583913 PMCID: PMC11582041 DOI: 10.3389/frma.2024.1486832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2024] [Accepted: 10/21/2024] [Indexed: 11/26/2024] Open
Abstract
Introduction The current study explored the influence of Chat Generative Pre-Trained Transformer (ChatGPT) on the concepts, parameters, policies, and practices of creativity and plagiarism in academic and research writing. Methods Data were collected from 10 researchers from 10 different countries (Australia, China, the UK, Brazil, Pakistan, Bangladesh, Iran, Nigeria, Trinidad and Tobago, and Turkiye) using semi-structured interviews. NVivo was employed for data analysis. Results Based on the responses, five themes about the influence of ChatGPT on academic and research writing were generated, i.e., opportunity, human assistance, thought-provoking, time-saving, and negative attitude. Although the researchers were mostly positive about it, some feared it would degrade their writing skills and lead to plagiarism. Many of them believed that ChatGPT would redefine the concepts, parameters, and practices of creativity and plagiarism. Discussion Creativity may no longer be restricted to the ability to write, but also to use ChatGPT or other large language models (LLMs) to write creatively. Some suggested that machine-generated text might be accepted as the new norm; however, using it without proper acknowledgment would be considered plagiarism. The researchers recommended allowing ChatGPT for academic and research writing; however, they strongly advised it to be regulated with limited use and proper acknowledgment.
Collapse
Affiliation(s)
| | | | - Sarfraz Aslam
- Doctoral Studies Department, Faculty of Education and Humanities, UNITAR International University, Petaling Jaya, Malaysia
| | - Abdulnaser Fakhrou
- Department of Psychological Sciences, College of Education, Qatar University, Doha, Qatar
| |
Collapse
|
28
|
Hayat J, Lari M, AlHerz M, Lari A. The Utility and Limitations of Artificial Intelligence-Powered Chatbots in Healthcare. Cureus 2024; 16:e73127. [PMID: 39650926 PMCID: PMC11624039 DOI: 10.7759/cureus.73127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/06/2024] [Indexed: 12/11/2024] Open
Abstract
At the intersection of artificial intelligence (AI) and healthcare, it is essential that clinicians grasp the ability of chatbots. AI-powered chatbots such as ChatGPT are being explored for their potential benefits by both individuals and institutions. The utility of ChatGPT (OpenAI) in various scenarios was explored through a series of recorded prompts and responses. In the clinical aspects, the chatbot facilitated tasks such as triage, patient consultation, diagnosis, and administrative responsibilities. Their capacity to translate and simplify intricate medical topics was also evaluated. For research purposes, the chatbots' abilities to suggest ideas, prepare protocols, assist in manuscript writing, guide statistical analyses, and recommend suitable journals were assessed. In the educational domain, chatbots were tested for simplifying complex subjects, reviewing procedural steps, generating clinical scenarios, and formulating multiple-choice questions. A comprehensive literature review was also conducted across Medline, Embase, and Web of Science. Chatbots, when optimally employed, can serve as invaluable resources in healthcare, spanning clinical, research, and educational domains. Their potential lies in enhancing efficiency, guiding decision-making, and facilitating patient care and education. However, their application requires a nuanced understanding and caution regarding their limitations.
Collapse
Affiliation(s)
- Jafar Hayat
- Department of Surgery, Jaber Al-Ahmad Al-Sabah Hospital, Kuwait City, KWT
| | - Mohammad Lari
- Department of Orthopedic Surgery, Al-Razi National Orthopedic Hospital, Kuwait City, KWT
| | | | - Ali Lari
- Department of Orthopedic Surgery, Al-Razi National Orthopedic Hospital, Kuwait City, KWT
| |
Collapse
|
29
|
Judge CS, Krewer F, O'Donnell MJ, Kiely L, Sexton D, Taylor GW, Skorburg JA, Tripp B. Multimodal Artificial Intelligence in Medicine. KIDNEY360 2024; 5:1771-1779. [PMID: 39167446 DOI: 10.34067/kid.0000000000000556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Accepted: 08/15/2024] [Indexed: 08/23/2024]
Abstract
Traditional medical artificial intelligence models that are approved for clinical use restrict themselves to single-modal data ( e.g ., images only), limiting their applicability in the complex, multimodal environment of medical diagnosis and treatment. Multimodal transformer models in health care can effectively process and interpret diverse data forms, such as text, images, and structured data. They have demonstrated impressive performance on standard benchmarks, like United States Medical Licensing Examination question banks, and continue to improve with scale. However, the adoption of these advanced artificial intelligence models is not without challenges. While multimodal deep learning models like transformers offer promising advancements in health care, their integration requires careful consideration of the accompanying ethical and environmental challenges.
Collapse
Affiliation(s)
- Conor S Judge
- HRB-Clinical Research Facility, University of Galway, Galway, Ireland
- Insight Data Analytics, University of Galway, Galway, Ireland
| | - Finn Krewer
- HRB-Clinical Research Facility, University of Galway, Galway, Ireland
| | | | - Lisa Kiely
- HRB-Clinical Research Facility, University of Galway, Galway, Ireland
| | - Donal Sexton
- Department of Medicine, Trinity College Dublin, Dublin, Ireland
| | - Graham W Taylor
- University of Guelph, Guelph, Ontario, Canada
- Vector Institute, Toronto, Ontario, Canada
| | | | - Bryan Tripp
- Department of Systems Design Engineering, University of Waterloo, Waterloo, Ontario, Canada
| |
Collapse
|
30
|
Hsu TW, Tseng PT, Tsai SJ, Ko CH, Thompson T, Hsu CW, Yang FC, Tsai CK, Tu YK, Yang SN, Liang CS, Su KP. Quality and correctness of AI-generated versus human-written abstracts in psychiatric research papers. Psychiatry Res 2024; 341:116145. [PMID: 39213714 DOI: 10.1016/j.psychres.2024.116145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 05/25/2024] [Accepted: 08/11/2024] [Indexed: 09/04/2024]
Abstract
This study aimed to assess the ability of an artificial intelligence (AI)-based chatbot to generate abstracts from academic psychiatric articles. We provided 30 full-text psychiatric papers to ChatPDF (based on ChatGPT) and prompted generating a similar style structured or unstructured abstract. We further used 10 papers from Psychiatry Research as active comparators (unstructured format). We compared the quality of the ChatPDF-generated abstracts with the original human-written abstracts and examined the similarity, plagiarism, detected AI-content, and correctness of the AI-generated abstracts. Five experts evaluated the quality of the abstracts using a blinded approach. They also identified the abstracts written by the original authors and validated the conclusions produced by ChatPDF. We found that the similarity and plagiarism were relatively low (only 14.07% and 8.34%, respectively). The detected AI-content was 31.48% for generated structure-abstracts, 75.58% for unstructured-abstracts, and 66.48% for active comparators abstracts. For quality, generated structured-abstracts were rated similarly to originals, but unstructured ones received significantly lower scores. Experts rated 40% accuracy with structured abstracts, 73% with unstructured ones, and 77% for active comparators. However, 30% of AI-generated abstract conclusions were incorrect. In conclusion, the data organization capabilities of AI language models hold significant potential for applications to summarize information in clinical psychiatry. However, the use of ChatPDF to summarize psychiatric papers requires caution concerning accuracy.
Collapse
Affiliation(s)
- Tien-Wei Hsu
- Department of Psychiatry, E-DA Dachang Hospital, I-Shou University Kaohsiung, Taiwan; Department of Psychiatry, E-DA Hospital, I-Shou University, Kaohsiung, Taiwan
| | - Ping-Tao Tseng
- Institute of Biomedical Sciences, National Sun Yat-sen University, Kaohsiung, Taiwan; Department of Psychology, College of Medical and Health Science, Asia University, Taichung, Taiwan; Prospect Clinic for Otorhinolaryngology & Neurology, Kaohsiung, Taiwan; Institute of Precision Medicine, National Sun Yat-sen University, Kaohsiung, Taiwan
| | - Shih-Jen Tsai
- Department of Psychiatry, Taipei Veterans General Hospital, Taipei, Taiwan
| | - Chih-Hung Ko
- Department of Psychiatry, Faculty of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan; Department of Psychiatry, Kaohsiung Medical University Hospital, Kaohsiung, Taiwan; Department of Psychiatry, Kaohsiung Municipal Siaogang Hospital, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Trevor Thompson
- Centre for Chronic Illness and Ageing, University of Greenwich, London, UK
| | - Chih-Wei Hsu
- Department of Psychiatry, Kaohsiung Chang Gung Memorial Hospital and Chang Gung University College of Medicine, Kaohsiung, Taiwan
| | - Fu-Chi Yang
- Department of Neurology, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - Chia-Kuang Tsai
- Department of Neurology, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - Yu-Kang Tu
- Institute of Epidemiology & Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan; Department of Dentistry, National Taiwan University Hospital, Taipei, Taiwan
| | - Szu-Nian Yang
- Department of Psychiatry, Beitou Branch, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan; Department of Psychiatry, Armed Forces Taoyuan General Hospital, Taoyuan, Taiwan; Graduate Institute of Health and Welfare Policy, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Chih-Sung Liang
- Department of Psychiatry, Beitou Branch, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan; Department of Psychiatry, National Defense Medical Center, Taipei, Taiwan.
| | - Kuan-Pin Su
- College of Medicine, China Medical University, Taichung, Taiwan; Mind-Body Interface Laboratory (MBI-Lab), China Medical University and Hospital, Taichung, 404, Taiwan; An-Nan Hospital, China Medical University, Tainan, 709, Taiwan.
| |
Collapse
|
31
|
Solmonovich RL, Kouba I, Quezada O, Rodriguez-Ayala G, Rojas V, Bonilla K, Espino K, Bracero LA. Artificial intelligence generates proficient Spanish obstetrics and gynecology counseling templates. AJOG GLOBAL REPORTS 2024; 4:100400. [PMID: 39507462 PMCID: PMC11539139 DOI: 10.1016/j.xagr.2024.100400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2024] Open
Abstract
Background Effective patient counseling in Obstetrics and gynecology is vital. Existing language barriers between Spanish-speaking patients and English-speaking providers may negatively impact patient understanding and adherence to medical recommendations, as language discordance between provider and patient has been associated with medication noncompliance, adverse drug events, and underuse of preventative care. Artificial intelligence large language models may be a helpful adjunct to patient care by generating counseling templates in Spanish. Objectives The primary objective was to determine if large language models can generate proficient counseling templates in Spanish on obstetric and gynecology topics. Secondary objectives were to (1) compare the content, quality, and comprehensiveness of generated templates between different large language models, (2) compare the proficiency ratings among the large language model generated templates, and (3) assess which generated templates had potential for integration into clinical practice. Study design Cross-sectional study using free open-access large language models to generate counseling templates in Spanish on select obstetrics and gynecology topics. Native Spanish-speaking practicing obstetricians and gynecologists, who were blinded to the source large language model for each template, reviewed and subjectively scored each template on its content, quality, and comprehensiveness and considered it for integration into clinical practice. Proficiency ratings were calculated as a composite score of content, quality, and comprehensiveness. A score of >4 was considered proficient. Basic inferential statistics were performed. Results All artificial intelligence large language models generated proficient obstetrics and gynecology counseling templates in Spanish, with Google Bard generating the most proficient template (p<0.0001) and outperforming the others in comprehensiveness (P=.03), quality (P=.04), and content (P=.01). Microsoft Bing received the lowest scores in these domains. Physicians were likely to be willing to incorporate the templates into clinical practice, with no significant discrepancy in the likelihood of integration based on the source large language model (P=.45). Conclusions Large language models have potential to generate proficient obstetrics and gynecology counseling templates in Spanish, which physicians would integrate into their clinical practice. Google Bard scored the highest across all attributes. There is an opportunity to use large language models to try to mitigate the language barriers in health care. Future studies should assess patient satisfaction, understanding, and adherence to clinical plans following receipt of these counseling templates.
Collapse
Affiliation(s)
- Rachel L. Solmonovich
- Northwell, New Hyde Park, NY (Solmonovich, Kouba, Quezada, Rodriguez-Ayala, Rojas, Bonilla, Espino, and Bracero)
- Department of Obstetrics and Gynecology, South Shore University Hospital, Bay Shore, NY (Solmonovich, Kouba, Rojas, Bonilla, Espino, and Bracero)
- Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY (Solmonovich, Rodriguez-Ayala, Rojas, and Bracero)
| | - Insaf Kouba
- Northwell, New Hyde Park, NY (Solmonovich, Kouba, Quezada, Rodriguez-Ayala, Rojas, Bonilla, Espino, and Bracero)
- Department of Obstetrics and Gynecology, South Shore University Hospital, Bay Shore, NY (Solmonovich, Kouba, Rojas, Bonilla, Espino, and Bracero)
| | - Oscar Quezada
- Northwell, New Hyde Park, NY (Solmonovich, Kouba, Quezada, Rodriguez-Ayala, Rojas, Bonilla, Espino, and Bracero)
- Department of Obstetrics and Gynecology, Peconic Bay Medical Center, Riverhead, NY (Quezada)
| | - Gianni Rodriguez-Ayala
- Northwell, New Hyde Park, NY (Solmonovich, Kouba, Quezada, Rodriguez-Ayala, Rojas, Bonilla, Espino, and Bracero)
- Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY (Solmonovich, Rodriguez-Ayala, Rojas, and Bracero)
- Department of Obstetrics and Gynecology, Huntington Hospital, Huntington, NY (Rodriguez-Ayala)
| | - Veronica Rojas
- Northwell, New Hyde Park, NY (Solmonovich, Kouba, Quezada, Rodriguez-Ayala, Rojas, Bonilla, Espino, and Bracero)
- Department of Obstetrics and Gynecology, South Shore University Hospital, Bay Shore, NY (Solmonovich, Kouba, Rojas, Bonilla, Espino, and Bracero)
- Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY (Solmonovich, Rodriguez-Ayala, Rojas, and Bracero)
| | - Kevin Bonilla
- Northwell, New Hyde Park, NY (Solmonovich, Kouba, Quezada, Rodriguez-Ayala, Rojas, Bonilla, Espino, and Bracero)
- Department of Obstetrics and Gynecology, South Shore University Hospital, Bay Shore, NY (Solmonovich, Kouba, Rojas, Bonilla, Espino, and Bracero)
| | - Kevin Espino
- Northwell, New Hyde Park, NY (Solmonovich, Kouba, Quezada, Rodriguez-Ayala, Rojas, Bonilla, Espino, and Bracero)
- Department of Obstetrics and Gynecology, South Shore University Hospital, Bay Shore, NY (Solmonovich, Kouba, Rojas, Bonilla, Espino, and Bracero)
| | - Luis A. Bracero
- Northwell, New Hyde Park, NY (Solmonovich, Kouba, Quezada, Rodriguez-Ayala, Rojas, Bonilla, Espino, and Bracero)
- Department of Obstetrics and Gynecology, South Shore University Hospital, Bay Shore, NY (Solmonovich, Kouba, Rojas, Bonilla, Espino, and Bracero)
- Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY (Solmonovich, Rodriguez-Ayala, Rojas, and Bracero)
| |
Collapse
|
32
|
Pandya S, Alessandri Bonetti M, Liu HY, Jeong T, Ziembicki JA, Egro FM. Concordance of ChatGPT With American Burn Association Guidelines on Acute Burns. Ann Plast Surg 2024; 93:564-574. [PMID: 39445876 DOI: 10.1097/sap.0000000000004128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2024]
Abstract
ABSTRACT Burn injuries often require immediate assistance and specialized care for optimal management and outcomes. The emergence of accessible artificial intelligence technology has just recently started being applied to healthcare decision making and patient education. However, its role in clinical recommendations is still under scrutiny. This study aims to evaluate ChatGPT's outputs and the appropriateness of its responses to commonly asked questions regarding acute burn care when compared to the American Burn Association Guidelines. Twelve commonly asked questions were formulated by a fellowship-trained burn surgeon to address the American Burn Association's recommendations on burn injuries, management, and patient referral. These questions were prompted into ChatGPT, and each response was compared with the aforementioned guidelines, the gold standard for accurate and evidence-based burn care recommendations. Three burn surgeons independently evaluated the appropriateness and comprehensiveness of each ChatGPT response based on the guidelines according to the modified Global Quality Score scale. The average score for ChatGPT-generated responses was 4.56 ± 0.65, indicating the responses were exceptional quality with the most important topics covered and in high concordance with the guidelines. This initial comparison of ChatGPT-generated responses and the American Burn Association guidelines demonstrates that ChatGPT can accurately and comprehensibly describe appropriate treatment and management plans for acute burn injuries. We foresee that ChatGPT may play a role as a complementary tool in medical decision making and patient education, having a profound impact on clinical practice, research, and education.
Collapse
Affiliation(s)
- Sumaarg Pandya
- From the Department of Plastic Surgery, University of Pittsburgh Medical Center, Pittsburgh, PA
| | | | - Hilary Y Liu
- From the Department of Plastic Surgery, University of Pittsburgh Medical Center, Pittsburgh, PA
| | - Tiffany Jeong
- From the Department of Plastic Surgery, University of Pittsburgh Medical Center, Pittsburgh, PA
| | - Jenny A Ziembicki
- Department of Surgery, University of Pittsburgh Medical Center, Pittsburgh, PA
| | | |
Collapse
|
33
|
Rough K, Rashidi ES, Tai CG, Lucia RM, Mack CD, Largent JA. Core Concepts in Pharmacoepidemiology: Principled Use of Artificial Intelligence and Machine Learning in Pharmacoepidemiology and Healthcare Research. Pharmacoepidemiol Drug Saf 2024; 33:e70041. [PMID: 39500844 DOI: 10.1002/pds.70041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 08/20/2024] [Accepted: 10/04/2024] [Indexed: 11/17/2024]
Abstract
Artificial intelligence (AI) and machine learning (ML) are important tools across many fields of health and medical research. Pharmacoepidemiologists can bring essential methodological rigor and study design expertise to the design and use of these technologies within healthcare settings. AI/ML-based tools also play a role in pharmacoepidemiology research, as we may apply them to answer our own research questions, take responsibility for evaluating medical devices with AI/ML components, or participate in interdisciplinary research to create new AI/ML algorithms. While epidemiologic expertise is essential to deploying AI/ML responsibly and ethically, the rapid advancement of these technologies in the past decade has resulted in a knowledge gap for many in the field. This article provides a brief overview of core AI/ML concepts, followed by a discussion of potential applications of AI/ML in pharmacoepidemiology research, and closes with a review of important concepts across application areas, including interpretability and fairness. This review is intended to provide an accessible, practical overview of AI/ML for pharmacoepidemiology research, with references to further, more detailed resources on fundamental topics.
Collapse
Affiliation(s)
| | | | - Caroline G Tai
- Real World Solutions, IQVIA, Durham, North Carolina, USA
| | - Rachel M Lucia
- Real World Solutions, IQVIA, Durham, North Carolina, USA
| | | | - Joan A Largent
- Real World Solutions, IQVIA, Durham, North Carolina, USA
| |
Collapse
|
34
|
Li W, Shi HY, Chen XL, Lan JZ, Rehman AU, Ge MW, Shen LT, Hu FH, Jia YJ, Li XM, Chen HL. Application of artificial intelligence in medical education: A meta-ethnographic synthesis. MEDICAL TEACHER 2024:1-14. [PMID: 39480998 DOI: 10.1080/0142159x.2024.2418936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Accepted: 10/16/2024] [Indexed: 11/02/2024]
Abstract
With the advancement of Artificial Intelligence (AI), it has had a profound impact on medical education. Understanding the advantages and issues of AI in medical education, providing guidance for educators, and overcoming challenges in the implementation process is particularly important. The objective of this study is to explore the current state of AI applications in medical education. A systematic search was conducted across databases such as PsycINFO, CINAHL, Scopus, PubMed, and Web of Science to identify relevant studies. The Critical Appraisal Skills Programme (CASP) was employed for the quality assessment of these studies, followed by thematic synthesis to analyze the themes from the included research. Ultimately, 21 studies were identified, establishing four themes: (1) Shaping the Future: Current Trends in AI within Medical Education; (2) Advancing Medical Instruction: The Transformative Power of AI; (3) Navigating the Ethical Landscape of AI in Medical Education; (4) Fostering Synergy: Integrating Artificial Intelligence in Medical Curriculum. Artificial intelligence's role in medical education, while not yet extensive, is impactful and promising. Despite challenges, including ethical concerns over privacy, responsibility, and humanistic care, future efforts should focus on integrating AI through targeted courses to improve educational quality.
Collapse
Affiliation(s)
- Wei Li
- School of Nursing and Rehabilitation, Nantong University, Nantong, Jiangsu, China
| | - Hai-Yan Shi
- Nantong University Affiliated Rugao Hospital, Rugao People's Hospital, Nantong, Jiangsu, China
| | - Xiao-Ling Chen
- Department of Respiratory Medicine, Dongtai People's Hospital, Yancheng, Jiangsu, China
| | - Jian-Zeng Lan
- School of Nursing and Rehabilitation, Nantong University, Nantong, Jiangsu, China
| | - Attiq-Ur Rehman
- School of Nursing and Rehabilitation, Nantong University, Nantong, Jiangsu, China
- Gulfreen Nursing College Avicenna Hospital Bedian, Lahore, Pakistan
| | - Meng-Wei Ge
- School of Nursing and Rehabilitation, Nantong University, Nantong, Jiangsu, China
| | - Lu-Ting Shen
- School of Nursing and Rehabilitation, Nantong University, Nantong, Jiangsu, China
| | - Fei-Hong Hu
- School of Nursing and Rehabilitation, Nantong University, Nantong, Jiangsu, China
| | - Yi-Jie Jia
- School of Nursing and Rehabilitation, Nantong University, Nantong, Jiangsu, China
| | - Xiao-Min Li
- Nantong First People's Hospital, The Second Affiliated Hospital of Nantong University, Nantong, Jiangsu, China
| | - Hong-Lin Chen
- School of Nursing and Rehabilitation, Nantong University, Nantong, Jiangsu, China
| |
Collapse
|
35
|
Liu XQ, Wang X, Zhang HR. Large multimodal models assist in psychiatry disorders prevention and diagnosis of students. World J Psychiatry 2024; 14:1415-1421. [DOI: 10.5498/wjp.v14.i10.1415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Revised: 09/03/2024] [Accepted: 09/25/2024] [Indexed: 10/17/2024] Open
Abstract
Students are considered one of the groups most affected by psychological problems. Given the highly dangerous nature of mental illnesses and the increasingly serious state of global mental health, it is imperative for us to explore new methods and approaches concerning the prevention and treatment of mental illnesses. Large multimodal models (LMMs), as the most advanced artificial intelligence models (i.e. ChatGPT-4), have brought new hope to the accurate prevention, diagnosis, and treatment of psychiatric disorders. The assistance of these models in the promotion of mental health is critical, as the latter necessitates a strong foundation of medical knowledge and professional skills, emotional support, stigma mitigation, the encouragement of more honest patient self-disclosure, reduced health care costs, improved medical efficiency, and greater mental health service coverage. However, these models must address challenges related to health, safety, hallucinations, and ethics simultaneously. In the future, we should address these challenges by developing relevant usage manuals, accountability rules, and legal regulations; implementing a human-centered approach; and intelligently upgrading LMMs through the deep optimization of such models, their algorithms, and other means. This effort will thus substantially contribute not only to the maintenance of students’ health but also to the achievement of global sustainable development goals.
Collapse
Affiliation(s)
- Xin-Qiao Liu
- School of Education, Tianjin University, Tianjin 300350, China
| | - Xin Wang
- School of Education, Tianjin University, Tianjin 300350, China
| | - Hui-Rui Zhang
- Faculty of Education, The Open University of China, Beijing 100039, China
| |
Collapse
|
36
|
Hashemzadeh M, Rahimi A, Allahbakhsh M, Adibi P, Beigi-Harchegani H. Information Capsule: A New Approach for Summarizing Medical Information. Int J Prev Med 2024; 15:52. [PMID: 39539581 PMCID: PMC11559705 DOI: 10.4103/ijpvm.ijpvm_254_23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 08/16/2024] [Indexed: 11/16/2024] Open
Abstract
Background The diversities of medical information resources and health information needs in recent years have caused another form of summarization and abstract writing to appear in the information capsule (IC) versus various types of abstracts. The present study was conducted with the aim to analyze the current IC, giving a unique definition and a standard structure for the development of an IC, and how it can be represented and implemented. Methods This study was conducted in three phases in the form of a qualitative study. In the first phase, a library review study was done on the relevant websites and international databases, such as PubMed, Science Direct, Web of Sciences, Google Scholar, ProQuest, and Embase. In the second phase, the results of the previous stage were discussed with a panel of experts. In the third phase, a suggested frame for an IC was stated. Results A specific structure was suggested for IC so that, in addition to having parts of similar cases, they had extra parts. The suggested frame includes title, name of the IC writers and reviewers, question or goals, design or methods, setting, patient or community of the study, result, commentary, citation, topic, picture, and tag and can be used in different fields. Conclusions Due to the importance of IC in summarizing information, our suggested structure should be used in other fields and be subject to trial and error.
Collapse
Affiliation(s)
- Mozhdeh Hashemzadeh
- Department of Medical Library and Information Science, School of Management and Medical Information Sciences, Clinical Informationist Research Group, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Alireza Rahimi
- Clinical Informationist Research Group, Health Information Technology Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| | | | - Peyman Adibi
- Gastroenterology and Hepatology Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Hossein Beigi-Harchegani
- Clinical Informationist Research Group, Health Information Technology Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| |
Collapse
|
37
|
Liu W, Kan H, Jiang Y, Geng Y, Nie Y, Yang M. MED-ChatGPT CoPilot: a ChatGPT medical assistant for case mining and adjunctive therapy. Front Med (Lausanne) 2024; 11:1460553. [PMID: 39478827 PMCID: PMC11521861 DOI: 10.3389/fmed.2024.1460553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2024] [Accepted: 10/03/2024] [Indexed: 11/02/2024] Open
Abstract
Background The large-scale language model, GPT-4-1106-preview, supports text of up to 128 k characters, which has enhanced the capability of processing vast quantities of text. This model can perform efficient and accurate text data mining without the need for retraining, aided by prompt engineering. Method The research approach includes prompt engineering and text vectorization processing. In this study, prompt engineering is applied to assist ChatGPT in text mining. Subsequently, the mined results are vectorized and incorporated into a local knowledge base. After cleansing 306 medical papers, data extraction was performed using ChatGPT. Following a validation and filtering process, 241 medical case data entries were obtained, leading to the construction of a local medical knowledge base. Additionally, drawing upon the Langchain framework and utilizing the local knowledge base in conjunction with ChatGPT, we successfully developed a fast and reliable chatbot. This chatbot is capable of providing recommended diagnostic and treatment information for various diseases. Results The performance of the designed ChatGPT model, which was enhanced by data from the local knowledge base, exceeded that of the original model by 7.90% on a set of medical questions. Conclusion ChatGPT, assisted by prompt engineering, demonstrates effective data mining capabilities for large-scale medical texts. In the future, we plan to incorporate a richer array of medical case data, expand the scale of the knowledge base, and enhance ChatGPT's performance in the medical field.
Collapse
Affiliation(s)
- Wei Liu
- School of Medical Information Engineering, Anhui University of Traditional Chinese Medicine, Hefei, Anhui, China
- Anhui Computer Application Research Institute of Chinese Medicine, China Academy of Chinese Medical Sciences, Hefei, Anhui, China
| | - Hongxing Kan
- School of Medical Information Engineering, Anhui University of Traditional Chinese Medicine, Hefei, Anhui, China
- Anhui Computer Application Research Institute of Chinese Medicine, China Academy of Chinese Medical Sciences, Hefei, Anhui, China
| | - Yanfei Jiang
- School of Medical Information Engineering, Anhui University of Traditional Chinese Medicine, Hefei, Anhui, China
| | - Yingbao Geng
- School of Medical Information Engineering, Anhui University of Traditional Chinese Medicine, Hefei, Anhui, China
| | - Yiqi Nie
- School of Medical Information Engineering, Anhui University of Traditional Chinese Medicine, Hefei, Anhui, China
- Anhui Computer Application Research Institute of Chinese Medicine, China Academy of Chinese Medical Sciences, Hefei, Anhui, China
| | - Mingguang Yang
- School of Medical Information Engineering, Anhui University of Traditional Chinese Medicine, Hefei, Anhui, China
| |
Collapse
|
38
|
Kim HJ, Yoon PW, Yoon JY, Kim H, Choi YJ, Park S, Moon JK. Discrepancies in ChatGPT's Hip Fracture Recommendations in Older Adults for 2021 AAOS Evidence-Based Guidelines. J Clin Med 2024; 13:5971. [PMID: 39408030 PMCID: PMC11477870 DOI: 10.3390/jcm13195971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2024] [Revised: 10/05/2024] [Accepted: 10/07/2024] [Indexed: 10/20/2024] Open
Abstract
Background: This study aimed to assess the reproducibility and reliability of Chat-Based GPT (ChatGPT)'s responses to 19 statements regarding the management of hip fractures in older adults as adopted by the American Academy of Orthopaedic Surgeons' (AAOS) evidence-based clinical practice guidelines. Methods: Nineteen statements were obtained from the 2021 AAOS evidence-based clinical practice guidelines. After generating questions based on these 19 statements, we set a prompt for both the GPT-4o and GPT-4 models. We repeated this process three times at 24 h intervals for both models, producing outputs A, B, and C. ChatGPT's performance, the intra-ChatGPT reliability, and the accuracy rates were assessed to evaluate the reproducibility and reliability of the hip fracture-related guidelines. Regarding the strengths of the recommendation compared with the 2021 AAOS guidelines, we observed accuracy of 0.684, 0.579, and 0.632 for outputs A, B, and C, respectively. Results: The precision was 0.740, 0.737, and 0.718 in outputs A, B, and C, respectively. For the reliability of the strengths of the recommendation, the Fleiss kappa was 0.409, indicating a moderate level of agreement. No statistical differences in the strengths of the recommendation were observed in outputs A, B, and C between the GPT-4o and GPT-4 versions. Conclusion: ChatGPT may be useful in providing guidelines for hip fractures but performs poorly in terms of accuracy and precision. However, hallucinations remain an unresolved limitation associated with using ChatGPT to search for hip fracture guidelines. The effective utilization of ChatGPT as a patient education tool for the management of hip fractures should be addressed in the future.
Collapse
Affiliation(s)
- Hong Jin Kim
- Department of Orthopaedic Surgery, Kyung-in Regional Military Manpower Administration, Suwon 16440, Republic of Korea;
- Department of Orthopedic Surgery, Inje University Sanggye Paik Hospital, Seoul 01757, Republic of Korea; (H.K.); (S.P.)
| | - Pil Whan Yoon
- Department of Orthopaedic Surgery, Seoul Now Hospital, Anyang-si 14058, Republic of Korea; (P.W.Y.); (J.Y.Y.)
| | - Jae Youn Yoon
- Department of Orthopaedic Surgery, Seoul Now Hospital, Anyang-si 14058, Republic of Korea; (P.W.Y.); (J.Y.Y.)
| | - Hyungtae Kim
- Department of Orthopedic Surgery, Inje University Sanggye Paik Hospital, Seoul 01757, Republic of Korea; (H.K.); (S.P.)
| | - Young Jin Choi
- Department of Orthopedic Surgery, Chung Goo Sung Sim Hospital, Seoul 03330, Republic of Korea;
| | - Sangyoon Park
- Department of Orthopedic Surgery, Inje University Sanggye Paik Hospital, Seoul 01757, Republic of Korea; (H.K.); (S.P.)
| | - Jun-Ki Moon
- Department of Orthopaedic Surgery, Chung-Ang University Hospital, Seoul 06973, Republic of Korea
| |
Collapse
|
39
|
Lechien JR. Generative AI and Otolaryngology-Head & Neck Surgery. Otolaryngol Clin North Am 2024; 57:753-765. [PMID: 38839556 DOI: 10.1016/j.otc.2024.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2024]
Abstract
The increasing development of artificial intelligence (AI) generative models in otolaryngology-head and neck surgery will progressively change our practice. Practitioners and patients have access to AI resources, improving information, knowledge, and practice of patient care. This article summarizes the currently investigated applications of AI generative models, particularly Chatbot Generative Pre-trained Transformer, in otolaryngology-head and neck surgery.
Collapse
Affiliation(s)
- Jérôme R Lechien
- Research Committee of Young Otolaryngologists of the International Federation of Otorhinolaryngological Societies (IFOS), Paris, France; Division of Laryngology and Broncho-esophagology, Department of Otolaryngology-Head Neck Surgery, EpiCURA Hospital, UMONS Research Institute for Health Sciences and Technology, University of Mons (UMons), Mons, Belgium; Department of Otorhinolaryngology and Head and Neck Surgery, Foch Hospital, Paris Saclay University, Phonetics and Phonology Laboratory (UMR 7018 CNRS, Université Sorbonne Nouvelle/Paris 3), Paris, France; Department of Otorhinolaryngology and Head and Neck Surgery, CHU Saint-Pierre, Brussels, Belgium.
| |
Collapse
|
40
|
Bhattaru A, Yanamala N, Sengupta PP. Revolutionizing Cardiology With Words: Unveiling the Impact of Large Language Models in Medical Science Writing. Can J Cardiol 2024; 40:1950-1958. [PMID: 38823633 DOI: 10.1016/j.cjca.2024.05.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2024] [Revised: 05/16/2024] [Accepted: 05/24/2024] [Indexed: 06/03/2024] Open
Abstract
Large language models (LLMs) are a unique form of machine learning that facilitates inputs of unstructured text/numerical information for meaningful interpretation and prediction. Recently, LLMs have become commercialized, allowing the average person to access these incredibly powerful tools. Early adopters focused on LLM use in performing logical tasks, including-but not limited to-generating titles, identifying key words, summarizing text, initial editing of scientific work, improving statistical protocols, and performing statistical analysis. More recently, LLMs have been expanded to clinical practice and academia to perform higher cognitive and creative tasks. LLMs provide personalized assistance in learning, facilitate the management of electronic medical records, and offer valuable insights into clinical decision making in cardiology. They enhance patient education by explaining intricate medical conditions in lay terms, have a vast library of knowledge to help clinicians expedite administrative tasks, provide useful feedback regarding content of scientific writing, and assist in the peer-review process. Despite their impressive capabilities, LLMs are not without limitations. They are susceptible to generating incorrect or plagiarized content, face challenges in handling tasks without detailed prompts, and lack originality. These limitations underscore the importance of human oversight in using LLMs in medical science and clinical practice. As LLMs continue to evolve, addressing these challenges will be crucial in maximizing their potential benefits while mitigating risks. This review explores the functions, opportunities, and constraints of LLMs, with a focus on their impact on cardiology, illustrating both the transformative power and the boundaries of current technology in medicine.
Collapse
Affiliation(s)
- Abhijit Bhattaru
- Department of Cardiology, Rutgers Robert Wood Johnson Medical School and Robert Wood Johnson University Hospital, New Brunswick, New Jersey, USA; Department of Medicine, Rutgers New Jersey Medical School, Newark, New Jersey, USA
| | - Naveena Yanamala
- Department of Cardiology, Rutgers Robert Wood Johnson Medical School and Robert Wood Johnson University Hospital, New Brunswick, New Jersey, USA
| | - Partho P Sengupta
- Department of Cardiology, Rutgers Robert Wood Johnson Medical School and Robert Wood Johnson University Hospital, New Brunswick, New Jersey, USA.
| |
Collapse
|
41
|
Buvat I, Weber WA. Is ChatGPT a Reliable Ghostwriter? J Nucl Med 2024; 65:1499-1502. [PMID: 39168521 DOI: 10.2967/jnumed.124.268341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024] Open
Affiliation(s)
- Irène Buvat
- Laboratoire d'Imagerie Translationnelle en Oncologie, Institut Curie, INSERM U1288, PSL Research University, Orsay, France
| | - Wolfgang A Weber
- Technical University of Munich, Munich, Germany; and
- Bavarian Cancer Research Center, Erlangen, Germany
| |
Collapse
|
42
|
Ahaley SS, Pandey A, Juneja SK, Gupta TS, Vijayakumar S. ChatGPT in medical writing: A game-changer or a gimmick? Perspect Clin Res 2024; 15:165-171. [PMID: 39583920 PMCID: PMC11584153 DOI: 10.4103/picr.picr_167_23] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 08/22/2023] [Accepted: 09/06/2023] [Indexed: 11/26/2024] Open
Abstract
OpenAI's ChatGPT (Generative Pre-trained Transformer) is a chatbot that answers questions and performs writing tasks in a conversational tone. Within months of release, multiple sectors are contemplating the varied applications of this chatbot, including medicine, education, and research, all of which are involved in medical communication and scientific publishing. Medical writers and academics use several artificial intelligence (AI) tools and software for research, literature survey, data analyses, referencing, and writing. There are benefits of using different AI tools in medical writing. However, using chatbots for medical communications pose some major concerns such as potential inaccuracies, data bias, security, and ethical issues. Perceived incorrect notions also limit their use. Moreover, ChatGPT can also be challenging if used incorrectly and for irrelevant tasks. If used appropriately, ChatGPT will not only upgrade the knowledge of the medical writer but also save time and energy that could be directed toward more creative and analytical areas requiring expert skill sets. This review introduces chatbots, outlines the progress in ChatGPT research, elaborates the potential uses of ChatGPT in medical communications along with its challenges and limitations, and proposes future research perspectives. It aims to provide guidance for doctors, researchers, and medical writers on the uses of ChatGPT in medical communications.
Collapse
Affiliation(s)
- Shital Sarah Ahaley
- Hashtag Medical Writing Solutions Private Limited, Chennai, Tamil Nadu, India
| | - Ankita Pandey
- Hashtag Medical Writing Solutions Private Limited, Chennai, Tamil Nadu, India
| | - Simran Kaur Juneja
- Hashtag Medical Writing Solutions Private Limited, Chennai, Tamil Nadu, India
| | - Tanvi Suhane Gupta
- Hashtag Medical Writing Solutions Private Limited, Chennai, Tamil Nadu, India
| | - Sujatha Vijayakumar
- Hashtag Medical Writing Solutions Private Limited, Chennai, Tamil Nadu, India
| |
Collapse
|
43
|
Park KW, Diop M, Willens SH, Pepper JP. Artificial Intelligence in Facial Plastics and Reconstructive Surgery. Otolaryngol Clin North Am 2024; 57:843-852. [PMID: 38971626 DOI: 10.1016/j.otc.2024.05.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/08/2024]
Abstract
Artificial intelligence (AI), particularly computer vision and large language models, will impact facial plastic and reconstructive surgery (FPRS) by enhancing diagnostic accuracy, refining surgical planning, and improving post-operative evaluations. These advancements can address subjective limitations of aesthetic surgery by providing objective tools for patient evaluation. Despite these advancements, AI in FPRS has yet to be fully integrated in the clinic setting and faces numerous challenges including algorithmic bias, ethical considerations, and need for validation. This article discusses current and emerging AI technologies in FPRS for the clinic setting, providing a glimpse of its future potential.
Collapse
Affiliation(s)
- Ki Wan Park
- Department of Otolaryngology-Head and Neck Surgery, Stanford University School of Medicine, 801 Welch Road, Palo Alto, CA 94305, USA
| | - Mohamed Diop
- Department of Otolaryngology-Head and Neck Surgery, Stanford University School of Medicine, 801 Welch Road, Palo Alto, CA 94305, USA
| | - Sierra Hewett Willens
- Department of Otolaryngology-Head and Neck Surgery, Stanford University School of Medicine, 801 Welch Road, Palo Alto, CA 94305, USA
| | - Jon-Paul Pepper
- Department of Otolaryngology-Head and Neck Surgery, Stanford University School of Medicine, 801 Welch Road, Palo Alto, CA 94305, USA.
| |
Collapse
|
44
|
Albuck AL, Becnel CM, Sirna DJ, Turner J. Precision of Chatbot Generative Pretrained Transformer Version 4-Generated References for Colon and Rectal Surgical Literature. J Surg Res 2024; 302:324-328. [PMID: 39121800 DOI: 10.1016/j.jss.2024.07.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 03/12/2024] [Accepted: 07/07/2024] [Indexed: 08/12/2024]
Abstract
INTRODUCTION The objective is to assess the precision of references generated by Chatbot Generative Pretrained Transformer version 4 (ChatGPT-4) in scientific literature pertaining to colon and rectal surgery. METHODS Ten frequently studied keywords pertaining to colon and rectal surgery were chosen: colon cancer, rectal cancer, anal cancer, total neoadjuvant therapy, diverticulitis, low anterior resection, transanal minimally invasive surgery, ileal pouch anal anastomosis, abdominoperineal resection, and hemorrhoidectomy. ChatGPT-4 was prompted to search for the most representative citations for all keywords. After this, two separate evaluators meticulously examined the outcomes each key element, awarding full accuracy to generated citations in which there was no discrepancies in any of the fields when cross-referenced with the Scopus, Google, and PubMed databases. References from ChatGPT-4 underwent a thorough review process, which involved careful examination of key elements such as the article title, authors, journal name, publication year, and Digital Object Identifier (DOI). RESULTS Forty-one of the 100 references generated by were fully accurate; however, but none included a DOI. Partial accuracy was observed in 67 of the references, which were identifiable by title and journal. Performance varied across specific keywords; for example, references for colon and rectal cancer were 100% identifiable by title and journal, but no term had 100% accuracy across all categories. Notably, none of the generated references correctly listed all authors. Conducted within a short timeframe during which ChatGPT4 is rapidly evolving and updating its knowledge base. CONCLUSIONS While ChatGPT-4 offers improvements over its predecessors and shows potential for use in academic literature, its inconsistent performance across categories, lack of DOIs, and irregularities in authorship listings raise concerns about its readiness for application in the field of colon and rectal surgery research.
Collapse
Affiliation(s)
- Aaron L Albuck
- School of Medicine, Tulane University, New Orleans, Louisiana
| | - Chad M Becnel
- School of Medicine, Tulane University, New Orleans, Louisiana; Ochsner Clinic Foundation, New Orleans, Louisiana
| | - Daniel J Sirna
- School of Medicine, Tulane University, New Orleans, Louisiana
| | - Jacquelyn Turner
- Division of Endocrine and Oncologic Surgery, Department of Surgery, Tulane University School of Medicine, New Orleans, Louisiana.
| |
Collapse
|
45
|
Alyasiri OM, Salman AM, Akhtom D, Salisu S. ChatGPT revisited: Using ChatGPT-4 for finding references and editing language in medical scientific articles. JOURNAL OF STOMATOLOGY, ORAL AND MAXILLOFACIAL SURGERY 2024; 125:101842. [PMID: 38521243 DOI: 10.1016/j.jormas.2024.101842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 03/06/2024] [Accepted: 03/20/2024] [Indexed: 03/25/2024]
Abstract
The attainment of academic superiority relies heavily upon the accessibility of scholarly resources and the expression of research findings through faultless language usage. Although modern tools, such as the Publish or Perish software program, are proficient in sourcing academic papers based on specific keywords, they often fall short of extracting comprehensive content, including crucial references. The challenge of linguistic precision remains a prominent issue, particularly for research papers composed by non-native English speakers who may encounter word usage errors. This manuscript serves a twofold purpose: firstly, it reassesses the effectiveness of ChatGPT-4 in the context of retrieving pertinent references tailored to specific research topics. Secondly, it introduces a suite of language editing services that are skilled in rectifying word usage errors, ensuring the refined presentation of research outcomes. The article also provides practical guidelines for formulating precise queries to mitigate the risks of erroneous language usage and the inclusion of spurious references. In the ever-evolving realm of academic discourse, leveraging the potential of advanced AI, such as ChatGPT-4, can significantly enhance the quality and impact of scientific publications.
Collapse
Affiliation(s)
- Osamah Mohammed Alyasiri
- Karbala Technical Institute, Al-Furat Al-Awsat Technical University, Karbala 56001, Iraq; School of Computer Sciences, Universiti Sains Malaysia, Penang 11800, Malaysia.
| | - Amer M Salman
- School of Mathematical Sciences, Universiti Sains Malaysia, Penang 11800, Malaysia
| | - Dua'a Akhtom
- School of Computer Sciences, Universiti Sains Malaysia, Penang 11800, Malaysia
| | - Sani Salisu
- School of Computer Sciences, Universiti Sains Malaysia, Penang 11800, Malaysia; Department of Information Technology, Federal University Dutse, Dutse 720101, Nigeria
| |
Collapse
|
46
|
Hirosawa T, Shimizu T. Enhancing English Presentation Skills with Generative Artificial Intelligence: A Guide for Non-native Researchers. MEDICAL SCIENCE EDUCATOR 2024; 34:1179-1184. [PMID: 39450042 PMCID: PMC11496412 DOI: 10.1007/s40670-024-02078-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 05/15/2024] [Indexed: 10/26/2024]
Abstract
This commentary explores the utilization of generative artificial intelligence (AI), particularly Google Gemini (previously Bard), in enhancing English presentation skills among non-native researchers. We present a step-by-step methodology for using Google Gemini's Speech-to-Text and Text-to-Speech features. Our findings suggest that Google Gemini effectively aids in draft presentations, pronunciation practice, and content verification, tapping into an area often unexplored-using AI for presentation skills in scientific research. Despite its potential, users must exercise caution due to the experimental nature of this AI technology. Adapting to such technologies is timely and beneficial for the global scientific community.
Collapse
Affiliation(s)
- Takanobu Hirosawa
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, 880 Kitakobayashi, Simotsuga-gun, Mibu-cho, Tochigi 321-0293 Japan
| | - Taro Shimizu
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, 880 Kitakobayashi, Simotsuga-gun, Mibu-cho, Tochigi 321-0293 Japan
| |
Collapse
|
47
|
Mayol Martínez J. Impacto de la Inteligencia Artificial generativa en la publicación científica. ENFERMERÍA NEFROLÓGICA 2024; 27:187-188. [DOI: 10.37551/s2254-28842024019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2025] Open
Abstract
La inteligencia artificial (IA), definida como la capacidad de las máquinas para simular los procesos cognitivos que son propios de la especie humana, se ha convertido en poco más de dos años en una fuerza disruptiva en múltiples sectores de gestión del conocimiento y especialmente en el de la publicación científica1. El procesamiento de lenguaje natural, habilidad básica de la IA generativa que se ha desarrollado exponencialmente en menos de una década desde la descripción de la arquitectura transformer2 permite que las máquinas comprendan, interpreten y generen textos similares a los humanos de manera fluida y plausible3. Esto facilita la escritura automatizada de documentos, la síntesis de estudios previos y la producción de nuevos contenidos. Esto afecta profundamente a la forma en que se produce, comparte, accede, e incluso se evalúa la información. Las aplicaciones de IA generativa van a incrementar la eficiencia y accesibilidad de la investigación, pero también plantean desafíos éticos y de seguridad que requieren una consideración cuidadosa.
Collapse
|
48
|
Alami K, Willemse E, Quiriny M, Lipski S, Laurent C, Donquier V, Digonnet A. Evaluation of ChatGPT-4's Performance in Therapeutic Decision-Making During Multidisciplinary Oncology Meetings for Head and Neck Squamous Cell Carcinoma. Cureus 2024; 16:e68808. [PMID: 39376890 PMCID: PMC11456411 DOI: 10.7759/cureus.68808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/04/2024] [Indexed: 10/09/2024] Open
Abstract
Objectives First reports suggest that artificial intelligence (AI) such as ChatGPT-4 (Open AI, ChatGPT-4, San Francisco, USA) might represent reliable tools for therapeutic decisions in some medical conditions. This study aims to assess the decisional capacity of ChatGPT-4 in patients with head and neck carcinomas, using the multidisciplinary oncology meeting (MOM) and the National Comprehensive Cancer Network (NCCN) decision as references. Methods This retrospective study included 263 patients with squamous cell carcinoma of the oral cavity, oropharynx, hypopharynx, and larynx who were followed at our institution between January 1, 2016, and December 31, 2021. The recommendation of GPT4 for the first- and second-line treatments was compared to the MOM decision and NCCN guidelines. The degrees of agreement were calculated using the Kappa method, which measures the degree of agreement between two evaluators. Results ChatGPT-4 demonstrated a moderate agreement in first-line treatment recommendations (Kappa = 0.48) and a substantial agreement (Kappa = 0.78) in second-line treatment recommendations compared to the decisions from MOM. A substantial agreement with the NCCN guidelines for both first- and second-line treatments was observed (Kappa = 0.72 and 0.66, respectively). The degree of agreement decreased when the decision included gastrostomy, patients over 70, and those with comorbidities. Conclusions The study illustrates that while ChatGPT-4 can significantly support clinical decision-making in oncology by aligning closely with expert recommendations and established guidelines, ongoing enhancements and training are crucial. The findings advocate for the continued evolution of AI tools to better handle the nuanced aspects of patient health profiles, thus broadening their applicability and reliability in clinical practice.
Collapse
Affiliation(s)
- Kenza Alami
- Otolaryngology, Jules Bordet Institute, Bruxelles, BEL
| | | | - Marie Quiriny
- Surgical Oncology, Jules Bordet Institute, Bruxelles, BEL
| | - Samuel Lipski
- Surgical Oncology, Jules Bordet Institute, Bruxelles, BEL
| | - Celine Laurent
- Otolaryngology - Head and Neck Surgery, Hôpital Ambroise-Paré, Mons, BEL
- Otolaryngology - Head and Neck Surgery, Hôpital Universitaire de Bruxelles (HUB) Erasme Hospital, Bruxelles, BEL
| | | | | |
Collapse
|
49
|
Lechien JR, Rameau A. Applications of ChatGPT in Otolaryngology-Head Neck Surgery: A State of the Art Review. Otolaryngol Head Neck Surg 2024; 171:667-677. [PMID: 38716790 DOI: 10.1002/ohn.807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 04/01/2024] [Accepted: 04/19/2024] [Indexed: 08/28/2024]
Abstract
OBJECTIVE To review the current literature on the application, accuracy, and performance of Chatbot Generative Pre-Trained Transformer (ChatGPT) in Otolaryngology-Head and Neck Surgery. DATA SOURCES PubMED, Cochrane Library, and Scopus. REVIEW METHODS A comprehensive review of the literature on the applications of ChatGPT in otolaryngology was conducted according to Preferred Reporting Items for Systematic Reviews and Meta-analyses statement. CONCLUSIONS ChatGPT provides imperfect patient information or general knowledge related to diseases found in Otolaryngology-Head and Neck Surgery. In clinical practice, despite suboptimal performance, studies reported that the model is more accurate in providing diagnoses, than in suggesting the most adequate additional examinations and treatments related to clinical vignettes or real clinical cases. ChatGPT has been used as an adjunct tool to improve scientific reports (referencing, spelling correction), to elaborate study protocols, or to take student or resident exams reporting several levels of accuracy. The stability of ChatGPT responses throughout repeated questions appeared high but many studies reported some hallucination events, particularly in providing scientific references. IMPLICATIONS FOR PRACTICE To date, most applications of ChatGPT are limited in generating disease or treatment information, and in the improvement of the management of clinical cases. The lack of comparison of ChatGPT performance with other large language models is the main limitation of the current research. Its ability to analyze clinical images has not yet been investigated in otolaryngology although upper airway tract or ear images are an important step in the diagnosis of most common ear, nose, and throat conditions. This review may help otolaryngologists to conceive new applications in further research.
Collapse
Affiliation(s)
- Jérôme R Lechien
- Research Committee of Young Otolaryngologists of the International Federation of Otorhinolaryngological Societies (IFOS), Paris, France
- Division of Laryngology and Broncho-Esophagology, Department of Otolaryngology-Head Neck Surgery, EpiCURA Hospital, UMONS Research Institute for Health Sciences and Technology, University of Mons (UMons), Mons, Belgium
- Department of Otorhinolaryngology and Head and Neck Surgery, Foch Hospital, Phonetics and Phonology Laboratory (UMR 7018 CNRS, Université Sorbonne Nouvelle/Paris 3), Paris Saclay University, Paris, France
- Department of Otorhinolaryngology and Head and Neck Surgery, CHU Saint-Pierre, Brussels, Belgium
| | - Anais Rameau
- Department of Otolaryngology-Head and Neck Surgery, Sean Parker Institute for the Voice, Weill Cornell Medicine, New York City, New York, USA
| |
Collapse
|
50
|
Salvagno M, Cassai AD, Zorzi S, Zaccarelli M, Pasetto M, Sterchele ED, Chumachenko D, Gerli AG, Azamfirei R, Taccone FS. The state of artificial intelligence in medical research: A survey of corresponding authors from top medical journals. PLoS One 2024; 19:e0309208. [PMID: 39178224 PMCID: PMC11343420 DOI: 10.1371/journal.pone.0309208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 08/08/2024] [Indexed: 08/25/2024] Open
Abstract
Natural Language Processing (NLP) is a subset of artificial intelligence that enables machines to understand and respond to human language through Large Language Models (LLMs)‥ These models have diverse applications in fields such as medical research, scientific writing, and publishing, but concerns such as hallucination, ethical issues, bias, and cybersecurity need to be addressed. To understand the scientific community's understanding and perspective on the role of Artificial Intelligence (AI) in research and authorship, a survey was designed for corresponding authors in top medical journals. An online survey was conducted from July 13th, 2023, to September 1st, 2023, using the SurveyMonkey web instrument, and the population of interest were corresponding authors who published in 2022 in the 15 highest-impact medical journals, as ranked by the Journal Citation Report. The survey link has been sent to all the identified corresponding authors by mail. A total of 266 authors answered, and 236 entered the final analysis. Most of the researchers (40.6%) reported having moderate familiarity with artificial intelligence, while a minority (4.4%) had no associated knowledge. Furthermore, the vast majority (79.0%) believe that artificial intelligence will play a major role in the future of research. Of note, no correlation between academic metrics and artificial intelligence knowledge or confidence was found. The results indicate that although researchers have varying degrees of familiarity with artificial intelligence, its use in scientific research is still in its early phases. Despite lacking formal AI training, many scholars publishing in high-impact journals have started integrating such technologies into their projects, including rephrasing, translation, and proofreading tasks. Efforts should focus on providing training for their effective use, establishing guidelines by journal editors, and creating software applications that bundle multiple integrated tools into a single platform.
Collapse
Affiliation(s)
- Michele Salvagno
- Department of Intensive Care, Hôpital Universitaire de Bruxelles (HUB), Brussels, Belgium
| | - Alessandro De Cassai
- Sant’Antonio Anesthesia and Intensive Care Unit, University Hospital of Padua, Padua, Italy
| | - Stefano Zorzi
- Department of Intensive Care, Hôpital Universitaire de Bruxelles (HUB), Brussels, Belgium
| | - Mario Zaccarelli
- Department of Intensive Care, Hôpital Universitaire de Bruxelles (HUB), Brussels, Belgium
| | - Marco Pasetto
- Department of Intensive Care, Hôpital Universitaire de Bruxelles (HUB), Brussels, Belgium
| | - Elda Diletta Sterchele
- Department of Intensive Care, Hôpital Universitaire de Bruxelles (HUB), Brussels, Belgium
| | - Dmytro Chumachenko
- Department of Mathematical Modelling and Artificial Intelligence, National Aerospace University “Kharkiv Aviation Institute”, Kharkiv, Ukraine
- Ubiquitous Health Technologies Lab, University of Waterloo, Waterloo, Canada
| | - Alberto Giovanni Gerli
- Department of Clinical Sciences and Community Health, Università degli Studi di Milano, Milan, Italy
| | - Razvan Azamfirei
- Department of Anesthesiology and Critical Care Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, United States of America
| | - Fabio Silvio Taccone
- Department of Intensive Care, Hôpital Universitaire de Bruxelles (HUB), Brussels, Belgium
| |
Collapse
|