Published online May 28, 2025. doi: 10.3748/wjg.v31.i20.105285
Revised: March 20, 2025
Accepted: April 7, 2025
Published online: May 28, 2025
Processing time: 131 Days and 16.3 Hours
This article evaluates the transformative potential of large language models (LLMs) as patient education tools for managing inflammatory bowel disease. The discussion highlights their ability to deliver nuanced and personalized infor
Core Tip: Large language models offer a groundbreaking approach to patient education for inflammatory bowel disease by providing accurate, personalized, and nuanced information. This article emphasizes the need for domain-specific fine-tuning of large language models, robust evaluation metrics, and their integration into clinical workflows. Ethical concerns, such as algorithmic bias and patient data privacy, and accessibility barriers, including digital literacy gaps, are critical to address. Interdisciplinary collaboration is essential for optimizing these tools to enhance patient engagement and improve health outcomes.
- Citation: Ardila CM, González-Arroyave D, Ramírez-Arbeláez J. Advancing large language models as patient education tools for inflammatory bowel disease. World J Gastroenterol 2025; 31(20): 105285
- URL: https://www.wjgnet.com/1007-9327/full/v31/i20/105285.htm
- DOI: https://dx.doi.org/10.3748/wjg.v31.i20.105285
We commend Zhang et al[1] for their insightful article published in the World Journal of Gastroenterology. This study is timely and of significant importance, as it evaluates the transformative potential of large language models (LLMs) in addressing critical gaps in patient education. The increasing prevalence of inflammatory bowel disease (IBD) worldwide[2], coupled with its complex and multifactorial etiology, necessitates innovative approaches to enhance patient under
The study’s findings demonstrate that while general-purpose LLMs provide satisfactory responses to common IBD-related queries, their performance in addressing complex or nuanced topics - such as rare medication side effects or individualized dietary guidance - remains inconsistent[1]. This underscores the importance of domain-specific fine-tuning to enhance their reliability and relevance. To address this, LLMs should be trained on curated datasets encompassing peer-reviewed literature, clinical guidelines, and real-world patient interactions[4,5]. Such fine-tuning would enable the models to deliver precise and evidence-based responses tailored to IBD’s multifaceted nature. For example, incorporating datasets from established gastroenterology networks like the Crohn’s and Colitis Foundation could significantly enhance content accuracy[6]. Furthermore, integrating case-based scenarios into the training process would help models adapt to atypical presentations and rare complications, thereby improving their utility in diverse clinical contexts.
While the authors employ Likert scales and readability scores to evaluate LLM-generated responses, these metrics, though useful, provide only a surface-level assessment of their educational value. Additional quantitative and qualitative metrics are essential to comprehensively evaluate the effectiveness of LLMs in patient education[7,8]. We suggest incorporating the Flesch-Kincaid Grade Level for readability assessments[9], which provides an objective measure of the text’s accessibility to patients with varying literacy levels. Specificity scores, which evaluate the accuracy and detail of information provided, could further validate the reliability of LLM-generated responses. Moreover, qualitative methods such as patient focus groups and clinician feedback could provide valuable insights into the perceived utility and relevance of the content. These expanded evaluation criteria would not only improve the study’s robustness but also set a standardized framework for future investigations in this domain.
The study primarily evaluates LLMs as standalone patient education tools[1]. However, their integration into clinical decision support systems (CDSS) represents an untapped opportunity to enhance their utility. By embedding LLMs into CDSS, clinicians could leverage these tools to generate personalized educational content during consultations, thereby addressing patient queries in real-time[10]. For example, LLMs could assist in explaining complex treatment plans, providing tailored dietary advice, or outlining potential medication interactions based on the patient’s medical history. This collaborative approach could mitigate concerns about information accuracy, as the clinician would remain the final arbiter of the content delivered to patients. Future research should explore pilot studies that assess the feasibility and impact of such integrations, focusing on metrics such as patient satisfaction, adherence to treatment plans, and clinical outcomes.
As LLMs become increasingly integrated into patient education, ethical considerations must take center stage[11,12]. The authors touch upon the potential for misinformation but do not delve into broader issues such as algorithmic bias, patient confidentiality, and equitable access. Algorithmic bias, stemming from imbalances in training data, could result in disparities in the quality of information provided to different demographic groups. Developers must prioritize diversity and inclusivity in dataset curation to ensure equitable performance across various patient populations. Additionally, safeguards should be implemented to protect patient confidentiality, particularly when LLMs are integrated into electronic health record systems[11,12]. Accessibility is another critical consideration. The digital divide poses a significant barrier to the widespread adoption of LLM-based tools, particularly in low-resource settings[12]. Efforts should be made to develop multilingual and offline-compatible versions of these tools to ensure that they are accessible to a broader audience. Policymakers and healthcare organizations must also invest in digital literacy programs to empower patients to effectively utilize these tools.
Zhang et al[1] have laid a solid foundation for exploring the potential of LLMs as patient education tools for IBD. However, realizing their full potential requires addressing the challenges outlined above. Domain-specific fine-tuning, expanded evaluation metrics, integration with CDSS, and a focus on ethical and accessibility concerns are essential to optimizing these systems for real-world use. We envision a future where LLMs, seamlessly integrated into clinical workflows, empower patients with accurate, personalized, and accessible information, thereby enhancing their engagement and improving clinical outcomes. Collaborative efforts between researchers, clinicians, and technology developers will be key to achieving this vision.
1. | Zhang Y, Wan XH, Kong QZ, Liu H, Liu J, Guo J, Yang XY, Zuo XL, Li YQ. Evaluating large language models as patient education tools for inflammatory bowel disease: A comparative study. World J Gastroenterol. 2025;31:102090. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Reference Citation Analysis (6)] |
2. | Caron B, Honap S, Peyrin-Biroulet L. Epidemiology of Inflammatory Bowel Disease across the Ages in the Era of Advanced Therapies. J Crohns Colitis. 2024;18:ii3-ii15. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 1] [Cited by in RCA: 1] [Article Influence: 1.0] [Reference Citation Analysis (0)] |
3. | Gordon M, Sinopoulou V, Ibrahim U, Abdulshafea M, Bracewell K, Akobeng AK. Patient education interventions for the management of inflammatory bowel disease. Cochrane Database Syst Rev. 2023;5:CD013854. [RCA] [PubMed] [DOI] [Full Text] [Reference Citation Analysis (0)] |
4. | Shah K, Xu AY, Sharma Y, Daher M, McDonald C, Diebo BG, Daniels AH. Large Language Model Prompting Techniques for Advancement in Clinical Medicine. J Clin Med. 2024;13:5101. [RCA] [PubMed] [DOI] [Full Text] [Reference Citation Analysis (0)] |
5. | Abd-Alrazaq A, AlSaad R, Alhuwail D, Ahmed A, Healy PM, Latifi S, Aziz S, Damseh R, Alabed Alrazak S, Sheikh J. Large Language Models in Medical Education: Opportunities, Challenges, and Future Directions. JMIR Med Educ. 2023;9:e48291. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 143] [Reference Citation Analysis (0)] |
6. | Rubin DT, Feld LD, Goeppinger SR, Margolese J, Rosh J, Rubin M, Kim S, Rodriquez DM, Wingate L. The Crohn's and Colitis Foundation of America Survey of Inflammatory Bowel Disease Patient Health Care Access. Inflamm Bowel Dis. 2017;23:224-232. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 19] [Cited by in RCA: 27] [Article Influence: 3.4] [Reference Citation Analysis (0)] |
7. | Ho CN, Tian T, Ayers AT, Aaron RE, Phillips V, Wolf RM, Mathioudakis N, Dai T, Klonoff DC. Qualitative metrics from the biomedical literature for evaluating large language models in clinical decision-making: a narrative review. BMC Med Inform Decis Mak. 2024;24:357. [RCA] [PubMed] [DOI] [Full Text] [Reference Citation Analysis (0)] |
8. | Aydin S, Karabacak M, Vlachos V, Margetis K. Large language models in patient education: a scoping review of applications in medicine. Front Med (Lausanne). 2024;11:1477898. [RCA] [PubMed] [DOI] [Full Text] [Reference Citation Analysis (0)] |
9. | Jindal P, MacDermid JC. Assessing reading levels of health information: uses and limitations of flesch formula. Educ Health (Abingdon). 2017;30:84-88. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 52] [Cited by in RCA: 115] [Article Influence: 14.4] [Reference Citation Analysis (0)] |
10. | Kresevic S, Giuffrè M, Ajcevic M, Accardo A, Crocè LS, Shung DL. Optimization of hepatological clinical guidelines interpretation by large language models: a retrieval augmented generation-based framework. NPJ Digit Med. 2024;7:102. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 35] [Reference Citation Analysis (0)] |
11. | Ardila CM, Yadalam PK. ChatGPT's Influence on Dental Education: Methodological Challenges and Ethical Considerations. Int Dent J. 2025;75:379-380. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1] [Reference Citation Analysis (0)] |
12. | Yadalam PK, Anegundi RV, Ardila CM. Integrating Artificial Intelligence Into Orthodontic Education and Practice. Int Dent J. 2024;74:1463. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 7] [Reference Citation Analysis (0)] |