Wang SS, Gao H, Lin PY, Qian TC, Du Y, Xu L. Evaluating chat generative pretrained transformer in answering questions on endoscopic mucosal resection and endoscopic submucosal dissection. World J Gastrointest Oncol 2025; 17(10): 109792 [PMID: 41114106 DOI: 10.4251/wjgo.v17.i10.109792]
Corresponding Author of This Article
Lei Xu, MD, PhD, Department of Gastroenterology, The First Affiliated Hospital of Ningbo University, No. 59 Liuting Street, Ningbo 315010, Zhejiang Province, China. xulei22@163.com
Research Domain of This Article
Gastroenterology & Hepatology
Article-Type of This Article
Observational Study
Open-Access Policy of This Article
This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Oct 15, 2025 (publication date) through Oct 25, 2025
Times Cited of This Article
Times Cited (0)
Journal Information of This Article
Publication Name
World Journal of Gastrointestinal Oncology
ISSN
1948-5204
Publisher of This Article
Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA
Share the Article
Wang SS, Gao H, Lin PY, Qian TC, Du Y, Xu L. Evaluating chat generative pretrained transformer in answering questions on endoscopic mucosal resection and endoscopic submucosal dissection. World J Gastrointest Oncol 2025; 17(10): 109792 [PMID: 41114106 DOI: 10.4251/wjgo.v17.i10.109792]
Shi-Song Wang, Hui Gao, Tian-Chen Qian, Ying Du, Lei Xu, Department of Gastroenterology, The First Affiliated Hospital of Ningbo University, Ningbo 315010, Zhejiang Province, China
Shi-Song Wang, Peng-Yao Lin, Ying Du, Health Science Center, Ningbo University, Ningbo 315010, Zhejiang Province, China
Tian-Chen Qian, Department of Gastroenterology, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou 310003, Zhejiang Province, China
Author contributions: Xu L conceived the study design; Wang SS and Gao H performed the statistical analysis; Wang SS and Du Y wrote the manuscript; Qian TC and Lin PY reviewed the manuscript; All authors approved the submitted draft.
Supported by Ningbo Top Medical and Health Research Program, No. 2023020612; the Ningbo Leading Medical & Healthy Discipline Project, No. 2022-S04; the Medical Health Science and Technology Project of Zhejiang Provincial Health Commission, No. 2022KY315; and Ningbo Science and Technology Public Welfare Project, No. 2023S133.
Institutional review board statement: Since the study did not involve human or animal data and all ChatGPT answers were public, there was no need for Ethics Committee approval.
Informed consent statement: As this study does not involve human or animal data and all ChatGPT responses are publicly accessible, informed consent was not required.
Conflict-of-interest statement: All the authors report no relevant conflicts of interest for this article.
STROBE statement: The authors have read the STROBE Statement—checklist of items, and the manuscript was prepared and revised according to the STROBE Statement—checklist of items.
Data sharing statement: Technical appendix, statistical code, and dataset available from the corresponding author.
Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Lei Xu, MD, PhD, Department of Gastroenterology, The First Affiliated Hospital of Ningbo University, No. 59 Liuting Street, Ningbo 315010, Zhejiang Province, China. xulei22@163.com
Received: May 22, 2025 Revised: June 17, 2025 Accepted: August 27, 2025 Published online: October 15, 2025 Processing time: 145 Days and 19.4 Hours
Abstract
BACKGROUND
With the rising use of endoscopic submucosal dissection (ESD) and endoscopic mucosal resection (EMR), patients are increasingly questioning various aspects of these endoscopic procedures. At the same time, conversational artificial intelligence (AI) tools like chat generative pretrained transformer (ChatGPT) are rapidly emerging as sources of medical information.
AIM
To evaluate ChatGPT’s reliability and usefulness regarding ESD and EMR for patients and healthcare professionals.
METHODS
In this study, 30 specific questions related to ESD and EMR were identified. Then, these questions were repeatedly entered into ChatGPT, with two independent answers generated for each question. A Likert scale was used to rate the accuracy, completeness, and comprehensibility of the responses. Meanwhile, a binary category (high/Low) was used to evaluate each aspect of the two responses generated by ChatGPT and the response retrieved from Google.
RESULTS
By analyzing the average scores of the three raters, our findings indicated that the responses generated by ChatGPT received high ratings for accuracy (mean score of 5.14 out of 6), completeness (mean score of 2.34 out of 3), and comprehensibility (mean score of 2.96 out of 3). Kendall’s coefficients of concordance indicated good agreement among raters (all P < 0.05). For the responses generated by Google, more than half were classified by experts as having low accuracy and low completeness.
CONCLUSION
ChatGPT provided accurate and reliable answers in response to questions about ESD and EMR. Future studies should address ChatGPT’s current limitations by incorporating more detailed and up-to-date medical information. This could establish AI chatbots as significant resource for both patients and health care professionals.
Core Tip: This study evaluated the reliability and usefulness of chat generative pretrained transformer in addressing questions related to endoscopic submucosal dissection and endoscopic mucosal resection. A set of thirty targeted questions was repeatedly entered, and responses were independently rated for accuracy, completeness, and comprehensibility. Compared with Google, chat generative pretrained transformer produced more accurate, detailed, and easier to understand answers, with consistent agreement among evaluators. The findings indicate that chat generative pretrained transformer may serve as a valuable and accessible source of medical information for both patients and healthcare professionals.
Citation: Wang SS, Gao H, Lin PY, Qian TC, Du Y, Xu L. Evaluating chat generative pretrained transformer in answering questions on endoscopic mucosal resection and endoscopic submucosal dissection. World J Gastrointest Oncol 2025; 17(10): 109792
Colorectal cancer is the third most prevalent cancer worldwide, with most cases developing from colorectal polyps[1]. Endoscopic mucosal resection (EMR) is an established treatment for these polyps[2]. The effectiveness of endoscopic screening and therapy depends not only on the accurate detection of adenomas but also on their complete removal[3]. However, a key limitation of EMR is that it is suitable only for small lesions, approximately 20 mm in diameter, which restricts the possibility of complete resection. In contrast, endoscopic mucosal dissection (ESD) allows for the removal of a wider range of lesions[4], but the technique is technically demanding, time-consuming, and costly[5]. Despite these challenges, ESD technology has gradually gained popularity over EMR for the endoscopic treatment of early gastric cancer[6]. During pre-surgery counseling for ESD or EMR, patients frequently ask numerous questions about the procedure, as well as about postoperative care and lifestyle considerations.
One promising tool to address patient questions is the use of artificial intelligence (AI)-driven chatbots. In recent years, such chatbots have proven effective in providing personalized support and patient education, indicating their potential as a supplementary resource in healthcare[7]. Particularly, advancements in natural language processing have enabled large language models, such as chat generative pre-trained transformers (ChatGPT)[8], to perform well across various fields, including medicine[9]. These models draw on extensive knowledge bases and a deep understanding of complex language patterns to deliver customized and informative responses[10]. Given the rapid rise in ChatGPT's popularity, it is likely that more patients will turn to this tool for information about ESD and EMR. Therefore, it is essential to evaluate whether these AI tools provide information that is accurate, complete, and easily understood.
This study aimed to evaluate ChatGPT 4.0’s responses to questions related to ESD and EMR across three domains: Accuracy, completeness, and comprehensibility. Specifically, we aimed to evaluate the tool’s ability to answer common patient questions regarding the preoperative, intraoperative, and postoperative aspects of surgery, and to explore its potential role in patient education.
MATERIALS AND METHODS
We conducted our queries using ChatGPT 4.0, an updated model reportedly offering more advanced reasoning capabilities and a broader knowledge base than ChatGPT 3.5[9,11,12]. In this study, we compiled expert responses to common patient questions regarding ESD and EMR encountered in clinical practice. An expert independent of the study excluded questions with overlapping meanings or duplicates and revised the wording and grammatical accuracy of certain items to ensure clarity and precision. Questions were posed in English, and to eliminate potential bias from previous conversations and ensure the relevance of responses, the “New Chat” reset function was used before every query. To evaluate the temporal accuracy and reproducibility of ChatGPT, each question was re-entered one week after the initial output, and both responses were documented for comparison. The first set of answers was designated for the 30 questions as the “discovery phase”, and the second set was referred to as the “replication phase” for analysis.
Subsequently, each ChatGPT response was evaluated by three gastroenterologists, three non-expert reviewers, and three patients. Each expert had over 20 years of professional experience in gastrointestinal endoscopy and had published extensively in the field. The non-expert reviewers possessed a fundamental understanding of endoscopic procedures. And three patients, aged between 40 and 50 years, who were scheduled to undergo endoscopic procedures involving ESD or EMR. Table 1 presents the details of the research questions. The experts independently assessed each answer using a Likert scale (Supplementary Table 1) across three dimensions. Accuracy was rated from 1 to 6 (with six being the most accurate), and completeness and comprehensibility from 1 to 3 (with three indicating the most complete or easiest to understand)[13]. Additionally, to further evaluate the linguistic quality of ChatGPT responses compared to traditional web searches and to minimize the potential impact of subtle rating differences, experts also reassessed and compared the quality of ChatGPT and Google responses using binary categories (low/high). Non-expert reviewers and patients evaluated the comprehensibility of each ChatGPT response using binary categorical ratings only. All reviewers were blinded to each other's scores to avoid potential bias.
Table 1 Questions posed to chat generative pretrained transformer.
No.
Question
1
What is the anatomy of the gastrointestinal tract?
2
What are the surgical indications/indications for ESD/EMR?
3
What is the specific surgical process for ESD/EMR?
4
What are the contraindications for ESD/EMR?
5
What should patients do before ESD/EMR?
6
What preoperative measures for ESD/EMR can help reduce surgical risks?
7
What are the possible problems and solutions that may be encountered during the ESD/EMR process?
8
Will sedation or anesthesia be used during ESD/EMR, and will the procedure cause pain or discomfort? How long does the procedure usually take?
9
What cooperation and precautions are required from the patient during an ESD/EMR procedure?
10
What are the factors influencing the safety and success rate of ESD/EMR procedures?
11
How is the resected tissue handled after an ESD/EMR procedure?
12
What is the expected timeframe and method for accessing pathology results after an ESD/EMR procedure?
13
Which terms in the pathology report after an ESD/EMR procedure should be given special attention?
14
What are the relevant definition standards and classifications for postoperative complications of ESD/EMR?
15
What are the common postoperative complications and related treatments of ESD/EMR?
16
What are the influencing factors of postoperative complications in ESD/EMR?
17
What are the observation indicators for the therapeutic effect of ESD/EMR?
18
What postoperative symptoms are considered normal after an ESD/EMR procedure?
19
Do patients need family accompaniment after an ESD/EMR procedure, and for how long is it recommended?
20
What are the postoperative care precautions for ESD/EMR?
21
What are the application and precautions of drugs and food after ESD/EMR surgery?
22
What are the daily life precautions for postoperative ESD/EMR?
23
How soon after an ESD/EMR procedure can a patient return to work, engage in physical activity, or take a shower?
24
What are the follow-up appointments and response methods for postoperative adverse events in ESD/EMR?
25
What is the likelihood of recurrence after an ESD/EMR procedure, and what are the related influencing factors?
26
What postoperative signs may indicate incomplete lesion removal or potential recurrence after an ESD/EMR procedure?
27
If follow-up after ESD/EMR suggests recurrence, what should be done next?
28
In endoscopic therapy, how are ESD and EMR selected?
29
What is the typical cost of an ESD/EMR procedure, and how is it covered by medical insurance?
30
How can the psychological state of patients be managed after an ESD/EMR procedure to reduce anxiety and concerns?
The study adhered to the ethical standards outlined in the Helsinki Declaration. Since the study did not involve human or animal data and all ChatGPT answers were public, there was no need for ethics committee approval.
Statistical analysis
In this study, we used the mean, standard deviation, and median to perform descriptive statistical analysis. When the curves representing the ratings of different reviewers were closer within the circles, a greater degree of agreement was indicated. The farther a curve extended toward the outer edge of the graph, the higher the score given by the reviewer. To evaluate reproducibility of ChatGPT’s answers, we dichotomized the accuracy ratings into two categories: Scores 1-3 vs scores 4-6. If the two responses to the same question fell into different categories, they were classified as significantly different, indicating low reproducibility for that item. To evaluate the reliability and consistency of the rating process, Kendall’s coefficients of concordance were used[14]. This nonparametric statistical method quantified the level of agreement among evaluators, where a coefficient of one indicates perfect agreement and a coefficient of zero reflects agreement by chance. The coefficient of variation was calculated to assess the variability among the three experts’ ratings for each response. Each rater’s set of scores was treated as an independent sample for this analysis. Data analysis was performed using IBM Statistical Package for the Social Sciences software (version 29), with statistical significance set at P < 0.05.
RESULTS
Initially, 71 questions were included. After excluding 41 similar or duplicate items, a final set of 30 relevant questions was retained (Figure 1). Each of the 30 distinct questions was submitted to both ChatGPT and Google. For each question, ChatGPT generated two responses (Supplementary Table 2), while Google provided a single response (Supplementary Table 3). Three experts evaluated each response in terms of accuracy, completeness, and comprehensibility (Table 2). Overall, the two responses for each question were largely consistent, indicating good reproducibility of ChatGPT’s answers.
In terms of accuracy, ChatGPT’s responses were evaluated by three experts using a 6-point Likert scale, yielding a mean score of 5.14 ± 0.54 and a median score of 5.00 (Figure 2). Multiple experts assigned a score of 4 to both responses for questions 2, 7, and 29. The lowest average score was observed for question 2 in the discovery phase, at 4.00 ± 0, with no variance among raters. Additionally, each of questions 3 and 28 in the discovery phase received a score of 4 from one of the experts. Furthermore, most of the ratings provided by the experts were 5 (almost all correct) or 6 (correct) (Table 2). Questions 5, 12, and 19 achieved the highest mean accuracy score, with an average of 5.67 ± 0.47 (Figure 3). Kendall’s coefficient of concordance was 0.538, which was statistically significant (P = 0.002), indicating that there was relatively moderate consistency among the experts (Supplementary Table 4). Meanwhile, the coefficients of variation for expert-assigned Likert scale scores on ChatGPT-generated questions are reported in Supplementary Table 5. When the scores were converted into binary categories, multiple experts classified both responses to questions 2, 7, and 29 as low-level. In the discovery phase, one expert also categorized the responses to questions 3 and 28 as low-level, while all other responses were classified as high-level. In the binary classification of Google responses, a majority were rated as low-level by one or more experts (Supplementary Table 6).
Figure 3 Distribution of expert ratings by topic for each question during the discovery and replication phases, presented as a radar chart.
A-C: The discovery phases; D-F: Replication phases. When the curves representing the ratings of different reviewers were closer within the circles, a greater degree of agreement was indicated. The farther a curve extended toward the outer edge of the graph, the higher the score given by the reviewer. Q: Question; RQ: Replication question.
In terms of completeness, expert ratings of ChatGPT’s responses on a 3-point Likert scale yielded a mean score of 2.34 ± 0.47 and a median of 2.00. All questions received scores of 2 or 3. Further analysis revealed that only for question 19 did all experts assign a score of 3 to both responses (Table 2). Moreover, Kendall's coefficient of concordance was 0.602, which was statistically significant (P < 0.001), indicating that there was a strong consistency among the experts. In the binary classification of ChatGPT scores, only both responses to question 7 and the discovery phase response to question 2 were classified as low-level by one expert; all other responses were categorized as high-level. In contrast, for Google, more than half of the responses were classified as low-level by at least one expert (Supplementary Table 6).
In terms of comprehensibility, ChatGPT’s responses received relatively high scores, with a mean of 2.96 ± 0.19 and a median of 3.00 on a 3-point Likert scale. Two of the three experts gave a score of 2 for questions 7 and 24. Additionally, for question 14 (discovery phase), one expert gave a 2 while the others gave 3, whereas all experts rated the remaining questions as 3 (easy to understand) (Figure 3). Moreover, Kendall's coefficient of concordance was 0.617, which was statistically significant (P < 0.001), indicating relatively strong consistency among the experts. In the binary classification, all responses from both ChatGPT and Google were categorized as high-level by all three experts. Three non-expert reviewers and three patients also evaluated the comprehensibility of ChatGPT’s responses using a binary classification method. All responses were classified as high-level by the three non-expert reviewers. From the patient perspective, only one patient classified both responses to questions 2, 13, 26, and 29 as low-level (Supplementary Table 7).
DISCUSSION
In this study, we evaluated the effectiveness of the ChatGPT in responding to 30 questions related to ESD and EMR. Our findings indicate that OpenAI's ChatGPT chatbot could provide answers to ESD and EMR related questions with high accuracy (mean score 5.14/6), substantial completeness (2.34/3), and strong comprehensibility (2.96/3). Moreover, it exhibited superior performance over traditional search engines like Google and offered information in a format that was more comprehensible to patients.
Before undergoing any diagnostic or therapeutic endoscopic procedure, patients should have a clear understanding of the expected benefits, potential risks, and available alternatives. Presenting this information in clear and accessible language is essential to support informed decision-making regarding perioperative management. The European Society of Gastrointestinal Endoscopy also emphasizes that patient preferences should be central to the informed consent process[15]. Evidence suggests a strong association between patients’ awareness of their condition and adherence to prescribed treatment plans[16,17]. Furthermore, structured surveillance following ESD and EMR has been indicated to facilitate early detection of recurrence and improve long-term survival outcomes[18].
Despite the recognized importance of health education, patients often encounter barriers to accessing accurate, individualized information pertinent to their clinical circumstances. One study reported that approximately 70%-80% of internet users seek health information online[19]. This enables patients to quickly access up-to-date information on disease prevention, evaluation, and treatment. As a convenient and cost-effective tool, the Internet plays a key role in enhancing patients' health literacy. However, due to the complexity and variability of online content, it is not always a reliable source of information[20-22]. ChatGPT offers a potential solution to improve this situation. It produces human-like responses optimized through reinforcement learning with feedback loops[23]. In our study, ChatGPT demonstrated superior performance compared to traditional internet search engines such as Google. Particularly, the ChatGPT 4.0 model features enhanced reasoning capabilities and a broader knowledge base, allowing it to solve complex problems more accurately[24]. Its readability exceeds the fifth-to-sixth-grade level recommended by the American Medical Association instead of reaching the average college reading level[25]. The model’s training process involves human feedback to guide the generation of clear, relevant, and user-aligned responses[26]. Moreover, previous studies have demonstrated that ChatGPT’s responses to cardiology-related questions were appropriate in most cases[27], and the model also performed well in addressing questions related to cirrhosis and hepatocellular carcinoma[28].
Additionally, ChatGPT can generate a structured framework in response to questions posed by patients and healthcare providers, facilitating better understanding and problem-solving. While many of its responses are comprehensive or accurate, they are occasionally insufficient. However, given the expected ongoing improvements in the model, physicians can enhance patient communication by simply refining ChatGPT’s initial responses[28]. Importantly, it demonstrated high reproducibility for certain questions, suggesting that its ability to generate clinically appropriate responses is not heavily dependent on the initial prompt. This approach not only increases physician efficiency but also reduces the overall cost and burden on the healthcare system. Moreover, ChatGPT empowers patients to better understand their care, promoting patient-centered approaches and supporting effective, shared decision-making by providing an additional source of reliable information.
Notably, the answers to question 5 (preoperative preparation), question 12 (postoperative pathology results), and question 19 (postoperative family support) received the highest accuracy scores among all items, suggesting that ChatGPT's responses on these topics may be directly useful for patient instruction. But there may be some shortcomings in the answers to more complex issues, such as indications for endoscopic surgery, intraoperative precautions, and detailed insurance-related policies. Several possible reasons may account for this. ChatGPT is primarily trained on large volumes of historical text data (books, articles, and websites)[29], and its knowledge may not reflect the most up-to-date medical guidelines. Consequently, some answers lagged behind current clinical standards, which likely contributed to the lower scores observed for specific questions. On the other hand, substantial disagreement remained among experts regarding certain responses, which may be attributed to linguistic and cultural differences across regions or countries. As ChatGPT currently lacks the ability to tailor its responses based on the user's geographic context, this challenge has also been noted in prior studies[28,30]. Future research may address this limitation by involving experts from multiple regions and conducting cross-cultural evaluations, thereby exploring the potential for developing regionally adaptive capabilities in ChatGPT.
ChatGPT’s responses can vary depending on its training data, contextual differences, and linguistic nuances. The same question posed at different times or in other situations may yield different responses, potentially affecting the accuracy and completeness of the information provided. In our study, each question was submitted to ChatGPT in its native language, English, within separate chat sessions. To assess the model’s stability, the same questions were resubmitted under identical conditions after a defined time interval. This approach aimed to minimize variability. However, the potential inconsistency of ChatGPT-generated responses should still be acknowledged. Therefore, extra caution is warranted when using ChatGPT as a stand-alone tool for patient counseling. While it can offer helpful information and guidance, it is not a substitute for the clinical expertise of a well-trained physician[31,32].
In addition, the application of AI to medical decision-making requires careful consideration of a variety of safety, legal, and ethical issues. ChatGPT, in the absence of effective fact-checking mechanisms, is prone to generating inaccurate or misleading information. In medical contexts, its “hallucinations” can exacerbate public health misinformation, contributing to an AI-driven infodemic[33]. Moreover, the lack of data minimization and protection measures raises concerns over sensitive information leakage[34]. Misuse of ChatGPT also poses legal risks. When users rely on AI-generated medical or legal advice, accountability becomes unclear[33]. In unregulated settings, generating legal documents may constitute unauthorized practice of law, violating professional ethics. Additionally, unsupervised use may reinforce biases, leading to discriminatory or harmful outputs[35]. To mitigate these risks, integrating real-time knowledge verification, promoting human-AI collaboration, and establishing clear ethical and legal guidelines for AI deployment are essential.
Overall, our findings suggest that relying solely on AI is insufficient. Medical decisions and patient counseling should always involve qualified healthcare professionals who can offer personalized advice based on an individual patient’s condition and needs. This ensures that patients receive accurate and comprehensive information, thereby improving their prognosis.
This study has several key strengths. First, to ensure the overall quality of ChatGPT’s responses, three independent gastroenterology experts reviewed and evaluated the answers. Moreover, to the best of our knowledge, this is the first study to assess the accuracy, reliability, and comprehensibility of ChatGPT in addressing questions related to ESD and EMR.
We must consider the limitations of this study. This study focused on evaluating the performance of ChatGPT 4.0; therefore, the results may not be applicable to other AI models, particularly in medical training. Additionally, our evaluation was based on a small group of experts. Their subjective judgments, although informed, may be biased and may not accurately reflect the broader range of opinions in the medical community or among patients. Furthermore, the design of the study and the choice of questions may be influenced by human moral concepts, social influences, and personal beliefs, which are often difficult to quantify and depend on subjective assessment. Our research questions were grounded in clinical practice, reviewed for comprehensibility by patients, and demonstrated strong inter-expert agreement with low variability, helping to minimize potential bias. Meantime, we only focused on ChatGPT’s performance in generating research questions for ESD/EMR, without addressing its potential in other gastrointestinal or surgical domains. This limitation is also consistent with those reported in previous studies[26,36,37]. Future research should aim to improve its multi-source integration and knowledge transfer for broader clinical application.
Furthermore, ESD and EMR require not only the transmission of information but a wide range of skills. This includes understanding the patient's health and lifestyle, identifying personal needs, building and maintaining good relationships and trust, motivating patients, and supporting their rehabilitation process. Besides, these duties require empathy, emotional intelligence, and interpersonal skills, all of which are currently not available in ChatGPT or other AI technologies. Moreover, AI models often experience performance drift or degradation after deployment due to updates in training data, architectural optimizations, or changes in the usage environment. Therefore, establishing a long-term evaluation and adaptive updating framework is essential[38-40]. Future research should consider dynamically comparing the performance of different model versions (e.g., ChatGPT 4.0 vs 4.5) to capture improvements introduced by updates. Additionally, constructing a behavioral timeline for model outputs, combined with online performance tracking and periodic assessments, can enable continuous monitoring. Furthermore, the use of multicenter datasets for cross-validation may help assess model generalizability and support external validation.
CONCLUSION
Although ChatGPT indicates potential in providing information related to ESD and EMR management, it should be used with caution and not relied upon as a universal tool for patient counseling. Future studies should aim to address current limitations and improve the reliability and practical application of AI models in delivering accurate and comprehensive medical information.
ACKNOWLEDGEMENTS
The authors would like to thank all participants and their families.
Footnotes
Provenance and peer review: Unsolicited article; Externally peer reviewed.
Peer-review model: Single blind
Specialty type: Gastroenterology and hepatology
Country of origin: China
Peer-review report’s classification
Scientific Quality: Grade B, Grade B, Grade C
Novelty: Grade B, Grade C, Grade C
Creativity or Innovation: Grade B, Grade C, Grade C
Scientific Significance: Grade B, Grade B, Grade C
P-Reviewer: Mukundan A, PhD, Assistant Professor, Taiwan; Ren S, MD, PhD, Assistant Professor, Chief Physician, China S-Editor: Li L L-Editor: A P-Editor: Zhao S
Tanaka S, Kashida H, Saito Y, Yahagi N, Yamano H, Saito S, Hisabe T, Yao T, Watanabe M, Yoshida M, Kudo SE, Tsuruta O, Sugihara KI, Watanabe T, Saitoh Y, Igarashi M, Toyonaga T, Ajioka Y, Ichinose M, Matsui T, Sugita A, Sugano K, Fujimoto K, Tajiri H. JGES guidelines for colorectal endoscopic submucosal dissection/endoscopic mucosal resection.Dig Endosc. 2015;27:417-434.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 359][Cited by in RCA: 445][Article Influence: 44.5][Reference Citation Analysis (0)]
Martínez ME, Baron JA, Lieberman DA, Schatzkin A, Lanza E, Winawer SJ, Zauber AG, Jiang R, Ahnen DJ, Bond JH, Church TR, Robertson DJ, Smith-Warner SA, Jacobs ET, Alberts DS, Greenberg ER. A pooled analysis of advanced colorectal neoplasia diagnoses after colonoscopic polypectomy.Gastroenterology. 2009;136:832-841.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 396][Cited by in RCA: 433][Article Influence: 27.1][Reference Citation Analysis (0)]
Nakamoto S, Sakai Y, Kasanuki J, Kondo F, Ooka Y, Kato K, Arai M, Suzuki T, Matsumura T, Bekku D, Ito K, Tanaka T, Yokosuka O. Indications for the use of endoscopic mucosal resection for early gastric cancer in Japan: a comparative study with endoscopic submucosal dissection.Endoscopy. 2009;41:746-750.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 96][Cited by in RCA: 110][Article Influence: 6.9][Reference Citation Analysis (0)]
Everett SM, Triantafyllou K, Hassan C, Mergener K, Tham TC, Almeida N, Antonelli G, Axon A, Bisschops R, Bretthauer M, Costil V, Foroutan F, Gauci J, Hritz I, Messmann H, Pellisé M, Roelandt P, Seicean A, Tziatzios G, Voiosu A, Gralnek IM. Informed consent for endoscopic procedures: European Society of Gastrointestinal Endoscopy (ESGE) Position Statement.Endoscopy. 2023;55:952-966.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 7][Cited by in RCA: 7][Article Influence: 3.5][Reference Citation Analysis (0)]
Heydari A, Ziaee ES, Gazrani A. Relationship between Awareness of Disease and Adherence to Therapeutic Regimen among Cardiac Patients.Int J Community Based Nurs Midwifery. 2015;3:23-30.
[PubMed] [DOI]
Pimentel-Nunes P, Dinis-Ribeiro M, Ponchon T, Repici A, Vieth M, De Ceglie A, Amato A, Berr F, Bhandari P, Bialek A, Conio M, Haringsma J, Langner C, Meisner S, Messmann H, Morino M, Neuhaus H, Piessevaux H, Rugge M, Saunders BP, Robaszkiewicz M, Seewald S, Kashin S, Dumonceau JM, Hassan C, Deprez PH. Endoscopic submucosal dissection: European Society of Gastrointestinal Endoscopy (ESGE) Guideline.Endoscopy. 2015;47:829-854.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 817][Cited by in RCA: 938][Article Influence: 93.8][Reference Citation Analysis (0)]