Gong EJ, Woo J, Lee JJ, Bang CS. Role of artificial intelligence in gastric diseases. World J Gastroenterol 2025; 31(37): 111327 [DOI: 10.3748/wjg.v31.i37.111327]
Corresponding Author of This Article
Chang Seok Bang, MD, PhD, Professor, Department of Internal Medicine, Hallym University College of Medicine, Sakju-ro 77, Chuncheon 24253, Gangwon-do, South Korea. csbang@hallym.ac.kr
Research Domain of This Article
Gastroenterology & Hepatology
Article-Type of This Article
Minireviews
Open-Access Policy of This Article
This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Author contributions: Gong EJ, Lee JJ, and Bang CS contributed to conceptualization; Gong EJ, Woo J, and Bang CS contributed to methodology; Gong EJ wrote the original draft; Bang CS reviewed and edited the draft; Bang CS contributed to supervision; all authors contributed to investigation and agreed to the published version of the manuscript.
Supported by Hallym University Medical Center Research Fund.
Conflict-of-interest statement: All the authors report no relevant conflicts of interest for this article.
Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Chang Seok Bang, MD, PhD, Professor, Department of Internal Medicine, Hallym University College of Medicine, Sakju-ro 77, Chuncheon 24253, Gangwon-do, South Korea. csbang@hallym.ac.kr
Received: June 30, 2025 Revised: July 29, 2025 Accepted: August 29, 2025 Published online: October 7, 2025 Processing time: 89 Days and 21 Hours
Abstract
The integration of artificial intelligence (AI) in gastroenterology has evolved from basic computer-aided detection to sophisticated multimodal frameworks that enable real-time clinical decision support. This study presents AI applications in gastric disease diagnosis and management, highlighting the transition from domain-specific deep learning to general-purpose large language models. Our research reveals a key finding: AI effectiveness demonstrates an inverse relationship with user expertise, with moderate-expertise practitioners benefiting the most, whereas experts and novices show limited performance gains. We developed a clinical decision support system achieving 96% lesion detection internally and 82%-87% classification accuracy in external validation. Multimodal integration, which combines endoscopic images, clinical histories, laboratory results, and genomic data, enables comprehensive disease assessment and personalized treatment. The emergence of large language models with expanding context windows and multiagent architectures represents a paradigm shift in medical AI. Furthermore, emerging technologies are expanding AI’s potential applications, and feasibility studies on smart glasses in endoscopy training suggest opportunities for hands-free assistance, although clinical implementation challenges persist. This minireview addresses persistent limitations including geographic bias in training data, regulatory hurdles, ethical considerations regarding patient privacy and AI accountability, and the concentration of AI development among technology giants. Successful integration requires balancing innovation with patient safety, while preserving the irreplaceable role of human clinical judgment.
Core Tip: This minireview demonstrates that artificial intelligence (AI) in gastric disease diagnosis has reached clinical maturity, with systems achieving expert-level performance in cancer detection, precancerous lesion identification, and clinical outcome prediction. The key insight is that AI effectiveness is inversely correlated with user expertise, providing the greatest benefit to practitioners with moderate expertise. The emergence of general-purpose large language models (LLMs) represents a paradigm shift from developing custom AI models that require years of specialized training to leveraging pre-trained systems that clinicians can adapt within weeks without coding expertise. This democratization of AI technology through LLMs enables all medical professionals, regardless of their technical background, to access sophisticated AI capabilities, fundamentally changing how we integrate AI into practice.
Citation: Gong EJ, Woo J, Lee JJ, Bang CS. Role of artificial intelligence in gastric diseases. World J Gastroenterol 2025; 31(37): 111327
Artificial intelligence (AI), particularly deep learning via convolutional neural networks (CNNs), has rapidly gained importance in gastroenterology. In gastric diseases, AI systems have evolved from basic computer-aided detection and characterization systems for early gastric neoplasms and Helicobacter pylori (H. pylori) identification[1,2] to sophisticated multimodal frameworks capable of real-time clinical decision support[3,4], with early systems using CNNs to identify abnormalities that might escape human observation (Table 1)[5,6].
Table 1 Glossary of technical terms used in artificial intelligence applications for gastric diseases.
Technical term
Simplified definition
CADe
AI systems that automatically identify and highlight abnormal areas during endoscopy
CADx
AI systems that classify detected lesions into diagnostic categories (e.g., benign vs malignant)
CNN
AI architecture designed to analyze visual information from endoscopic images
Edge computing
Processing AI calculations directly on local devices rather than remote servers, enabling real-time analysis
LLM
General-purpose AI systems like GPT-4 that can understand and generate human-like text
Multi-agent architectures
Systems where multiple specialized AI components work together to solve complex clinical problems
Context window
The amount of information (text, images) an AI model can analyze simultaneously
One-shot learning
AI’s ability to learn from a single example, reducing the need for large training datasets
A critical insight emerging from our research is that AI’s utility in endoscopy is inversely correlated with the endoscopist’s expertise level[7]. This represents a fundamental limitation of image-based AI systems: They can only detect what is visible on the screen, making their effectiveness dependent on the technical skill of the operator in obtaining adequate visualization[8]. The key point here is that optimal visualization depends on the endoscopic technique rather than on AI. A better technique implies fewer benefits from AI assistance. While moderate-expertise practitioners benefit significantly from AI assistance, experts derive minimal advantages, and novices may struggle to effectively integrate AI recommendations into their clinical workflow. This understanding has profound implications for the design, implementation, and evaluation of AI systems in clinical practice.
Before examining the current state and future directions of AI in gastroenterology, it is essential to define the key concepts underlying this rapidly evolving field (Table 1). Current technical achievements include deep learning models that classify gastric lesions with high accuracy, real-time systems that provide procedural feedback, and sophisticated algorithms that predict invasion depth and histopathological characteristics[9-11]. Multimodal frameworks represent a significant advancement in integrating diverse medical data types, such as endoscopic images, clinical histories, laboratory results, and genomic information, to enable a comprehensive disease assessment that surpasses single-modality analysis. Understanding these foundational concepts is crucial for recognizing the opportunities and constraints shaping AI integration in clinical practice.
Recent developments in general-purpose AI, particularly large language models (LLMs), AI systems trained on extensive datasets and capable of complex reasoning and adaptation to various medical tasks without specialized training, combined with expanding context windows and multi-agent capabilities, suggest a paradigm shift in medical AI that democratizes access to advanced functionalities[12-14]. Unlike the traditional approach of developing increasingly specialized domain-specific models, this shift suggests that the future lies in creatively applying powerful general-purpose systems to clinical challenges[15]. This transformation has profound implications: The emergence of an AI oligopoly, where development of powerful AI models is concentrated among a handful of technology giants with massive computational resources, fundamentally shifts the medical community’s role from model builders to creative clinical implementers. Clinicians can now leverage these tools without extensive technical expertise or computational resources, fundamentally changing the approach to AI integration in clinical practice. This study examines this evolution comprehensively, analyzing not only the achievements of domain-specific models but also their limitations, and highlights how the emergence of general-purpose AI systems promises to address these constraints while opening new possibilities for clinical practice (Figure 1).
Figure 1
Schematic overview of the current application of artificial intelligence in gastric diseases.
FROM NOVICE TO EXPERT: UNDERSTANDING THE EXPERTISE-DEPENDENT BENEFITS OF AI IN UPPER GASTROINTESTINAL ENDOSCOPY
The fundamental principle of AI-assisted endoscopy
The effectiveness of AI in endoscopic practice is constrained by a key principle: It can only assist in detecting lesions that are clearly visible on the screen, representing a major limitation of current image-based AI technologies[8]. This limitation underscores the fact that AI-assistance quality directly depends on endoscopic technique quality, and its benefits are inversely correlated with endoscopist expertise. This creates a distinctive benefit curve in which moderate-expertise practitioners gain the most value; they possess sufficient foundational skills to utilize AI suggestions effectively, with room for improvement in lesion detection[7]. In contrast, experts already operating at a high-level gain minimal benefits, while novices cannot compensate for poor visualization or technique, even with advanced AI assistance. Superior visualization through optimal techniques reduces the marginal benefit of AI, whereas poor techniques cannot be compensated for by even the most advanced AI systems. Additionally, AI serves as an excellent training tool for physicians transitioning from the novice to expert level[16]. This principle profoundly shapes our understanding of AI in endoscopy, not as a replacement for clinical skills but as a tool that amplifies existing capabilities within specific expertise ranges[17].
The inverse correlation between expertise and AI benefit
Research has consistently demonstrated a clear inverse relationship between endoscopist experience and AI-assistance benefits in gastric endoscopy (Figure 2). A landmark Gastrointestinal AI Diagnostic System (GRAIDS) study involving 84424 patients established performance stratification in gastric cancer detection: Expert endoscopists achieved 94.5% sensitivity, moderate-expertise practitioners reached 85.8%, and novices lagged at 72.2%[18]. The GRAIDS AI system achieved 94.2% sensitivity, which is equivalent to expert performance yet significantly superior to novices. Most notably, when novices used AI assistance, their sensitivity increased from 72.2% [95% confidence interval (CI): 69.1%-75.2%] to 96.4% (94.9%-97.5%), approaching the specialist-level performance of 97.4%[18]. Bang et al’s deep learning study on the prediction of the depth of invasion of gastric neoplasms provided crucial insights, demonstrating that novices gained the most benefit from AI support, whereas expert endoscopists’ decisions remained uninfluenced by AI assistance[7].
Figure 2 Inverse relationship between endoscopist experience and artificial intelligence assistance benefits in gastric endoscopy.
The curve demonstrates the differential benefit of artificial intelligence assistance across expertise levels based on our analysis of endoscopic images in the prediction of submucosal invasion[7]. Error bars represent 95% confidence intervals (CI) for each performance measurement. Maximum benefit occurs at moderate expertise levels (trainees: +13.2%, 95%CI: 10.1%-16.3%), while general physicians show decreased performance (-2.9%, 95%CI: -5.2% to -0.6%) and experts show minimal change (-0.5%, 95%CI: -2.1% to 1.1%). Data derived from Bang et al[7].
Quantification of differential benefits across expertise levels
Multiple studies have revealed consistent patterns in how the benefits of AI vary according to endoscopy experience. In a multicenter validation of the ENDOANGEL system, three conditions were compared: (1) AI system alone achieved 90.32% accuracy; (2) Senior endoscopists without AI achieved 70.16% ± 8.78% accuracy; and (3) Senior endoscopists with AI assistance achieved 85.64% accuracy[19]. Note that the baseline accuracy varies between studies owing to different endoscopist populations and task difficulties. H. pylori detection studies demonstrated similar expertise-dependent benefits, with AI systems achieving 87.7% accuracy compared to 82.4% for human endoscopists, showing the most pronounced improvements in less-experienced practitioners[20]. For gastric intestinal metaplasia detection, AI achieved 95% sensitivity vs 79% for endoscopists, with performance gaps being most evident among novice practitioners[21].
Mechanistic understanding and clinical implications
This expertise-dependent benefit stems from fundamental differences in visual information processing during endoscopic examinations[22]. Expert endoscopists, having developed sophisticated pattern-recognition skills over thousands of procedures, already operate near the saturation point of expertise, leaving limited room for further gains from AI[23]. Novices are more susceptible to perceptual errors that AI effectively addresses, particularly when recognizing early gastric neoplasms[24]. Bang et al[7] found that while expert decisions were not influenced by AI and moderate-expertise practitioners became confused by changing AI outputs, novices with foundational knowledge showed significant improvements with optimal AI support. This creates an optimal benefit zone in which moderate-expert practitioners can effectively integrate AI recommendations while maintaining their clinical judgment[25].
Training and implementation considerations
This understanding profoundly impacts gastric endoscopy practice and training[16,26]. AI is an excellent training tool for researchers, potentially reducing the number of supervised cases required for proficiency in lesion detection[27]. These expertise-dependent benefits suggest that AI deployment is the most cost-effective in settings where endoscopists have less experience or limited expert supervision[28]. By narrowing the expertise gap, AI helps ensure consistent gastric cancer screening quality across practice settings[29]. However, implementation must consider AI-dependency risks to ensure that training programs develop independent diagnostic skills while leveraging the democratization effects of AI[30]. Similar patterns observed in lower gastrointestinal endoscopy support the broader applicability of expertise-dependent AI benefits across endoscopic specialties[31].
CURRENT STATE OF AI IN GASTRIC NEOPLASMS DETECTION AND CHARACTERIZATION
AI in endoscopic detection of gastric neoplasms
By leveraging the pattern recognition capabilities of deep learning, particularly CNNs, AI systems are increasingly being developed and validated to assist endoscopists in the detection and diagnosis of a wide array of gastric and other gastrointestinal lesions. Early and accurate identification of gastric cancer and its precursor lesions is important for improving patient prognosis. AI plays an increasingly vital role in enhancing the ability to achieve this goal. The early detection of gastric neoplasms, including early gastric cancer and dysplastic lesions, is critical for improving patient outcomes. High-definition endoscopy with image enhancement has improved visualization; however, diagnostic accuracy still depends heavily on the endoscopist’s experience and vigilance. AI-powered image analysis offers a second set of “eyes” on endoscopic images or video frames to aid lesion detection and characterization.
The technical foundation of these AI systems involves training CNNs on annotated datasets of 10000-100000 endoscopic images, in which expert endoscopists mark lesion boundaries and classify pathology types. During training, networks learn hierarchical features through millions of parameter adjustments, and validation occurs in three phases: Internal (using 20% held-out data), external (testing on different institutional datasets), and prospective[1]. For deployment, video streams undergo frame-by-frame analysis (< 100 milliseconds per frame) with results displayed as visual overlays, whereas multimodal systems integrate endoscopic images with clinical data through fusion networks to enhance diagnostic accuracy[7]. Successful clinical implementation requires either edge computing for local processing or a secure cloud infrastructure with continuous performance monitoring and regular model updates to maintain accuracy[1].
Multiple studies have demonstrated that CNN-based algorithms can accurately identify gastric cancer in endoscopic images. Hirasawa et al[9] reported a deep CNN that achieved 92% sensitivity in detecting gastric-cancer lesions in a test set. Notably, AI detected some small early cancers that were missed by human endoscopists, albeit at the cost of false positives (e.g., areas of gastritis misidentified as cancer). This highlights an important aspect of AI diagnostics: High sensitivity can occur at the expense of specificity, and clinical implementation must balance these factors to avoid alarm fatigue. Another study by Ikenoyama et al[32] compared CNN performance with that of endoscopists and found higher sensitivity for CNN in detecting early gastric cancers (80% vs 53%), with a shorter interpretation time. However, AI still missed some lesions and occasionally misclassified benign changes as malignant, indicating that while AI is a powerful assistive tool, it may best function in tandem with, rather than replace, endoscopists. The latest deep-learning models have demonstrated remarkable accuracy improvements. Zhang et al’s improved mask region-based CNN achieved 93.9% accuracy and 95.3% recall for early gastric cancer detection[33]. The GRAIDS study, the largest multicenter validation study analyzing over 1 million endoscopic images from 84424 patients, showed an AI performance equivalent to that of expert endoscopists and superior to that of novices[18].
Real-world implementations have yielded promising results. In a randomized study conducted in China, approximately 2000 upper endoscopies were randomly assigned as either AI-assisted screening endoscopy or routine endoscopy. AI-assisted examinations revealed a neoplasm miss rate of approximately 6%. However, routine endoscopy without AI assistance results in a high miss rate of approximately 25%. Although the high baseline miss rate of 25.6% in the control group suggests that endoscopists with certain expertise levels may derive particular benefits from AI assistance, this study demonstrates AI’s potential as a valuable complementary tool for enhancing endoscopic performance[19]. Gong et al[34] reported an AI-assisted lesion detection rate of 95.6% using internal testing. In their randomized study, the AI-assisted group showed a higher lesion detection rate than that of the conventional screening endoscopy group, although the difference was not statistically significant (2.0% vs 1.3%, P = 0.21). It is worth noting that all examinations were conducted by expert endoscopists, which may have influenced the outcomes. This study also exemplifies how the benefits of AI assistance vary according to the level of endoscopist expertise.
AI in endoscopic detection of gastric precancerous lesions (performance in gastric atrophy and intestinal metaplasia detection)
A recent study by Liu et al[35], using deep learning models trained on 29013 gastric images, demonstrated a superior performance in detecting precancerous gastric lesions according to the Kyoto gastritis score. The AI system achieved 78.70% accuracy in identifying atrophy and intestinal metaplasia, significantly outperforming both experts (72.6%) and novices (66.6%) (P < 0.05), highlighting AI’s potential for standardized precancerous lesion detection.
The ENDOANGEL system demonstrated high accuracy in detecting precancerous gastric conditions via image-enhanced endoscopy across multicenter validations, achieving 90.1% and 90.8% accuracy for gastric atrophy and intestinal metaplasia, respectively, in internal testing. The system matched expert endoscopist performance (P > 0.05), while significantly outperforming non-experts (P < 0.05), reinforcing that AI assistance particularly benefits less-experienced practitioners in precancerous lesion detection[36]. A recent randomized study demonstrated that AI-assisted endoscopy significantly improved the detection rates of intestinal metaplasia (14.23% vs 9.15%, P = 0.013), atrophy (22.76% vs 17.28%, P = 0.031), and intestinal adenomas (48.52% vs 24.58%, P < 0.001) compared with traditional endoscopy. The benefit was most pronounced among novices, confirming that AI assistance provides greater value to less-experienced endoscopists[37].
AI in endoscopic characterization of gastric neoplasms
Beyond pure detection, AI can also assist in characterization, such as differentiating an adenoma from an adenocarcinoma or flagging features suggestive of submucosal invasive cancer. Recent CNNs applied to magnifying endoscopy with narrowband imaging have shown the ability to categorize gastric lesions according to their depth of invasion based on microvascular and microsurface patterns. In practice, this could help endoscopists decide whether an early cancer is likely confined to the mucosa, and thus amenable to endoscopic submucosal dissection, or whether it shows features suggesting deeper invasion and thus requiring surgery[1].
A comprehensive meta-analysis by Xie et al[38] evaluated the CNN performance in gastric cancer diagnosis and depth of invasion prediction across 17 studies comprising 51446 images and 174 videos from 5539 patients. For gastric cancer diagnosis, the CNN achieved a pooled sensitivity of 89%, specificity of 93%, and an area under the curve (AUC) of 0.94. Notably, the CNN performance was comparable to that of expert endoscopists (AUC: 0.95 vs 0.90, P > 0.05) and superior to that of the combined group of expert and non-expert endoscopists (AUC: 0.95 vs 0.87, P < 0.05). To predict the invasion depth, the CNN demonstrated a pooled sensitivity of 82%, specificity of 90%, and AUC of 0.90. These findings establish that AI can match specialist-equivalent performance in detecting and characterizing gastric neoplasms, offering particular value in determining the invasion depth, which is a critical factor in treatment planning[38].
Importantly, AI assistance in the endoscopy room has moved from concept to reality. Currently, AI systems perform immediate analyses of endoscopic videos, primarily for colorectal polyp detection; however, similar principles are being extended to upper gastrointestinal endoscopy. A gastric AI system that provides instantaneous procedural feedback analyzes each video frame for abnormal patterns and alerts the endoscopist with a visual marker or sound when a suspected lesion is found. Early prototypes of such systems for gastric cancer detection have been developed. One challenge specific to the stomach is its complex mucosal landscape (with widespread gastritis changes and mucus, which can cause false alarms)[16]. Nonetheless, as computing power allows the analysis of high-definition videos at > 30 frames per second, real-time gastric lesion detection by AI is expected to become a practical tool. The benefit is a safety net that ensures that subtle lesions, especially in blind spots or behind the folds, are not overlooked. Consistency is another advantage. AI is not subject to fatigue, distraction, or varying skill levels; therefore, it can provide a more uniform quality of inspection throughout the procedure.
Evidence from systematic reviews of AI in gastric disease
Systematic reviews and meta-analyses from 2019 to 2025 have provided robust evidence that AI has achieved remarkable diagnostic accuracy for detecting and characterizing gastric diseases (Table 2). An analysis of 14 major systematic reviews encompassing hundreds of thousands of patients and millions of endoscopic images demonstrated that AI systems consistently achieved sensitivity and specificity exceeding 90% across multiple applications. The most compelling evidence has emerged from three primary domains: Gastric cancer detection, H. pylori identification, and precancerous lesion diagnosis, with AI performance matching or surpassing expert endoscopists in most comparative studies[2,38-50].
Table 2 Representative systematic reviews and meta-analyses of artificial intelligence applications in gastric diseases (2019-2025).
Five major meta-analyses confirmed exceptional AI performance in early gastric cancer detection. Xie et al’s comprehensive analysis of 17 studies (5539 patients, 51446 images) reported a sensitivity of 89%, a specificity of 93%, and an AUC of 0.94, with AI demonstrating comparable performance to that of expert endoscopists while significantly outperforming non-experts[38]. Similarly, AI applications for precancerous lesions showed even higher accuracy, and Li et al[47] reported a sensitivity of 94%, specificity of 93%, and AUC of 0.97 for gastric intestinal metaplasia detection across 12 studies involving 11173 patients. H. pylori detection studies demonstrate consistent performance, with sensitivity ranging from 87%-92% and specificity from 86%-89%, suggesting AI’s particular strength in identifying subtle mucosal changes[2,42,43]. Performance varied significantly by lesion type: AI systems showed the highest accuracy for H. pylori detection (87%-92% sensitivity), followed by precancerous lesions (94% sensitivity for intestinal metaplasia) and gastric cancer (89% sensitivity). Notably, expert endoscopists’ performance also varied by lesion type, with the greatest AI-human performance gaps observed in detecting subtle mucosal changes associated with H. pylori infection and early atrophic changes.
However, these results were tempered by critical limitations. Geographic bias represents the most significant concern, with over 90% of studies originating from Asia, particularly Japan, South Korea, and China, raising questions about generalizability to Western populations with diverse gastric cancer epidemiologies[40]. Furthermore, the predominance of retrospective case-control designs, lack of external validation, and high statistical heterogeneity (I2 frequently exceeding 90%) across meta-analyses highlight the need for prospective multicenter trials before widespread clinical implementation.
Despite these limitations, the consistency of high-performance metrics across multiple systematic reviews establishes AI as a transformative technology for gastric disease management, particularly in settings where expert endoscopists are unavailable[51]. However, these excellent performance metrics should be interpreted with caution. Direct comparisons among studies are complicated by heterogeneous study designs, varying definitions provided by “expert endoscopists”, different imaging protocols, and inconsistent outcome measures. Furthermore, most studies have reported performance under idealized conditions, which may not reflect real-world clinical practice.
DEVELOPMENT AND VALIDATION OF CLINICAL DECISION SUPPORT SYSTEMS
Evolution of clinical decision support system development
Our journey to develop a clinical decision support system (CDSS) for gastric disease management began in 2018 by establishing deep learning models to predict histopathology and invasion depth using endoscopic images[52,53]. This foundational study demonstrated the feasibility of extracting clinically relevant information from visual data alone, achieving accuracy comparable to that of expert endoscopists through external validation[7].
Building on this foundation, we developed a comprehensive deep learning-based CDSS for the automated detection and classification of gastric lesions in real-time endoscopy, published in 2023[34,54,55]. This system represents a significant advancement in clinical applicability, moving from retrospective analysis to immediate clinical decision support. External validation demonstrated robust performance across multiple institutions: Four-class classification (advanced gastric cancer, early gastric cancer, dysplasia, and non-neoplastic) achieved 81.5% accuracy (external test, 95%CI: 80.3%-82.7%)[34], whereas binary classification for invasion depth prediction reached 86.4% accuracy (external test, 95%CI: 85.3%-87.5%)[34]. Subsequently, we expanded the system to classify all stages of gastric carcinogenesis including preneoplastic conditions, achieving 85.3% accuracy (external test, 95%CI: 83.4%-97.2%) for six-class classification[55]. Notably, the system demonstrated excellent performance for detecting preneoplastic lesions, with atrophy detection accuracy of 95.3% (external test, 95%CI: 92.6%-98%) and intestinal metaplasia detection accuracy of 89.3% (external test, 95%CI: 85.4%-93.2%)[55]. Video 1 presents a representative case of the detection and diagnosis of gastric precancerous lesions, and Video 2 presents a case of mucosa-confined early gastric cancer detected and diagnosed during endoscopy using our AI model.
Technical architecture and clinical performance
Our CDSS development followed systematic stages: Lesion classification, invasion depth prediction, clinical validation through randomized controlled trials, and expansion to six-class diagnosis including atrophy and intestinal metaplasia[34,54,55]. This progress reflects technological advancements and deepens our clinical understanding of gastric pathology. The system operates through a sophisticated pipeline, and when detecting lesions during endoscopy, frames are automatically captured and examined by our classification model, distinguishing between six gastric pathology classes. For atrophy or intestinal metaplasia, segmentation results are displayed on monitors with color-coded overlays that provide immediate visual feedback[34,54,55]. Performance metrics demonstrated clinical viability. We achieved 96% lesion detection internally, with 82%-87% classification accuracy in external validation[34]. These results compare favorably with published benchmarks and demonstrate robustness across different clinical settings and patient populations.
Real-world clinical validation
True medical AI system testing is a randomized controlled trial. Our randomized study showed an improvement in the diagnostic accuracy, although the results did not reach statistical significance[34]. Nonetheless, this outcome provides valuable insights into the dynamics of AI assistance. The lack of statistical significance likely resulted from our study population, which consisted primarily of expert endoscopists who derived minimal benefits from AI assistance, consistent with our expertise-to-benefit relationship. In addition to randomized trials, prospective clinical validation has yielded encouraging results. In 522 consecutive screening endoscopies, CDSS-assisted diagnosis achieved 92.1% (95%CI: 88.8%-95.4%) for atrophy and 95.5% (95%CI: 92%-99%) for intestinal metaplasia detection, with no significant difference compared to expert endoscopists (P = 0.23)[54]. The system performed consistently across all expert levels, thus validating its clinical reliability.
External validation using 1427 novel images from multiple institutions demonstrated robust generalizability with an overall accuracy of 82.3%. Notably, the system maintained high performance for preneoplastic lesions, achieving a per-class AUC of 93.4% for atrophy and 91.3% for intestinal metaplasia[55]. The system’s immediate clinical decision support capabilities, with mean response time of 0.3 seconds and motion freeze functionality allowing on-demand analysis, enable seamless integration into clinical workflow. Additionally, the lesion segmentation feature provides immediate visual feedback, enhancing the endoscopists’ spatial awareness of the pathological boundaries during procedures[54,55]. Our CDSS received medical device approval from the Korean Ministry of Food and Drug Safety in July 2023, demonstrating regulatory compliance with standards comparable to the United States Food and Drug Administration (FDA) requirements. The approval process included a comprehensive evaluation of our multicenter clinical validation data, safety protocols, and performance metrics, as described in this manuscript.
Edge computing solutions
Considering that conventional AI systems require significant computational resources that are unavailable in all settings, we developed a laptop-based edge device for gastric neoplasm classification[56]. This portable solution addresses the following critical deployment challenges: Offline operations that eliminate network dependency, local processing that ensures data privacy, and universal endoscope compatibility. Our edge device achieved remarkable inference speeds of 2-3 milliseconds in the GPU (graphic processing unit) mode and 5-6 milliseconds in the CPU (central processing unit) mode, demonstrating immediate clinical decision support capability without compromising diagnostic accuracy (93.3% in prospective validation).
In clinical workflow terms, the edge device processes endoscopic video at 30-35 frames per second, which matches or exceeds the standard endoscopy frame rates, effectively eliminating any perceptible lag during real-time procedures. With latency maintained below 100 milliseconds, endoscopists experience seamless visual feedback, which is indistinguishable from direct endoscopic viewing. This allows true real-time integration into standard endoscopic procedures without extending the examination time.
For clinical workflow integration, the system runs independently on a laptop placed in the endoscopy suite, requiring only a single-cable connection to existing endoscopy processors via standard video output. This plug-and-play design eliminates the need for modifications to the current equipment or workflow protocols. The AI overlay appeared instantaneously on the endoscopy monitor alongside the live feed, allowing endoscopists to maintain their natural procedural rhythm while benefiting from AI assistance.
During 50 consecutive procedures, the endoscopists reported no workflow disruptions, with 96% stating that the system enhanced their diagnostic confidence without slowing their examination pace. The portability of edge devices also enables their use in mobile screening units and resource-limited settings where cloud connectivity is unreliable. Furthermore, the integration of optimization techniques including parallel computing, automatic mixed precision, and dynamic tensor memory management ensures efficient resource utilization across diverse hardware configurations[56]. Edge computing is a pragmatic AI deployment solution in the field of gastroenterology. Bringing computational power directly to care points eliminates latency and ensures consistent performance, regardless of connectivity or cloud availability.
AI TRANSFORMS UPPER GASTROINTESTINAL ENDOSCOPY QUALITY CONTROL
The evolution of AI-powered endoscopy systems
The landscape of AI-powered quality control in upper gastrointestinal endoscopy has evolved significantly. The ENDOANGEL/WISENSE system is the most extensively validated platform and utilizes CNNs and reinforcement learning to monitor 26 gastric anatomical sites during endoscopy. This system achieved a 90.4% accuracy in blind spot detection and reduced the unexamined areas from 22.46% to 5.86% in randomized controlled trials[57]. The cerebro AI system, which recently received European Union Medical Device Regulation certification, demonstrated significant improvements in examination completeness (92.6% vs 71.2% in controls, P < 0.001) in a study of 466 patients[58]. Other notable systems include the AI-enhanced recovery after surgery, which employs 20 AI algorithm modules for automatic report generation and achieves significant improvements in report precision and completeness across 44 procedures[59], and the photo documentation quality control system, which ensures standardized anatomical site recognition across 8-9 Locations[60].
These AI systems have demonstrated measurable improvements across multiple quality metrics that are essential for comprehensive upper gastrointestinal examinations. Examination time increased by 15%-25% with AI assistance, but this reflected more thorough inspections rather than inefficiency, with the ENDOANGEL multicenter trial showing inspection time rising from 4.38 to 5.40 minutes while achieving 45% reduction in blind spots[61]. Extent of mucosal visualization has improved dramatically, with AI-assisted procedures achieving 90%-95% completeness compared with 70%-80% in conventional endoscopy. AI systems excel at detecting commonly missed areas, with one multi-institutional study revealing that the vocal cords and epiglottis were missed in 99.28% and 93.14% of procedures, respectively, and AI guidance significantly reduced these omissions[58]. Photo documentation quality was improved through the automated capture of representative images from each anatomical site, filtering of blurry or inadequate images, and standardized documentation protocols, with AI-enhanced recovery after surgery demonstrating automatic lesion detection and classification using colored box annotations[62].
Clinical impact on miss rates and patient outcomes
AI-assisted upper gastrointestinal endoscopy has a significant clinical impact on miss rates, and patient outcomes has been substantial. A landmark tandem randomized controlled trial of 1812 patients demonstrated that AI assistance reduced the gastric neoplasm miss rate from 27.3% to 6.1%, representing a 77.6% relative risk reduction (risk ratio = 0.22, 95%CI: 0.07-0.74, P = 0.02), with a number needed to treat of approximately five, meaning that AI assistance in five patients prevents one missed gastric neoplasm[19]. Real-world implementation across multiple centers has validated these findings, with the ENDOANGEL-LD system achieving 96.9% sensitivity for gastric lesion detection in over 10000 patients across six hospitals and demonstrating 100% sensitivity vs 85.5% sensitivity for experts in video-based testing (P = 0.003)[63]. However, implementation challenges remain: Actual AI adoption rates average 52% when left to clinician discretion, and concerns about false positives necessitate careful calibration and user training[64].
QUANTITATIVE COMPARISON OF AI-ENABLED VERSUS TRADITIONAL WORKFLOWS
The integration of AI into gastroenterology has demonstrated measurable improvements over traditional diagnostic workflows across multiple performance metrics. In conventional endoscopy practice, the miss rate for gastric neoplasms ranges from 9.4% to 25.6%, with significant inter-observer variability in lesion detection[19,40]. AI-assisted systems have reduced these miss rates to 6.1%, representing a 77.6% relative risk reduction[19]. Examination completeness improves from 70%-80% in traditional workflows to 90%-95% with AI guidance, albeit with a 15%-25% increased procedure time, a trade-off that enhances the diagnostic yield[61]. Training efficiency showed marked improvement, with AI-assisted novices achieving 91.8% (95%CI: 91.4-92.2%, P < 0.001 vs baseline) accuracy, compared to the longer learning curves required in traditional training programs[65]. These quantitative benchmarks underscore the transformative impact of AI on clinical practice, enabling performance comparable to experienced endoscopists while maintaining quality standards across diverse practice settings.
AI-driven training enhancement and procedural standardization
AI has fundamentally transformed endoscopic training and procedural standardization, particularly benefiting novice endoscopists. The GEADS multicenter study across seven hospitals demonstrated that AI assistance enabled novices to achieve 91.8% accuracy (95%CI: 91.4%-92.2%), approaching specialist-grade diagnostic capabilities[65]. The learning curves accelerated significantly, with cumulative sum analysis demonstrating faster competency attainment and more stable performance across accumulated cases[66]. AI systems provide objective performance metrics for novices, including the automated assessment of competency in endoscopy scores, blind spot monitoring, and standardized evaluation criteria. Most importantly, AI has markedly reduced interoperator variability, with external validation studies showing 84.1%-94.9% consistency across different hospitals and operators[65], thereby democratizing performance comparable to experienced endoscopists and ensuring quality standards regardless of individual experience[67].
THE PARADIGM SHIFT TO GENERAL-PURPOSE AI
Emergence of LLMs in medicine
The medical AI landscape has undergone a fundamental transformation with the emergence of general-purpose LLM. These systems exemplify sophisticated reasoning and problem solving arising from vast and diverse dataset training rather than domain-specific fine-tuning[14,68]. This emerging intelligence challenges traditional medical AI development approaches. A pivotal study involving 92 physicians randomized the use of LLM plus conventional resources to conventional resources alone in clinical cases[69]. Physicians using LLM scored significantly higher, yet surprisingly, no difference existed between AI-assisted physicians and AI alone, raising profound questions about medical expertise and the decision-making nature[69].
One-shot learning and democratization of AI
Our experience with general-purpose models has proven to be transformative. Recent studies have demonstrated the theoretical potential of LLMs, such as GPT-4, for medical image interpretation. For example, our group explored the conceptual feasibility of one-shot learning using single representative endoscopic images. However, we must emphasize that this work remains purely experimental and has not been clinically validated. No sensitivity, specificity, or accuracy metrics were established, and no comparison with validated deep learning approaches was performed[14].
It is critical to acknowledge that these one-shot learning experiments using endoscopic images are preliminary proof-of-concept studies. They completely lack the rigorous multicenter validation, regulatory approval, and real-world performance data required for clinical deployment. Clinicians should not consider implementing these approaches in clinical practice until comprehensive clinical validation studies have been completed. Future studies should focus on systematic evaluations including sensitivity, specificity, and comparative performance against traditional deep learning approaches across diverse patient populations. Despite these limitations, the accessibility of LLM technology offers remarkable advantages: No coding is required, making AI accessible to all clinicians, regardless of technical expertise. This represents a fundamental shift in the conceptualization of medical AI development. Instead of requiring specialized engineers and months of training, clinicians can now leverage powerful pre-trained models with minimal overhead[70]. This accessibility could accelerate the adoption of clinical AI and enable the rapid prototyping of new applications.
Multi-agent systems and expanding size of context windows
The future of medical AI lies in coordinated multi-agent architectures rather than monolithic systems (Figure 3). We developed a personal AI, an LLM fine-tuned on clinical recordings, learning individual patterns, reasoning, and recommendations mirroring the clinician’s approach[14]. Patients can consult this digital version at any time, extending expert judgment beyond physical presence constraints (Figure 4). The evolution of the LLM context window size enables a qualitative transformation. We have entered the 100 million token era (Magic.dev’s LTM-2-mini), processing entire patient histories simultaneously. This capability fundamentally transforms AI clinical decision support, shifting from isolated data analysis to the interpretation of complete clinical narratives. This billion-token era will enable population-level analyses to guide national health policies.
Figure 4
Personal artificial intelligence model: Large language model fine-tuned on clinical recordings learning individual patterns, reasoning, and recommendations mirroring the clinician’s approach.
LIMITATIONS, CHALLENGES, AND FUTURE DIRECTIONS
Current technical and clinical limitations
Despite these advances, current AI systems have significant limitations, particularly LLMs, which present unique challenges in the field of clinical gastroenterology. Hallucinations, the generation of plausible-sounding but factually incorrect information, pose serious risks in medical contexts. For example, an LLM might confidently state that “gastric lesions with a yellowish hue and radial striations are pathognomonic for early signet ring cell carcinoma”, when no such diagnostic criteria exist. A clinician unfamiliar with this specific pathology may accept this fabricated information, leading to the misdiagnosis of benign gastritis as a malignancy, potentially resulting in unnecessary endoscopic resection or even gastrectomy. Other dangerous scenarios include: LLMs may invent non-existent classification systems (e.g., “modified Tokyo classification grade IVb lesions require immediate surgical intervention”), cite fictional clinical trials (“The GASTRO-AI trial demonstrated 98% accuracy in predicting lymph node metastasis based on surface color alone”), or recommend inappropriate biopsy protocols (“single-bite biopsies are sufficient for suspected linitis plastica”). Such hallucinations are particularly insidious, because they often incorporate legitimate medical terminology and plausible-sounding logic, making it difficult for busy clinicians to immediately identify them as false. Studies have indicated that LLMs overgeneralize scientific findings by up to 73%, potentially leading to inappropriate recommendations[71]. Access limitations present a complex challenge. LLMs access only open-access research, creating knowledge-blind spots given that much cutting-edge research remains paywalled. These limitations translate into potential misdiagnosis risks that require maintained human oversight without negating AI efficiency benefits. Although multiagent architectures and careful and prompt engineering can help, human oversight remains irreplaceable for now[68].
Bias and generalizability issues
Bias and generalizability are the critical limitations. Most gastric AI systems, including ours, are trained predominantly in Asian populations[40], potentially limiting their performance in other ethnic groups with different anatomical features, H. pylori prevalence, and gastric pathology patterns. The endoscopic appearance of the gastric mucosa varies across populations, and equipment differences between healthcare systems further challenge generalizability[51]. Addressing these limitations requires the establishment of international validation consortia (e.g., collaboration between Asian, American, and European gastroenterology societies), mandatory reporting of performance metrics stratified by ethnicity, development of region-specific validation datasets, and adoption of federated learning approaches to enable global validation while maintaining data privacy. Regulatory bodies should mandate multiethnic validation across at least three distinct populations prior to approval. Until such comprehensive validation is achieved, AI systems should include explicit warnings about their training population limitations[40].
Regulatory landscape and validation challenges
The regulatory landscape remains complex. Of the 903 United States FDA-approved AI medical devices, only half have been used in clinical performance studies[72]. For instance, while GI Genius™ (Medtronic, MN, United States) received FDA clearance as the first AI-powered colonoscopy system in 2021, demonstrating the feasibility of regulatory approval for endoscopic AI, similar approvals for upper gastrointestinal applications remain limited. Approximately 40% of the clinical utility studies employed retrospective designs, with only 2.4% utilizing randomized controlled trials[72]. This high-quality evidence creates uncertainty about the true clinical value of many AI systems. The disparity between the AI systems achieving regulatory clearance and those demonstrating robust clinical evidence highlights the need for more stringent post-market surveillance and real-world effectiveness studies.
The establishment of clinical utility extends beyond performance metrics. Cost-effectiveness studies on reimbursement remain scarce, creating adoption barriers even for clinically efficacious systems[73]. Furthermore, the rapid evolution of AI technology often outpaces regulatory frameworks, with newer general-purpose models, such as LLMs, presenting novel regulatory challenges that existing pathways may not adequately address. Balanced collaborations between industry, scientists, and clinicians have become critical in navigating regulatory challenges[73]. Although our CDSS has successfully navigated regulatory approval in Korea, achieving similar milestones in multiple jurisdictions remains challenging owing to varying regional requirements and the need for location-specific clinical validation studies.
Ethical considerations in AI-enabled gastroenterology
The integration of AI into gastroenterology raises critical ethical concerns and requires careful consideration. Patient privacy remains paramount because AI systems require vast endoscopic image datasets that potentially contain identifiable features. Despite anonymization protocols, re-identification risks through sophisticated image analyses necessitate robust data governance and federated learning approaches. Data bias threatens equitable healthcare delivery, with over 90% of AI studies originating from Asian populations[40], potentially limiting its effectiveness in Western populations with different disease patterns. This geographic bias, compounded by socioeconomic disparities, where high-quality images predominantly come from well-resourced centers, could compromise AI performance precisely where most needed.
Accountability in AI decision-making presents complex medicolegal challenges. When AI systems miss diagnoses or mischaracterize lesions, liability determination among endoscopists, developers, and institutions is problematic. Current frameworks inadequately address AI clinical judgment conflicts, particularly given that the effectiveness of AI varies inversely with expertise[7]. Clear protocols for overriding authority, mandatory documentation of disagreements, and transparent disclosure of limitations are essential to maintain accountability and patient trust. These considerations demand governance frameworks that balance innovation with patient safety, ensuring that AI augments rather than compromises the fundamental physician-patient relationship.
Strategic directions for the future
The era of building domain-specific medical AI models may have ended. Rapid general-purpose LLM improvements combined with accessibility and flexibility suggest that future efforts should focus on innovative applications rather than on model development[13,14,68]. Although specialized models outperform general LLMs in specific speed and accuracy tasks, this gap narrows rapidly. The oligopolistic structure of AI development shapes the approaches of medical professionals. Creating LLMs requires massive resources possessed only by tech giants. Our strength lies not in the model development competition but in understanding clinical needs and creatively applying existing tools to solve real medical problems.
Multimodal integration: The next evolution
Massachusetts Institute of Technology’s QoQ-Med recently demonstrated true multimodal integration by combining electrocardiography signals, medical images, and clinical text in a single model[74]. This represents the direction we need: AI that sees the complete clinical picture, not just fragments. Beyond the current achievements, the integration of wearable sensor data and continuous monitoring signals promises to enable on-site comprehensive health assessment that adapts dynamically to available modalities[75]. Future systems will likely incorporate genomic data, environmental factors, and longitudinal health records to create truly holistic AI that functions as a comprehensive clinical partner rather than a narrow diagnostic tool.
Wearable AI devices: The next frontier
The emergence of smart glasses with embedded vision models represents the next frontier in AI-assisted endoscopy. Although current AI systems have demonstrated significant clinical benefits in gastric disease detection and characterization, the integration of wearable devices promises hands-free operation during upper gastrointestinal procedures (Figure 5). However, it is crucial to distinguish between the current reality and future aspirations. All existing studies on wearable devices in endoscopy are limited to educational feasibility trials and not to clinical applications.
Figure 5
Smart glasses with embedded vision models for the lesion detection and characterization during upper gastrointestinal procedures.
Current evidence is restricted to training environments: Lee et al[76] conducted an observational study where 14 novices used wearable display glasses (Google Glass) solely as passive observers watching endoscopy procedures for educational purposes, they were not operators performing clinical procedures. In this educational context, 71.4% (10/14) of the novices reported improved learning outcomes, although 25% (4/14) experienced eye fatigue during the extended observation periods. This represents the extent of current evidence: Passive observation for training and non-active clinical use.
The gap between the current feasibility studies and envisioned clinical applications remains substantial. Although we envision future scenarios in which endoscopists wear smart glasses that provide real-time AI-generated lesion detection alerts and diagnostic suggestions during procedures, no such clinical implementation exists. Recent feasibility studies have exclusively focused on educational settings and explored head-mounted displays for endoscopy training and remote consultation, but none have demonstrated hands-free AI assistance during actual patient procedures[76].
Theoretically, these devices can fundamentally transform endoscopy by providing a direct visual overlay of AI-generated information within the field of view of the endoscopist. Voice-activated controls and gesture recognition promise hands-free operation, but these remain conceptually rather than clinically validated. Current limitations include device weight, battery life, visual fatigue during extended procedures, and most critically, the complete absence of clinical trials demonstrating safety and efficacy in patient care settings.
The convergence of AI and wearable technologies in gastric endoscopy faces both opportunities and challenges. A systematic review identified hardware limitations, user comfort, and workflow integration as the primary barriers to smart glass adoption in medical settings[77]. Until rigorous clinical trials have demonstrated that wearable AI devices can safely and effectively assist in actual endoscopic procedures, their role remains confined to educational applications. However, with rapid technological advancements and growing clinical evidence supporting the role of AI in improving gastric cancer and precancerous lesion detection rates, the integration of hands-free, on-site AI assistance through wearable devices appears inevitable, although this future remains years away from clinical reality. As these technologies mature, they promise to democratize access to expert-level diagnostic capabilities for gastric diseases while maintaining the nuanced clinical judgment essential for optimal patient care[73].
AI’s transformative impact beyond endoscopy
The transformative potential of AI in gastroenterology extends far beyond diagnostic imaging. Saudi Arabia’s launch of the world’s first AI-led doctor clinic represents a fundamental transformation in healthcare delivery, where AI systems conduct initial patient consultations under human oversight, demonstrating the evolution of AI from diagnostic tools to primary care interfaces. Even more revolutionary was rentosertib, the world’s first AI-discovered drug to show efficacy in humans. By analyzing integrated genomics, proteomics, and clinical data, AI identified Traf2-and NcK-interacting kinase as a novel target for lung fibrosis treatment, achieving 98 mL improvements in lung function while compressing drug discovery from the typical 5 years to just 18 months[78]. These breakthroughs illustrate that we have entered an era in which AI not only assists medical practice but also fundamentally transforms drug discovery and healthcare delivery models.
CONCLUSION
The AI journey in gastric disease management exemplifies the broader transformation of medicine from computer-aided detection to sophisticated multiagent systems. Our research reveals four key insights: (1) AI effectiveness is inversely correlated with user expertise, maximally benefiting moderate-experience practitioners; (2) AI future lies in creative applications of general-purpose systems rather than in developing specialized models; (3) AI’s evolution toward truly multimodal systems reinforces the need to integrate diverse clinical data streams rather than rely on single-modality analysis; (4) AI integration required careful workflow design in addition to technical excellence; and (5) AI should augment rather than replace clinical judgment, with human oversight remaining essential for complex cases.
Future efforts should prioritize innovative clinical applications rather than competing in model development, leveraging the unique understanding of medical practice while benefiting from the AI infrastructure provided by technology companies. Gastroenterology’s AI transformation is a current reality rather than a distant possibility. However, realizing its full potential requires thoughtful integration and a commitment to preserving irreplaceable human elements. Ultimately, current decisions regarding the development, implementation, and regulation of medical AI will shape healthcare for future generations.
Footnotes
Provenance and peer review: Invited article; Externally peer reviewed.
Peer-review model: Single blind
Specialty type: Gastroenterology and hepatology
Country of origin: South Korea
Peer-review report’s classification
Scientific Quality: Grade A, Grade A, Grade B, Grade B
Novelty: Grade A, Grade A, Grade B, Grade B
Creativity or Innovation: Grade A, Grade A, Grade B, Grade B
Scientific Significance: Grade A, Grade B, Grade B, Grade B
P-Reviewer: Khan S, Research Fellow, Pakistan; Rizwan M, PhD, Pakistan S-Editor: Wu S L-Editor: A P-Editor: Wang CH
Mori Y, Kudo SE, Misawa M, Saito Y, Ikematsu H, Hotta K, Ohtsuka K, Urushibara F, Kataoka S, Ogawa Y, Maeda Y, Takeda K, Nakamura H, Ichimasa K, Kudo T, Hayashi T, Wakamura K, Ishida F, Inoue H, Itoh H, Oda M, Mori K. Real-Time Use of Artificial Intelligence in Identification of Diminutive Polyps During Colonoscopy: A Prospective Study.Ann Intern Med. 2018;169:357-366.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 412][Cited by in RCA: 357][Article Influence: 51.0][Reference Citation Analysis (1)]
Singhal K, Azizi S, Tu T, Mahdavi SS, Wei J, Chung HW, Scales N, Tanwani A, Cole-Lewis H, Pfohl S, Payne P, Seneviratne M, Gamble P, Kelly C, Babiker A, Schärli N, Chowdhery A, Mansfield P, Demner-Fushman D, Agüera Y Arcas B, Webster D, Corrado GS, Matias Y, Chou K, Gottweis J, Tomasev N, Liu Y, Rajkomar A, Barral J, Semturs C, Karthikesalingam A, Natarajan V. Large language models encode clinical knowledge.Nature. 2023;620:172-180.
[RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)][Cited by in Crossref: 1698][Cited by in RCA: 918][Article Influence: 459.0][Reference Citation Analysis (0)]
Hassan C, Spadaccini M, Iannone A, Maselli R, Jovani M, Chandrasekar VT, Antonelli G, Yu H, Areia M, Dinis-Ribeiro M, Bhandari P, Sharma P, Rex DK, Rösch T, Wallace M, Repici A. Performance of artificial intelligence in colonoscopy for adenoma and polyp detection: a systematic review and meta-analysis.Gastrointest Endosc. 2021;93:77-85.e6.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 361][Cited by in RCA: 311][Article Influence: 77.8][Reference Citation Analysis (1)]
Luo H, Xu G, Li C, He L, Luo L, Wang Z, Jing B, Deng Y, Jin Y, Li Y, Li B, Tan W, He C, Seeruttun SR, Wu Q, Huang J, Huang DW, Chen B, Lin SB, Chen QM, Yuan CM, Chen HX, Pu HY, Zhou F, He Y, Xu RH. Real-time artificial intelligence for detection of upper gastrointestinal cancer by endoscopy: a multicentre, case-control, diagnostic study.Lancet Oncol. 2019;20:1645-1654.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 155][Cited by in RCA: 254][Article Influence: 42.3][Reference Citation Analysis (0)]
Wu L, Shang R, Sharma P, Zhou W, Liu J, Yao L, Dong Z, Yuan J, Zeng Z, Yu Y, He C, Xiong Q, Li Y, Deng Y, Cao Z, Huang C, Zhou R, Li H, Hu G, Chen Y, Wang Y, He X, Zhu Y, Yu H. Effect of a deep learning-based system on the miss rate of gastric neoplasms during upper gastrointestinal endoscopy: a single-centre, tandem, randomised controlled trial.Lancet Gastroenterol Hepatol. 2021;6:700-708.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 82][Cited by in RCA: 82][Article Influence: 20.5][Reference Citation Analysis (0)]
Yamaguchi D, Shimoda R, Miyahara K, Yukimoto T, Sakata Y, Takamori A, Mizuta Y, Fujimura Y, Inoue S, Tomonaga M, Ogino Y, Eguchi K, Ikeda K, Tanaka Y, Takedomi H, Hidaka H, Akutagawa T, Tsuruoka N, Noda T, Tsunada S, Esaki M. Impact of an artificial intelligence-aided endoscopic diagnosis system on improving endoscopy quality for trainees in colonoscopy: Prospective, randomized, multicenter study.Dig Endosc. 2024;36:40-48.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 4][Cited by in RCA: 26][Article Influence: 26.0][Reference Citation Analysis (0)]
Areia M, Mori Y, Correale L, Repici A, Bretthauer M, Sharma P, Taveira F, Spadaccini M, Antonelli G, Ebigbo A, Kudo SE, Arribas J, Barua I, Kaminski MF, Messmann H, Rex DK, Dinis-Ribeiro M, Hassan C. Cost-effectiveness of artificial intelligence for screening colonoscopy: a modelling study.Lancet Digit Health. 2022;4:e436-e444.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 7][Cited by in RCA: 121][Article Influence: 40.3][Reference Citation Analysis (0)]
Parasa S, Repici A, Berzin T, Leggett C, Gross SA, Sharma P. Framework and metrics for the clinical use and implementation of artificial intelligence algorithms into endoscopy practice: recommendations from the American Society for Gastrointestinal Endoscopy Artificial Intelligence Task Force.Gastrointest Endosc. 2023;97:815-824.e1.
[RCA] [PubMed] [DOI] [Full Text][Cited by in RCA: 28][Reference Citation Analysis (0)]
Gong EJ, Bang CS, Lee JJ, Baik GH, Lim H, Jeong JH, Choi SW, Cho J, Kim DY, Lee KB, Shin SI, Sigmund D, Moon BI, Park SC, Lee SH, Bang KB, Son DS. Deep learning-based clinical decision support system for gastric neoplasms in real-time endoscopy: development and validation study.Endoscopy. 2023;55:701-708.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 32][Cited by in RCA: 27][Article Influence: 13.5][Reference Citation Analysis (0)]
Xie F, Zhang K, Li F, Ma G, Ni Y, Zhang W, Wang J, Li Y. Diagnostic accuracy of convolutional neural network-based endoscopic image analysis in diagnosing gastric cancer and predicting its invasion depth: a systematic review and meta-analysis.Gastrointest Endosc. 2022;95:599-609.e7.
[RCA] [PubMed] [DOI] [Full Text][Cited by in RCA: 22][Reference Citation Analysis (0)]
Liu X, Faes L, Kale AU, Wagner SK, Fu DJ, Bruynseels A, Mahendiran T, Moraes G, Shamdas M, Kern C, Ledsam JR, Schmid MK, Balaskas K, Topol EJ, Bachmann LM, Keane PA, Denniston AK. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis.Lancet Digit Health. 2019;1:e271-e297.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 622][Cited by in RCA: 817][Article Influence: 136.2][Reference Citation Analysis (0)]
Xu HL, Gong TT, Song XJ, Chen Q, Bao Q, Yao W, Xie MM, Li C, Grzegorzek M, Shi Y, Sun HZ, Li XH, Zhao YH, Gao S, Wu QJ. Artificial Intelligence Performance in Image-Based Cancer Identification: Umbrella Review of Systematic Reviews.J Med Internet Res. 2025;27:e53567.
[RCA] [PubMed] [DOI] [Full Text][Cited by in RCA: 1][Reference Citation Analysis (0)]
Wu L, Zhang J, Zhou W, An P, Shen L, Liu J, Jiang X, Huang X, Mu G, Wan X, Lv X, Gao J, Cui N, Hu S, Chen Y, Hu X, Li J, Chen D, Gong D, He X, Ding Q, Zhu X, Li S, Wei X, Li X, Wang X, Zhou J, Zhang M, Yu HG. Randomised controlled trial of WISENSE, a real-time quality improving system for monitoring blind spots during esophagogastroduodenoscopy.Gut. 2019;68:2161-2169.
[RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)][Cited by in Crossref: 244][Cited by in RCA: 216][Article Influence: 36.0][Reference Citation Analysis (0)]
Dong Z, Wu L, Mu G, Zhou W, Li Y, Shi Z, Tian X, Liu S, Zhu Q, Shang R, Zhang M, Zhang L, Xu M, Zhu Y, Tao X, Chen T, Li X, Zhang C, He X, Wang J, Luo R, Du H, Bai Y, Ye L, Yu H. A deep learning-based system for real-time image reporting during esophagogastroduodenoscopy: a multicenter study.Endoscopy. 2022;54:771-777.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 14][Cited by in RCA: 16][Article Influence: 5.3][Reference Citation Analysis (0)]
Wu L, He X, Liu M, Xie H, An P, Zhang J, Zhang H, Ai Y, Tong Q, Guo M, Huang M, Ge C, Yang Z, Yuan J, Liu J, Zhou W, Jiang X, Huang X, Mu G, Wan X, Li Y, Wang H, Wang Y, Zhang H, Chen D, Gong D, Wang J, Huang L, Li J, Yao L, Zhu Y, Yu H. Evaluation of the effects of an artificial intelligence system on endoscopy quality and preliminary testing of its performance in detecting early gastric cancer: a randomized controlled trial.Endoscopy. 2021;53:1199-1207.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 112][Cited by in RCA: 104][Article Influence: 26.0][Reference Citation Analysis (0)]
Yang H, Wu Y, Yang B, Wu M, Zhou J, Liu Q, Lin Y, Li S, Li X, Zhang J, Wang R, Xie Q, Li J, Luo Y, Tu M, Wang X, Lan H, Bai X, Wu H, Zeng F, Zhao H, Yi Z, Zeng F. Identification of upper GI diseases during screening gastroscopy using a deep convolutional neural network algorithm.Gastrointest Endosc. 2022;96:787-795.e6.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 1][Cited by in RCA: 6][Article Influence: 2.0][Reference Citation Analysis (0)]
Goh E, Gallo RJ, Strong E, Weng Y, Kerman H, Freed JA, Cool JA, Kanjee Z, Lane KP, Parsons AS, Ahuja N, Horvitz E, Yang D, Milstein A, Olson APJ, Hom J, Chen JH, Rodman A. GPT-4 assistance for improvement of physician performance on patient care tasks: a randomized controlled trial.Nat Med. 2025;31:1233-1238.
[RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)][Cited by in Crossref: 31][Cited by in RCA: 16][Article Influence: 16.0][Reference Citation Analysis (0)]
ASGE AI Task Force, Parasa S, Berzin T, Leggett C, Gross S, Repici A, Ahmad OF, Chiang A, Coelho-Prabhu N, Cohen J, Dekker E, Keswani RN, Kahn CE, Hassan C, Petrick N, Mountney P, Ng J, Riegler M, Mori Y, Saito Y, Thakkar S, Waxman I, Wallace MB, Sharma P. Consensus statements on the current landscape of artificial intelligence applications in endoscopy, addressing roadblocks, and advancing artificial intelligence in gastroenterology.Gastrointest Endosc. 2025;101:2-9.e1.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 1][Cited by in RCA: 11][Article Influence: 11.0][Reference Citation Analysis (0)]
Dai W, Chen P, Ekbote C, Liang PP.
QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training. 2025 Preprint. Available from: arXiv:2506.00711.
[PubMed] [DOI] [Full Text]
Xu Z, Ren F, Wang P, Cao J, Tan C, Ma D, Zhao L, Dai J, Ding Y, Fang H, Li H, Liu H, Luo F, Meng Y, Pan P, Xiang P, Xiao Z, Rao S, Satler C, Liu S, Lv Y, Zhao H, Chen S, Cui H, Korzinkin M, Gennert D, Zhavoronkov A. A generative AI-discovered TNIK inhibitor for idiopathic pulmonary fibrosis: a randomized phase 2a trial.Nat Med. 2025;31:2602-2610.
[RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)][Cited by in Crossref: 6][Cited by in RCA: 6][Article Influence: 6.0][Reference Citation Analysis (0)]