Endoscopic image analysis assisted by machine learning: Algorithmic advancements and clinical uses

doi:10.37126/aige.v6.i3.108281

Advanced Search

BPG is committed to discovery and dissemination of knowledge

Home / Archive / Volume 6, Issue 3

This Article

Table of Contents

Academic Content and Language Evaluation of This Article

CrossCheck and Google Search of This Article

Academic Rules and Norms of This Article

Citation of this article

Corresponding Author of This Article

Research Domain of This Article

Article-Type of This Article

Open-Access Policy of This Article

Times Cited Counts in Google of This Article

Number of Hits and Downloads for This Article

Total Article Views (3215)

All Articles published online

The chart showing PDF series, HTML series, Figures (1-1) series, Tables (1-2) series.

Item

Count

PDF

HTML

1221

Figures (1-1)

Tables (1-2)

Sum=1352

Featured Article

The chart showing Browse series, Download series.

Item

Count

Browse

1018

Download

243

Sum=1261

Publishing Process of This Article

Item

Count

Browse

416

Download

115

Sum=531

Sep 8, 2025 (publication date) through Nov 9, 2025

Times Cited of This Article

Times Cited (0)

Journal Information of This Article

Publication Name

Artificial Intelligence in Gastrointestinal Endoscopy

ISSN

2689-7164

Publisher of This Article

Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA

Minireviews Open Access

Artif Intell Gastrointest Endosc. Sep 8, 2025; 6(3): 108281
Published online Sep 8, 2025. doi: 10.37126/aige.v6.i3.108281

Endoscopic image analysis assisted by machine learning: Algorithmic advancements and clinical uses

Jiang-Cheng Ding, Jun Zhang

Jiang-Cheng Ding, Jun Zhang, Department of Gastroenterology, Nanjing First Hospital, Nanjing 210006, Jiangsu Province, China

ORCID number: Jiang-Cheng Ding (0009-0000-3705-3059); Jun Zhang (0000-0001-7051-4590).

Author contributions: Ding JC performed the research; Zhang J designed the research study; all of the authors read and approved the final version of the manuscript to be published.

Conflict-of-interest statement: The author declares no financial or non-financial conflicts of interest related to this work.

Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/

Corresponding author: Jun Zhang, PhD, Adjunct Associate Professor, Chief Physician, Department of Digestive, Nanjing First Hospital, No. 68 Changle Road, Qinhuai District, Nanjing 210006, Jiangsu Province, China. zhangjun711028@126.com

Received: April 10, 2025
Revised: May 20, 2025
Accepted: July 21, 2025
Published online: September 8, 2025
Processing time: 146 Days and 20.8 Hours

Abstract

Clinical gastrointestinal endoscopy has significantly advanced owing to machine learning techniques, which have produced novel instruments and approaches for early-stage disease diagnosis, categorization, and therapy. Machine learning applications in gastrointestinal endoscopy, such as image identification, lesion detection, pathological categorization, and surgical aid, are examined in this minireview. We examine the potential of machine learning to improve treatment regimens, lower misdiagnosis rates, and increase diagnostic accuracy by evaluating previous research. In addition, this study discusses current issues such clinical applicability, model generalization, and data privacy. It also suggests future research directions to help clinicians and researchers in the field of gastrointestinal endoscopy.

Key Words: Machine learning; Artificial intelligence; Endoscopy; Image recognition; Gastroenterology

Core Tip: This article systematically reviews recent research progress and developmental trends in machine learning applications for gastrointestinal endoscopic imaging. Focusing on tumor and non-tumor lesion analysis, it elaborates on convolutional neural networks' dual mechanisms: Enhancing image clarity through deep feature extraction and reconstruction algorithms, and enabling quantitative image analysis via multi-dimensional feature interpretation. The study further highlights their clinical value in developing artificial intelligence-assisted diagnostic models and achieving precision differential diagnosis in digestive diseases.

Citation: Ding JC, Zhang J. Endoscopic image analysis assisted by machine learning: Algorithmic advancements and clinical uses. Artif Intell Gastrointest Endosc 2025; 6(3): 108281
URL: https://www.wjgnet.com/2689-7164/full/v6/i3/108281.htm
DOI: https://dx.doi.org/10.37126/aige.v6.i3.108281

INTRODUCTION

Gastrointestinal endoscopy is an essential instrument for diagnosing digestive diseases. It is used for screening lesions in the esophagus, stomach, and colorectum. It allows clinicians to observe the intricate details of mucosal surfaces, which helps them in uncovering hidden disorders, such as inflammation, ulcers, polyps, and even early-stage cancers. Gastroscopy and colonoscopy are valued as effective tools in cancer screening[1]. When used together, they enhance the detection of gastric and colorectal tumors with remarkable efficacy. Endoscopy not only allows diagnosis, but also enables the application of minimally invasive treatments. Endoscopic techniques, such as endoscopic mucosal resection (EMR) and endoscopic submucosal dissection, reduce surgical trauma for patients[2]. According to the World Health Organization, gastrointestinal cancers account for nearly a quarter of global cancer cases. Endoscopy has revolutionized the clinical management of these challenging diseases, which also explains their rising popularity[3].

However, the diagnostic efficacy of endoscopic technology depends more on the operator’s experience than on sheer innovation. Studies have revealed a staggering 20%–30% rate of missed early-stage gastrointestinal tumors, such as flat lesions[4]. This is particularly troubling in primary care, where variances in physician expertise result in diagnostic disparities, especially when the technical status is uncertain. Endoscopy generates an overwhelming amount of images during a single colonoscopy session. This wealth of data can lead to fatigue-fueled oversight. As a result, small lesions may often get overlooked, while it may become difficult to decipher subtle features such as vascular patterns and mucosal color shifts[5]. Furthermore, distinguishing between different lesions, such as hyperplastic and adenomatous polyps, is a highly complicated process. In addition, this step often necessitates pathological biopsies, which can prolong diagnosis and treatment, creating more stress for already burdened patients.

Deep learning is a subset of machine learning, which can learn multilevel feature representations from data using neural networks with multiple layers. Unlike its traditional counterpart, deep learning does not rely on hand-crafted features. Instead, it skillfully distills abstract features through stacked hidden layers, fostering seamless end-to-end learning[6]. The convolutional neural network (CNN) is an example of deep learning algorithm[7]. This exceptional model is designed specifically for processing grid structures like images and videos. Its hierarchal architecture enables the automatic extraction of semantic features from raw pixels. With each layer, it discerns subtle details, such as morphological differences in lesion areas[8].

Machine learning has emerged as a significant advancement in medical imaging. deep learning, in particular, has shown remarkable potential in gastrointestinal endoscopy. CNNs have been used in challenging tasks, such as polyp detection and tumor classification. For example, the U-Net architecture has been used to achieve a pixel-perfect segmentation of lesions in the endoscopic images of early-stage gastric cancer. By training the CNN, the algorithm can be taught to autonomously pinpoint lesions and guide physicians to swiftly identify suspicious lesions, such as those related to Barrett’s esophagus and early-stage gastric cancer. Moreover, it can quantitatively evaluate the malignancy risk of each lesion[1]. In a notable multicenter study, artificial intelligence (AI) systems achieved a remarkable 94% sensitivity when detecting colorectal polyps in real-time, far surpassing the performance of primary endoscopists[9]. However, considering the limitations of the technology, namely that it has not yet been deployed in large-scale clinical trials and usage, it usually serves only as a reference direction for future development of related machine learning technologies.

The following are some of the most common models of machine learning in clinical applications: (1) Supervised learning models (such as CNN and U-Net): These models have been successfully used for clinical translation. They are particularly useful in lesion detection (such as colon polyp recognition) and image enhancement [such as generative adversarial networks (GANs) for super-resolution reconstruction]. They provide high-quality labeled data, which allows for precise adaptation to specific clinical tasks; (2) Hybrid models [combining two-dimensional (2D)/three-dimensional (3D) CNNs]: Although these models can be used for dynamic endoscopic video analysis, they have a high computational complexity and therefore have not yet been deployed on a large scale; and (3) Unsupervised/weakly supervised models: These models can be employed in scenarios with insufficient data labeling (such as endoscopic image denoising). However, they lack clinical validation and their generalization capability needs to be improved (Table 1).

Table 1 Comparison of two machine learning models.

	Supervised model	Unsupervised model
Date dependency	High-quality annotations	Label-efficient
Clinical applicability	Task-specific optimization	Exploratory applications
Interpretability	Moderate (via feature mapping)	Low (abstract feature hierarchy)
Deployment	Device-specific optimization	Generalizable frameworks

Open in New Tab Full Size Table

Combined with the increasing development of AI technology and the improvement of clinical related needs, this paper has an in-depth understanding of the current development of AI in gastrointestinal endoscopy, and in view of its future development trend, it is necessary to provide useful guidance and development direction for the development of clinical gastrointestinal endoscopy. This study focuses on the incredible impact of machine learning on revolutionizing endoscopic image analysis and disease diagnosis.

MACHINE LEARNING-ASSISTED ENDOSCOPIC IMAGE ENHANCEMENT

Machine learning techniques can efficiently enhance the endoscopic image quality. They are particularly useful for image clarity, computer vision tasks, and 2D-3D transformations. However, traditional endoscopes are plagued by motion blur, uneven lighting, and low resolution. Deep learning with super-resolution algorithms such as GANs can help resolve these issues. For instance, Fang et al[10] introduced a spatiotemporal super-resolution model that amplifies the visibility of mucosal textures by seamlessly combining temporal and spatial features. This transformative model can also enhance image analysis, including endoscopic captures. Moreover, denoising algorithms like U-Net variants and DnCNN, in particular, are key machine learning models. They can be used to handle noise disruption in endoscopic images, especially in dimly lit settings. These methods have revolutionized endoscopy, providing clarity that once seemed difficult to achieve[11]. Daher et al[12] transformed endoscopy with advancements for managing specular highlights. By harnessing the power of GANs, they expertly hid intricate anatomical structures beneath bright reflections. Their innovative approach helps in understanding images, deftly weaving in spatial data and adjacent frames. Automated detection of specular highlights is responsible for the success of the approach advanced by Daher et al[12] This clever technique yields significant enhancements in gastroscopy. System evaluations, fortified by direct comparisons and ablation studies, reveal striking improvements over conventional endoscopic methods[12].

The leap from 2D to 3D reconstruction technology has recently captivated researchers worldwide. This innovative approach creates striking 3D models using both monocular and Multiview endoscopic images. It provides doctors with an in-depth examination of the spatial architecture of lesions. For instance, Chen et al[6] unveiled the Colonoscopy 3D Video Dataset, setting the benchmark in computer vision techniques. They also introduced a groundbreaking multimodal 2D-3D registration method that aligns optical video sequences with life-like renders of known 3D models. Their reconstructed 3D vision decreases conversion errors across multiple dimensions[13]. Similarly, Schulte et al[14] engineered a machine learning-based megahertz optical coherence tomography proctoscopy model to differentiate between rectal parietal layers and key tissue features within the postmortem human colon. The real-time 3D-OCT tool is expected to become an invaluable for on-site evaluations of disease status and treatment responses in rectal disorder. González-Bueno Puyal et al[15] reported a groundbreaking hybrid 2D/3D CNN. This innovative architecture paves the way for polyp segmentation and enhances clinical diagnosis. Blending machine learning with dynamic 3D imaging affords superior results compared to those achieved using traditional static images only. In addition, incorporating real-time video boosts the stability of temporal predictions. The results seem promising because automated polyp detection thrives when hybrid algorithms and temporal insights are employed together[15].

Although various advancements have been made in the field of machine learning, it is still affected by several hurdles. Super-resolution models lean heavily on their training data. As a result, they struggle with generalization across devices. The 3D reconstruction of machine learning models suffers from computational complexity. In addition, its real-time performance is yet to be optimized. Therefore, future work on the topic must combine physical imaging models with data-driven techniques for clearer endoscopic images and sharper 3D visuals.

MACHINE LEARNING–ASSISTED ENDOSCOPIC IMAGE ANALYSIS FOR TUMOR DIAGNOSIS AND PROGNOSIS

Although gastrointestinal tumors are prime targets for endoscopic exploration, there are two major issues that need to be addressed first: (1) A high rate of missed diagnoses; and (2) A poor ability to spot early-stage lesions. Additionally, evaluating the impact of certain tumors post-treatment is fraught with difficulties. Research reveals that conventional endoscopy often misses small, flat lesions. Some advanced tumors evade detection altogether due to varying operator expertise and/or blind spots. Navigating these obstacles requires a sharper focus and refined techniques[5]. Subtle hues in early-stage gastrointestinal tumors can be missed. Worryingly, intramucosal carcinoma and high-grade intraepithelial neoplasia are often seen as benign inflammation or proliferative lesions, leading to diagnostic delays. Statistics paint a poor picture as over 90% of patients with early-stage gastric cancer survive only five years post-diagnosis. In contrast, less than 30% of those with advanced gastric cancer share that fate[16]. This glaring gap underscores the pressing need for early diagnosis.

Machine learning is a highly useful tool that may help improve the diagnosis of gastric cancer. Innovators like Yoon et al[17] have developed a gastrointestinal image stitching method, using an enhanced unsupervised algorithm. Their goal is to broaden the horizons of gastrointestinal examination and reduce the chances of missed detections. Chen et al[6] reported an innovative system for detecting and classifying colorectal polyps. Using deep learning and grayscale images, this innovation promises precision. The data gathered includes 1000 colorectal polyp images from Chang Gung Hospital (Taiwan) and the CVC-ClinicDB (Colorectal Cancer-Clinic Dataset). The system transforms vibrant RGB images into striking 0 to 255 grayscales. Previous studies have revealed that, with CNN models, the polyp detection accuracy of grayscale images can rival, or even surpass, that of RGB images, delivering efficiency without compromising on the quality[18]. Shi et al[19] evaluated the accuracy of machine learning in diagnosing early-stage gastric cancer. They conducted a thorough screening using a bivariate mixed-effect model for their meta-analysis. The results showed promise, as the machine learning model demonstrated remarkable sensitivity and specificity. In fact, its performance overshadowed that of non-specialist doctors (0.64/0.84) as well as that of even specialist doctors[19]. Arai et al[20] explored gastric cancer using machine learning. They designed a model combining both endoscopic and histological features from the initial esophagogastroduodenoscopy (EGD). The model data do not just predict gastric cancer, but also seamlessly integrate risk factors. This machine learning model makes it possible to develop effective, personalized follow-up strategies post-EGD[20]. Chen et al[21] also reported better endoscopic identification techniques for gastric cancer. Li et al[22] took a different route, constructing a population cohort to assess early-stage esophageal cancer lesions. They compared high-risk esophageal lesions (HrEL) obtained through traditional endoscopy and machine learning-assisted endoscopy. Their findings revealed that machine learning dramatically enhanced the HrEL detection rate during endoscopy and ensured safety[22].

Deep learning is revolutionizing the early detection and treatment of esophageal cancer and is proving to be a valuable screening tool. Wang et al[23] used machine learning to dissect imaging and pathological images and evaluate the complexities of esophageal cancer. They generated deep learning predictions for evaluating prognosis post-chemotherapy from various angles. Their multidimensional approach—combining endoscopy-based analysis with imaging insights—showed that machine learning can be employed to assess post-chemotherapy outcomes[23]. Fockens et al[24] used machine learning in Barrett’s esophagus and proposed that a computer-aided detection (CADe) system could significantly improve tumor detection. They refined the Barrett CADe system so that it could be used specifically for neoplasia and evaluated its accuracy against those reported by experienced endoscopists. Their findings indicate that machine learning can significantly enhance endoscopic tumor detection. Remarkably, the CADe system not only detects tumors, but also reports a much better sensitivity compared to that reported by a substantial cohort of endoscopists[24].

In a groundbreaking study, Li et al[25] developed an endoscopic system that combines photoacoustic microscopy with ultrasound imaging. Using a deep learning CNN, their system can successfully differentiate pathologically complete remission from incomplete remission with considerable precision. Hence, deep learning-enabled endoscopic image analysis can revolutionize the management of rectal cancer by improving treatment outcomes and reducing medical costs[25]. Koseoglu et al[26] used label-free indirect diagnosis for colorectal cancer (CRC), with machine learning and a plasma needle endoscopy system. By enveloping a stainless-steel needle in a dopamine polymer layer, they grew gold nanopolyhedra—a remarkably sensitive surface-enhanced Raman scattering sensor. Coupled with endoscopy, this non-invasive approach offers a fresh avenue for early screening and diagnosis of CRC[26]. The innovative machine learning model for colon cancer detection draws from the research by Jheng et al[27]. They constructed a CNN-based algorithm, named GUTAID, for detecting colon abnormalities and markers with remarkable precision. This clever GUTAID system not only excels in identification, but also enhances polyp characterization for optical diagnosis. These excellent results suggest that AI classification methods can effectively differentiate among various colon diseases[27].

Tumor endoscopic 3D reconstruction technology, machine learning-based field, has seen consistent progress in recent times. Rowan et al[28] participated in the SimCol 3D challenge, which aimed to facilitate data-driven navigation in colonoscopy. It employed an array of advanced techniques and innovative strategies, which led to significantly improved outcomes in depth prediction and estimation of colon cancer stage. These efforts helped build a robust foundation for creating a 3D model using colonoscopy images[26,28].

MACHINE LEARNING-ENHANCED ENDOSCOPIC IMAGE REVIEW FOR THE DIAGNOSIS AND FUTURE OUTLOOK OF GASTROINTESTINAL NON-MALIGNANT DISEASES

Currently, the landscape of intelligent diagnosis of gastrointestinal diseases is witnessing a rapid growth. No longer content with merely tackling malignant tumors such as esophageal and gastric cancers, researchers are now embarking on multidisease exploration. This evolving paradigm embraces diversity, with fresh studies focusing on enhancing the efficacy of machine learning in image identification. In addition, the morphological analysis of gastrointestinal polyps and early endoscopic diagnosis of Helicobacter pylori (H. pylori) infections are also attracting increasing attention. In a notable comparison, Namikawa et al[29] revealed the capabilities of AI against human experts in detecting and classifying colon polyps. Their findings revealed that AI showed superior sensitivity, outperforming experts, albeit with slightly less specificity. AI was also more sensitive than non-experts but less precise. Thus, while AI can be a vital tool in boosting polyp-detection rates, it is also essential to be aware of task-specific impacts on model performance[29]. Yoon et al[17] were the first to employ machine learning and hyperspectral endoscopy (HySE) technology. This groundbreaking project led to the development of a compact HySE system. Merging line-scan hyperspectral imaging with standard white light revealed a vibrant spectrum of data during endoscopy. This innovative machine learning-based approach efficiently identifies and segments various tissue types. However, this approach is fraught with challenges, such as those related to image registration and tissue annotation. Nevertheless, HySE technology is considered to be clinically promising[30]. Larger clinical trials are needed to validate the success of HySE. Grosu et al[31] examined machine learning-assisted endoscopy and reported early detection of colon polyps. They also revealed that machine learning can differentiate between benign and precancerous polyps. Bang et al[32] investigated endogastric lesions linked with H. pylori infection. They used AI to predict the presence of H. pylori by scrutinizing endoscopy images. Employing meticulous meta-analysis methods, they evaluated sensitivity, specificity, and the area under the curve of their AI model. They achieved remarkable results by mining a rich database of endoscopic images from infected as well as controlled patients. The CNN-based model showed superior performance, revealing robust analysis in white light source endoscopy images. These findings support the use of clinically transformed AI in endoscopy diagnoses[32]. In a parallel study, Dhali et al[33] examined how AI enhances endoscopy ultrasound analysis. Clearly, AI has the ability to enhance diagnostic accuracy in endoscopy ultrasound. Employing deep learning for image classification, object detection, and semantic segmentation, AI accurately pinpoints pancreatic anatomy and detects lesions. Previous studies have reported an accuracy rate of 93%–95% for identifying pancreatic tumors using support vector machines[33].

Stidham et al[34] carried out a groundbreaking study on inflammatory bowel disease (IBD) and meticulously examined the endoscopic severity grading of ulcerative colitis by juxtaposing the accuracy of a deep learning model against that of skilled physicians. Their findings revealed that the deep learning model held its own against experienced reviewers in terms of grading endoscopic severity. This advancement could significantly enhance the role of colonoscopy in both research and everyday practice[34]. Jahagirdar et al[35] undertook a systematic review and meta-analysis and examined CNN-based machine learning algorithms for predicting ulcerative colitis severity in endoscopy. The CNN algorithms showcased formidable diagnostic accuracy in assessing endoscopic severity. They also reported that the Mayo endoscopy score may yield superior results when paired with the Ulcerative Colitis Endoscopic Severity Index during CNN training. However, they prudently noted that further clinical research is essential to solidify these conclusions[35]. Lu et al[36] explored the differential diagnosis between ulcerative colitis and intestinal tuberculosis. Through sophisticated machine learning techniques, they adeptly extracted pathological features. In addition, they assessed the diagnostic performance of the model by utilizing a series of rigorous scoring criteria, and concluded that the machine learning model effectively differentiated between intestinal tuberculosis and Crohn’s disease. This integration of multiple methods, including endoscopy, marks a significant leap in diagnostic accuracy[36].

CHALLENGES OF ENDOSCOPY AND MACHINE LEARNING APPLICATIONS IN NON-MALIGNANT DISEASES

Despite the remarkable progress, the application of machine learning in non-malignant diseases still faces the bottleneck of data heterogeneity and insufficient clinical validation. For example, the annotation of IBD endoscopy images relies on histology or long-term follow-up results. In addition, the rarity of dysplasia creates a scarcity of training data. Therefore, future studies should integrate multicenter data through federated learning and develop algorithms that do not need close supervision (such as semi-supervised models based on attention mechanisms) to reduce the cost of annotation. We believe the real-time dynamic scoring system and multimodal data fusion will be the key direction to improve clinical practicability. Machine learning can not only assist in the objective grading of IBD, but can also reshape the monitoring paradigm of chronic intestinal diseases. By embedding time series analysis and risk prediction models into endoscopy workflows, it is expected that the transition from “passive screening” to “active early warning” will be realized in the future, which will ultimately reduce the mortality rate of IBD-related cancers. However, technology transfer needs to be closely integrated to clinical needs; for example, developing edge computing models for portable endoscopes, or designing visualization tools to help doctors understand the decision-making logic of AI (e.g., Grad-CAM heat maps of lesion areas). Only in this way can we improve the diagnostic efficiency and increase the acceptance of clinical terminals.

DISCUSSION

Machine learning, in recent years, has made major advances. Deep learning, in particular, has revolutionized gastrointestinal endoscopy. Its efficacy lies in its ability to handle vast amounts of image data and extract intricate features from these visuals. For instance, algorithms have shown remarkable success in polyp detection. They have enhanced early-stage tumor identification and also streamlined the diagnosis of H. pylori infection. As a result, diagnostic sensitivity and specificity have reported a significant boost. Innovations in 3D reconstruction technology present a panoramic view during colonoscopy, which dramatically reduces missed diagnoses caused by blind spots[13]. The application of these technologies can revive traditional endoscopes (Figure 1). Not only do they optimize existing limitations, but also deliver multidimensional data empowering clinical decision-making. The benefits are obvious in gastrointestinal endoscopy, with dual boosts in diagnostic efficiency and accuracy. For example, our real-time assistance system can rapidly process thousands of images, swiftly cutting risks that can arise due to manual fatigue. Furthermore, this clever algorithm is particularly useful for detecting complex lesions[5]. It non-invasively differentiates adenomatous polyps from hyperplastic polyps, solely by analyzing subtle texture differences in endoscopic images[31]. Machine learning is also used in innovative surgical navigation systems. Lesion boundary annotation technology based on real-time segmentation algorithms is a major development. This invaluable tool helps doctors pinpoint the precise lesion area during EMRs, significantly lowering the risk of surgical complications[30].

Open in New Tab Full Size Figure Download Figure

Figure 1 The way machine learning assists in improving the performance of endoscopy. CADe: Computer-aided detection.

However, machine learning has a few limitations in terms of clinical use, two of which are data privacy and annotation quality that hinder model training. Notably, endoscopy images hold sensitive patient information, presenting legal and ethical minefields for cross-institutional data sharing. In addition, securing high-quality annotations requires professional physician input, which is costly and prone to subjective bias. Moreover, the models struggle with generalization, which prevents their widespread deployment. In addition, variations in optical parameters across different endoscope brands may skew image feature distributions. To worsen the issue, the existing endoscopic equipment does not have a dedicated computing unit, leading to processing delays.

Practical barriers and solutions for clinical integration

At present, the main influencing factors of endoscopy on the image recognition of some diseases are mainly concentrated in the above-mentioned effects, which are mainly manifested in the mucosal texture complexity of image recognition of early esophageal cancer, and the lesion size and image resolution of early gastric cancer (Table 2).

Table 2 Comparison of the effects of machine learning-based endoscopic image recognition in different diseases.

	Sensitivity (range)	Specificity (range)	Key influencing factors
Barrett’s esophagus	85%-92%	78%-88%	Complexity mucosal texture, annotation consistency
Early gastric cancer	89%-95%	82%-91%	Lesion size, endoscopic image resolution
Ulcerative colitis	76%-84%	88%-93%	Subjectivity of inflammation activity scoring, data heterogeneity
Colorectal polyps	92%-97%	90%-95%	Polyp morphological diversity, real-time detection algorithm efficiency

Open in New Tab Full Size Table

In addition, the clinical use of endoscopes also faces regulatory challenges and some ethical issues.

Relevant regulatory approval and compliance challenges are also significant issues faced during clinical practices. These factors make clinical application and integration challenging. For example, AI must be certified as a medical device by the Food and Drug Administration (FDA) (United States), Conformite Europeenne (Europe), or National Medical Products Administration (China), which involves strict performance validation, traceability, and ethical review. A real-time polyp detection system must demonstrate its generalization ability across different populations (e.g., racial differences and variations in equipment models). There are, of course, examples, such as GI Genius (Medtronic), which is the first FDA-approved AI-assisted colonoscopy system, based on a CNN polyp detection model and validated through a multicenter RCT (98.4% sensitivity). Another example is EndoBRAIN (Olympus), an AI diagnostic system for early-stage gastric cancer, certified by Japan’s Pharmaceuticals and Medical Devices Agency. However, it requires specific endoscope models to be used. Moreover, there are challenges regarding the attribution of responsibility for AI misdiagnosis (doctor, developer, or algorithm), especially in high-risk scenarios (such as missed detection of early-stage cancer).

Future research on machine learning is poised for multidimensional breakthroughs. Multimodal data fusion will enhance diagnostic efficiency. Take, for instance, the collaboration of endoscopy images with the chemical insights gained through Raman spectroscopy. This collaboration promises the development of non-invasive tools for detecting molecular markers in early-stage cancers. To strengthen the generalization of the model, we must use an adaptive algorithm design. Federated learning technology harnesses the power of distributed data while safeguarding privacy. This approach has shown immense promise in the classification of pancreatic tumors. It is also essential to boost algorithm interpretability. Tools, such as gradient-weighted class activation mapping, focus on the decision-making process of AI, making it easier for physicians to comprehend and trust these systems. This understanding paves the way for greater clinical acceptance. With regard to technology, the fusion of embedded AI chips with edge computing is a significant development. Imagine low-latency, real-time assistance with lightweight models sprouting in endoscopic devices, automatically identifying suspicious lesions, and suggesting biopsy sites. Lastly, it is paramount to establish unified standards for image acquisition, annotation, and model validation. Designing a transparent AI medical responsibility framework will ensure that technology applications remain both safe and reliable.

The application of machine learning should be expanded to non-oncology diseases as well. For example, time-series analysis of endoscopy images determines disease activity and recurrence risk in IBD. A previous study predicted Crohn’s disease recurrence within six months by decoding mucosal texture traits. For too long, functional gastrointestinal diagnoses hinged on subjective symptom tales. Now, machine learning, along with high-resolution endoscopy images and physiological signals, can be a promising objective evaluation tool.

CONCLUSION

The advent of machine learning has transformed gastrointestinal endoscopy. However, its clinical translation must still navigate a labyrinth of challenges, including data dilemmas, algorithmic intricacies, and ethical quandaries. Through interdisciplinary teamwork, technological innovation, and standardization, we foresee an optimized journey from diagnosis to personalized treatment. Ultimately, this will enhance precision and inclusivity in navigating digestive tract disease management.

ACKNOWLEDGEMENTS

The authors sincerely acknowledge the Department of Gastroenterology at Nanjing First Hospital, Nanjing Medical University, for providing the clinical resources and infrastructure essential to this research. We extend our gratitude to the medical and technical teams for their invaluable support in data collection and validation.

Footnotes

Provenance and peer review: Invited article; Externally peer reviewed.

Peer-review model: Single blind

Specialty type: Gastroenterology and hepatology

Country of origin: China

Peer-review report’s classification

Scientific Quality: Grade C

Novelty: Grade C

Creativity or Innovation: Grade C

Scientific Significance: Grade C

P-Reviewer: Dell'Anna G S-Editor: Luo ML L-Editor: A P-Editor: Xu ZH

References

Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin. 2021;71:209-249. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 75126] [Cited by in RCA: 66107] [Article Influence: 16526.8] [Reference Citation Analysis (183)]

Hashimoto R, Requa J, Dao T, Ninh A, Tran E, Mai D, Lugo M, El-Hage Chehade N, Chang KJ, Karnes WE, Samarasena JB. Artificial intelligence using convolutional neural networks for real-time detection of early esophageal neoplasia in Barrett's esophagus (with video). Gastrointest Endosc. 2020;91:1264-1271.e1. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 172] [Cited by in RCA: 151] [Article Influence: 30.2] [Reference Citation Analysis (0)]

Ben-Aharon I, van Laarhoven HWM, Fontana E, Obermannova R, Nilsson M, Lordick F. Early-Onset Cancer in the Gastrointestinal Tract Is on the Rise-Evidence and Implications. Cancer Discov. 2023;13:538-551. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 93] [Reference Citation Analysis (0)]

Mori Y, Wang P, Løberg M, Misawa M, Repici A, Spadaccini M, Correale L, Antonelli G, Yu H, Gong D, Ishiyama M, Kudo SE, Kamba S, Sumiyama K, Saito Y, Nishino H, Liu P, Glissen Brown JR, Mansour NM, Gross SA, Kalager M, Bretthauer M, Rex DK, Sharma P, Berzin TM, Hassan C. Impact of Artificial Intelligence on Colonoscopy Surveillance After Polyp Removal: A Pooled Analysis of Randomized Trials. Clin Gastroenterol Hepatol. 2023;21:949-959.e2. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 32] [Cited by in RCA: 38] [Article Influence: 19.0] [Reference Citation Analysis (0)]

Rex DK, Boland CR, Dominitz JA, Giardiello FM, Johnson DA, Kaltenbach T, Levin TR, Lieberman D, Robertson DJ. Colorectal Cancer Screening: Recommendations for Physicians and Patients From the U.S. Multi-Society Task Force on Colorectal Cancer. Gastroenterology. 2017;153:307-323. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 392] [Cited by in RCA: 521] [Article Influence: 65.1] [Reference Citation Analysis (0)]

Chen X, Wang X, Zhang K, Fung KM, Thai TC, Moore K, Mannel RS, Liu H, Zheng B, Qiu Y. Recent advances and clinical applications of deep learning in medical image analysis. Med Image Anal. 2022;79:102444. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 334] [Cited by in RCA: 360] [Article Influence: 120.0] [Reference Citation Analysis (0)]

Li Z, Liu F, Yang W, Peng S, Zhou J. A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects. IEEE Trans Neural Netw Learn Syst. 2022;33:6999-7019. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 276] [Cited by in RCA: 538] [Article Influence: 179.3] [Reference Citation Analysis (0)]

Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, van der Laak JAWM, van Ginneken B, Sánchez CI. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60-88. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 5573] [Cited by in RCA: 5074] [Article Influence: 634.3] [Reference Citation Analysis (0)]

Wang P, Berzin TM, Glissen Brown JR, Bharadwaj S, Becq A, Xiao X, Liu P, Li L, Song Y, Zhang D, Li Y, Xu G, Tu M, Liu X. Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study. Gut. 2019;68:1813-1819. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 398] [Cited by in RCA: 556] [Article Influence: 92.7] [Reference Citation Analysis (0)]

10.

Fang L, Monroe F, Novak SW, Kirk L, Schiavon CR, Yu SB, Zhang T, Wu M, Kastner K, Latif AA, Lin Z, Shaw A, Kubota Y, Mendenhall J, Zhang Z, Pekkurnaz G, Harris K, Howard J, Manor U. Deep learning-based point-scanning super-resolution imaging. Nat Methods. 2021;18:406-416. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 96] [Cited by in RCA: 65] [Article Influence: 16.3] [Reference Citation Analysis (0)]

11.	Mou E, Wang H, Chen X, Li Z, Cao E, Chen Y, Huang Z, Pang Y. Retinex theory-based nonlinear luminance enhancement and denoising for low-light endoscopic images. BMC Med Imaging. 2024;24:207. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 1] [Reference Citation Analysis (0)]

12.	Daher R, Vasconcelos F, Stoyanov D. A Temporal Learning Approach to Inpainting Endoscopic Specularities and Its Effect on Image Correspondence. Med Image Anal. 2023;90:102994. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 9] [Reference Citation Analysis (0)]

13.

Bobrow TL, Golhar M, Vijayan R, Akshintala VS, Garcia JR, Durr NJ. Colonoscopy 3D video dataset with paired depth from 2D-3D registration. Med Image Anal. 2023;90:102956. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 20] [Cited by in RCA: 20] [Article Influence: 10.0] [Reference Citation Analysis (0)]

14.

Schulte B, Göb M, Singh AP, Lotz S, Draxinger W, Heimke M, Pieper M, Heinze T, Wedel T, Rahlves M, Huber R, Ellrichmann M. High-resolution rectoscopy using MHz optical coherence tomography: a step towards real time 3D endoscopy. Sci Rep. 2024;14:4672. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1] [Cited by in RCA: 2] [Article Influence: 2.0] [Reference Citation Analysis (0)]

15.

González-Bueno Puyal J, Brandao P, Ahmad OF, Bhatia KK, Toth D, Kader R, Lovat L, Mountney P, Stoyanov D. Polyp detection on video colonoscopy using a hybrid 2D/3D CNN. Med Image Anal. 2022;82:102625. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 23] [Cited by in RCA: 13] [Article Influence: 4.3] [Reference Citation Analysis (0)]

16.	Smyth EC, Nilsson M, Grabsch HI, van Grieken NC, Lordick F. Gastric cancer. Lancet. 2020;396:635-648. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1150] [Cited by in RCA: 3047] [Article Influence: 609.4] [Reference Citation Analysis (5)]

17.

Yoon J, Joseph J, Waterhouse DJ, Borzy C, Siemens K, Diamond S, Tsikitis VL, Bohndiek SE. First experience in clinical application of hyperspectral endoscopy for evaluation of colonic polyps. J Biophotonics. 2021;14:e202100078. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 13] [Cited by in RCA: 17] [Article Influence: 4.3] [Reference Citation Analysis (0)]

18.

Hsu CM, Hsu CC, Hsu ZM, Shih FY, Chang ML, Chen TH. Colorectal Polyp Image Detection and Classification through Grayscale Images and Deep Learning. Sensors (Basel). 2021;21:5995. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 3] [Cited by in RCA: 15] [Article Influence: 3.8] [Reference Citation Analysis (2)]

19.

Shi Y, Fan H, Li L, Hou Y, Qian F, Zhuang M, Miao B, Fei S. The value of machine learning approaches in the diagnosis of early gastric cancer: a systematic review and meta-analysis. World J Surg Oncol. 2024;22:40. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 17] [Reference Citation Analysis (0)]

20.

Arai J, Aoki T, Sato M, Niikura R, Suzuki N, Ishibashi R, Tsuji Y, Yamada A, Hirata Y, Ushiku T, Hayakawa Y, Fujishiro M. Machine learning-based personalized prediction of gastric cancer incidence using the endoscopic and histologic findings at the initial endoscopy. Gastrointest Endosc. 2022;95:864-872. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 41] [Cited by in RCA: 37] [Article Influence: 12.3] [Reference Citation Analysis (0)]

21.

Chen K, Wang Y, Lang Y, Yang L, Guo Z, Wu W, Zhang J, Ding S. Machine learning models to predict submucosal invasion in early gastric cancer based on endoscopy features and standardized color metrics. Sci Rep. 2024;14:10445. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 3] [Reference Citation Analysis (0)]

22.

Li SW, Zhang LH, Cai Y, Zhou XB, Fu XY, Song YQ, Xu SW, Tang SP, Luo RQ, Huang Q, Yan LL, He SQ, Zhang Y, Wang J, Ge SQ, Gu BB, Peng JB, Wang Y, Fang LN, Wu WD, Ye WG, Zhu M, Luo DH, Jin XX, Yang HD, Zhou JJ, Wang ZZ, Wu JF, Qin QQ, Lu YD, Wang F, Chen YH, Chen X, Xu SJ, Tung TH, Luo CW, Ye LP, Yu HG, Mao XL. Deep learning assists detection of esophageal cancer and precursor lesions in a prospective, randomized controlled study. Sci Transl Med. 2024;16:eadk5395. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 15] [Reference Citation Analysis (0)]

23.

Wang J, Zhu X, Zeng J, Liu C, Shen W, Sun X, Lin Q, Fang J, Chen Q, Ji Y. Using clinical and radiomic feature-based machine learning models to predict pathological complete response in patients with esophageal squamous cell carcinoma receiving neoadjuvant chemoradiation. Eur Radiol. 2023;33:8554-8563. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 15] [Reference Citation Analysis (0)]

24.

Fockens KN, Jukema JB, Boers T, Jong MR, van der Putten JA, Pouw RE, Weusten BLAM, Alvarez Herrero L, Houben MHMG, Nagengast WB, Westerhof J, Alkhalaf A, Mallant R, Ragunath K, Seewald S, Elbe P, Barret M, Ortiz Fernández-Sordo J, Pech O, Beyna T, van der Sommen F, de With PH, de Groof AJ, Bergman JJ. Towards a robust and compact deep learning system for primary detection of early Barrett's neoplasia: Initial image-based results of training on a multi-center retrospectively collected data set. United European Gastroenterol J. 2023;11:324-336. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 24] [Cited by in RCA: 26] [Article Influence: 13.0] [Reference Citation Analysis (0)]

25.

Li MD, Huang ZR, Shan QY, Chen SL, Zhang N, Hu HT, Wang W. Performance and comparison of artificial intelligence and human experts in the detection and classification of colonic polyps. BMC Gastroenterol. 2022;22:517. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 12] [Reference Citation Analysis (0)]

26.	Koseoglu FD, Alıcı IO, Er O. Machine learning approaches in the interpretation of endobronchial ultrasound images: a comparative analysis. Surg Endosc. 2023;37:9339-9346. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 7] [Reference Citation Analysis (0)]

27.

Jheng YC, Wang YP, Lin HE, Sung KY, Chu YC, Wang HS, Jiang JK, Hou MC, Lee FY, Lu CL. A novel machine learning-based algorithm to identify and classify lesions and anatomical landmarks in colonoscopy images. Surg Endosc. 2022;36:640-650. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 4] [Cited by in RCA: 9] [Article Influence: 2.3] [Reference Citation Analysis (0)]

28.

Rowan NJ, Kremer T, McDonnell G. A review of Spaulding's classification system for effective cleaning, disinfection and sterilization of reusable medical devices: Viewed through a modern-day lens that will inform and enable future sustainability. Sci Total Environ. 2023;878:162976. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 45] [Cited by in RCA: 32] [Article Influence: 16.0] [Reference Citation Analysis (0)]

29.

Namikawa K, Hirasawa T, Yoshio T, Fujisaki J, Ozawa T, Ishihara S, Aoki T, Yamada A, Koike K, Suzuki H, Tada T. Utilizing artificial intelligence in endoscopy: a clinician's guide. Expert Rev Gastroenterol Hepatol. 2020;14:689-706. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 15] [Cited by in RCA: 25] [Article Influence: 5.0] [Reference Citation Analysis (0)]

30.	Yu C, Helwig EJ. Artificial intelligence in gastric cancer: a translational narrative review. Ann Transl Med. 2021;9:269. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 2] [Cited by in RCA: 7] [Article Influence: 1.8] [Reference Citation Analysis (0)]

31.

Grosu S, Wesp P, Graser A, Maurus S, Schulz C, Knösel T, Cyran CC, Ricke J, Ingrisch M, Kazmierczak PM. Machine Learning-based Differentiation of Benign and Premalignant Colorectal Polyps Detected with CT Colonography in an Asymptomatic Screening Population: A Proof-of-Concept Study. Radiology. 2021;299:326-335. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 11] [Cited by in RCA: 34] [Article Influence: 8.5] [Reference Citation Analysis (0)]

32.

Bang CS, Lee JJ, Baik GH. Artificial Intelligence for the Prediction of Helicobacter Pylori Infection in Endoscopic Images: Systematic Review and Meta-Analysis Of Diagnostic Test Accuracy. J Med Internet Res. 2020;22:e21983. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 76] [Cited by in RCA: 69] [Article Influence: 13.8] [Reference Citation Analysis (0)]

33.

Dhali A, Kipkorir V, Srichawla BS, Kumar H, Rathna RB, Ongidi I, Chaudhry T, Morara G, Nurani K, Cheruto D, Biswas J, Chieng LR, Dhali GK. Artificial intelligence assisted endoscopic ultrasound for detection of pancreatic space-occupying lesion: a systematic review and meta-analysis. Int J Surg. 2023;109:4298-4308. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 7] [Cited by in RCA: 11] [Article Influence: 5.5] [Reference Citation Analysis (0)]

34.

Stidham RW, Liu W, Bishu S, Rice MD, Higgins PDR, Zhu J, Nallamothu BK, Waljee AK. Performance of a Deep Learning Model vs Human Reviewers in Grading Endoscopic Disease Severity of Patients With Ulcerative Colitis. JAMA Netw Open. 2019;2:e193963. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 116] [Cited by in RCA: 181] [Article Influence: 30.2] [Reference Citation Analysis (0)]

35.

Jahagirdar V, Bapaye J, Chandan S, Ponnada S, Kochhar GS, Navaneethan U, Mohan BP. Diagnostic accuracy of convolutional neural network-based machine learning algorithms in endoscopic severity prediction of ulcerative colitis: a systematic review and meta-analysis. Gastrointest Endosc. 2023;98:145-154.e8. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 18] [Cited by in RCA: 20] [Article Influence: 10.0] [Reference Citation Analysis (0)]

36.

Lu B, Huang Z, Lin J, Zhang R, Shen X, Huang L, Wang X, He W, Huang Q, Fang J, Mao R, Li Z, Huang B, Feng ST, Ye Z, Zhang J, Wang Y. A novel multidisciplinary machine learning approach based on clinical, imaging, colonoscopy, and pathology features for distinguishing intestinal tuberculosis from Crohn's disease. Abdom Radiol (NY). 2024;49:2187-2197. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1] [Cited by in RCA: 4] [Article Influence: 4.0] [Reference Citation Analysis (0)]