Diagnostic value of real-time computer-aided detection for precancerous lesion during esophagogastroduodenoscopy: A meta-analysis

doi:10.4240/wjgs.v17.i11.109991

Advanced Search

BPG is committed to discovery and dissemination of knowledge

Home / Archive / Volume 17, Issue 11

This Article

Table of Contents

Academic Content and Language Evaluation of This Article

CrossCheck and Google Search of This Article

Academic Rules and Norms of This Article

Supplementary Materials of This Article

Citation of this article

Corresponding Author of This Article

Research Domain of This Article

Article-Type of This Article

Open-Access Policy of This Article

Times Cited Counts in Google of This Article

Number of Hits and Downloads for This Article

Total Article Views (0)

All Articles published online

The chart showing PDF series, HTML series, Figures (1-8) series, Tables (1-4) series.

Item

Count

PDF

HTML

Figures (1-8)

Tables (1-4)

Sum=0

Nov 27, 2025 (publication date) through Nov 25, 2025

Times Cited of This Article

Times Cited (0)

Journal Information of This Article

Publication Name

World Journal of Gastrointestinal Surgery

ISSN

1948-9366

Publisher of This Article

Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA

Meta-Analysis Open Access

World J Gastrointest Surg. Nov 27, 2025; 17(11): 109991
Published online Nov 27, 2025. doi: 10.4240/wjgs.v17.i11.109991

Diagnostic value of real-time computer-aided detection for precancerous lesion during esophagogastroduodenoscopy: A meta-analysis

Zong-Yang Li, Ya-Hui Liu, Hong-Qiao Cai

Zong-Yang Li, Ya-Hui Liu, Hong-Qiao Cai, Department of Hepatobiliary and Pancreatic Surgery, General Surgery Center, The First Hospital of Jilin University, Changchun 130021, Jilin Province, China

ORCID number: Zong-Yang Li (0009-0002-5932-9559); Ya-Hui Liu (0000-0003-3081-8156); Hong-Qiao Cai (0000-0002-7022-3512).

Author contributions: Cai HQ designed the overall concept and outline of the manuscript; Liu YH contributed to the discussion and design of the manuscript; Li ZY contributed to the writing, and editing the manuscript, illustrations, and review of literature.

Conflict-of-interest statement: All the authors report no relevant conflicts of interest for this article.

PRISMA 2009 Checklist statement: The authors have read the PRISMA 2009 Checklist, and the manuscript was prepared and revised according to the PRISMA 2009 Checklist.

Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/

Corresponding author: Hong-Qiao Cai, MD, PhD, Department of Hepatobiliary and Pancreatic Surgery, General Surgery Center, The First Hospital of Jilin University, No. 1 Xinmin Street, Changchun 130021, Jilin Province, China. hongqiaocai@jlu.edu.cn

Received: May 28, 2025
Revised: August 16, 2025
Accepted: September 15, 2025
Published online: November 27, 2025
Processing time: 182 Days and 2.4 Hours

Abstract

BACKGROUND

Early detection of precancerous lesions is of vital importance for reducing the incidence and mortality of upper gastrointestinal (UGI) tract cancer. However, traditional endoscopy has certain limitations in detecting precancerous lesions. In contrast, real-time computer-aided detection (CAD) systems enhanced by artificial intelligence (AI) systems, although they may increase unnecessary medical procedures, can provide immediate feedback during examination, thereby improving the accuracy of lesion detection. This article aims to conduct a meta-analysis of the diagnostic performance of CAD systems in identifying precancerous lesions of UGI tract cancer during esophagogastroduodenoscopy (EGD), evaluate their potential clinical application value, and determine the direction for further research.

AIM

To investigate the improvement of the efficiency of EGD examination by the real-time AI-enabled real-time CAD system (AI-CAD) system.

METHODS

PubMed, EMBASE, Web of Science and Cochrane Library databases were searched by two independent reviewers to retrieve literature with per-patient analysis with a deadline up until April 2025. A meta-analysis was performed with R Studio software (R4.5.0). A random-effects model was used and subgroup analysis was carried out to identify possible sources of heterogeneity.

RESULTS

The initial search identified 802 articles. According to the inclusion criteria, 2113 patients from 10 studies were included in this meta-analysis. The pooled accuracy difference, logarithmic difference of diagnostic odds ratios, sensitivity, specificity and the area under the summary receiver operating characteristic curve (area under the curve) of both AI group and endoscopist group for detecting precancerous lesion were 0.16 (95%CI: 0.12-0.20), -0.19 (95%CI: -0.75-0.37), 0.89 (95%CI: 0.85-0.92, AI group), 0.67 (95%CI: 0.63-0.71, endoscopist group), 0.89 (95%CI: 0.84-0.93, AI group), 0.77 (95%CI: 0.70-0.83, endoscopist group), 0.928 (95%CI: 0.841-0.948, AI group), 0.722 (95%CI: 0.677-0.821, endoscopist group), respectively.

CONCLUSION

The present studies further provide evidence that the AI-CAD is a reliable endoscopic diagnostic tool that can be used to assist endoscopists in detection of precancerous lesions in the UGI tract. It may be introduced on a large scale for clinical application to enhance the accuracy of detecting precancerous lesions in the UGI tract.

Key Words: Artificial intelligence; Real-time computer-aided detection system; Precancerous lesion; Esophagogastroduodenoscopy; Endoscopy; Upper gastrointestinal tract; Diagnostic performance; Meta-analysis

Core Tip: This meta-analysis indicates that the artificial intelligence-enabled real-time computer-aided detection system (AI-CAD) system is superior to endoscopists in detecting precancerous lesions of the upper gastrointestinal (UGI) tract. Its sensitivity, specificity, and diagnostic accuracy are higher, which is helpful in improving lesion recognition ability and may reduce the rate of missed diagnoses. These findings support the clinical potential of integrating AI-CAD into routine endoscopy practice to enhance the early detection and prevention of UGI cancers.

Citation: Li ZY, Liu YH, Cai HQ. Diagnostic value of real-time computer-aided detection for precancerous lesion during esophagogastroduodenoscopy: A meta-analysis. World J Gastrointest Surg 2025; 17(11): 109991
URL: https://www.wjgnet.com/1948-9366/full/v17/i11/109991.htm
DOI: https://dx.doi.org/10.4240/wjgs.v17.i11.109991

INTRODUCTION

Upper gastrointestinal (UGI) malignancies [including esophageal squamous cell carcinoma (ESCC), Barrett-associated adenocarcinoma and gastric cancer] have now become a major global public health challenge. According to the latest data from the International Agency for Research on Cancer, in 2022, there were over 1.7 million new cases of UGI cancers worldwide, with esophageal cancer and gastric cancer accounting for 3.1% and 5.6% respectively, and the 5-year survival rates of both are generally below 30%[1]. More than 40% of patients with ESCC are diagnosed after the disease has metastasized, and the 5-year survival rate is less than 20%[2]. The 5-year survival rate for early-stage gastric cancer can reach 92.6%[3], while that for advanced-stage gastric cancer is only 25%[4]. The above data indicate that early diagnosis of UGI cancers is of vital importance. Precancerous lesions of the UGI tract [such as atypical hyperplasia, Barrett-related lesions, intestinal metaplasia (IM), and chronic atrophic gastritis (CAG)] are the key reversible stages in the development of cancer, and early detection and intervention of precancerous lesions can reduce the mortality rate of advanced cancer significantly.

Currently, white light endoscopy (WLE) remains the conventional method for screening precancerous lesions, but its diagnostic efficacy is highly dependent on the experience of the endoscopist and missed diagnosis remains relatively common. A systematic review incorporating 22 studies demonstrated that the proportion of missed gastric cancer diagnoses by endoscopists was 9.4% (95%CI: 5.7%-13.1%). This indicates that approximately 1 in every 10 gastric cancer cases may be missed during the initial endoscopy, and the majority (69%) of these missed cancers were early-stage gastric cancers[5]. In addition, a prospective study has also evaluated the consistency and accuracy of 5 experienced endoscopists and 5 Less-experienced endoscopists in diagnosing IM using high-definition (HD) endoscopy. The results showed that the experienced endoscopist group had a κ value of 0.38 (95%CI: 0.25-0.52), while the less-experienced endoscopist group had a κ value of 0.33 (95%CI: 0.20-0.47). Both values indicate poor inter-observer agreement, highlighting the significant impact of the technical divide caused by unequal distribution of medical resources on the diagnosis of precancerous lesions of the UGI tract[6].

Artificial intelligence-enhanced real-time computer-aided detection systems (AI-CAD) have enabled real-time analysis of endoscopic images through deep learning algorithms, such as convolutional neural networks. A prospective study involving 450 patients demonstrated that AI-CAD significantly enhanced the detection capability of non-expert endoscopists for Barrett's esophagus (BE), increasing their neoplasia detection rate by an average of approximately 12%[7]. This indicates that AI-CAD has the potential to both improve the accuracy of detecting precancerous lesions in the UGI tract and reduce variability among operators.

Although some studies have discussed the diagnostic role of AI-CAD in early cancer detection, they mostly focus on single target organs or offline image analysis, lacking a systematic evaluation of the diagnostic performance of AI-CAD in detecting precancerous lesions of the UGI tract in real-time scenarios[8-10]. Additionally, there are limited direct comparisons between artificial intelligence (AI) and endoscopists, and the sources of heterogeneity have not yet been clarified, making it difficult to provide a solid evidence base for the widespread clinical application of AI-CAD in detecting precancerous lesions of the UGI tract. Based on the above limitations, this study aims to systematically review prospective real-time AI-CAD studies published up to April 2025. Through meta-analysis, we first quantitatively evaluate the pooled sensitivity, specificity, and the area under the summary receiver operating characteristic (SROC) curve area under the curve (AUC) of AI-CAD in detecting precancerous lesions during routine esophagogastroduodenoscopy (EGD). We then directly compare AI-CAD with endoscopists alone using differences in diagnostic accuracy and logarithm of diagnostic odds ratios log(DOR). In addition, subgroup analyses and meta-regression are conducted to explore sources of heterogeneity, providing evidence-based support for the clinical application of AI-CAD and guiding future research directions. It should be noted that the present study did not involve the design, training, or deployment of any AI-CAD system. Instead, we performed a meta-analysis to synthesize their diagnostic performance in detecting UGI precancerous lesions.

MATERIALS AND METHODS

Search strategy and study selection

To retrieve all relevant publications up to April 30, 2025, two independent reviewers (Li ZY and Cai HQ) searched four databases, including PubMed, EMBASE, Web of Science, and Cochrane Library. The search strategy logic was defined as follows: “Artificial intelligence” or “Machine learning” or “Deep learning”, and “Computer-aided detection” or “CAD”, and “Precancerous conditions” or “Precancerous lesion”, and “Stomach neoplasms” or “Esophageal neoplasms”. The detailed search strategy is presented in Supplementary Table 1. To identify potential studies, the reference lists of all relevant publications were checked and we also followed up on the clinical trials registered in the Cochrane Library database.

All articles were independently screened by two reviewers (Li ZY and Cai HQ) based on predefined inclusion and exclusion criteria. The inclusion criteria were as follows: (1) Participants: Patients undergoing EGD, with lesions involving precancerous conditions of the UGI tract, including BE, esophageal squamous intraepithelial neoplasia, gastric intraepithelial neoplasia, IM, and CAG; (2) Intervention: Endoscopic examination assisted by real-time AI-CAD systems; (3) Comparison: Conventional endoscopic examination performed by endoscopists without AI assistance; (4) Diagnostic criteria: Histopathological confirmation as the gold standard; (5) Study design: Original studies, including randomized controlled trials, prospective studies, retrospective cohort studies, and case-control studies; (6) Outcomes: Studies reporting or allowing the derivation of diagnostic performance metrics for both AI-assisted and endoscopist-alone groups (e.g., sensitivity, specificity, true positives, false positives, false negatives, and true negatives values); (7) Language: Articles published in English; and (8) Publication type: Full-text articles published in peer-reviewed journals. The exclusion criteria were as follows: (1) Irrelevant participants: Studies not involving the UGI tract or not focused on precancerous lesions; (2) Ineligible intervention: Studies not using real-time AI assistance or using AI only for image classification, pathological analysis, or other non-endoscopic applications; (3) Lack of comparison: Absence of a comparator group or inability to compare outcomes with endoscopist performance; (4) Insufficient data: Studies lacking or not allowing derivation of diagnostic performance outcomes for either the AI or endoscopist group; (5) Inappropriate research type: Reviews, commentaries, conference abstracts, letters, technical reports, meta-analyses, and case reports; (6) Duplicate data: Studies with overlapping data, in which case the more comprehensive or recently published article was included; (7) Low-quality studies: Studies with poor methodological quality or unclear design, presenting a high risk of bias; and (8) Non-English publications.

Data extraction and quality assessment

For each study that does meet the inclusion criteria and does not violate any exclusion criteria, two independent reviewers (Li ZH and Cai HQ) extracted the following data: First author, year of publication, country, ethnicity, study design, total sample size, type of lesion, EGD imaging modality, and the AI model used for lesion segmentation. Diagnostic parameters including true positives, false positives, false negatives, and true negatives values were recorded in 2 × 2 contingency tables. In cases of discrepancies between the reviewers, a third investigator (Liu YH) assessed the disagreement and resolved it through consensus. To detect possible publication bias, a Deek's funnel plot was drawn based on the available data. To evaluate the quality of the included literature, two independent reviewers (Li ZH and Cai HQ) conducted quality assessment using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool. QUADAS-2 assesses the risk of bias and applicability concerns across four domains: Patient selection, index test, reference standard, and flow and timing.

Statistical analysis

First, presence of threshold effect was assessed. A P-value greater than 0.05 for Spearman’s correlation coefficient was interpreted as indicating no significant threshold effect, whereas a P-value less than or equal to 0.05 suggested that heterogeneity might be attributable to a threshold effect. If no significant threshold effect was identified, heterogeneity due to non-threshold effects was further evaluated. Heterogeneity was assessed using the P-value of Cochran’s Q test and the I² statistic. If P-value was less than 0.05 and the I² exceeded 50%, significant heterogeneity was considered present, and a random-effects model was applied. Conversely, if P-value exceeded 0.05 and the I² was less than 50%, studies were considered to have acceptable consistency, and a fixed-effects model was used.

A random-effects model for single proportions was ultimately employed to calculate the pooled sensitivity and specificity of the AI-CAD group and the endoscopist group. A univariable random-effects model was applied to estimate the pooled difference in diagnostic accuracy and log(DOR) between the two groups. Forest plots were used to display the pooled effect sizes along with their 95% confidence intervals (CI). A SROC curve was constructed to evaluate the diagnostic superiority of the AI-CAD group compared to the endoscopist group, and the AUC was calculated-where a higher AUC indicates greater diagnostic performance.

Given the presence of substantial heterogeneity, subgroup analyses and meta-regression were performed to explore potential sources. In addition, sensitivity analyses were conducted to assess the robustness of the findings. All meta-analyses were conducted using R 4.5.0 and Stata 18.0.

RESULTS

Study selection

Following the methodology described above, we present here the results of the systematic literature search, study selection process, and quality appraisal of the included studies. According to the search strategy, a total of 798 articles were retrieved from four databases: PubMed, EMBASE, Web of Science, and the Cochrane Library. Additionally, 4 studies listed as ongoing in the Cochrane Library were subsequently confirmed and included after follow-up. No additional records were identified through manual screening of reference lists of the included literature. Among the retrieved articles, 283 were excluded due to duplication. A further 509 articles were excluded for not meeting the inclusion criteria: 416 were excluded based on title and abstract screening due to reasons such as irrelevant target disease (non-UGI precancerous lesions), non-original research, or non-English language; 93 articles were excluded after full-text review, including 13 without a comparison group of conventional endoscopists, 45 due to insufficient data for constructing a 2 × 2 contingency table, and 35 due to use of non-real-time AI-CAD interventions. Ultimately, 10 studies were included in this meta-analysis (Figure 1).

Open in New Tab Full Size Figure Download Figure

Figure 1 Flow diagram of the study selection process.

Study characteristics

These 10 eligible studies[11-20] enrolled a cumulative total of 2113 patients, with a mean sample size of 211 (range 61-676), and were included in the present meta-analysis. All investigations adopted prospective designs, 8 conventional cohort studies, and 2 paired cohort studies. The studies were conducted in China (n = 5), the Netherlands (n = 2), the United Kingdom (n = 2), and Germany (n = 1), yielding a balanced ethnic distribution (5 Asian and 5 Caucasian cohorts). Regarding the target condition, Barrett’s-related neoplasia was examined in 5 studies, CAG in 4 studies, and gastric IM (GIM) in one study. Most datasets were acquired with HD-WLE (4/10) or narrow-band imaging (NBI, 3/10); the remainder used blue-light imaging (BLI), volumetric laser endomicroscopy, or multimodal combinations. A variety of deep-learning architectures were implemented for real-time AI-CAD: U-Net and its variants were the most common (3/10), while the remaining studies employed ResNet, EfficientNet-B4, VGG-16, FCN-ResNet-50, SegNet, and DeepLab-V3+. Across all trials, histopathology served as the reference standard for lesion confirmation. Collectively, these studies provide a geographically and methodologically diverse evidence base for assessing the diagnostic performance of real-time AI-assisted endoscopy in detecting precancerous lesions of UGI tract (Table 1). To facilitate a clearer understanding of methodological variability, we summarizes the essential technical characteristics of the AI-CAD systems used in the included studies (Table 2).

Table 1 Main characteristics of 10 studies included in meta-analysis.

Ref.	Year	Location	Ethnicity	Study design	Sample size		Type of lesion	Imaging modality	AI model for segmentation	Diagnostic power
					Sample size					Case				Control
					Case	Control				TP	FP	FN	TN	TP	FP	FN	TN
van der Putten et al[11]	2020	Netherlands	Caucasian	Cohort	80	80	Barrett’s neoplasia	BLI	ResNet	37	7	3	33	29	10	11	30
Yan et al[12]	2020	China	Asian	Cohort	80	80	GIM	NBI	EfficientNet B4	34	6	3	37	32	8	5	35
Struyvenberg et al[13]	2021	Netherlands	Caucasian	Cohort	134	134	Barrett’s neoplasia	VLE	VGG16	36	17	3	78	27	14	12	81
Hussein et al[14]	2022	United Kingdom	Caucasian	Cohort	61	61	Barrett’s neoplasia	WLE	FCNResNet 50	27	4	1	29	22	17	6	16
Zhao et al[15]	2022	China	Asian	Cohort	268	268	CAG	NBI	U-Net	96	10	13	149	66	27	43	132
Zhao et al[16]	2022	China	Asian	Paired cohort study	676	676	CAG	WLE	U-Net	284	10	54	328	212	61	126	277
Abdelrahim et al[17]	2023	United Kingdom	Caucasian	Cohort	75	75	Barrett’s neoplasia	WLE	SegNet	30	4	2	39	20	10	12	33
Zhao et al[18]	2023	China	Asian	Paired cohort study	524	524	CAG	NBI	U-Net	234	25	28	237	177	78	85	184
Meinikheim et al[19]	2024	Germany	Caucasian	Cohort	96	96	Barrett’s neoplasia	Multi-modal	DeepLab V3+	40	12	11	33	36	15	15	30
Tao et al[20]	2024	China	Asian	Cohort	119	119	CAG	WLE	U-Net++	74	3	4	38	55	4	23	37

BLI: Blue light imaging; GIM: Gastric intestinal metaplasia; NBI: Narrow band imaging; VLE: Volumetric laser endomicroscopy; WLE: White light endoscopy; CAG: Chronic atrophic gastritis; TP: True positive; FP: False positive; FN: False negative; TN: True negative; AI: Artificial intelligence.

Open in New Tab Full Size Table

Table 2 Technical characteristics of artificial intelligence-enhanced real-time computer-aided detection systems utilized in included studies.

Ref.	Year	AI model type	Hardware configuration	Processing speed	Dataset size	Risk stratification tool
van der Putten et al[11]	2020	Integrated U-Net + Transfer Learning	Titan Xp 12GB GPU	< 2 sec/frame	500000 frames	None
Yan et al[12]	2020	EfficientNetB4	Not specified	20 fps	11000 frames	None
Struyvenberg et al[13]	2021	VGG16	Not specified	56 fps	318 video cases	None
Hussein et al[14]	2022	ResNet101 + FCN ResNet50	Not specified	48-56 fps	150000 frames	None
Zhao et al[15]	2022	U-Net	Not specified	Not specified	Not specified	None
Zhao et al[16]	2022	U-Net	GeForce RTX 3090	30 fps	Not specified	None
Abdelrahim et al[17]	2022	VGG16 + SegNet Hybrid Model	GeForce RTX 2080 Ti	30 fps	Training: 109071 frames; Testing: 75 video cases	None
Zhao et al[18]	2023	U-Net Extended Model	GeForce RTX 3090	30 fps	Not specified	OLGA
Meinikheim et al[19]	2024	DeepLabV3+ + ResNet50 + Mean-Teacher	Not specified	30 fps	Training: 51273 frames; Testing: 96 video cases	None
Tao et al[20]	2024	UNet++ + ResNet50 Dual Model System	Not specified	Not specified	Training: 119 video cases; Testing: 102 video cases	Kimura-Takemoto

AI: Artificial intelligence.

Open in New Tab Full Size Table

Risk of bias and quality assessment

The Deek’s funnel plot for publication bias demonstrated a clear symmetry for the AI-CAD group (Figure 2A). Egger’s test further indicated no statistical evidence of publication bias in the AI-CAD group (P = 0.799). Similarly, the Deek’s funnel plot for the endoscopist group also showed a symmetrical distribution (Figure 2B), and Egger’s test revealed no statistical evidence supporting the presence of publication bias in the endoscopist group (P = 0.320). In summary, there was no significant publication bias detected in either the AI-CAD or endoscopist groups. The results of the QUADAS-2 assessment (Figure 3) indicated a generally low risk of bias, with the overall methodological quality of the included studies rated as moderate to high.

Open in New Tab Full Size Figure Download Figure

Figure 2 Funnel graph for the assessment of potential publication bias of the included studies. A: Funnel plot for the artificial intelligence-enabled real-time computer-aided detection system group, showing the distribution of logarithm of diagnostic odds ratios log(DOR) values against their standard errors; B: Funnel plot for the endoscopist group, showing the distribution of log(DOR) values against their standard errors. DOR: Diagnostic odds ratio.

Open in New Tab Full Size Figure Download Figure

Figure 3 Summary quality assessment of studies using Quality Assessment of Diagnostic Accuracy Studies-2 criteria. A: Overall risk of bias and applicability concerns across all included studies; B: Risk of bias and applicability concerns for each individual study.

Diagnostic effect

The pooled sensitivity and specificity of the AI-CAD group were 0.89 (95%CI: 0.85-0.92) and 0.89 (95%CI: 0.84-0.93), respectively (Figure 4A and B). In comparison, the pooled sensitivity and specificity of the endoscopist group were 0.67 (95%CI: 0.63-0.71) and 0.77 (95%CI: 0.70-0.83), respectively (Figure 4C and D). The pooled area under the curve (AUC) for the AI-CAD and endoscopist groups was 0.921 (95%CI: 0.841-0.948) and 0.733 (95%CI: 0.678-0.818), respectively (Figure 5). The pooled difference in overall accuracy between the AI-CAD group and the endoscopist group was 0.16 (95%CI: 0.12-0.20) (Figure 6). The pooled difference in log(DOR) between the two groups was -0.19 (95%CI: -0.75 to 0.37) (Figure 7).

Open in New Tab Full Size Figure Download Figure

Figure 4 Forest plots of the pooled sensitivity and specificity for artificial intelligence-enhanced real-time computer-aided detection systems and endoscopist diagnosis. A: Pooled sensitivity for artificial intelligence-enhanced real-time computer-aided detection systems (AI-CAD) group; B: Pooled specificity for AI-CAD group; C: Pooled sensitivity for endoscopist group; D: Pooled specificity for endoscopist group.

Open in New Tab Full Size Figure Download Figure

Figure 5 Combined summary receiver operating characteristic plot of artificial intelligence-assisted and endoscopist-only diagnosis. SROC: Summary receiver operating characteristic; AUC: Area under the curve.

Open in New Tab Full Size Figure Download Figure

Figure 6 Forest plots of pooled accuracy difference between artificial intelligence-enhanced real-time computer-aided detection systems and conventional endoscopists across all included studies. Each horizontal line represents the 95% confidence interval (CI) for the accuracy difference in an individual study. A positive value indicates that the Artificial intelligence-enhanced real-time computer-aided detection systems system had higher diagnostic accuracy than endoscopists. The size of the square reflects the weight of each study in the meta-analysis, and the diamond at the bottom represents the pooled effect estimate with its 95%CI, calculated using a random-effects model. CI: Confidence interval.

Open in New Tab Full Size Figure Download Figure

Figure 7 Forest plots of pooled logarithm of diagnostic odds ratios difference between artificial intelligence-enhanced real-time computer-aided detection systems and conventional endoscopists across all included studies. Each study is represented by a point estimate and a 95% confidence interval (CI). A logarithm of diagnostic odds ratios difference greater than zero indicates a higher diagnostic discriminative ability of Artificial intelligence-enhanced real-time computer-aided detection systems compared to endoscopists. The diamond at the bottom shows the pooled estimate with 95%CI, calculated using a random-effects model. Notably, the wide confidence interval suggests substantial between-study heterogeneity and potential uncertainty in effect direction. CI: Confidence interval; DOR: Diagnostic odds ratio.

Heterogeneity and subgroup analysis

In the threshold effect analysis, the Spearman correlation coefficient for the AI-CAD group was -0.006 with a P-value of 0.987, indicating an almost negligible correlation and a P-value far greater than 0.05. This suggests that no significant threshold effect was present in the AI-CAD group. For the endoscopist group, the Spearman correlation coefficient was 0.382 with a P-value of 0.279. Although the coefficient was positive, the P-value exceeded 0.05, and thus there was no statistically significant evidence of a threshold effect in this group either. While threshold variation may have had some impact on diagnostic performance measures, the effect did not reach statistical significance. Therefore, threshold effects are not considered a significant source of heterogeneity in either group.

However, as shown in Figures 6 and 7, the pooled diagnostic accuracy difference and log(DOR) difference between the AI-CAD and endoscopist groups exhibited I² values of 62.9% and 43.8%, respectively, with P-values close to or below 0.05. In addition, Figure 4 shows that the I² values for pooled sensitivity and specificity in both groups mostly exceeded 50%. These findings indicate that although the threshold effect does not contribute significantly to heterogeneity, non-threshold-related heterogeneity remains substantial. Therefore, a random-effects model was adopted. Furthermore, subgroup analyses were conducted to identify potential sources of heterogeneity.

The results of subgroup analyses based on six variables (Table 3) suggest that current variables were insufficient to explain the heterogeneity observed in sensitivity of AI-CAD group. Objective characteristics that have not been quantified, such as patient cohorts and lesion conspicuity, may be more contributive. The heterogeneity of sensitivity in the endoscopist group was already low (Figure 4), and it was therefore unnecessary to investigate further sources of heterogeneity. In contrast, the heterogeneity of specificity and accuracy difference could be partially attributed to imaging modality and AI model for segmentation, but a portion of the heterogeneity remained unexplained. The heterogeneity of the log(DOR) difference could not be determined by the existing subgroup analyses. In the subsequent meta-regression, imaging modality and AI model for segmentation are taken as candidate covariates to further verify their independent contributions to the heterogeneity of specificity and accuracy difference.

Table 3 Subgroup analysis of diagnostic effect.

Subgroup	No. studies	No. patients	Sensitivity (AI-CAD)			Specificity (AI-CAD)			Sensitivity (endoscopist)			Specificity (endoscopist)			Difference of accuracy			Difference of log (DOR)
Subgroup	No. studies	No. patients	Value	I² (%)	P value	Value	I² (%)	P value	Value	I² (%)	P value	Value	I² (%)	P value	Value	I² (%)	P value	Value	I² (%)	P value
Race
Asian	5	1667	0.890 (95%CI: 0.837-0.928)	53.0	0.070	0.930 (95%CI: 0.891-0.956)	70.8	0.009	0.658 (95%CI: 0.618-0.696)	43.9	0.039	0.810 (95%CI: 0.723-0.874)	75.3	0.001	0.167 (95%CI: 0.112-0.221)	22.9	0.118	-0.355 (95%CI: -1.110 to 0.401)	9.3	0.296
Others	5	446	0.896 (95%CI: 0.821-0.942)	49.9	0.083	0.831 (95%CI: 0.746-0.891)	16.3	0.276	0.703 (95%CI: 0.632-0.766)	0.0	0.746	0.721 (95%CI: 0.608-0.812)	76.1	0.002	0.142 (95%CI: 0.072-0.211)	64.8	0.022	0.151 (95%CI: -0.805 to 1.107)	62.6	0.032
Number of patients
< 100	5	392	0.895 (95%CI: 0.819-0.941)	49.2	0.089	0.842 (95%CI: 0.741-0.909)	31.4	0.242	0.731 (95%CI: 0.660-0.791)	17.6	0.235	0.703 (95%CI: 0.594-0.793)	64.0	0.027	0.147 (95%CI: 0.074-0.220)	60.2	0.041	0.213 (95%CI: -0.756 to 1.182)	60.3	0.035
≥ 100	5	1721	0.891 (95%CI: 0.838-0.928)	54.0	0.066	0.922 (95%CI: 0.870-0.954)	83.1	0.000	0.650 (95%CI: 0.614-0.684)	6.7	0.432	0.816 (95%CI: 0.742-0.872)	78.8	0.000	0.163 (95%CI: 0.106-0.219)	66.1	0.037	-0.387 (95%CI: -1.112 to 0.339)	3.4	0.306
Study design
Paired	2	1200	0.868 (95%CI: 0.783-0.923)	71.1	0.063	0.944 (95%CI: 0.879-0.975)	90.4	0.001	0.650 (95%CI: 0.586-0.710)	33.8	0.219	0.766 (95%CI: 0.600-0.877)	91.1	0.001	0.196 (95%CI: 0.125-0.266)	0.0	0.378	-0.774 (95%CI: -1.475 to -0.073)	0.0	0.536
Others	8	913	0.904 (95%CI: 0.856-0.937)	44.0	0.116	0.868 (95%CI: 0.803-0.913)	57.7	0.018	0.694 (95%CI: 0.640-0.743)	30.1	0.159	0.773 (95%CI: 0.684-0.842)	75.2	0.000	0.141 (95%CI: 0.092-0.189)	56.8	0.026	0.075 (95%CI: -0.542 to 0.693)	40.5	0.101
Type of lesion
BE	5	446	0.896 (95%CI: 0.821-0.942)	49.9	0.083	0.831 (95%CI: 0.746-0.891)	16.3	0.276	0.703 (95%CI: 0.632-0.766)	0.0	0.746	0.721 (95%CI: 0.608-0.812)	76.1	0.002	0.142 (95%CI: 0.072-0.211)	64.8	0.022	0.151 (95%CI: -0.805 to 1.107)	62.6	0.032
Others	5	1667	0.890 (95%CI: 0.837-0.928)	53.0	0.070	0.930 (95%CI: 0.891-0.956)	70.8	0.009	0.658 (95%CI: 0.618-0.696)	43.9	0.039	0.810 (95%CI: 0.723-0.874)	75.3	0.001	0.167 (95%CI: 0.112-0.221)	22.9	0.118	-0.355 (95%CI: -1.110 to 0.401)	9.3	0.296
Imaging modality
WLE	4	931	0.906 (95%CI: 0.835-0.949)	63.0	0.025	0.935 (95%CI: 0.877-0.967)	59.7	0.044	0.662 (95%CI: 0.596-0.722)	26.1	0.255	0.764 (95%CI: 0.634-0.858)	88.1	0.000	0.202 (95%CI: 0.134-0.269)	0.1	0.438	0.560 (95%CI: -0.458 to 1.579)	65.3	0.020
Others	6	1182	0.886 (95%CI: 0.827-0.926)	26.8	0.253	0.861 (95%CI: 0.790-0.911)	73.1	0.002	0.685 (95%CI: 0.630-0.735)	31.3	0.127	0.775 (95%CI: 0.679-0.849)	66.3	0.007	0.128 (95%CI: 0.074-0.183)	69.1	0.005	-0.530 (95%CI: -1.261 to 0.201)	0.9	0.536
AI model for segmentation
U-Net	4	1587	0.887 (95%CI: 0.828-0.927)	59.7	0.048	0.939 (95%CI: 0.904-0.961)	69.1	0.015	0.648 (95%CI: 0.611-0.682)	11.0	0.323	0.810 (95%CI: 0.710-0.881)	83.2	0.000	0.187 (95%CI: 0.155-0.220)	0.0	0.735	-0.376 (95%CI: -1.170 to 0.418)	17.4	0.187
Others	6	526	0.900 (95%CI: 0.835-0.941)	43.6	0.119	0.834 (95%CI: 0.765-0.886)	7.3	0.358	0.724 (95%CI: 0.660-0.779)	0.0	0.327	0.737 (95%CI: 0.632-0.820)	72.4	0.002	0.118 (95%CI: 0.069-0.167)	63.1	0.021	0.093 (95%CI: -0.783 to 0.970)	52.1	0.060

WLE: White light endoscopy; AI-CAD: Artificial intelligence-enhanced real-time computer-aided detection systems; DOR: Diagnostic odds ratio; BE: Barrett's esophagus; AI: Artificial intelligence.

Open in New Tab Full Size Table

Meta-regression of AICAD group specificity (Table 4) showed that both WLE modality and UNet model independently enhanced AICAD group specificity (P < 0.01). After adjustment, the residual betweenstudy variance was nil (τ² = 0; I² = 0%), and entire heterogeneity is explained (R² = 100%), indicating that these two factors almost completely accounted for the variability in specificity. For accuracy difference, metaregression (Table 4) revealed a baseline advantage of 10.7 percentage points for AICAD group under nonWLE, nonUNet conditions. After incorporating WLE and UNet as covariates, residual heterogeneity remained substantial (I² = 51%), the explained variance was modest (R² = 25%), and β was not statistically significant (P > 0.05). The robustness of the above findings was verified by leave-one-out (LOO) sensitivity analysis (Figure 8). In the specificity model, the regression coefficients (β) remained statistically significant throughout, with CI overlapping those of the full model, indicating a stable and reliable result. In contrast, the accuracy difference model showed greater variability in β estimates, with CI crossing 0 in some iterations, suggesting limited explanatory power and model stability.

Open in New Tab Full Size Figure Download Figure

Figure 8 Sensitivity analysis of image modality and artificial intelligence model effects. A: Leave-one-out (LOO) sensitivity analysis of artificial intelligence-enhanced real-time computer-aided detection systems (AI-CAD) group specificity, stratified by imaging modality; B: LOO sensitivity analysis of accuracy difference, stratified by imaging modality; C: LOO sensitivity analysis of AI-CAD group specificity, stratified by artificial intelligence (AI) model; D: LOO sensitivity analysis of accuracy difference, stratified by AI model.

Table 4 Meta-regression examining the impact of imaging modality and artificial intelligence model on artificial intelligence-enhanced real-time computer-aided detection systems specificity and accuracy gain.

Outcome	Covariate	β	95%CI	P value	τ²	I² (%)	R² (%)
Specificity	Intercept	1.418	1.099-1.737	0	0	0	10000
	WLE (vs others)	0.816	0.307-1.324	0.002
	UNet (vs others)	0.986	0.558-1.415	0
Accuracy difference	Intercept	0.107	0.043-0.17	0.001	0.002	51.4	2498.5
	WLE (vs others)	0.059	-0.022-0.14	0.157
	UNet (vs others)	0.051	-0.029-0.131	0.212

WLE: White light endoscopy.

Open in New Tab Full Size Table

DISCUSSION

AI systems have advanced rapidly in recent years and are now widely applied across various domains of clinical medicine, including EGD for the detection of precancerous lesions. Multiple AI-CAD models targeting UGI precancerous conditions have been developed and introduced into clinical practice. In 2020, de Groof et al[21] first evaluated an AI system for diagnosing BE, reporting a sensitivity of 91%, specificity of 89%, and accuracy of 90. The same year, AI-CAD systems for CAG and GIM also emerged, with diagnostic performance metrics consistently around 90%. Notably, Zhang et al[22] reported a sensitivity of 94.5%, specificity of 94.0%, and accuracy of 94.2% for their CAG detection system. These findings collectively suggest that current AI-CAD systems for UGI precancerous lesion detection are approaching clinically acceptable diagnostic performance. However, as highlighted in a recent review by Spadaccini et al[23], despite their promising potential, AI-CAD systems are still subject to misclassification risks, including both false positives and false negatives. Their clinical effectiveness and practical feasibility require further validation. Given the growing body of evidence and the absence of a previously published meta-analysis on this topic, the present study aims to comprehensively evaluate the diagnostic performance of AI-CAD systems in UGI precancerous lesion detection, based on all available literature published up to April 30, 2025.

The pooled sensitivity, specificity, and AUC of AI-CAD systems for the diagnosis of UGI precancerous lesions were 0.89 (95%CI: 0.85-0.92), 0.89 (95%CI: 0.84-0.93), and 0.921 (95%CI: 0.841-0.948), respectively. In contrast, endoscopists achieved corresponding values of 0.67 (95%CI: 0.63-0.71), 0.77 (95%CI: 0.70-0.83), and 0.733 (95%CI: 0.678-0.818). The pooled diagnostic accuracy difference was 0.16 (95%CI: 0.12-0.20), and pooled difference in log(DOR) was -0.19 (95%CI: -0.75 to 0.37). The non-overlapping summary estimates of sensitivity between groups indicate that AI-CAD significantly outperformed endoscopists in lesion detection. A similar, though smaller, difference was observed for specificity, suggesting that the primary advantage of AI-CAD lies in improving sensitivity and reducing the risk of missed diagnoses. Heterogeneity analysis revealed substantial variability across most analyses: I² values exceeded 50% in all but one comparison, with several reaching close to 80%, indicating moderate to high heterogeneity. Notably, only the sensitivity forest plot for endoscopists showed I² < 50%, reflecting relatively consistent sensitivity across clinicians. In contrast, greater variability in specificity may reflect individual differences in diagnostic strategy, training, or adherence to guidelines. These findings also support the potential of AI-CAD to reduce diagnostic variability and mitigate errors in clinical endoscopy. The SROC curves further corroborated the diagnostic advantage of AI-CAD. The AI curve was notably closer to the upper-left corner, with an AUC > 0.9 and a narrower 95%CI that did not overlap with that of the endoscopists. This indicates that AI-CAD systems demonstrated high sensitivity with acceptable false-positive rates, offering superior diagnostic performance with less variability. These results suggest that AI-CAD can enhance diagnostic consistency while being less susceptible to factors such as operator fatigue, experience, or lesion conspicuity. In addition to conventional diagnostic metrics, this analysis incorporated diagnostic accuracy difference and log(DOR) difference as complementary effect measures. Diagnostic accuracy, defined as the proportion of true positives and true negatives among the total population, provides an intuitive measure of the overall correctness of a diagnostic test. The accuracy difference reflects the extent to which AI-CAD improves this correctness relative to endoscopists. The log(DOR), a composite metric that integrates both sensitivity and specificity, is commonly used to assess overall discriminatory performance; log(DOR) difference indicates the extent to which AI-CAD enhances diagnostic discrimination compared to conventional endoscopy. The analysis of accuracy difference yielded a pooled estimate of 0.16 with a 95%CI that did not cross zero, indicating that the use of AI-CAD may reduce missed or incorrect diagnoses by 16 per 100 patients undergoing EGD. Moreover, most studies consistently favored AI-CAD over endoscopists alone. In contrast, the log(DOR) difference yielded a negative point estimate with a 95%CI that crossed zero, suggesting that AI-CAD did not significantly improve overall discriminatory power. This discrepancy may be attributed to elevated false-positive rates in some AI models, which may have offset the gains in sensitivity and attenuated the net diagnostic advantage.

Most summary effects exhibited notable heterogeneity except of endoscopist’s sensitivity, I² exceeded 50% for every outcome, and reached 70%-80% for specificity and the accuracy difference, indicating moderate-to-high between-study variability. By contrast, the direction of the pooled sensitivity advantage was consistent and its heterogeneity more limited, whereas log(DOR) differences fluctuated markedly across studies. Subgroup analyses identified technical variables (imaging modality and AI model for segmentation), as the principal drivers of heterogeneity. Variation in imaging modality primarily altered the false-positive rate, and U-Net-based segmentation exerted the strongest influence on false-positive control; together they accounted for almost all variability in specificity and a sizeable proportion of the variability in accuracy. The subsequent meta-regression also confirmed that the imaging modality and AI model for segmentation almost completely explained the specific heterogeneity, and also partially explained the heterogeneity of the accuracy rates. Routine WLE combined with a U-Net-based AI-CAD system is therefore recommended for first-line deployment, as it delivers the most stable gain in specificity. This may be attributed to the broader accessibility of WLE data and the abundance of training samples, as well as its preservation of full-spectrum visible light information. Although WLE lacks the enhanced visualization of mucosal microvasculature and surface texture provided by modalities such as NBI or BLI, it retains critical structural details that may be essential for accurate lesion identification. These characteristics have brought significant advantages for the training and deployment of AI-CAD systems. U-Net model has better adaptability to limited data, high noise levels, and high inter-case variability compared to other segmentation architectures. In the context of multiple diseases and multiple centers, it can generate stable outputs and is more suitable for early lesion detection tasks with prominent morphologic heterogeneity. While AI-CAD systems showed higher sensitivity, this may be accompanied by increased false-positive rates in some models, potentially leading to unnecessary biopsies or interventions. Future development should focus on optimizing specificity and integrating a two-step human-AI review process to minimize overtreatment risk. The human-machine collaboration strategy of having the high-specificity AI-CAD system conduct the initial screening and then having the endoscopist conduct the key review can avoid excessive treatment, reduce unnecessary biopsies and surgeries, and is of great significance for the diagnosis and treatment of precancerous lesions. In the future, when the hardware conditions permit, multi-stage and ensemble models can also be introduced to re-examine the initially positive results, thereby further enhancing the specificity. LOO also confirmed that there was no key driving studies included in the research, indicating that meta-regression results were highly robust.

In addition, we also found that the specificity and accuracy of the BE scenario were lower than those of other lesions. Moreover, when the β value of the BE study was removed in LOO, there was a significant increase. This might be due to the complex structure of the BE lesion. Further improvements in diagnostic efficacy may be required through multi-modal imaging modalities, ensemble AI models, and re-training methods. In the paired study scenario, heterogeneity is generally high and the sensitivity is lower than in the non-paired scenario. This might be due to the smaller number of paired studies, and because specific patient-level paired data is lacking, it is impossible to construct the covariates required for McNemar or bivariate models. Therefore, it can only assume that the two groups independently use the random effects model for the main analysis, which leads to an impact on the final results. Only two studies adopted a paired design, limiting the feasibility of using more precise paired statistical models. This constraint may have impacted the robustness of direct AI-endoscopist comparisons. This constitutes a limitation of the present meta-analysis and highlights the need for large, multicenter, prospectively paired datasets. In addition to the above limitations, several other factors should be noted. Although this meta-analysis comprehensively included all available prospective studies up to April 2025, only 10 studies met the inclusion criteria. This limited number may reduce the generalizability of our conclusions, especially when extrapolating to unrepresented populations or healthcare settings. Some included studies had relatively small sample sizes, with the smallest cohort containing only 61 participants. Such small sample sizes may introduce statistical instability and increase the potential for random error in pooled effect estimates. And there were still several adopted retrospective designs, which may introduce selection bias and reduce the internal validity of the pooled results. Moreover, most of the included literature was discussed within the context of Asia or Europe. The lack of data from other regions may lead to bias. The absence of discussions on the deployment in low-resource environments and the different disease phenotypes in different regions may also affect the external generalizability of the AI-CAD system. Our meta-analysis did not address the cost-effectiveness, operational complexity, or impact on clinical workflows of AI-CAD systems. These factors are particularly important for their adoption in resource-limited settings and should be evaluated in future studies.

Future directions include: (1) Confirmatory large-scale prospective cohorts to mitigate residual bias; (2) Detailed error profiling and threshold optimization to further reduce false positives; and (3) Cost-effectiveness and patient-outcome studies to substantiate real-world value.

CONCLUSION

This meta-analysis of 10 studies (involving a total of 2113 patients) confirmed that real-time AI-CAD significantly improved the endoscopic recognition rate of precancerous lesions of the UGI tract. Compared with traditional endoscopists, the pooled sensitivity and specificity of AI-CAD were higher, with an accuracy rate increased by 16 percentage points. Its AUC value was greater than 0.92, highlighting its outstanding diagnostic performance. Heterogeneity analysis indicated that WLE combined with U-Net-based segmentation was the most consistent and specific configuration, providing a practical path for clinical application. Overall evidence suggests that integrating AI-CAD into routine EGD can standardize operations, reduce missed lesions, and ultimately support earlier intervention. It is now necessary to conduct large-scale, multi-center paired cohort trials and cost-effectiveness studies to verify these benefits in different populations and medical settings.

Footnotes

Provenance and peer review: Invited article; Externally peer reviewed.

Peer-review model: Single blind

Specialty type: Gastroenterology and hepatology

Country of origin: China

Peer-review report’s classification

Scientific Quality: Grade A, Grade B

Novelty: Grade A, Grade B

Creativity or Innovation: Grade B, Grade B

Scientific Significance: Grade A, Grade B

P-Reviewer: Liu QS, MD, Adjunct Professor, Chief Physician, China; Wei QZ, Post Doctoral Researcher, China S-Editor: Qu XL L-Editor: A P-Editor: Zhang L

References

Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin. 2021;71:209-249. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 75126] [Cited by in RCA: 66429] [Article Influence: 16607.3] [Reference Citation Analysis (183)]

Huang LM, Yang WJ, Huang ZY, Tang CW, Li J. Artificial intelligence technique in detection of early esophageal cancer. World J Gastroenterol. 2020;26:5959-5969. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in CrossRef: 16] [Cited by in RCA: 22] [Article Influence: 4.4] [Reference Citation Analysis (0)]

Sexton RE, Al Hallak MN, Diab M, Azmi AS. Gastric cancer: a comprehensive review of current and future treatment strategies. Cancer Metastasis Rev. 2020;39:1179-1203. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 459] [Cited by in RCA: 494] [Article Influence: 98.8] [Reference Citation Analysis (0)]

Suzuki H, Oda I, Abe S, Sekiguchi M, Mori G, Nonaka S, Yoshinaga S, Saito Y. High rate of 5-year survival among patients with early gastric cancer undergoing curative endoscopic submucosal dissection. Gastric Cancer. 2016;19:198-205. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 135] [Cited by in RCA: 190] [Article Influence: 21.1] [Reference Citation Analysis (0)]

Pimenta-Melo AR, Monteiro-Soares M, Libânio D, Dinis-Ribeiro M. Missing rate for gastric cancer during upper gastrointestinal endoscopy: a systematic review and meta-analysis. Eur J Gastroenterol Hepatol. 2016;28:1041-1049. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 185] [Cited by in RCA: 174] [Article Influence: 19.3] [Reference Citation Analysis (0)]

Hyun YS, Han DS, Bae JH, Park HS, Eun CS. Interobserver variability and accuracy of high-definition endoscopic diagnosis for gastric intestinal metaplasia among experienced and inexperienced endoscopists. J Korean Med Sci. 2013;28:744-749. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 13] [Cited by in RCA: 15] [Article Influence: 1.3] [Reference Citation Analysis (0)]

7.	Ebigbo A, Messmann H, Lee SH. Artificial Intelligence Applications in Image-Based Diagnosis of Early Esophageal and Gastric Neoplasms. Gastroenterology. 2025;169:396-415.e2. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 4] [Reference Citation Analysis (0)]

Wang J, Li Y, Chen B, Cheng D, Liao F, Tan T, Xu Q, Liu Z, Huang Y, Zhu C, Cao W, Yao L, Wu Z, Wu L, Zhang C, Xiao B, Xu M, Liu J, Li S, Yu H. A real-time deep learning-based system for colorectal polyp size estimation by white-light endoscopy: development and multicenter prospective validation. Endoscopy. 2024;56:260-270. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 17] [Reference Citation Analysis (0)]

Li N, Yang J, Li X, Shi Y, Wang K. Accuracy of artificial intelligence-assisted endoscopy in the diagnosis of gastric intestinal metaplasia: A systematic review and meta-analysis. PLoS One. 2024;19:e0303421. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 2] [Cited by in RCA: 11] [Article Influence: 11.0] [Reference Citation Analysis (0)]

10.	Huang C, Song Y, Dong J, Yang F, Guo J, Sun S. Diagnostic performance of AI-assisted endoscopy diagnosis of digestive system tumors: an umbrella review. Front Oncol. 2025;15:1519144. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Reference Citation Analysis (0)]

11.

van der Putten J, de Groof J, Struyvenberg M, Boers T, Fockens K, Curvers W, Schoon E, Bergman J, van der Sommen F, de With PHN. Multi-stage domain-specific pretraining for improved detection and localization of Barrett's neoplasia: A comprehensive clinically validated study. Artif Intell Med. 2020;107:101914. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 7] [Cited by in RCA: 13] [Article Influence: 2.6] [Reference Citation Analysis (0)]

12.

Yan T, Wong PK, Choi IC, Vong CM, Yu HH. Intelligent diagnosis of gastric intestinal metaplasia based on convolutional neural network and limited number of endoscopic images. Comput Biol Med. 2020;126:104026. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 44] [Cited by in RCA: 34] [Article Influence: 6.8] [Reference Citation Analysis (0)]

13.

Struyvenberg MR, de Groof AJ, Fonollà R, van der Sommen F, de With PHN, Schoon EJ, Weusten BLAM, Leggett CL, Kahn A, Trindade AJ, Ganguly EK, Konda VJA, Lightdale CJ, Pleskow DK, Sethi A, Smith MS, Wallace MB, Wolfsen HC, Tearney GJ, Meijer SL, Vieth M, Pouw RE, Curvers WL, Bergman JJ. Prospective development and validation of a volumetric laser endomicroscopy computer algorithm for detection of Barrett's neoplasia. Gastrointest Endosc. 2021;93:871-879. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 7] [Cited by in RCA: 12] [Article Influence: 3.0] [Reference Citation Analysis (0)]

14.

Hussein M, González-Bueno Puyal J, Lines D, Sehgal V, Toth D, Ahmad OF, Kader R, Everson M, Lipman G, Fernandez-Sordo JO, Ragunath K, Esteban JM, Bisschops R, Banks M, Haefner M, Mountney P, Stoyanov D, Lovat LB, Haidry R. A new artificial intelligence system successfully detects and localises early neoplasia in Barrett's esophagus by using convolutional neural networks. United European Gastroenterol J. 2022;10:528-537. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 33] [Cited by in RCA: 31] [Article Influence: 10.3] [Reference Citation Analysis (0)]

15.	Zhao Q, Jia Q, Chi T. Deep learning as a novel method for endoscopic diagnosis of chronic atrophic gastritis: a prospective nested case-control study. BMC Gastroenterol. 2022;22:352. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 7] [Reference Citation Analysis (0)]

16.

Zhao Q, Chi T. Deep learning model can improve the diagnosis rate of endoscopic chronic atrophic gastritis: a prospective cohort study. BMC Gastroenterol. 2022;22:133. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 1] [Cited by in RCA: 16] [Article Influence: 5.3] [Reference Citation Analysis (0)]

17.

Abdelrahim M, Saiko M, Maeda N, Hossain E, Alkandari A, Subramaniam S, Parra-Blanco A, Sanchez-Yague A, Coron E, Repici A, Bhandari P. Development and validation of artificial neural networks model for detection of Barrett's neoplasia: a multicenter pragmatic nonrandomized trial (with video). Gastrointest Endosc. 2023;97:422-434. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1] [Cited by in RCA: 38] [Article Influence: 19.0] [Reference Citation Analysis (0)]

18.

Zhao Q, Jia Q, Chi T. U-Net deep learning model for endoscopic diagnosis of chronic atrophic gastritis and operative link for gastritis assessment staging: a prospective nested case-control study. Therap Adv Gastroenterol. 2023;16:17562848231208669. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 5] [Reference Citation Analysis (0)]

19.

Meinikheim M, Mendel R, Palm C, Probst A, Muzalyova A, Scheppach MW, Nagl S, Schnoy E, Römmele C, Schulz DAH, Schlottmann J, Prinz F, Rauber D, Rückert T, Matsumura T, Fernández-Esparrach G, Parsa N, Byrne MF, Messmann H, Ebigbo A. Influence of artificial intelligence on the diagnostic performance of endoscopists in the assessment of Barrett's esophagus: a tandem randomized and video trial. Endoscopy. 2024;56:641-649. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 16] [Reference Citation Analysis (0)]

20.

Tao X, Zhu Y, Dong Z, Huang L, Shang R, Du H, Wang J, Zeng X, Wang W, Wang J, Li Y, Deng Y, Wu L, Yu H. An artificial intelligence system for chronic atrophic gastritis diagnosis and risk stratification under white light endoscopy. Dig Liver Dis. 2024;56:1319-1326. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1] [Cited by in RCA: 5] [Article Influence: 5.0] [Reference Citation Analysis (0)]

21.

de Groof AJ, Struyvenberg MR, Fockens KN, van der Putten J, van der Sommen F, Boers TG, Zinger S, Bisschops R, de With PH, Pouw RE, Curvers WL, Schoon EJ, Bergman JJGHM. Deep learning algorithm detection of Barrett's neoplasia with high accuracy during live endoscopic procedures: a pilot study (with video). Gastrointest Endosc. 2020;91:1242-1250. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 66] [Cited by in RCA: 93] [Article Influence: 18.6] [Reference Citation Analysis (0)]

22.

Zhang Y, Li F, Yuan F, Zhang K, Huo L, Dong Z, Lang Y, Zhang Y, Wang M, Gao Z, Qin Z, Shen L. Diagnosing chronic atrophic gastritis by gastroscopy using artificial intelligence. Dig Liver Dis. 2020;52:566-572. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 45] [Cited by in RCA: 80] [Article Influence: 16.0] [Reference Citation Analysis (0)]

23.

Spadaccini M, Vespa E, Chandrasekar VT, Desai M, Patel HK, Maselli R, Fugazza A, Carrara S, Anderloni A, Franchellucci G, De Marco A, Hassan C, Bhandari P, Sharma P, Repici A. Advanced imaging and artificial intelligence for Barrett's esophagus: What we should and soon will do. World J Gastroenterol. 2022;28:1113-1122. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in CrossRef: 11] [Cited by in RCA: 7] [Article Influence: 2.3] [Reference Citation Analysis (0)]