BPG is committed to discovery and dissemination of knowledge
Review
Copyright ©The Author(s) 2025.
World J Gastroenterol. Sep 28, 2025; 31(36): 110742
Published online Sep 28, 2025. doi: 10.3748/wjg.v31.i36.110742
Table 1 Summary of artificial intelligence in esophageal diseases
Disease
Application
Ref.
Study design
Region/country
Modality
Test set
AI model
Main findings
BE DiagnosisRosenfeld et al[19]RUnited KingdomQuestionnaires1299 patientsMLML model with 8 factors (e.g., age) predicts BE (AUC 0.86/0.81), facilitating high-risk screening
DiagnosisAbdelrahim et al[20]PEuropeWLI75 patientsCNNAn AI system detected Barrett's neoplasia in real-time endoscopy with 93.8% sensitivity, significantly higher than endoscopists (63.5%)
DiagnosisStruyvenberg et al[21]REuropeNBI157 videosCNNDeveloped a DL-based CAD system for Barrett's neoplasia in NBI videos: 83% accuracy, 85% sensitivity, 83% specificity, processing at 38 fps
DiagnosisHashimoto et al[22]RUnited StatesWLI, NBI1832 imagesCNNAI detects Barrett's early neoplasia at 95.4% accuracy, 96.4% sensitivity via CNN, with real-time lesion localization
Esophageal carcinoma, ESCCDiagnosisTokai et al[23]RJapanWLI, NBI2042 imagesCNNAI outperformed 13 endoscopists in assessing ESCC invasion depth (accuracy: 80.9%; AUC 0.7873), demonstrating superior diagnostic capability
DiagnosisFukuda et al[24]RJapanNBI, BLI28333 imagesCNNAI outperformed endoscopists in ESCC detection sensitivity (91% vs 79%) and characterization accuracy (88% vs 75%)
DiagnosisLi et al[25]RChinaWLI, NBI759 patientsDLCAD-NBI surpasses CAD-WLI in accuracy/specificity for early ESCC (P < 0.05). Endoscopist combination yields optimal diagnosis (94.9% accuracy, 92.4% sensitivity, 96.7% specificity)
DiagnosisGuo et al[26]RChinaNBI13144 imagesDLThis DL model demonstrates high sensitivity (image 98.04%, video 96.1%) and specificity (image 95.03%, video 99.9%) in real-time diagnosis of esophageal precancerous and early SCC
DiagnosisOhmori et al[27]RJapanWLI, NBI/BLI, ME21597 imagesCNNAI detected ESCC via non-magnifying endoscopy (NBI/BLI) with 100% sensitivity. With magnification, accuracy reached 83%, comparable to expert endoscopists
Table 2 Summary of artificial intelligence in gastric diseases
Disease
Application
Ref.
Study design
Country/region
Modality
Test set
AI model
Main findings
H. pylori infectionDiagnosisMartin et al[43]RUnited StatesGastric biopsy406 patientsCNNDCNNs accurately recognize gastric pathology damage patterns, especially H. pylori gastritis, and serve as effective screening tools
DiagnosisMohan et al[44]RChina, JapanWLI, BLI, LCI-CNNFor H. pylori infection diagnosis, CNN achieved 87% accuracy, sensitivity, and specificity, comparable to endoscopists (82.9% accuracy)
DiagnosisNakashima et al[45]RJapanLCI, WLI515 patientsCNNDeveloped LCI/DL-based CAD classifying H. pylori infection into uninfected, active, and post-eradication statuses with 84.2%, 82.5%, 79.2% accuracy. Outperforms WLI and matches expert endoscopists
Gastric polypDiagnosisYuan et al[46]RChinaWLI9443 patientsDCNNsAI achieved 96.2% accuracy and 88.0% sensitivity for gastric polyps. With AI, junior endoscopists' accuracy significantly improved (96.9%→97.6%), matching seniors
DiagnosisCao et al[47]RChinaGastroscopic imaging2270 imagesDLImproved YOLOv3 with feature fusion boosts small polyp detection in gastroscopic images to 91.6% accuracy, resolving complex background interference
GCDiagnosisHoriuchi et al[48]RJapanME-NBI2828 imagesCNNCNN system distinguishes EGC from gastritis (sensitivity 95.4%, NPV 91.7%, accuracy 85.3%), aiding clinical diagnosis
DiagnosisLi et al[39]P & RChinaME-NBI20341 imagesCNNME-NBI-based CNN achieves 90.91% accuracy for early GC; 91.18% sensitivity (superior to experts), 90.64% specificity (comparable); overall outperforms non-experts
DiagnosisBu et al[49]PChinaLiquid biopsy150 samplesMLDeveloping NanoFisher for efficient plasma EV isolation, combining metabolomics and machine learning, achieves 92% accuracy in EGC diagnosis
TreatmentWang et al[50]RChinaCECT244 patientsMLCT radiomics distinguishes T2 from T3/T4 GC, guiding neoadjuvant chemotherapy selection
TreatmentShang et al[51]RChinaCECT311 imagesDLA nomogram built from radiomic and DL features via automated spleen segmentation effectively predicts GC serosal invasion, providing a noninvasive tool for surgical planning
TreatmentKang et al[52]RSouth KoreaCT, WLI, biopsy2927 patientsDLDeveloped a Transformer-based multimodal AI system integrating endoscopic images and clinical data. Accurately predicts EGC lymph node metastasis risk (AUC 0.908), guiding treatment decisions
TreatmentChen et al[53]RChinaLaparoscopic surgery video2460 imagesDLDevelop AI models to accurately identify perigastric vessels, enhancing safety and reducing bleeding risks in laparoscopic gastrectomy
PrognosisZhang et al[54]RChinaCT669 patientsDCNNDeveloped CT-based radiomics nomogram integrating radiomics and clinical factors (e.g., CEA) effectively predicts early recurrence in advanced GC preoperatively (AUC 0.806-0.831)
PrognosisDong et al[55]RChina, ItalyCECT730 patientsDLDLRN accurately predicts GC lymph node metastasis number (C-index 0.797-0.822), outperforming clinical staging and correlating significantly with survival
PrognosisHuang et al[56]RChinaCECT205 patientsMLDeveloped a ML nomogram combining clinical factors (T/N stage) and CT radiomics to predict gastric cancer PNI (validation AUC 0.885), aiding prognosis
Table 3 Summary of artificial intelligence in intestinal diseases
Disease
Application
Ref.
Study design
Country/region
Modality
Test set
AI model
Main findings
Crohn's diseaseDiagnosisLi et al[66]RChinaCTE, histopathology167 patientsMLA validated CTE-based radiomics model accurately distinguishes moderate-severe from non-mild fibrosis in Crohn's bowel walls, significantly outperforming radiologists' visual assessments
DiagnosisMajtner et al[67]PDenmarkpan-CE7744 imagesDLAuto-detects Crohn's ulcers with 98.4% accuracy. Comparable small/Large bowel accuracy (98.5% vs 98.1%). Distinguishes severity (κ = 0.72)
DiagnosisKlang et al[68]RIsraelCE27892 imagesCNNDL model detects Crohn's enteric strictures at 93.5% accuracy (AUC 0.989), precisely distinguishing strictures from ulcers (including severe), enabling automated CE diagnosis
TreatmentKonikoff et al[69]RIsraelAPCT101 patientsMLDeveloped an ML model using indicators, such as NLR, to predict Crohn's complication risk in emergency CT (AUC 0.774), enabling risk stratification to reduce unnecessary scans
TreatmentCon et al[70]RAustraliaSerum biomarkers146 patientsDLRNN with serial biomarkers (AUC 0.754) outperforms logistic regression (AUC 0.659) in predicting biochemical remission (CRP < 5 mg/L) at 12 months post-anti-TNF therapy in Crohn's disease
PrognosisUngaro et al[71]PUnited States, CanadaPEA265 patientsMLML model identified 5 plasma protein markers for penetrating (B3) and 4 for stricturing (B2) complications, outperforming traditional models (B3 AUC 0.79)
PrognosisStidham et al[72]RUnited StatesLaboratory examination2809 patientsMLMachine learning leveraging routine longitudinal lab data predicts Crohn's surgical risk (AUC 0.78)
SINENsDiagnosisKjellman et al[73]PNorthern EuropePET/CT, MRI etc.278 patientsMLMulti-plasma protein markers with Random Forest boost SI-NETs diagnosis (Sensitivity 89%, Specificity 91%, AUC 0.99)
DiagnosisClift et al[74]RUnited KingdomEHR382 patientsMLXGBoost using primary care EHRs effectively identifies undiagnosed high-risk SI-NET patients (AUC 0.869)
Table 4 Summary of artificial intelligence in colorectal diseases
Disease
Application
Ref.
Study design
Country/region
Modality
Test set
AI model
Main findings
Ulcerative colitisDiagnosisSutton et al[99]RNorwayEndoscopy8000 imagesCNNAI (especially DenseNet121 model) accurately distinguishes UC from non-UC pathology (AUC 0.999) and grades endoscopic activity (mild/severe, AUC 0.90)
DiagnosisLo et al[100]RDenmarkWLI1484 imagesCNNThe CNN model achieved 84% accuracy in distinguishing UC endoscopic severity (Mayo score 0-3), significantly outperforming existing models and standardizing clinical assessment
DiagnosisRuan et al[101]RChinaColonoscope1772 patientsCNNAI model detects UC/CD at 99.1% accuracy vs physicians' 78%-92.2%, enhancing clinical efficiency
DiagnosisGutierrez Becker et al[102]REurope et alEndoscopy1672 videosCNNThis model directly analyzes raw colonoscopy videos, automatically assessing UC severity (MCES) with high accuracy (AUC 0.84-0.85), reducing manual annotation needs
TreatmentBossuyt et al[103]PBelgium, JapanEndoscopy100 patientsMLThe RD algorithm objectively evaluates UC endoscopic and histologic activity, closely correlated with RHI (r = 0.74), and is sensitive for monitoring therapeutic response
TreatmentIacucci et al[104]PEurope, North AmericaHD-WLE, VCE283 patientsCNNAI system accurately differentiates UC endoscopic activity/remission (AUC 0.94) and predicts histological remission (83% accuracy), comparable to physicians
PrognosisHuang et al[105]RChinaColonoscope856 imagesDL, MLDL/ML-CAD diagnoses mucosal healing (MES 0-1) with 94.5% accuracy and complete healing (MES 0) at 89.0%
PrognosisPopa et al[106]RRomaniaColonoscope55 patientsMLML models accurately predict endoscopic disease activity one year after anti-TNFα therapy in UC patients (90% accuracy in test set, 100% in validation set)
PrognosisTakenaka et al[107]PJapanEndoscopy2012 patientsDNNDNN achieves 90.1% accuracy for endoscopic remission and 92.9% for histological remission (UC), reducing biopsy needs
PrognosisMaeda et al[108]PJapanEndocytoscope, NBI145 patientsMLReal-time AI endoscopy predicts relapse risk in UC remission by analyzing mucosal microvessels (AI-Active 28.4% vs AI-Healing 4.9%, P < 0.001)
Colorectal polypsDiagnosisWang et al[109]RChinaColonoscope1600 patientsCNNEnhanced GAP model achieves > 98% accuracy (TPR > 96%, TNR > 98%) for colon polyp detection with reduced parameters, enabling lightweight yet accurate diagnosis
DiagnosisSong et al[89]P & RSouth KoreaNBI1169 samplesDLCAD with NBI predicts polyp histology at 81.3%-82.4% accuracy, outperforming junior physicians (63.8-71.8%) and matching experts (82.4-87.3%), enhancing junior diagnostic performance
DiagnosisJin et al[110]P & RSouth KoreaNBI2450 imagesCNNAI assistance significantly boosts endoscopists' (especially novices') accuracy for small polyps (< 5 mm) (73.8%→85.6%) and reduces time (3.92→3.37 seconds/polyp)
DiagnosisSakamoto et al[111]RJapanWLI, LCI, BLI1788 imagesDLCADe sensitive > 94% (WLI/LCI), CADx accuracy > 93% (WLI/BLI), rivals expert endoscopists
DiagnosisZachariah et al[112]RUSAWLI, NBI6223 imagesCNNCNN real-time prediction of colorectal polyp pathology meets PIVI standards: 97% adenoma NPV, > 93% surveillance interval concordance
TreatmentWickstrøm et al[113]REuropeColonoscope912 imagesFCNsCNNs rely on polyp shape/edges for segmentation; error risk rises significantly in uncertain areas. FCNs combine uncertainty with interpretability visualization, helping doctors pinpoint high-risk regions fast
TreatmentSu et al[114]PChinaColonoscope659 patientsDCNNAQCS significantly boosts adenoma detection (28.9% vs 16.5%, P < 0.001), polyp detection, and optimizes withdrawal time and bowel prep during colonoscopy
CRCDiagnosisLuo et al[115]P & RChinaLiquid biopsy3315 patientsMLPlasma ctDNA methylation markers (e.g., cg10673833) enable early CRC diagnosis (AUC 0.96) and high-risk group screening (Sensitivity 89.7%)
DiagnosisArabameri et al[116]RFrance, USA, AustriaFecal microbiota analysis350 patientsMLCombining GRNN and DBFS (new feature selection) identified 6 key microbial markers (e.g., Clostridium), enabling high-precision CRC detection (AUC 0.911) but insufficient adenoma sensitivity (AUC 0.724)
DiagnosisZeng et al[117]PUSAOCT26000 imagesCNNPR-OCT system using OCT and RetinaNet distinguishes CRC from normal tissue in real-time with high accuracy (sensitivity 100%, specificity 99.7%, AUC 0.998)
TreatmentWang et al[118]RChinaMRI240 patientsFaster R-CNNFaster R-CNN detects positive CRM on rectal cancer pre-op high-resolution MRI with 93.2% accuracy (AUC 0.953); 0.2 seconds/image, highly feasible and efficient
TreatmentYang et al[119]RChinaMRI89 patientsMLPre-treatment ADC radiomics predicts locally advanced rectal cancer resistance to neoadjuvant chemoradiotherapy (AUC 0.83/91.3% accuracy)
TreatmentFu et al[120]RUSAMRI43 patientsDLDL-based radiomics significantly outperform handcrafted features in predicting neoadjuvant chemoradiotherapy response for locally advanced rectal cancer (AUC 0.73 vs 0.64)
PrognosisXu et al[121]RChinaCT, MRI et al999 patientsMLGradientBoosting and LightGBM effectively predict stage IV CRC recurrence risk (AUC up to 0.881); key factors: Chemotherapy, age, LogCEA, CEA, anesthesia duration
PrognosisZhao et al[122]RChinaClinical data7205 patientsMLML-based NCDB nomogram predicts metastatic rectal cancer 3-year OS (C-index > 0.77, internal/external validation), outperforming prior models
PrognosisReichling et al[123]RFranceIHC, WSI1018 patientsMLDGMuneS integrating tumor-stroma/CD3+/tumor features better predicts stage III colon cancer recurrence than traditional immune scores (C-index 0.601 vs 0.578)
PrognosisSkrede et al[124]RNorway, United KingdomH&E staining2467 patientsCNNDeveloped a DL-based prognostic biomarker (DoMore v1-CRC) using only routine H&E-stained slides, effectively stratifying Stage II/III CRC risk and outperforming existing markers
PrognosisVäyrynen et al[125]PUSAH&E staining1504 samplesMLML on H&E slides links dense stromal lymphocytes/eosinophils and their peri-tumoral localization to significantly improved CRC-specific survival
Table 5 Summary of artificial intelligence in hepatic disease
Disease
Application
Ref.
Study design
Country/region
Modality
Test set
AI model
Main findings
Hepato-cirrhosisDiagnosisRhyou and Yoo[146]RSouth KoreaUS4950 imagesDLA patch-based DL network for ultrasound cirrhosis diagnosis using synthetic image augmentation, achieving 99.95% accuracy, 100% sensitivity, and 99.9% specificity
DiagnosisLuetkens et al[147]RGermanyMRI465 patientsCNNResNet50 distinguishes alcoholic from nonalcoholic cirrhosis on MRI: AUC 0.82, 75% accuracy
DiagnosisChang et al[148]RUnited StatesLiver biopsy, FibroScan etc.1370 patientsMLML models (especially Random Forest) outperform traditional non-invasive tests (e.g., FibroScan, FIB-4) in identifying significant fibrosis and cirrhosis in NAFLD patients
DiagnosisMazumder et al[149]RUnited StatesCT, liver biopsy etc.351 patientsDL, MLCombining AI-extracted CT radiomics with routine lab data boosts liver cirrhosis prediction accuracy (AUC 0.84-0.85)
DiagnosisGuo et al[150]PUnited KingdomNMR spectroscopy 64005 patientsMLA plasma metabolomics and ML-based nomogram accurately predicts 10-year hepatic cirrhosis complication risk (AUC 0.861), outperforming conventional metrics
Hepatic encephalopathy, HEDiagnosisCalvo Córdoba et al[151]PSpainVOG47 patientsSVMAutomated VOG with SVM detects MHE in 7-10 minutes (93% sensitivity/specificity), outperforming PHES (25-40 minutes)
DiagnosisSparacia et al[152]RItalyMRI124 patientsMLMRI radiomics with KNN predicts HE presence (76.5% accuracy); MLP predicts HE severity (≥ stage 2, 94.1%), demonstrating potential for HE diagnosis and staging
DiagnosisChen et al[153]RChinaMRI53 patientsSVMSVM model using gray matter volume discriminates cirrhosis patients with/without MHE at 83.02% accuracy
TreatmentLiu et al[154]RChinaTIPS218 patientsMLDeveloped logistic regression model (AUC 0.825) accurately predicts OHE post-TIPS
TreatmentZhong et al[155]RChinaTIPS207 patientsANNANN model accurately predicts post-TIPS OHE (C-index = 0.863), with 15.9% incidence within 3 months, providing a clinical stratification tool
Liver cancerDiagnosisGao et al[156]RChinaCECT723 patientsDLThis model effectively distinguishes malignant liver tumors (HCC, ICC, metastases), achieving 72.6% test-set accuracy and improving physician ICC sensitivity by 26.9%
DiagnosisXu et al[157]RChinaCT1049 patientsDLSwin-Transformer model simplifies LI-RADS classification, effectively distinguishes HCC from non-HCC, and enhances diagnostic performance with clinical data
DiagnosisLi et al[158]RChinaDECT262 patientsDLDual-energy CT deep-learning radiomics nomogram noninvasively predicts HCC MTM subtype, outperforming clinical-radiologic models (AUC 0.87-0.91)
DiagnosisMa et al[159]RChinaCT, MRI211 patientsMLCT/MRI radiomics + clinical features (SVM) achieves highest HCC diagnostic accuracy (82.4%), significantly outperforms single-modality models, and distinguishes HCC vs non-HCC
TreatmentHua et al[144]RChinaCECT151 patientsMLDeveloped and validated a pretreatment CT-based radiomics model predicting response and survival outcomes after triple therapy in unresectable HCC to guide clinical decisions
TreatmentXu et al[143]RChinaCECT458 patientsDLThe model significantly outperforms single models (externally validated AUC 0.896), effectively distinguishing survival differences (P < 0.001), providing a tool for personalized treatment
TreatmentAn et al[160]RChinaIATs2959 patientsMLDeveloped MLDSM to risk-stratify unresectable HCC patients undergoing transarterial therapies (alone/combined) for 12-month mortality, guiding clinical treatment decision (e.g. TACE, HAIC)
PrognosisCao et al[161]RChinaCT, MRI et al466 patientsDL, MLDeveloped pre-/post-op dual-phase DeepSurv model predicting HCC recurrence post-liver transplant; outperformed Milan Criteria (C-index 0.765-0.839), guiding individualized surveillance
PrognosisAltaf et al[162]RPakistanCT, MRI etc.192 patientsCNNAI model with tumor size, AFP, and grade predicts post-transplant HCC recurrence risk (validation AUC 0.77)
PrognosisDong et al[163]RUnited StatesClinical data2038 patientsMLXGBoost model effectively predicts 1-, 3-, and 5-year survival (AUC > 0.7) in AFP-positive HCC patients, outperforming other algorithms, offering a clinical tool for early intervention
PrognosisYan et al[164]RChinaMRI285 patientsCNNDeep learning-based nomogram integrating imaging, MVI, and tumor number significantly outperforms traditional models in predicting early HCC recurrence (AUC 0.949 vs 0.751)