Copyright
©The Author(s) 2025.
World J Gastroenterol. Sep 28, 2025; 31(36): 110742
Published online Sep 28, 2025. doi: 10.3748/wjg.v31.i36.110742
Published online Sep 28, 2025. doi: 10.3748/wjg.v31.i36.110742
Table 1 Summary of artificial intelligence in esophageal diseases
Disease | Application | Ref. | Study design | Region/country | Modality | Test set | AI model | Main findings |
BE | Diagnosis | Rosenfeld et al[19] | R | United Kingdom | Questionnaires | 1299 patients | ML | ML model with 8 factors (e.g., age) predicts BE (AUC 0.86/0.81), facilitating high-risk screening |
Diagnosis | Abdelrahim et al[20] | P | Europe | WLI | 75 patients | CNN | An AI system detected Barrett's neoplasia in real-time endoscopy with 93.8% sensitivity, significantly higher than endoscopists (63.5%) | |
Diagnosis | Struyvenberg et al[21] | R | Europe | NBI | 157 videos | CNN | Developed a DL-based CAD system for Barrett's neoplasia in NBI videos: 83% accuracy, 85% sensitivity, 83% specificity, processing at 38 fps | |
Diagnosis | Hashimoto et al[22] | R | United States | WLI, NBI | 1832 images | CNN | AI detects Barrett's early neoplasia at 95.4% accuracy, 96.4% sensitivity via CNN, with real-time lesion localization | |
Esophageal carcinoma, ESCC | Diagnosis | Tokai et al[23] | R | Japan | WLI, NBI | 2042 images | CNN | AI outperformed 13 endoscopists in assessing ESCC invasion depth (accuracy: 80.9%; AUC 0.7873), demonstrating superior diagnostic capability |
Diagnosis | Fukuda et al[24] | R | Japan | NBI, BLI | 28333 images | CNN | AI outperformed endoscopists in ESCC detection sensitivity (91% vs 79%) and characterization accuracy (88% vs 75%) | |
Diagnosis | Li et al[25] | R | China | WLI, NBI | 759 patients | DL | CAD-NBI surpasses CAD-WLI in accuracy/specificity for early ESCC (P < 0.05). Endoscopist combination yields optimal diagnosis (94.9% accuracy, 92.4% sensitivity, 96.7% specificity) | |
Diagnosis | Guo et al[26] | R | China | NBI | 13144 images | DL | This DL model demonstrates high sensitivity (image 98.04%, video 96.1%) and specificity (image 95.03%, video 99.9%) in real-time diagnosis of esophageal precancerous and early SCC | |
Diagnosis | Ohmori et al[27] | R | Japan | WLI, NBI/BLI, ME | 21597 images | CNN | AI detected ESCC via non-magnifying endoscopy (NBI/BLI) with 100% sensitivity. With magnification, accuracy reached 83%, comparable to expert endoscopists |
Table 2 Summary of artificial intelligence in gastric diseases
Disease | Application | Ref. | Study design | Country/region | Modality | Test set | AI model | Main findings |
H. pylori infection | Diagnosis | Martin et al[43] | R | United States | Gastric biopsy | 406 patients | CNN | DCNNs accurately recognize gastric pathology damage patterns, especially H. pylori gastritis, and serve as effective screening tools |
Diagnosis | Mohan et al[44] | R | China, Japan | WLI, BLI, LCI | - | CNN | For H. pylori infection diagnosis, CNN achieved 87% accuracy, sensitivity, and specificity, comparable to endoscopists (82.9% accuracy) | |
Diagnosis | Nakashima et al[45] | R | Japan | LCI, WLI | 515 patients | CNN | Developed LCI/DL-based CAD classifying H. pylori infection into uninfected, active, and post-eradication statuses with 84.2%, 82.5%, 79.2% accuracy. Outperforms WLI and matches expert endoscopists | |
Gastric polyp | Diagnosis | Yuan et al[46] | R | China | WLI | 9443 patients | DCNNs | AI achieved 96.2% accuracy and 88.0% sensitivity for gastric polyps. With AI, junior endoscopists' accuracy significantly improved (96.9%→97.6%), matching seniors |
Diagnosis | Cao et al[47] | R | China | Gastroscopic imaging | 2270 images | DL | Improved YOLOv3 with feature fusion boosts small polyp detection in gastroscopic images to 91.6% accuracy, resolving complex background interference | |
GC | Diagnosis | Horiuchi et al[48] | R | Japan | ME-NBI | 2828 images | CNN | CNN system distinguishes EGC from gastritis (sensitivity 95.4%, NPV 91.7%, accuracy 85.3%), aiding clinical diagnosis |
Diagnosis | Li et al[39] | P & R | China | ME-NBI | 20341 images | CNN | ME-NBI-based CNN achieves 90.91% accuracy for early GC; 91.18% sensitivity (superior to experts), 90.64% specificity (comparable); overall outperforms non-experts | |
Diagnosis | Bu et al[49] | P | China | Liquid biopsy | 150 samples | ML | Developing NanoFisher for efficient plasma EV isolation, combining metabolomics and machine learning, achieves 92% accuracy in EGC diagnosis | |
Treatment | Wang et al[50] | R | China | CECT | 244 patients | ML | CT radiomics distinguishes T2 from T3/T4 GC, guiding neoadjuvant chemotherapy selection | |
Treatment | Shang et al[51] | R | China | CECT | 311 images | DL | A nomogram built from radiomic and DL features via automated spleen segmentation effectively predicts GC serosal invasion, providing a noninvasive tool for surgical planning | |
Treatment | Kang et al[52] | R | South Korea | CT, WLI, biopsy | 2927 patients | DL | Developed a Transformer-based multimodal AI system integrating endoscopic images and clinical data. Accurately predicts EGC lymph node metastasis risk (AUC 0.908), guiding treatment decisions | |
Treatment | Chen et al[53] | R | China | Laparoscopic surgery video | 2460 images | DL | Develop AI models to accurately identify perigastric vessels, enhancing safety and reducing bleeding risks in laparoscopic gastrectomy | |
Prognosis | Zhang et al[54] | R | China | CT | 669 patients | DCNN | Developed CT-based radiomics nomogram integrating radiomics and clinical factors (e.g., CEA) effectively predicts early recurrence in advanced GC preoperatively (AUC 0.806-0.831) | |
Prognosis | Dong et al[55] | R | China, Italy | CECT | 730 patients | DL | DLRN accurately predicts GC lymph node metastasis number (C-index 0.797-0.822), outperforming clinical staging and correlating significantly with survival | |
Prognosis | Huang et al[56] | R | China | CECT | 205 patients | ML | Developed a ML nomogram combining clinical factors (T/N stage) and CT radiomics to predict gastric cancer PNI (validation AUC 0.885), aiding prognosis |
Table 3 Summary of artificial intelligence in intestinal diseases
Disease | Application | Ref. | Study design | Country/region | Modality | Test set | AI model | Main findings |
Crohn's disease | Diagnosis | Li et al[66] | R | China | CTE, histopathology | 167 patients | ML | A validated CTE-based radiomics model accurately distinguishes moderate-severe from non-mild fibrosis in Crohn's bowel walls, significantly outperforming radiologists' visual assessments |
Diagnosis | Majtner et al[67] | P | Denmark | pan-CE | 7744 images | DL | Auto-detects Crohn's ulcers with 98.4% accuracy. Comparable small/Large bowel accuracy (98.5% vs 98.1%). Distinguishes severity (κ = 0.72) | |
Diagnosis | Klang et al[68] | R | Israel | CE | 27892 images | CNN | DL model detects Crohn's enteric strictures at 93.5% accuracy (AUC 0.989), precisely distinguishing strictures from ulcers (including severe), enabling automated CE diagnosis | |
Treatment | Konikoff et al[69] | R | Israel | APCT | 101 patients | ML | Developed an ML model using indicators, such as NLR, to predict Crohn's complication risk in emergency CT (AUC 0.774), enabling risk stratification to reduce unnecessary scans | |
Treatment | Con et al[70] | R | Australia | Serum biomarkers | 146 patients | DL | RNN with serial biomarkers (AUC 0.754) outperforms logistic regression (AUC 0.659) in predicting biochemical remission (CRP < 5 mg/L) at 12 months post-anti-TNF therapy in Crohn's disease | |
Prognosis | Ungaro et al[71] | P | United States, Canada | PEA | 265 patients | ML | ML model identified 5 plasma protein markers for penetrating (B3) and 4 for stricturing (B2) complications, outperforming traditional models (B3 AUC 0.79) | |
Prognosis | Stidham et al[72] | R | United States | Laboratory examination | 2809 patients | ML | Machine learning leveraging routine longitudinal lab data predicts Crohn's surgical risk (AUC 0.78) | |
SINENs | Diagnosis | Kjellman et al[73] | P | Northern Europe | PET/CT, MRI etc. | 278 patients | ML | Multi-plasma protein markers with Random Forest boost SI-NETs diagnosis (Sensitivity 89%, Specificity 91%, AUC 0.99) |
Diagnosis | Clift et al[74] | R | United Kingdom | EHR | 382 patients | ML | XGBoost using primary care EHRs effectively identifies undiagnosed high-risk SI-NET patients (AUC 0.869) |
Table 4 Summary of artificial intelligence in colorectal diseases
Disease | Application | Ref. | Study design | Country/region | Modality | Test set | AI model | Main findings |
Ulcerative colitis | Diagnosis | Sutton et al[99] | R | Norway | Endoscopy | 8000 images | CNN | AI (especially DenseNet121 model) accurately distinguishes UC from non-UC pathology (AUC 0.999) and grades endoscopic activity (mild/severe, AUC 0.90) |
Diagnosis | Lo et al[100] | R | Denmark | WLI | 1484 images | CNN | The CNN model achieved 84% accuracy in distinguishing UC endoscopic severity (Mayo score 0-3), significantly outperforming existing models and standardizing clinical assessment | |
Diagnosis | Ruan et al[101] | R | China | Colonoscope | 1772 patients | CNN | AI model detects UC/CD at 99.1% accuracy vs physicians' 78%-92.2%, enhancing clinical efficiency | |
Diagnosis | Gutierrez Becker et al[102] | R | Europe et al | Endoscopy | 1672 videos | CNN | This model directly analyzes raw colonoscopy videos, automatically assessing UC severity (MCES) with high accuracy (AUC 0.84-0.85), reducing manual annotation needs | |
Treatment | Bossuyt et al[103] | P | Belgium, Japan | Endoscopy | 100 patients | ML | The RD algorithm objectively evaluates UC endoscopic and histologic activity, closely correlated with RHI (r = 0.74), and is sensitive for monitoring therapeutic response | |
Treatment | Iacucci et al[104] | P | Europe, North America | HD-WLE, VCE | 283 patients | CNN | AI system accurately differentiates UC endoscopic activity/remission (AUC 0.94) and predicts histological remission (83% accuracy), comparable to physicians | |
Prognosis | Huang et al[105] | R | China | Colonoscope | 856 images | DL, ML | DL/ML-CAD diagnoses mucosal healing (MES 0-1) with 94.5% accuracy and complete healing (MES 0) at 89.0% | |
Prognosis | Popa et al[106] | R | Romania | Colonoscope | 55 patients | ML | ML models accurately predict endoscopic disease activity one year after anti-TNFα therapy in UC patients (90% accuracy in test set, 100% in validation set) | |
Prognosis | Takenaka et al[107] | P | Japan | Endoscopy | 2012 patients | DNN | DNN achieves 90.1% accuracy for endoscopic remission and 92.9% for histological remission (UC), reducing biopsy needs | |
Prognosis | Maeda et al[108] | P | Japan | Endocytoscope, NBI | 145 patients | ML | Real-time AI endoscopy predicts relapse risk in UC remission by analyzing mucosal microvessels (AI-Active 28.4% vs AI-Healing 4.9%, P < 0.001) | |
Colorectal polyps | Diagnosis | Wang et al[109] | R | China | Colonoscope | 1600 patients | CNN | Enhanced GAP model achieves > 98% accuracy (TPR > 96%, TNR > 98%) for colon polyp detection with reduced parameters, enabling lightweight yet accurate diagnosis |
Diagnosis | Song et al[89] | P & R | South Korea | NBI | 1169 samples | DL | CAD with NBI predicts polyp histology at 81.3%-82.4% accuracy, outperforming junior physicians (63.8-71.8%) and matching experts (82.4-87.3%), enhancing junior diagnostic performance | |
Diagnosis | Jin et al[110] | P & R | South Korea | NBI | 2450 images | CNN | AI assistance significantly boosts endoscopists' (especially novices') accuracy for small polyps (< 5 mm) (73.8%→85.6%) and reduces time (3.92→3.37 seconds/polyp) | |
Diagnosis | Sakamoto et al[111] | R | Japan | WLI, LCI, BLI | 1788 images | DL | CADe sensitive > 94% (WLI/LCI), CADx accuracy > 93% (WLI/BLI), rivals expert endoscopists | |
Diagnosis | Zachariah et al[112] | R | USA | WLI, NBI | 6223 images | CNN | CNN real-time prediction of colorectal polyp pathology meets PIVI standards: 97% adenoma NPV, > 93% surveillance interval concordance | |
Treatment | Wickstrøm et al[113] | R | Europe | Colonoscope | 912 images | FCNs | CNNs rely on polyp shape/edges for segmentation; error risk rises significantly in uncertain areas. FCNs combine uncertainty with interpretability visualization, helping doctors pinpoint high-risk regions fast | |
Treatment | Su et al[114] | P | China | Colonoscope | 659 patients | DCNN | AQCS significantly boosts adenoma detection (28.9% vs 16.5%, P < 0.001), polyp detection, and optimizes withdrawal time and bowel prep during colonoscopy | |
CRC | Diagnosis | Luo et al[115] | P & R | China | Liquid biopsy | 3315 patients | ML | Plasma ctDNA methylation markers (e.g., cg10673833) enable early CRC diagnosis (AUC 0.96) and high-risk group screening (Sensitivity 89.7%) |
Diagnosis | Arabameri et al[116] | R | France, USA, Austria | Fecal microbiota analysis | 350 patients | ML | Combining GRNN and DBFS (new feature selection) identified 6 key microbial markers (e.g., Clostridium), enabling high-precision CRC detection (AUC 0.911) but insufficient adenoma sensitivity (AUC 0.724) | |
Diagnosis | Zeng et al[117] | P | USA | OCT | 26000 images | CNN | PR-OCT system using OCT and RetinaNet distinguishes CRC from normal tissue in real-time with high accuracy (sensitivity 100%, specificity 99.7%, AUC 0.998) | |
Treatment | Wang et al[118] | R | China | MRI | 240 patients | Faster R-CNN | Faster R-CNN detects positive CRM on rectal cancer pre-op high-resolution MRI with 93.2% accuracy (AUC 0.953); 0.2 seconds/image, highly feasible and efficient | |
Treatment | Yang et al[119] | R | China | MRI | 89 patients | ML | Pre-treatment ADC radiomics predicts locally advanced rectal cancer resistance to neoadjuvant chemoradiotherapy (AUC 0.83/91.3% accuracy) | |
Treatment | Fu et al[120] | R | USA | MRI | 43 patients | DL | DL-based radiomics significantly outperform handcrafted features in predicting neoadjuvant chemoradiotherapy response for locally advanced rectal cancer (AUC 0.73 vs 0.64) | |
Prognosis | Xu et al[121] | R | China | CT, MRI et al | 999 patients | ML | GradientBoosting and LightGBM effectively predict stage IV CRC recurrence risk (AUC up to 0.881); key factors: Chemotherapy, age, LogCEA, CEA, anesthesia duration | |
Prognosis | Zhao et al[122] | R | China | Clinical data | 7205 patients | ML | ML-based NCDB nomogram predicts metastatic rectal cancer 3-year OS (C-index > 0.77, internal/external validation), outperforming prior models | |
Prognosis | Reichling et al[123] | R | France | IHC, WSI | 1018 patients | ML | DGMuneS integrating tumor-stroma/CD3+/tumor features better predicts stage III colon cancer recurrence than traditional immune scores (C-index 0.601 vs 0.578) | |
Prognosis | Skrede et al[124] | R | Norway, United Kingdom | H&E staining | 2467 patients | CNN | Developed a DL-based prognostic biomarker (DoMore v1-CRC) using only routine H&E-stained slides, effectively stratifying Stage II/III CRC risk and outperforming existing markers | |
Prognosis | Väyrynen et al[125] | P | USA | H&E staining | 1504 samples | ML | ML on H&E slides links dense stromal lymphocytes/eosinophils and their peri-tumoral localization to significantly improved CRC-specific survival |
Table 5 Summary of artificial intelligence in hepatic disease
Disease | Application | Ref. | Study design | Country/region | Modality | Test set | AI model | Main findings |
Hepato-cirrhosis | Diagnosis | Rhyou and Yoo[146] | R | South Korea | US | 4950 images | DL | A patch-based DL network for ultrasound cirrhosis diagnosis using synthetic image augmentation, achieving 99.95% accuracy, 100% sensitivity, and 99.9% specificity |
Diagnosis | Luetkens et al[147] | R | Germany | MRI | 465 patients | CNN | ResNet50 distinguishes alcoholic from nonalcoholic cirrhosis on MRI: AUC 0.82, 75% accuracy | |
Diagnosis | Chang et al[148] | R | United States | Liver biopsy, FibroScan etc. | 1370 patients | ML | ML models (especially Random Forest) outperform traditional non-invasive tests (e.g., FibroScan, FIB-4) in identifying significant fibrosis and cirrhosis in NAFLD patients | |
Diagnosis | Mazumder et al[149] | R | United States | CT, liver biopsy etc. | 351 patients | DL, ML | Combining AI-extracted CT radiomics with routine lab data boosts liver cirrhosis prediction accuracy (AUC 0.84-0.85) | |
Diagnosis | Guo et al[150] | P | United Kingdom | NMR spectroscopy | 64005 patients | ML | A plasma metabolomics and ML-based nomogram accurately predicts 10-year hepatic cirrhosis complication risk (AUC 0.861), outperforming conventional metrics | |
Hepatic encephalopathy, HE | Diagnosis | Calvo Córdoba et al[151] | P | Spain | VOG | 47 patients | SVM | Automated VOG with SVM detects MHE in 7-10 minutes (93% sensitivity/specificity), outperforming PHES (25-40 minutes) |
Diagnosis | Sparacia et al[152] | R | Italy | MRI | 124 patients | ML | MRI radiomics with KNN predicts HE presence (76.5% accuracy); MLP predicts HE severity (≥ stage 2, 94.1%), demonstrating potential for HE diagnosis and staging | |
Diagnosis | Chen et al[153] | R | China | MRI | 53 patients | SVM | SVM model using gray matter volume discriminates cirrhosis patients with/without MHE at 83.02% accuracy | |
Treatment | Liu et al[154] | R | China | TIPS | 218 patients | ML | Developed logistic regression model (AUC 0.825) accurately predicts OHE post-TIPS | |
Treatment | Zhong et al[155] | R | China | TIPS | 207 patients | ANN | ANN model accurately predicts post-TIPS OHE (C-index = 0.863), with 15.9% incidence within 3 months, providing a clinical stratification tool | |
Liver cancer | Diagnosis | Gao et al[156] | R | China | CECT | 723 patients | DL | This model effectively distinguishes malignant liver tumors (HCC, ICC, metastases), achieving 72.6% test-set accuracy and improving physician ICC sensitivity by 26.9% |
Diagnosis | Xu et al[157] | R | China | CT | 1049 patients | DL | Swin-Transformer model simplifies LI-RADS classification, effectively distinguishes HCC from non-HCC, and enhances diagnostic performance with clinical data | |
Diagnosis | Li et al[158] | R | China | DECT | 262 patients | DL | Dual-energy CT deep-learning radiomics nomogram noninvasively predicts HCC MTM subtype, outperforming clinical-radiologic models (AUC 0.87-0.91) | |
Diagnosis | Ma et al[159] | R | China | CT, MRI | 211 patients | ML | CT/MRI radiomics + clinical features (SVM) achieves highest HCC diagnostic accuracy (82.4%), significantly outperforms single-modality models, and distinguishes HCC vs non-HCC | |
Treatment | Hua et al[144] | R | China | CECT | 151 patients | ML | Developed and validated a pretreatment CT-based radiomics model predicting response and survival outcomes after triple therapy in unresectable HCC to guide clinical decisions | |
Treatment | Xu et al[143] | R | China | CECT | 458 patients | DL | The model significantly outperforms single models (externally validated AUC 0.896), effectively distinguishing survival differences (P < 0.001), providing a tool for personalized treatment | |
Treatment | An et al[160] | R | China | IATs | 2959 patients | ML | Developed MLDSM to risk-stratify unresectable HCC patients undergoing transarterial therapies (alone/combined) for 12-month mortality, guiding clinical treatment decision (e.g. TACE, HAIC) | |
Prognosis | Cao et al[161] | R | China | CT, MRI et al | 466 patients | DL, ML | Developed pre-/post-op dual-phase DeepSurv model predicting HCC recurrence post-liver transplant; outperformed Milan Criteria (C-index 0.765-0.839), guiding individualized surveillance | |
Prognosis | Altaf et al[162] | R | Pakistan | CT, MRI etc. | 192 patients | CNN | AI model with tumor size, AFP, and grade predicts post-transplant HCC recurrence risk (validation AUC 0.77) | |
Prognosis | Dong et al[163] | R | United States | Clinical data | 2038 patients | ML | XGBoost model effectively predicts 1-, 3-, and 5-year survival (AUC > 0.7) in AFP-positive HCC patients, outperforming other algorithms, offering a clinical tool for early intervention | |
Prognosis | Yan et al[164] | R | China | MRI | 285 patients | CNN | Deep learning-based nomogram integrating imaging, MVI, and tumor number significantly outperforms traditional models in predicting early HCC recurrence (AUC 0.949 vs 0.751) |
- Citation: Ren SQ, Chen JM, Cai C. Translational artificial intelligence in gastrointestinal and hepatic disorders: Advancing intelligent clinical decision-making for diagnosis, treatment, and prognosis. World J Gastroenterol 2025; 31(36): 110742
- URL: https://www.wjgnet.com/1007-9327/full/v31/i36/110742.htm
- DOI: https://dx.doi.org/10.3748/wjg.v31.i36.110742