Copyright
©The Author(s) 2025.
World J Gastroenterol. Oct 28, 2025; 31(40): 111499
Published online Oct 28, 2025. doi: 10.3748/wjg.v31.i40.111499
Published online Oct 28, 2025. doi: 10.3748/wjg.v31.i40.111499
Table 1 Take-home messages for practicing endoscopists based on current guidelines
| Society | Current position on AI in colonoscopy |
| ESGE[16] | Supports CADe use to reduce AMR, provided minimal false positives and no WT prolongation; allows CADx for “leave-in-situ”/“resect and discard”, if accuracy is equivalent to experts. Recommends AI integration to help standardize performance in less experienced endoscopists |
| BMJ[17] | Recommends against routine CADe use due to concerns over overdiagnosis, minimal clinical benefit, and increased surveillance; recommendation graded as “weak” |
| AGA[19] | Does not issue a definitive recommendation for or against CADe use, citing uncertainty in long-term outcomes |
| WEO[20] | Acknowledges CADe/CADx potential but urges cost-effectiveness assessment; highlights the need for high-quality real-world studies across healthcare systems |
| Ref. | Year | Study type | Patient number | Number of studies | Primary outcome | Artificial intelligence vs conventional colonoscopy (95%CI) |
| Barua et al[37] | 2020 | Systematic review and meta-analysis | 4311 | 5 RCTs | ADR, PDR, APC, PPC, aAPC | ADR: 29.6% vs 19.3%; RR = 1.52 (1.31 to 1.77); PDR: 45.4% vs 30.6%; RR = 1.48 (1.37 to 1.60); No difference in aAPC; PPC: 0.93 vs 0.51; Mean difference 0.42 (0.33 to 0.50); APC: 0.41 vs 0.23; Mean difference 0.18 (0.13 to 0.22) |
| Aziz et al[108] | 2020 | Systematic review and meta-analysis | 2815 | 3 RCTs | ADR | 32.9% vs 20.8%; RR = 1.58 (1.39 to 1.80) |
| Mohan et al[109] | 2020 | Systematic review and meta-analysis | 4962 | 6 RCTs | ADR | 32.8% vs 21.1%; RR = 1.5 (1.30 to 1.72) |
| Ashat et al[110] | 2021 | Systematic review and meta-analysis | 5058 | 6 RCTs | ADR | 33.7% vs 22.9%; OR = 1.76 (1.55 to 2.00) |
| Hassan et al[56] | 2021 | Systematic review and meta-analysis | 4354 | 5 RCTs | ADR | 36.6% vs 25.2%; RR = 1.44 (1.27 to 1.62) |
| Deliwala et al[61] | 2021 | Systematic review and meta-analysis | 4996 | 6 RCTs | ADR, PDR | ADR: OR = 1.77 (1.50 to 2.08); PDR: OR = 1.91 (1.68 to 2.16) |
| Nazarian et al[51] | 2021 | Systematic review and meta-analysis | 5577 subjects for polyp detection with ADR and PDR | 48 studies (18 studies for polyp detection, 22 studies for polyp characterization and 8 studies for PDR) | ADR, PDR, polyp characterization | PDR: OR = 1.75 (1.56 to 1.96); ADR: OR = 1.53 (1.32 to 1.77) |
| Li et al[59] | 2021 | Systematic review and meta-analysis | 4311 | 5 studies | ADR, PDR | PDR: OR = 1.91 (1.68 to 2.16); ADR: OR = 1.75 (1.52 to 2.01) |
| Zhang et al[60] | 2021 | Systematic review and meta-analysis | 5427 | 7 RCTs | ADR, PDR | PDR: OR = 1.95 (1.75 to 2.19); ADR: OR = 1.72 (1.52 to 1.95) |
| Spadaccini et al[48] | 2021 | Systematic review and network meta-analysis | 34445 | 50 RCTs | ADR | CADe vs HD white-light endoscopy OR = 1.78 (1.44 to 2.18); CADe vs chromoendoscopy OR = 1.45 (1.14 to 1.85); CADe vs increased mucosal visualization systems OR = 1.54 (1.22 to 1.94) |
| Huang et al[41] | 2022 | Systematic review and meta-analysis | 6629 | 10 RCTs | ADR, PDR | ADR: 35.4% vs 24.9%; RR = 1.43 (1.33 to 1.53)/OR = 1.45 (1.32 to 1.59); PDR: 48.6% vs 33.8%; RR = 1.44 (1.35 to 1.53)/OR = 1.90 (1.70 to 2.11) |
| Shah et al[62] | 2023 | Systematic review and meta-analysis | 10928 | 14 RCTs | ADR, PDR | ADR: OR = 1.52 (1.39 to 1.67); PDR: OR = 1.48 (1.37 to 1.61) |
| Hassan et al[57] | 2023 | Systematic review and meta-analysis | 18232 | 21 RCTs | ADR, APC, aAPC, number of serrated lesions/colonoscopy, number of polypectomies for nonneoplastic lesions, WT | ADR: 44.0% vs 35.9%; RR = 1.24 (1.16 to 1.33) |
| Lou et al[58] | 2023 | Systematic review and meta-analysis | 27404 | 33 RCTs | ADR, APC | ADR: RR = 1.242 (1.159 to 1.332); APC: IRR = 1.390 (1.277 to 1.513) |
| Adiwinata et al[50] | 2023 | Systematic review and meta-analysis | NA | 13 RCTs | ADR, PDR | PDR: OR = 1.46 (1.13 to 1.89); ADR: OR = 1.58 (1.37 to 1.82) |
| Shiha et al[43] | 2023 | Systematic review and meta-analysis | 11340 | 12 RCTs | ADR | 41.4% vs 33%; RR = 1.26 (1.18 to 1.35) |
| Wei et al[111] | 2024 | Systematic review and meta-analysis | 11660 | 12 studies | ADR | ADR was statistically significantly higher with AI vs CC (36.3% vs 35.8%, RR = 1.13, 95%CI: 1.01 to 1.28) |
| Aziz et al[112] | 2024 | Systematic review and network meta-analysis | 61172 | 94 RCTs | ADR | Autofluorescence imaging: RR = 1.33 (1.06 to 1.66); Dye-based chromoendoscopy: RR = 1.32 (1.17 to 1.50); Endo cuff: RR = 1.19 (1.04 to 1.35); Endo cuff vision: RR = 1.26 (1.13 to 1.41); Endoring: RR = 1.30 (1.10 to 1.52); Flexible spectral imaging color enhancement: RR = 1.26 (1.09 to 1.46); Full-spectrum endoscopy: RR = 1.40 (1.19 to 1.65); High definition: RR = 1.41 (1.28 to 1.54); Linked color imaging: RR = 1.21 (1.08 to 1.36); Narrow band imaging: RR = 1.33 (1.18 to 1.48); Water exchange: RR = 1.22 (1.06 to 1.42); Water immersion: RR = 1.47 (1.19 to 1.82) |
| Soleymanjahi et al[18] | 2024 | Systematic review and meta-analysis | 36201 | 44 RCTs | APC, ACN | APC: 0.98 vs 0.78; IRD = 0.22 (0.16 to 0.28); ACN: 0.16 vs 0.15; IRD = 0.01 (-0.01 to 0.02) |
| Patel et al[46] | 2024 | Systematic review and meta-analysis | 9782 | 8 non-randomized studies | ADR | 44% vs 38%; RR = 1.11 (0.97 to 1.28) |
| Gangwani et al[113] | 2024 | Network meta-analysis | 22560 | 26 studies (20 RCTs, 3 retrospective, 3 prospective studies) | ADR | AI vs single operator: RR = 1.1 (0.9 to 1.2) |
| Lee et al[44] | 2024 | Systematic review and meta-analysis | 17413 | 24 RCTs | ADR | Tandem studies: RR = 1.18 (1.08 to 1.30); Parallel studies: RR = 1.26 (1.17 to 1.35); Overall ADR: RR = 1.24 (1.17 to 1.31) |
| Makar et al[40] | 2025 | Systematic review and meta-analysis | 23861 | 28 RCTs | ADR | ADR: RR = 1.20 (1.14 to 1.27) |
| Spadaccini et al[55] | 2025 | Systematic review and meta-analysis | 5421 | 10 RCTs | ADR | 0.62 vs 0.52; RR = 1.19 (1.08 to 1.31) |
| Ref. | Year | Study type | Patient number | Number of studies | Primary outcome | Artificial intelligence vs conventional colonoscopy (95%CI) |
| Hassan et al[57] | 2023 | Systematic review and meta-analysis | 18232 | 21 RCTs | AMR | AMR: 16% vs 35%; RR = 0.45 (0.35 to 0.58) |
| Lou et al[58] | 2023 | Systematic review and meta-analysis | 27404 | 33 RCTs | AMR | AMR: RR = 0.495 (0.390 to 0.627) |
| Maida et al[52] | 2024 | Systematic review and meta-analysis | 1718 | 6 RCTs | AMR, PMR | PMR: 16.3% vs 38.1%; RR = 0.44 (0.33 to 0.60); AMR: 15.3% vs 34.1%; RR = 0.46 (0.38 to 0.55) |
| Makar et al[40] | 2025 | Systematic review and meta-analysis | 23861 | 28 RCTs | AMR | AMR: RR = 0.45 (0.37 to 0.54) |
Table 4 Summary of study characteristics evaluating artificial intelligence performance for invasion depth prediction and polyp characterization in colonoscopy
| Ref. | Year | Study type | Image type | AI algorithm | Patient number training set | Patient number validation set | Patient number testing set | Primary outcome | Sensitivity (%) | Specificity (%) | Accuracy (%) |
| Luo et al[69] | 2021 | Single center retrospective | WLI | Deep learning | 556 | 137 | Invasion depth Tis/T1a vs T1b/> T2 | 91.2 | 91 | 91.1 | |
| Minami et al[70] | 2022 | Single center retrospective | WLI, NBI, CCE | Deep learning | 91 | 49 | 56 | Submucosal invasion depth SM1 vs SM2/3 | 87.2 | 35.7 | 74.4 |
| Lu et al[68] | 2022 | Multicenter retrospective | WLI, NBI, BLI | Deep learning | 305 | 140 | Invasion depth LGD/HGD/IM/SM1 vs SM2/advanced CRC | 90.0 | 94.2 | 93.8 | |
| Nemoto et al[72] | 2023 | Multicenter retrospective | WLI | Deep learning | 1084 | 400 | Invasion depth Tis/T1a vs T1b | 59.8 | 94.4 | 87.3 | |
| Tokunaga et al[73]1 | 2021 | Single center retrospective | WLI | Deep learning | 824 | 211 | Invasion depth LGD/HGD/SM1 vs SM2/advanced CRC | 96.7 | 75.0 | 90.3 | |
| Nakajima et al[71] | 2022 | Multicenter retrospective | WLI | Deep learning | 313 | 44 | Invasion depth Tis/T1a vs T1b | 81.0 | 87.0 | 84.0 | |
| Song et al[75] | 2020 | Single center retrospective | NBI | Deep learning | 624 | 545 | Invasion depth SSP/BA/SM1 vs SM2/3 | 58.8 | 93.3 | 81.3 | |
| Lui et al[74] | 2019 | Single center retrospective | WLI, NBI | Deep learning | 1652 | 76 | Invasion depth polyps ≥ 2 cm adenoma/SM1 vs SM2 | 94.6 | 92.3 | 94.3 | |
| Yao et al[76] | 2023 | Multicenter retrospective | WLI, IEE | Deep learning | 339 | 198 | Invasion depth large SSPs ≥ 10 mm | 78.8 | 96.2 | 90.4 | |
| Racz et al[66] | 2022 | Single center retrospective | NBI | Machine learning | 279 | Polyp characterization non-neoplastic vs neoplastic2 | 92.2 | 77.6 | 86.6 | ||
| Ham et al[67] | 2025 | Single center retrospective | WLI | Deep learning | 2696 | 476 | Polyp characterization low vs high-risk adenomas ≤ 10 mm3 | 75.6 | 95.7 | 93.8 |
Table 5 Practical tips for incorporating artificial intelligence into endoscopy training
| Practical tips for trainees using AI in colonoscopy |
| Use AI as a complement, not a replacement for clinical judgment |
| Review false positive alerts to learn distinguishing features |
| Combine AI with feedback from supervisors |
| Practice with and without AI assistance |
| Incorporate AI into video-based self-review |
| Be aware of potential deskilling with overuse |
| Participate in structured training programs including AI modules |
- Citation: Dimopoulou K, Spinou M, Ioannou A, Nakou E, Zormpas P, Tribonias G. Artificial intelligence in colonoscopy: Enhancing quality indicators for optimal patient outcomes. World J Gastroenterol 2025; 31(40): 111499
- URL: https://www.wjgnet.com/1007-9327/full/v31/i40/111499.htm
- DOI: https://dx.doi.org/10.3748/wjg.v31.i40.111499
