Copyright: ©Author(s) 2026.
World J Gastroenterol. Mar 28, 2026; 32(12): 115990
Published online Mar 28, 2026. doi: 10.3748/wjg.v32.i12.115990
Published online Mar 28, 2026. doi: 10.3748/wjg.v32.i12.115990
Table 1 Explanation of deep convolutional neural network architectures
| Architecture name | Core innovation/structural features | Relevance in medical field |
| VGGNet | Adopts a concise structure of “stacked small convolutional kernels (3 × 3) + pooling layers”, enhancing feature extraction capability by increasing network depth | A classic model for basic feature extraction in medical images, suitable for preliminary lesion detection and medical image classification (e.g., X-ray disease screening), laying the foundation for subsequent architectures in medical AI |
| ResNet | Introduces “residual connections” (cross-layer feature transmission) to solve the gradient vanishing problem in deep network training, enabling the construction of ultra-deep networks | Significantly improves feature extraction accuracy for complex medical images, applicable to pathological section analysis and 3D medical image segmentation (e.g., tumor boundary extraction), serving as a core architecture for disease diagnosis models |
| DenseNet | Employs “dense connections” (direct feature sharing across all layers) to enhance feature propagation efficiency and reduce parameter redundancy | Excels in fine-grained analysis of medical images, such as micro-lesion recognition and multi-modal medical image fusion (e.g., combining CT and MRI images), demonstrating distinct advantages in precision medical diagnosis |
Table 2 Summary of comparisons of major artificial intelligence-assisted endoscopy systems
| System name | Target site and function | Key performance metrics | Validation status and characteristics |
| Deep learning-based endoscopy systems | Esophagus, stomach: Early cancer detection and diagnosis | Sensitivity for early gastric cancer > 90%, specificity > 80% | Mostly in clinical research phase: Validation often involves single-center or retrospective studies; demonstrates potential to match or surpass human experts in specific tasks |
| Detection accuracy for early esophageal cancer comparable to expert endoscopists | |||
| AI-assisted capsule endoscopy systems | Small bowel: Automatic detection of ulcers, bleeding, polyps, etc. | Sensitivity for small bowel lesions > 95%, specificity > 90% | Validated by multicenter prospective studies: Some systems have received regulatory approval and are in clinical use; aims to address the inefficiency of analyzing large CE image volumes |
| Significantly increases reading speed, reducing physician workload by > 70% | |||
| CADe colonoscopy systems | Colorectum: Real-time polyp detection (CADe) | Increases adenoma detection rate by an absolute 5%-10% | Some systems approved by FDA, CE, NMPA: Supported by the highest level of evidence (multicenter RCTs); integrated into commercial endoscopy platforms; value is pronounced in community practice settings |
| Particularly effective for detecting small polyps (< 5 mm) and flat adenomas | |||
| CADx colonoscopy systems | Colorectum: Real-time polyp characterization (CADx) | Accuracy for optical diagnosis of adenomatous polyps > 90% | Some features approved and commercialized: Integrated with CADe systems; aims to provide “see-and-diagnose” capability, reducing unnecessary polypectomies and screening costs |
| Enables reliable “diagnose-and-leave” or “resect-and-discard” strategies with > 90% confidence | |||
| AI-assisted laryngoscopy/pharyngeal diagnosis systems | Pharynx, larynx: Early cancer detection | Sensitivity for laryngopharyngeal cancer 90%-93%, specificity > 92% | Primarily in prospective research or pilot project phase: Sample sizes are relatively small, but shows great promise for multimodal AI in complex anatomical sites |
| Capable of multimodal analysis integrating voice signals and images | |||
| AI-assisted high-resolution anoscopy | Anal canal: Detection of HSIL | Shows high accuracy in differentiating HSIL from other conditions | Research is very preliminary and exploratory: Limited sample sizes; a promising tool for screening specific high-risk populations but requires further validation |
| Can integrate HPV status for risk prediction |
- Citation: Ning ZX, Xiao JJ, Zhou ZX. Artificial intelligence-assisted endoscopy in the detection of early gastrointestinal cancer: Progress, challenges, and future directions. World J Gastroenterol 2026; 32(12): 115990
- URL: https://www.wjgnet.com/1007-9327/full/v32/i12/115990.htm
- DOI: https://dx.doi.org/10.3748/wjg.v32.i12.115990
