BPG is committed to discovery and dissemination of knowledge
Review
Copyright ©The Author(s) 2026.
Artif Intell Gastroenterol. Jan 8, 2026; 7(1): 115498
Published online Jan 8, 2026. doi: 10.35712/aig.v7.i1.115498
Table 1 Application of multimodal data in gastrointestinal tumors
Data type
Core characteristics and key technologies
Main clinical application scenarios
AI empowerment and value
Imaging dataCT: High spatial resolution, rapid imaging, morphological analysis; MRI: Excellent soft tissue contrast (DWI, DCE), microenvironment assessment; PET: High metabolic sensitivity (SUV value), assessment of biological activityTumor localization, staging, efficacy evaluation, recurrence monitoringAI application: Automatic segmentation based on CNN; radiomics feature mining. Value: Improves diagnostic consistency, predicts efficacy and metastasis risk
Endoscopic dataProvides HD real-time visualization of mucosal layer; chromo/electronic staining enhances contrastEarly screening and diagnosis (e.g., early gastric cancer, colorectal polyp detection)AI Application: CNN models for automatic lesion identification, classification, and depth assessment. Value: Increases early detection rate, assists treatment decisions
Omics dataGenomics: Reveals driver mutations (e.g., HER2). Transcriptomics/proteomics/metabolomics: Reflects gene expression, protein function, metabolic statusDeciphering tumor heterogeneity, predicting treatment response and prognosis, facilitating personalized therapyAI Application: Feature selection and dimension reduction; multimodal fusion (e.g., GNN model StereoMM, drug response prediction model DROEG). Value: Mines molecular mechanisms, enables precise typing, predicts drug sensitivity
Table 2 Core framework of multimodal data fusion technologies
Core stage
Key methods/technologies
Core challenges & solutions
Primary application value
Data preprocessing & standardizationImaging data: N4 bias field correction, CLAHE, SMORE; Text data: Tokenization, word embedding, LLMs (e.g., BioBERT, GPT-4o); Standardization: Z-score, batch normalization, FHIR standardChallenges: Data heterogeneity, missing values, noise, privacy. Solutions: Dedicated preprocessing, automated tools, unified standards (e.g., FHIR)Improves data quality & consistency, lays foundation for fusion
Fusion strategyEarly fusion (data-level): Directly concatenates raw data. Middle fusion (feature-level): Multi-stream CNN, Attention Mechanism, GNNs. Late fusion (decision-level): Weighted averaging, voting, meta-learningChallenges: Data heterogeneity, inter-modal relationships, information loss. Solutions: Select/combine strategies based on data traits and task goals (e.g., using attention to capture cross-modal dependencies)Integrates multi-source complementary information, enhances model robustness & prediction accuracy
Model training & validationTraining techniques: Data augmentation, handling missing values, regularization, early stopping validation methods: K-fold cross-validation, external validation, multi-center validation evaluation metrics: ACC, AUC, sensitivity, specificity, f1-scoreChallenges: Data imbalance, overfitting, generalization. Solutions: Employ rigorous internal/external validation, use explainable AI (e.g., SHAP) to enhance trustEnsures model reliability, stability, and clinical applicability, promotes clinical translation
Table 3 Clinical applications of multimodal artificial intelligence in personalized gastrointestinal cancer therapy
Application area
Core function
Key technologies/data
Primary value
Intelligent diagnosis & stagingEarly screening & precise staging: Enhances tumor identification and classification, predicts metastasis riskImaging data: CT, EUS, PET/CT; Omics data: Radiomics, genomics; Clinical data: EHRIncreases early detection rates, reduces missed diagnoses; enables more accurate preoperative staging to inform treatment decisions
Treatment optimizationTreatment response prediction: Guides the selection of surgery, radiotherapy, chemotherapy, and targeted/immunotherapy regimensMultimodal fusion models: e.g., MuMo model; Data integration: Radiomics, genomics, immunomics, tumor microbiomeAccurately predicts efficacy, avoids unnecessary treatments; guides personalized medication (e.g., targeted drug combinations) to overcome drug resistance and improve response rates
Prognostic assessment & follow-up managementRisk stratification & recurrence prediction: Precisely assesses patient survival and recurrence risk. Dynamic follow-up management: Enables personalized long-term monitoringPrognostic models: Integrate clinical, imaging, genomic data. Intelligent systems: Clinical Decision Support Systems, EHR analysisEnables precise risk stratification to guide adjuvant therapy; improves follow-up efficiency, provides timely recurrence alerts, and optimizes resource allocation
Table 4 Challenges and future directions of multimodal artificial intelligence in gastrointestinal cancer therapy
Core challenges
Key technologies/methods
Future directions
Data quality & privacy protection: Data heterogeneity (divergent formats/standards); data noise (equipment/operator variations). Patient privacy risks (esp. genomic/imaging data)Data standardization: Common data models (e.g., OMOP CDM, medical imaging CDM); Privacy-preserving techniques: FL, DP, Blockchain; Legal compliance: Frameworks like GDPR to enhance policy transparencyTo build a more secure and reliable data environment, promoting seamless integration and controlled sharing of high-quality data
Model interpretability & clinical acceptability: "Black-box" problem erodes clinical trust. Opaque decision-making hinders regulatory approval & integrationExplainable AI: Attention mechanisms, prototype networks (ProtoPNet), Counterfactual explanations; Interpretability tools: LIME, SHAP, Grad-CAM for visualization & feature importance ranking; Clinical integration: Displaying model uncertainty & key decision factors in CDSSTo develop transparent and trustworthy AI systems, enhance clinician trust, and promote deep integration of AI into clinical workflows
Multi-center collaboration & standardization: Significant data heterogeneity across centers (equipment, protocols, populations). Poor model generalizability, hindering cross-institutional applicationMulti-center data sharing & standardization: Unified data formats and acquisition standards; privacy-preserving collaborative training: Federated learning for joint modeling; standardized multimodal databases: Integrating genomics, radiomics, and other multidimensional dataTo promote large-scale, high-quality multi-center collaboration, establish industry standards, and improve model generalizability and clinical applicability
Technical integration & clinical translation: Reliance on large annotated datasets limits generalizability. Barriers in translating research findings to clinical applicationEmerging ML paradigms: RL for dynamic treatment optimization; SSL to reduce annotation dependency; Integrating Novel Data types: e.g., digital pathology, patient behavior data; Robust clinical validation: Validating model efficacy and robustness through clinical trials and RWDTo integrate multimodal AI with cutting-edge technologies and validate it through rigorous clinical trials, ultimately enabling its routine use in personalized therapy
Table 5 Translation roadmap for clinical application of reinforcement learning and self-supervised learning in gastrointestinal tumors
Phase
Timeframe
Core objective
Key technical milestones
Clinical & regulatory milestones
Short-term1-3 yearsFoundational development & algorithmic validation(1) Complete SSL model pre-training using large-scale historical data; (2) Construct RL simulation environments based on historical outcomes; and (3) Validate superior predictive accuracy of integrated models vs baselines on retrospective data(1) Publication of proof-of-concept studies; and (2) Establishment of open-source benchmark datasets and simulation platforms
Mid-term3-5 yearsClinical trials in limited settings & system integration(1) Develop interpretable, human-in-the-loop CDSS; (2) Model outputs serve as assistive decision aids) for clinicians; and (3) Validate system usability and clinician acceptance in prospective observational studies(1) Obtain initial regulatory approval (e.g., as Class II medical device software); and (2) Develop clinical workflow integration guidelines
Long-term5+ yearsWidespread integration & adaptive learning systems(1) Achieve multi-center deployment using privacy-preserving techniques (e.g., Federated Learning); (2) Explore regulated continuous learning and model adaptation; and (3) Conduct large-scale RCTs with OS as a primary endpoint(1) Confirm clinical benefit through high-level evidence; (2) Establish new standards for individualized care; and (3) Advocate for healthcare reimbursement policy coverage