Published online Jan 7, 2026. doi: 10.3748/wjg.v32.i1.112090
Revised: July 30, 2025
Accepted: November 27, 2025
Published online: January 7, 2026
Processing time: 172 Days and 10.8 Hours
The accurate prediction of lymph node metastasis (LNM) is crucial for managing locally advanced (T3/T4) colorectal cancer (CRC). However, both traditional his
To develop and validate a case-level multiple-instance learning (MIL) framework mimicking a pathologist's comprehensive review and improve T3/T4 CRC LNM prediction.
The whole-slide images of 130 patients with T3/T4 CRC were retrospectively collected. A case-level MIL framework utilising the CONCH v1.5 and UNI2-h deep learning models was trained on features from all haematoxylin and eosin-stained primary tumour slides for each patient. These pathological features were subsequently integrated with clinical data, and model performance was evaluated using the area under the curve (AUC).
The case-level framework demonstrated superior LNM prediction over slide-level training, with the CONCH v1.5 model achieving a mean AUC (± SD) of 0.899 ± 0.033 vs 0.814 ± 0.083, respectively. Integrating pathology features with clinical data further enhanced performance, yielding a top model with a mean AUC of 0.904 ± 0.047, in sharp contrast to a clinical-only model (mean AUC 0.584 ± 0.084). Crucially, a pathologist’s review confirmed that the model-identified high-attention regions correspond to known high-risk histopathological features.
A case-level MIL framework provides a superior approach for predicting LNM in advanced CRC. This method shows promise for risk stratification and therapy decisions, requiring further validation.
Core Tip: To better predict lymph node metastasis (LNM) in advanced colorectal cancer, this pilot study developed a case-level deep learning framework. By analysing the pathology slides of all patients and emulating a pathologist's workflow, the model achieved a high area under the curve of 0.899, outperforming traditional methods. Integrating the clinical data further increased the accuracy to 0.904. This interpretable approach is a promising tool for refining LNM risk assessments and guiding adjuvant therapy decisions.
