Luo ZC, Guo HY, Tang X, Chen XR, Zhang CY, Cui YT, Zuo J, Li HR, Hou XM, Chen H, Song SB, Wang XF. Predicting the magnitude of risk for non-curative endoscopic submucosal dissection in superficial esophageal cancer using explainable artificial intelligence. World J Gastrointest Oncol 2026; 18(2): 114782 [DOI: 10.4251/wjgo.v18.i2.114782]
Corresponding Author of This Article
Xian-Fei Wang, Chief Physician, Full Professor, Department of Gastroenterology, Affiliated Hospital of North Sichuan Medical College, No. 1 Maoyuan South Road, Shunqing District, Nanchong 637000, Sichuan Province, China. wangxianfei@nsmc.edu.cn
Research Domain of This Article
Gastroenterology & Hepatology
Article-Type of This Article
Retrospective Study
Open-Access Policy of This Article
This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Feb 15, 2026 (publication date) through Feb 3, 2026
Times Cited of This Article
Times Cited (0)
Journal Information of This Article
Publication Name
World Journal of Gastrointestinal Oncology
ISSN
1948-5204
Publisher of This Article
Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA
Share the Article
Luo ZC, Guo HY, Tang X, Chen XR, Zhang CY, Cui YT, Zuo J, Li HR, Hou XM, Chen H, Song SB, Wang XF. Predicting the magnitude of risk for non-curative endoscopic submucosal dissection in superficial esophageal cancer using explainable artificial intelligence. World J Gastrointest Oncol 2026; 18(2): 114782 [DOI: 10.4251/wjgo.v18.i2.114782]
World J Gastrointest Oncol. Feb 15, 2026; 18(2): 114782 Published online Feb 15, 2026. doi: 10.4251/wjgo.v18.i2.114782
Predicting the magnitude of risk for non-curative endoscopic submucosal dissection in superficial esophageal cancer using explainable artificial intelligence
Zi-Chen Luo, Hai-Yang Guo, Xin-Rui Chen, Cheng-Yu Zhang, Yu-Tong Cui, Ji Zuo, Hao-Rui Li, Xue-Mei Hou, Hao Chen, Shao-Bi Song, Xian-Fei Wang, Department of Gastroenterology, Affiliated Hospital of North Sichuan Medical College, Nanchong 637000, Sichuan Province, China
Xiao Tang, Department of Gastroenterology, Langzhong People’s Hospital, Langzhong 637400, Sichuan Province, China
Xian-Fei Wang, Department of Gastroenterology, Sichuan Branch of National Clinical Research Center for Digestive Diseases, Nanchong 637000, Sichuan Province, China
Author contributions: Luo ZC and Guo HY contributed equally to this work as co-first authors; Wang XF, Luo ZC, and Guo HY designed the study, analyzed the data, drafted the article, and critically revised the article; Luo ZC, Zhang CY, Guo HY, Chen XR, Cui YT, Zuo J, Li HR, Hou XM, Chen H, Song SB, and Tang X collected the data. All authors have read and approved the final manuscript.
Institutional review board statement: The study was reviewed and approved by the Ethics Committee of The Affiliated Hospital of North Sichuan Medical College, No. 2024ER140-1.
Informed consent statement: The requirement for informed consent was waived by the Ethics Committee due to the retrospective nature of the study.
Conflict-of-interest statement: All the authors report no relevant conflicts of interest for this article.
Data sharing statement: The original datasets presented in the study are included in the article, further inquiries can be directed to the corresponding author.
Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Xian-Fei Wang, Chief Physician, Full Professor, Department of Gastroenterology, Affiliated Hospital of North Sichuan Medical College, No. 1 Maoyuan South Road, Shunqing District, Nanchong 637000, Sichuan Province, China. wangxianfei@nsmc.edu.cn
Received: September 29, 2025 Revised: October 31, 2025 Accepted: December 19, 2025 Published online: February 15, 2026 Processing time: 127 Days and 20.9 Hours
Abstract
BACKGROUND
Endoscopic submucosal dissection (ESD) serves as a critical treatment modality for superficial esophageal cancer. However, non-curative resection is significantly associated with residual tumors and unfavorable prognosis. Effective preoperative predictive tools are currently lacking.
AIM
To develop and validate a machine learning-based prediction model for accurate preoperative assessment of the risk of non-curative ESD resection.
METHODS
This multicenter retrospective study included 366 superficial esophageal cancer patients from the Affiliated Hospital of North Sichuan Medical College as a training set, and 129 patients from Langzhong People’s Hospital as an independent external validation set. Predictors were selected using least absolute shrinkage and selection operator and multivariate logistic regression. Nine machine learning classifiers, including logistic regression, LightGBM, and XGBoost, were integrated to develop the models, and SHapley Additive exPlanations (SHAP) were employed to achieve risk visualization.
RESULTS
Key predictive factors identified included esophageal stricture, computed tomography-based esophageal wall thickening > 7 mm, endoscopically estimated invasion depth > superficial layer (SM1) (endoscopic ultrasound or magnifying endoscopy with narrow-band imaging collectively referred to as EOM > SM1), multiple lesions, circumferential ratio ≥ 3/4, and preoperative pathological type. The logistic regression model constructed with these factors demonstrated optimal performance (training set area under the curve (AUC) = 0.887; internal validation AUC = 0.872; external validation AUC = 0.849). SHAP analysis further revealed computed tomography-based esophageal wall thickening > 7 mm and EOM > SM1 as core risk-driving factors.
CONCLUSION
The logistic regression prediction model developed in this study effectively identifies patients at high risk of non-curative resection prior to ESD. By incorporating SHAP-based interpretability, the model provides a reliable and transparent tool to support clinical decision-making.
Core Tip: This multicenter study produced an online, interpretable prediction tool that quantifies the preoperative risk of non-curative endoscopic submucosal dissection for superficial esophageal cancer. With a clear cutoff (SHapley Additive exPlanations value ≥ 0.185), it provides immediate, transparent guidance: High-risk patients are directed to radical surgery, while low-risk ones are confirmed as endoscopic submucosal dissection candidates, ensuring the first treatment choice is optimal.
Citation: Luo ZC, Guo HY, Tang X, Chen XR, Zhang CY, Cui YT, Zuo J, Li HR, Hou XM, Chen H, Song SB, Wang XF. Predicting the magnitude of risk for non-curative endoscopic submucosal dissection in superficial esophageal cancer using explainable artificial intelligence. World J Gastrointest Oncol 2026; 18(2): 114782
Esophageal cancer ranks as the seventh most common malignancy globally. Its early stage, termed superficial esophageal cancer (SEC), is defined as a lesion with tumor infiltration confined to the mucosa (T1a) or submucosa (T1b)[1]. SEC accounts for approximately 20% to 30% of newly diagnosed esophageal cancer cases. It is characterized by a relatively low risk of lymph node metastasis (less than 5% for intramucosal carcinoma and 10% to 20% for submucosal carcinoma) and the potential for cure through minimally invasive techniques[2]. With advances in the early diagnosis and minimally invasive treatment of esophageal cancer, the survival of SEC patients has improved significantly, achieving a five-year survival rate of over 90%[3].
Endoscopic submucosal dissection (ESD) is currently the preferred treatment for achieving curative outcomes in SEC. Its key advantage lies in enabling en bloc resection while preserving anatomical integrity, thereby providing optimal specimens for accurate pathological assessment. However, the therapeutic efficacy of ESD largely depends on whether curative resection is attained. Resection is deemed non-curative if postoperative pathology reveals poorly differentiated carcinoma, positive margins, lymphovascular invasion (LVI), or submucosal invasion depth reaching deeper submucosal layer (SM2) or deeper (i.e., ≥ 200 μm beyond the muscularis mucosae)[4]. The incidence of non-curative resection (NCR) may be as high as 20%, which is associated with a 40% increase in local recurrence risk and over 30% reduction in five-year survival[5]. Current clinical decision-making relies predominantly on postoperative histopathology, with a notable lack of effective preoperative risk stratification tools. This leads to two major clinical dilemmas: Low-risk patients may undergo overtreatment (e.g., direct surgical resection), while high-risk patients often endure a redundant pathway of “ESD attempt, NCR, secondary surgery”. Such inefficiencies not only delay optimal treatment but also increase the risks of procedural complications and healthcare costs.
Current predictive models face three major limitations. First, many rely on unimodal indicators, such as esophageal wall thickness on computed tomography (CT) or isolated biomarkers, without integrating SEC-specific multimodal risk features, including circumferential ratio (CR) and depth of invasion grading. Second, most models are developed using single-center, small-sample datasets and lack independent external validation, resulting in limited generalizability. Third, conventional statistical methods have limited capacity to capture complex interactions among high-dimensional variables and often yield models with low interpretability (i.e., “black-box” nature), which hinders clinical adoption[6,7]. Machine learning (ML) techniques, particularly ensemble algorithms such as XGBoost and LightGBM, offer a promising approach to overcome these challenges by leveraging their ability to model nonlinear relationships. When coupled with interpretability tools like SHapley Additive Explanations (SHAP), ML provides a pathway toward more transparent and clinically acceptable prediction models.
This study, utilizing a multicenter cohort from the Affiliated Hospital of North Sichuan Medical College (training cohort, n = 366) and Langzhong People’s Hospital (independent external validation cohort, n = 129), developed and validated for the first time a preoperative prediction model for the risk of non-curative ESD. Variable selection was performed using least absolute shrinkage and selection operator (LASSO) regression, and key predictors, including six SEC-specific features such as preoperative endoscopically estimated deep SM, esophageal stricture, and CR ≥ 3/4, were identified via multivariable logistic regression. The predictive performance of nine ML classifiers was systematically compared. The optimal model was integrated with SHAP to achieve individualized risk visualization, providing transparent and reliable decision support for clinical practice.
MATERIALS AND METHODS
Materials
This multicenter retrospective study enrolled 552 patients with clinically staged SEC (cT1a/T1b) who underwent ESD at the Affiliated Hospital of North Sichuan Medical College between January 2017 and December 2024, constituting the initial training cohort. Patients were excluded based on the following criteria: (1) Histology other than squamous cell carcinoma; (2) Receipt of neoadjuvant therapy; (3) Missing data exceeding 10% across the 30 candidate predictor variables; and (4) Incomplete clinical records (defined as the absence of key documents such as operative or pathology reports, precluding complete data extraction). After applying these criteria, 42, 28, 53, and 63 patients were excluded, respectively, resulting in a final training cohort of 366 patients. An independent external validation cohort was established from 215 eligible patients treated at Langzhong People’s Hospital between January 2020 and December 2024. After applying the same exclusion criteria (31 with non-squamous histology, 19 receiving neoadjuvant therapy, 27 with > 10% missing data, and 9 with incomplete records), 129 patients were included. All enrolled patients met the following criteria: (1) Preoperative contrast-enhanced CT confirmed the absence of lymph node or distant metastasis (cN0M0); (2) All predictive variables were assessed within one week prior to ESD; and (3) Postoperative pathology confirmed either curative or NCR. The patient selection flowchart is presented in Figure 1.
Figure 1
Flowchart of patients included in the analysis.
Study variables
Clinical indicators encompassed patient age, sex, and pretreatment inflammatory markers, including neutrophil-to-lymphocyte ratio, platelet-to-lymphocyte ratio, platelet-to-neutrophil ratio, and systemic immune-inflammation index (SII). Factors also considered were body mass index, comorbidities such as hypertension, cardiovascular disorder, diabetes, renal impairment, and chronic obstructive pulmonary disease, as well as family history of esophageal cancer. Endoscopic observations included brownish discoloration, chronic esophagitis, endoscopic ultrasound (EUS) or magnifying endoscopy with narrow-band imaging (ME-NBI), collectively referred to as EOM, and esophageal stricture. Contrast-enhanced CT outcomes involved esophageal wall thickness, tumor enhancement patterns, and esophageal stricture while lesion characteristics covered anatomic location, multiple lesions, CR, and Paris classification. Histopathological characteristics incorporated preoperative pathological type (PPT) and features linked to NCR, such as LVI, depth of invasion, tumor differentiation, and resection margin status. Lifestyle factors included long-term smoking, alcohol use, and dietary habits, specifically regular consumption of preserved foods, high-temperature foods, fried items, fruits, and vegetables, in addition to oral hygiene practices.
Definition of terms
NCR is defined as a post-ESD pathological state in SEC wherein at least one high-risk feature is identified, including invasion beyond the superficial submucosa (SM1), LVI, poorly differentiated or undifferentiated carcinoma, or positive resection margins (either horizontal or vertical). This outcome fails to meet the criteria for curative resection and necessitates further multimodal treatment such as adjuvant surgery or chemoradiotherapy[8].
All patients underwent thin-slice contrast-enhanced CT examination using the following parameters: 1 mm slice thickness, 120 kV tube voltage, and automated tube current modulation. Iodinated contrast medium (iopromide, 350 mgI/mL) was administered intravenously at a dose of 1.5 mL/kg body weight. Image acquisition was performed during the portal venous phase with a delay of 70 seconds, and multiplanar reconstruction as well as curved planar reformation were applied for detailed analysis. The thickness of the esophageal wall (mucosa-submucosa complex) was objectively measured at the endoscopically confirmed lesion site using synchronized triplanar views. Two radiologists independently performed three measurements each, and the mean value was calculated. Interobserver agreement for CT-based esophageal wall thickness measurements was excellent, with an intraclass correlation coefficient of 0.921 [95% confidence interval (CI): 0.879-0.953]. Esophageal wall thickness was categorized into three grades: Mild (< 5 mm), moderate (5-7 mm), and severe (> 7 mm)[9,10]. Due to their hypervascular nature, tumor regions exhibited significantly higher enhancement compared to the adjacent normal esophageal tissue.
The depth of tumor invasion was assessed using either EUS or ME-NBI, collectively referred to as EOM. According to the 2002 Paris Classification for gastrointestinal tumors[11], invasion depth was categorized into mucosal layers: M1 (confined to the epithelium), M2 (invasion into the lamina propria), and M3 (involving the muscularis mucosae); and submucosal layers, which were subdivided into SM1 (upper third, ≤ 200 μm), SM2 (middle third), and SM3 (lower third). Under EUS evaluation using a 20 MHz mini-probe, SM1 invasion was characterized by localized thinning (≤ 2 mm) of the hyperechoic submucosal layer with an intact muscularis propria. In contrast, SM2 or deeper invasion was defined by interruption of the submucosal layer accompanied by hypoechoic lesions extending into the middle or deep submucosa (> 200 μm) or the muscularis propria. Using ME-NBI, SM1 lesions simultaneously exhibited intraepithelial papillary capillary loop type B2 (irregular branching with > 2 × caliber variation) and type IV pit pattern (irregular cerebriform structures) within a uniform non-ulcerated mucosal background. SM2+ invasion was identified by the presence of intraepithelial papillary capillary loop type B3 (fragmented, worm-like vessels) or type Bx (avascular areas with abnormal thick vessels), often accompanied by a type V pit pattern (loss of mucosal structure), brownish turbid background mucosa, and/or ulceration or protrusion.
Endoscopic observations also included brownish discoloration under narrow-band imaging, suggestive of inflammatory, vascular, or neoplastic changes[12]; chronic esophagitis, manifesting as mucosal erythema, edema, erosion, or ulceration, sometimes with white plaques or granularity; and esophageal stricture, which was defined according to a comprehensive clinical-imaging criteria as meeting either of the following: (1) Endoscopic visualization of definite luminal narrowing coupled with a clear sense of resistance or impaired passage of a standard gastroscope (diameter 9.8 mm); or (2) Preoperative CT imaging demonstrating focal esophageal narrowing accompanied by proximal luminal dilation (diameter > 2 cm) or content retention[13]. Lifestyle factors were systematically assessed during postoperative follow-up using a structured scoring system encompassing dietary habits (frequency of fruit/vegetable, pickled, and fried food intake), oral hygiene (daily brushing frequency), and consumption of high-temperature foods (based on subjectively tolerated temperature). Intake frequency was dichotomized as occasional (≤ 3 times/week) or frequent (> 3 times/week)[14].
Statistical analysis
Continuous variables were presented as median with interquartile range, and group comparisons were conducted using the Mann-Whitney U test. Categorical variables were summarized as n (%), and associations were evaluated with Pearson’s χ2 test or Fisher’s exact test, as appropriate. A two-sided P value of less than 0.05 was considered statistically significant. All analyses were performed using R version 4.2.3 (with the gtsummary package, version 1.7.2) and Python scikit-learn version 1.1.3.
Development and validation of a prediction model for NCR
This study included 366 patients with SEC (cT1a/T1b) who underwent ESD at the Affiliated Hospital of North Sichuan Medical College as the training cohort, and 129 patients from Langzhong People’s Hospital serving as an external test set for independent validation. The predictive performance of multiple ML algorithms was systematically compared to identify the optimal model, which was subsequently interpreted using the SHAP framework for visual explanation.
During data preprocessing, minimal random missing values (< 3% for all variables) were imputed using the k-nearest neighbors (KNN) (k = 5) algorithm implemented in Python scikit-learn (v1.1.3). This method preserved data integrity under conditions of low missingness, which followed the application of stringent exclusion criteria. Clinical cutoff values for continuous variables such as age, platelet-to-lymphocyte ratio, and neutrophil-to-lymphocyte ratio were determined by the receiver operating characteristic (ROC) curve analysis, for instance, age was dichotomized using a cutoff of 67 years, and these variables were dichotomized to reduce model complexity while enhancing clinical interpretability. This transformation strategy aids in developing targeted interventions and supports straightforward risk stratification and clinical decision-making.
For feature selection and modeling, significant predictors were identified via LASSO regression (R glmnet v4.1.8) combined with multivariable logistic regression (R v4.2.3) at a significance level of P < 0.05, effectively addressing multicollinearity. Nine ML models were constructed, including XGBoost, logistic regression, LightGBM, AdaBoost, decision tree, gradient boosting classifier, Gaussian naive Bayes, multilayer perceptron, and KNN classifier. Model parameters were optimized over five repeated training cycles, and performance was comprehensively evaluated using ROC curves, decision curve analysis, and calibration plots.
For validation and interpretation, the optimal model underwent 5-fold cross-validation, with stability assessed via learning curves implemented in Python scikit-learn v1.1.3. SHAP analysis (Python SHAP v0.43.0) was employed to quantify feature contributions and to develop an online predictive tool along with a clinical nomogram. Finally, two representative cases were selected to demonstrate the practical utility and interpretability of the model in real-world clinical scenarios.
RESULTS
Baseline analysis of the training and external test sets
The training cohort in this study consisted of 366 patients with SEC who underwent ESD at the Affiliated Hospital of North Sichuan Medical College. An independent external test set was established using 129 contemporary patients from Langzhong People’s Hospital. The demographic and clinical baseline characteristics of both cohorts are summarized in Table 1.
Table 1 Baseline demographic profile and clinical parameters of the study cohort, n (%).
This study employed LASSO regression to identify independent predictors of NCR. The method offers a dual mechanism: It mitigates overfitting through coefficient shrinkage while simultaneously addressing multicollinearity among variables[15,16]. Analysis of the coefficient shrinkage path (Figure 2A) and cross-validation curve (Figure 2B) revealed that at λ = 0.058 (the minimum value within one standard error), the model selected seven key variables: SII, esophageal wall thickness, esophageal stricture, EOM, multiple lesions, CR of the lesion, and PPT.
Figure 2 Least absolute shrinkage and selection operator regression analysis for feature selection.
A: Coefficient paths of 30 variables vs log(λ). Vertical lines indicate key λ values: 0.023 (9 variables, minimal mean squared error) and 0.058 (7 core variables under 1-SE rule); B: Cross-validation curve shows deviance vs log(λ) with error bands. λ = 0.023 gives minimum deviance; λ = 0.058 provides optimal parsimony. Together, these demonstrate regularization’s control of model complexity and prediction performance.
Confounding adjustment
To further control for potential confounding factors, multivariable logistic regression analysis was performed subsequent to variable selection via LASSO regression. From the seven candidate predictors identified by LASSO, six independent risk factors were ultimately retained based on statistical significance (P < 0.05) in the stepwise regression analysis (Table 2). Although SII was selected by LASSO, it did not reach statistical significance (P = 0.063) in the multivariable model and was therefore excluded from the final prediction model. The six retained factors were: Esophageal wall thickness > 7 mm on CT, EOM > SM1, esophageal stricture, multiple lesions, CR > 3/4, and PPT of esophageal squamous cell carcinoma (ESCC).
Table 2 Multivariable logistic regression analysis of non-curative resection after endoscopic submucosal dissection for superficial esophageal cancer.
Comparative analysis identifying logistic regression as the preferred model
In the comprehensive model evaluation, the logistic regression model demonstrated optimal clinical applicability and reliability. On the validation set, it achieved an under the curve (AUC) of 0.869 (95%CI: 0.774-0.965), significantly outperforming both XGBoost (0.831) and LightGBM (0.836), with the lowest performance degradation from the training set (only 1.5% decay rate; Figure 3A and B). The model also exhibited excellent calibration, with a Brier score of 0.103 closest to ideal, indicating minimal deviation between predicted and observed risks (Figure 3C). Decision curve analysis showed that within the clinically critical threshold range of 30% to 50%, the model’s net benefit consistently exceeded other models by over 35% (Figure 3D). Furthermore, the logistic regression model demonstrated consistently superior performance in both the training and validation sets on the precision-recall curve. It achieved an average precision of 0.688 on the validation set, comparable to the gradient boosting decision tree model (0.691), but with a narrower CI (ΔCI width: 0.099 vs 0.065; Figure 3E and F). In summary, the logistic regression model achieved the best balance of predictive accuracy, stability, and clinical utility, supporting its recommendation as the preferred tool for predicting NCR risk.
Figure 3 Comprehensive performance evaluation of machine learning models.
A: Receiver operating characteristic curve and area under the curve values for the training set; B: Receiver operating characteristic curve and area under the curve values for the validation set using five 7:3 random splits; C: Calibration curve shows predicted vs observed probabilities. Dashed diagonal indicates ideal reference. Solid lines show model performance. Better calibration is indicated by closer fit to diagonal and lower Brier scores (in parentheses); D: Decision curve analysis compares models. Black dashed line: All patients undergo non-curative resection; red dashed line: No intervention; E: Precision-recall curve and average precision (AP) for training set; F: Precision-recall curve and AP for validation set (Y-axis: Precision; X-axis: Recall). The logistic regression model showed consistently superior performance. Superiority is determined either by complete curve encapsulation or higher AP values for intersecting curves. Models are color-coded with mean and 95% confidence intervals. ROC: Receiver operating characteristic; AUC: Area under the curve; CI: Confidence interval; PR: Precision-recall.
Superior performance and generalization of the logistic regression model
The logistic regression classifier demonstrated excellent performance in this task. The model was developed using the training set with 5-fold cross-validation. The performance of the external test set, measured by AUC, did not significantly exceed that of the validation set (difference < 10%), indicating a well-fitted model[17], and confirming that the logistic model is suitable for classification tasks on this dataset. As observed in the ROC curves (Figure 4A-C), the AUC values for the training set, validation set, and test set were 0.887, 0.872, and 0.849, respectively, demonstrating strong discriminatory ability between positive and negative classes. Furthermore, the learning curve (Figure 4D) indicated that as the training sample size increased, both training and validation accuracy steadily converged and stabilized, reflecting a reliable learning process and strong generalization capability.
Figure 4 Performance evaluation of the logistic regression model across the training, validation, and external test sets.
A: Receiver operating characteristic (ROC) curve and area under the curve (AUC) value for the training set; B: ROC curve and AUC values for the validation set, constructed through random selection of 30% of training cases with 5-fold cross-validation. Five solid lines represent individual validation fold outcomes; C: ROC curve and AUC value for the independent external test set; D: Learning curves show performance progression, with training and validation sets represented by red and blue dashed lines, respectively. ROC: Receiver operating characteristic; CI: Confidence interval.
Visualization of the ML prediction model for NCR
SHAP analysis was used to evaluate how each feature contributes to predictions in the medical diagnostic model. The summary plot (Figure 5A) shows the direction and distribution of effects for six key features. Blue and red dots indicate lower and higher feature values, corresponding to negative and positive impacts on predictions, respectively. The analysis revealed that EOM and esophageal wall thickness had a clear biphasic effect: Low values reduced the prediction output, while high values increased it. Similarly, CR and PPT showed higher values associated with positive effects and lower values with negative effects[18]. In contrast, multiple lesions and esophageal stricture had more variable influences.
Figure 5 SHapley Additive exPlanations interpretability analysis for the non-curative resection prediction model.
A: Feature contribution summary. The horizontal axis indicates the Shapley Additive exPlanations value (log-odds impact on prediction); the vertical axis lists clinical predictors. Red and blue dots indicate high and low feature values, respectively; B: Predictor importance ranking. Bar length reflects the mean |SHapley Additive exPlanations| value, quantifying each predictor’s contribution to model decisions. SHAP: SHapley Additive exPlanations; CR: Circumferential ratio; EOM: Endoscopic ultrasound or magnifying endoscopy with narrow-band imaging; PPT: Postoperative pathological type.
The mean absolute SHAP values (Figure 5B) provided a global importance ranking: EOM > esophageal wall thickness > CR > PPT > multiple lesions > esophageal stricture. All features were positively correlated with the model output. EOM, esophageal wall thickness, and CR were the strongest predictors, with SHAP values mainly in the high-impact zone (|SHAP| > 0.1). Less influential features like esophageal stricture had smaller effects (|SHAP| < 0.05).
This two-part analysis clearly shows the direction and level of feature contributions, helping explain how the model makes decisions. The SHAP risk cutoff value of 0.185 for clinical decision-making was determined during the model training phase by maximizing Youden’s index (J = sensitivity + specificity - 1), which optimally balances the trade-off between correctly identifying positive and negative cases. This pre-defined cutoff was subsequently applied to the external validation set and is utilized by the online prediction tool. Based on these results, a cutoff value of 0.185 was selected for predicting NCR risk and built into an online prediction tool (https://www.xsmartanalysis.com/model/list/predict/model/html?mid=27022&symbol=717549NY9Me5qXNB7874). Clinical recommendations are as follows: (1) SHAP value ≥ 0.185 suggests high risk, and surgical treatment is recommended; and (2) SHAP value < 0.184 supports the use of ESD. To demonstrate clinical applicability, Figure 6 present two example cases: One accurately predicted negative (SHAP = 0.01) and one accurately predicted positive (SHAP = 0.75). These examples illustrate the model’s interpretability and practical value. Additionally, a nomogram based on the model is provided for ease of use in clinical practice (Figure 7).
Figure 7 Nomogram of the machine learning prediction model for non-curative resection in superficial esophageal cancer.
CR: Circumferential ratio; EOM: Endoscopic ultrasound or magnifying endoscopy with narrow-band imaging; PPT: Postoperative pathological type.
DISCUSSION
Pathological findings indicative of NCR, such as depth of invasion > SM1, LVI, positive margins, or undifferentiated histology, provide clear justification for completion radical surgery in patients with SEC initially treated with ESD[19,20]. This salvage therapeutic strategy, however, exposes patients to repeat surgical trauma, treatment delays, and increased consumption of medical resources. Our study included 105 patients with NCR. Of these, 40 underwent completion surgery, and 22 transitioned to chemoradiotherapy due to contraindications to further surgery. Further analysis classified 18 patients as low-risk after pathological review, leading to intensive surveillance; 12 received palliative care due to advanced age (> 85 years) or severe cardiopulmonary comorbidities (American Society of Anesthesiologists class ≥ IV); and 13 were lost to follow-up during the critical decision-making interval. This profile highlights a critical weakness in current clinical pathways: The inability to preoperatively identify high-risk patients accurately, resulting in unnecessary ESD procedures and associated complications for some individuals. To address this limitation, we developed a ML model that integrates multidimensional preoperative features from clinical, imaging, and lifestyle domains to achieve early prediction of NCR risk. This model offers substantial clinical value by identifying high-risk patients who can proceed directly to radical surgery, thereby minimizing the chain of medical burdens associated with futile ESD and promoting a paradigm shift from postoperative salvage to preemptive management.
Using LASSO and multivariable logistic regression analysis, we identified six risk factors significantly associated with NCR from an initial set of 30 clinical variables. These factors include EOM > SM1, esophageal wall thickness, multiple lesions, CR, esophageal stricture, and PPT. As a key biological indicator of deep SM, EOM > SM1 was confirmed as a central high-risk factor for NCR [P < 0.001; odds ratio (OR) = 10.252; 95%CI: 4.211-28.176], with a non-curative rate of 56.3% in this subgroup. The pathological mechanism involves a triple cascading effect in the submucosa: First, the SM2 exhibits a significantly higher density of lymphatic networks compared to the SM1, greatly increasing the probability of LVI. Current consensus indicates that the lymph node metastasis rate for SM2 lesions may exceed 20%, far beyond the acceptable safety threshold (< 3%) for endoscopic local therapy such as ESD[21-23]. Second, deep invasive tumors can induce substantial collagen deposition in the submucosa, increasing tissue stiffness and fibrosis, which leads to difficulty in identifying the dissection plane during operation and exacerbates tissue retraction. These pathological changes elevate the risk of incomplete en bloc resection and positive vertical margins[24-27]. Third, SM2 lesions are typically dominated by malignant phenotypes. Strong evidence suggests that, compared to SM1 lesions, SM2 lesions often demonstrate higher histological grades (e.g., poorly differentiated or undifferentiated) and significantly increased probability of LVI[28-30]. Current preoperative assessment methods remain limited in distinguishing SM1 from SM2 lesions, particularly in determining deep SM, resulting in considerable discrepancy between preoperative diagnosis and postoperative pathological findings[31,32]. Studies indicate that misclassifying SM2 lesions as superficial leading to inappropriate ESD, may significantly increase the NCR rate[33,34]. Although international guidelines consider endoscopic features suggestive of invasion deeper than SM1 an absolute contraindication for ESD[35], accurate discrimination of borderline lesions remains a challenge in clinical practice. To address this limitation, we developed a predictive model for NCR that integrates artificial intelligence algorithms with multimodal radiomic features, thereby providing evidence-based support for treatment decision-making.
A study of 404 patients with SEC demonstrated that esophageal wall thickness > 7 mm on contrast-enhanced chest CT was a high-risk factor for NCR, consistent with the findings of the present investigation (P = 0.003; OR = 3.065; 95%CI: 1.463-6.498)[36]. This imaging biomarker holds significant clinical translational value: Contrast-enhanced CT may serve as an effective complement to endoscopy, enabling more comprehensive preoperative evaluation[37]. As a convenient and quantifiable objective tool, it effectively compensates for the limitations of endoscopy in assessing the full-thickness esophageal structure. Specifically, when endoscopic evaluation yields uncertain results (e.g., SM1/SM2 borderline lesions), a CT-based wall thickness > 7 mm provides critical decision-making support for escalating treatment strategies, such as recommending radical surgery instead of ESD, thereby mitigating the risk of NCR due to misclassification. In this study, the NCR rate reached 48.4% among patients positive for this indicator, further validating its clinical utility as an effective risk-stratification marker.
Additional analysis revealed that multiple lesions reflect spatial tumor heterogeneity and high-risk distribution patterns. The endoscopic detection of multiple lesions was identified as an independent risk factor for NCR in patients with SEC undergoing endoscopic resection (P = 0.015; OR = 2.666; 95%CI: 1.210-5.865). The underlying risk mechanisms include the following: First, multiple lesions often indicate more diffuse growth patterns and higher malignant potential, significantly increasing the likelihood of occult micro-invasion, microvascular invasion, or skip metastases, features frequently missed during preoperative evaluation, leading to failure in meeting curative criteria upon postoperative pathological assessment[38-40]. Second, the presence of multiple lesions, particularly those located in different anatomical regions, substantially increases the technical difficulty of endoscopic procedures, including precise lesion localization, margin delineation, maintenance of a stable visual field, and control of the dissection plane, often resulting in positive margins or residual disease[41]. Third, large resection specimens complicate pathological evaluation, elevating the risk of overlooking high-risk microscopic foci or margin involvement. Thus, multiple lesions not only signify aggressive tumor biology but also contribute to NCR through compounded technical and pathological challenges.
This study also confirmed that esophageal stricture (P = 0.003; OR = 3.411; 95%CI: 1.512-7.760) and CR > 3/4 (P = 0.002; OR = 3.724; 95%CI: 1.669-8.799) are independent risk factors for NCR in SEC. Although their mechanisms partially overlap, each contributes uniquely to resection failure: Esophageal stricture primarily imposes anatomic constraints, such as visual obstruction and instrumental limitation leading to loss of dissection plane control, while extensive circumferential involvement induces mechanical destabilization by compromising mucosal traction, resulting in residual margins. Both factors indicate higher malignant potential; stenosis is often associated with deep SM and fibrotic response, and large circumferential lesions reflect a tendency toward circumferential spread. Each independently elevates the risk of occult LVI[42-44]. Moreover, luminal stenosis independently reduces the accuracy of EUS staging[45], and large circumferential lesions independently increase errors in pathological evaluation[46,47], collectively introducing systematic deviations into the diagnostic-therapeutic pathway. To address these challenges, a stratified intervention strategy is recommended: Balloon dilation may reestablish operational access in stenotic lesions, whereas circumferential precutting can help restore traction planes for largely circumferential lesions. Ultra-early endoscopic surveillance within one month postprocedure should be implemented to detect any residual disease.
Moreover, patients diagnosed with ESCC based on preoperative pathological examination exhibited a significantly higher risk of NCR compared to those with high-grade intraepithelial neoplasia or low-grade intraepithelial neoplasia (P = 0.002; OR = 2.892; 95%CI: 1.513-5.791)[48]. This finding underscores that for individuals with ESCC, regardless of tumor stage, the possibility of NCR should be thoroughly evaluated during therapeutic strategy formulation to optimize treatment decision-making.
Through systematic evaluation of nine ML algorithms, we identified the logistic regression model as optimal for predicting the risk of NCR. The model demonstrated robust generalizability, maintaining an AUC of 0.849 on external validation. Notably, this performance was sustained despite significant baseline differences between the training and validation cohorts in variables such as age, body mass index, and prevalence of chronic obstructive pulmonary disease. While such heterogeneity reflects real-world clinical practice and may influence feature distributions, the preserved discriminatory ability underscores the model’s robustness. Nevertheless, future validation in more diverse populations is warranted to ensure consistent performance across different clinical settings.
Compared with existing prediction models for SEC, our approach demonstrates competitive performance. For instance, Cui et al[36] reported an AUC of 0.82 for predicting SM > 200 μm, and Ruan et al[49] achieved an AUC of 0.819 for predicting LVI. Our model specifically targets the composite endpoint of NCR while maintaining comparable discriminative ability (external validation AUC 0.849). More importantly, our framework offers enhanced clinical utility through the integration of multimodal preoperative features and SHAP-based explainability, providing transparent, individualized risk assessment to guide ESD candidate selection.
To facilitate clinical translation, we developed an interpretable prediction framework incorporating SHAP analysis and deployed an online platform featuring a graphical nomogram. However, widespread implementation faces several challenges. These include the technical integration of the tool with existing electronic health record systems and its adaptation to institution-specific clinical workflows. Additional barriers involve addressing clinicians’ traditional reliance on postoperative pathology for definitive decision-making and conclusively demonstrating the tool’s utility in improving patient outcomes. Prospective, multicenter studies are essential to validate the tool’s impact on therapeutic decision-making and clinical endpoints before routine adoption can be recommended.
Several limitations of this study should be acknowledged. First, the retrospective design and sample size from only two centers necessitate validation in larger, prospective, multicenter cohorts. Second, the assessment of lifestyle factors during postoperative follow-up is susceptible to recall bias; future studies would benefit from prospective, objective data collection prior to treatment. Third, although KNN imputation was used to handle the minimal missing data (< 3% for all variables), sensitivity analyses comparing different imputation methods were not performed. While the low missingness rate makes a substantial impact on the conclusions unlikely, more comprehensive missing data strategies are advisable for future studies with larger datasets. Finally, the dichotomization of continuous variables, while clinically practical, may oversimplify underlying biological relationships. Additionally, non-concurrent data collection periods may introduce temporal bias due to evolution in endoscopic and imaging technologies over time.
CONCLUSION
This study established a ML-based predictive framework for assessing the risk of NCR, utilizing multi-center data. Among the algorithms evaluated, the logistic regression model exhibited superior performance. Integration of SHAP interpretability enabled visualization of individualized risk profiles, effectively addressing the current gap in predictive tools for post-ESD outcomes. This tool facilitates accurate identification of high-risk patients by clinicians, supports optimized treatment planning, and contributes to the reduction of unnecessary medical procedures and associated costs.
Footnotes
Provenance and peer review: Unsolicited article; Externally peer reviewed.
Peer-review model: Single blind
Specialty type: Oncology
Country of origin: China
Peer-review report’s classification
Scientific Quality: Grade A, Grade C
Novelty: Grade A, Grade C
Creativity or Innovation: Grade A, Grade C
Scientific Significance: Grade A, Grade B
P-Reviewer: Feng JB, Assistant Professor, China; Pacal I, Associate Professor, Türkiye S-Editor: Wu S L-Editor: A P-Editor: Zhao YQ
Nagao S, Nishimura M, Koseki M, Beauvais J, Laszkowska M, Tang L, Strong VE, Schattner MA. Treatment outcomes of non-curative endoscopic submucosal dissection for superficial gastric neoplasia: A retrospective study at a tertiary care center in the United States.DEN Open. 2025;5:e70034.
[RCA] [PubMed] [DOI] [Full Text][Cited by in RCA: 1][Reference Citation Analysis (0)]
Lee JW, Cho CJ, Kim DH, Ahn JY, Lee JH, Choi KD, Song HJ, Park SR, Lee HJ, Kim YH, Lee GH, Jung HY, Kim SB, Kim JH, Park SI. Long-Term Survival and Tumor Recurrence in Patients with Superficial Esophageal Cancer after Complete Non-Curative Endoscopic Resection: A Single-Center Case Series.Clin Endosc. 2018;51:470-477.
[RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)][Cited by in Crossref: 2][Cited by in RCA: 6][Article Influence: 0.8][Reference Citation Analysis (0)]
Pimentel-Nunes P, Dinis-Ribeiro M, Ponchon T, Repici A, Vieth M, De Ceglie A, Amato A, Berr F, Bhandari P, Bialek A, Conio M, Haringsma J, Langner C, Meisner S, Messmann H, Morino M, Neuhaus H, Piessevaux H, Rugge M, Saunders BP, Robaszkiewicz M, Seewald S, Kashin S, Dumonceau JM, Hassan C, Deprez PH. Endoscopic submucosal dissection: European Society of Gastrointestinal Endoscopy (ESGE) Guideline.Endoscopy. 2015;47:829-854.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 817][Cited by in RCA: 949][Article Influence: 86.3][Reference Citation Analysis (0)]
Draganov PV, Aihara H, Karasik MS, Ngamruengphong S, Aadam AA, Othman MO, Sharma N, Grimm IS, Rostom A, Elmunzer BJ, Jawaid SA, Westerveld D, Perbtani YB, Hoffman BJ, Schlachterman A, Siegel A, Coman RM, Wang AY, Yang D. Endoscopic Submucosal Dissection in North America: A Large Prospective Multicenter Study.Gastroenterology. 2021;160:2317-2327.e2.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 38][Cited by in RCA: 139][Article Influence: 27.8][Reference Citation Analysis (1)]
Sato D, Sasabe M, Mitsui T, Furue Y, Yoshii T, Hara H, Oka D, Fukuda T, Yoda Y. Impact of time from diagnosis to endoscopic submucosal dissection on curability in superficial esophageal squamous cell carcinoma.DEN Open. 2025;5:e70035.
[RCA] [PubMed] [DOI] [Full Text][Cited by in RCA: 1][Reference Citation Analysis (0)]
ASGE standards of practice committee; Forbes N, Elhanafi SE, Al-Haddad MA, Thosani NC, Draganov PV, Othman MO, Ceppa EP, Kaul V, Feely MM, Sahin I, Buxbaum JL, Calderwood AH, Chalhoub JM, Coelho-Prabhu N, Desai M, Fujii-Lau LL, Kohli DR, Kwon RS, Machicado JD, Marya NB, Pawa S, Ruan W, Sheth SG, Storm AC, Thiruvengadam NR, Qumseya BJ; (ASGE Standards of Practice Committee Chair). American Society for Gastrointestinal Endoscopy guideline on endoscopic submucosal dissection for the management of early esophageal and gastric cancers: summary and recommendations.Gastrointest Endosc. 2023;98:271-284.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 44][Cited by in RCA: 79][Article Influence: 26.3][Reference Citation Analysis (0)]
Li B, Li B, Jiang H, Yang Y, Zhang X, Su Y, Hua R, Gu H, Guo X, Ye B, Yang Y, He Y, Sun Y, Piessen G, Hochwald SN, Cuesta MA, Birdas TJ, Li Z; written on behalf of the AME Thoracic Surgery Collaborative Group. The value of enhanced CT scanning for predicting lymph node metastasis along the right recurrent laryngeal nerve in esophageal squamous cell carcinoma.Ann Transl Med. 2020;8:1632.
[RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)][Cited by in Crossref: 6][Cited by in RCA: 13][Article Influence: 2.2][Reference Citation Analysis (0)]