Published online Sep 28, 2025. doi: 10.3748/wjg.v31.i36.111293
Revised: July 28, 2025
Accepted: August 21, 2025
Published online: September 28, 2025
Processing time: 84 Days and 3.4 Hours
Advanced esophageal squamous cell carcinoma (ESCC) has an extremely poor prognosis. Preoperative chemoradiotherapy (CRT) can significantly prolong survival, especially in those who achieve pathological complete response (pCR). However, the pretherapeutic prediction of pCR remains challenging.
To predict pCR and survival in ESCC patients undergoing CRT using an artificial intelligence (AI)-based diffusion-weighted magnetic resonance imaging (DWI-MRI) radiomics model.
We retrospectively analyzed 70 patients with ESCC who underwent curative surgery following CRT. For each patient, pre-treatment tumors were semi-automatically segmented in three dimensions from DWI-MRI images (b = 0, 1000 second/mm²), and a total of 76 radiomics features were extracted from each segmented tumor. Using these features as explanatory variables and pCR as the objective variable, machine learning models for predicting pCR were developed using AutoGluon, an automated machine learning library, and validated by stratified double cross-validation.
pCR was achieved in 15 patients (21.4%). Apparent diffusion coefficient skewness demonstrated the highest predictive performance [area under the curve (AUC) = 0.77]. Gray-level co-occurrence matrix (GLCM) entropy (b = 1000 second/mm²) was an independent prognostic factor for relapse-free survival (RFS) (hazard ratio = 0.32, P = 0.009). In Kaplan-Meier analysis, patients with high GLCM entropy showed significantly better RFS (P < 0.001, log-rank). The best-performing machine learning model achieved an AUC of 0.85. The predicted pCR-positive group showed significantly better RFS than the predicted pCR-negative group (P = 0.007, log-rank).
AI-based radiomics analysis of DWI-MRI images in ESCC has the potential to accurately predict the effect of CRT before treatment and contribute to constructing optimal treatment strategies.
Core Tip: Accurately predicting pathological complete response (pCR) to chemoradiotherapy in esophageal squamous cell carcinoma remains a critical clinical challenge. This study introduces a novel artificial intelligence-based model leveraging radiomics features from pre-treatment diffusion-weighted magnetic resonance imaging. By integrating semi-automated three dimensions tumor segmentation with an automated machine learning framework, our model demonstrated high predictive accuracy for pCR (area under the curve = 0.85) and successfully stratified patients into distinct prognostic groups based on relapse-free survival. This non-invasive biomarker is a promising tool for constructing optimal treatment strategies, thereby advancing personalized medicine and significantly improving patient outcomes.
- Citation: Hirata A, Hayano K, Tochigi T, Kurata Y, Shiraishi T, Sekino N, Nakano A, Matsumoto Y, Toyozumi T, Uesato M, Ohira G. Predicting pathological complete response to chemoradiotherapy using artificial intelligence-based magnetic resonance imaging radiomics in esophageal squamous cell carcinoma. World J Gastroenterol 2025; 31(36): 111293
- URL: https://www.wjgnet.com/1007-9327/full/v31/i36/111293.htm
- DOI: https://dx.doi.org/10.3748/wjg.v31.i36.111293
Esophageal squamous cell carcinoma (ESCC) is a significant global health concern, characterized by a high incidence in East Asia and a generally poor prognosis[1]. A multidisciplinary treatment approach, including surgery, chemotherapy, and radiation therapy, has been established as a standard therapeutic strategy to improve clinical outcomes. In Japan, neoadjuvant chemotherapy (nCT) followed by surgery is the recommended standard for resectable locally advanced ESCC. The JCOG1109 (NExT) trial demonstrated that nCT with a triplet regimen (fluorouracil, cisplatin, and docetaxel) significantly improved survival compared to neoadjuvant chemoradiotherapy (nCRT), supporting the preference for nCT as the optimal preoperative strategy in Japan[2]. This may be partly due to the increased perioperative complications and late toxicities associated with radiotherapy, which can attenuate the long-term survival benefit of nCRT.
On the other hand, several studies have consistently demonstrated that nCRT yields a significantly higher pathological complete response (pCR) rate compared to nCT alone. Furthermore, patients who achieve pCR exhibit a favorable prognosis, with 5-year overall survival rates exceeding 80%[3,4]. This suggests that nCRT rather than nCT may be the optimal treatment strategy for certain patients with advanced ESCCs. However, accurately predicting treatment response to nCRT remains a significant clinical challenge. The ability to predict pCR prior to initiating nCRT could substantially improve treatment strategies for patients with advanced ESCC. Moreover, reliable prediction of pCR to CRT may facilitate a “watch-and-wait” approach (non-operative management strategy), allowing for esophageal preservation and reducing surgery-related morbidity. Therefore, the development of robust and accurate predictive biomarkers for response to nCRT is crucial for enabling personalized neoadjuvant treatment and improving clinical outcomes in advanced ESCC.
Diffusion-weighted imaging (DWI) is a functional imaging technique sensitive to water diffusion, capable of detecting tumor microstructural changes at the cellular level. It has been increasingly used in oncology due to its ability to non-invasively characterize tumor biology. The apparent diffusion coefficient (ADC), a quantitative parameter derived from DWI, reflects tissue cellularity and structural abnormalities. ADC maps, calculated from different b values, enable evaluation of the tumor microenvironment and treatment response[5,6].
Radiomics is an advanced image analysis technique that transforms medical images into high-dimensional quantitative data, enabling objective characterization of tumor phenotypes[7]. Previous studies have demonstrated that radiomics based on clinical imaging can offer valuable insights for predicting cancer staging, treatment response, and prognosis[8,9]. In esophageal cancer, radiomics analyses using computed tomography (CT) or 18F-fluorodeoxyglucose positron emission tomography (FDG-PET) images have been applied to predict prognosis and radiotherapy response[10-12]. However, radiomics research utilizing magnetic resonance imaging (MRI) in ESCC remains very limited[13,14]. Studies specifically evaluating MRI-based radiomics features for predicting pCR in ESCC patients undergoing surgery after nCRT are rarely reported[15,16]. The purpose of this study was to construct a predictive model using artificial intelligence (AI)-based radiomics analysis of pre-treatment DWI-MRI to predict pCR and survival in patients with ESCC undergoing surgery following nCRT, thereby aiming to establish an optimal strategy for personalized treatment.
This retrospective study enrolled 70 patients with ESCC who underwent curative surgery after CRT at our institution between 2007 and 2019. All patients had histopathologically confirmed ESCC from biopsy specimens and underwent MRI prior to treatment. Based on the 8th edition of the Union for International Cancer Control tumor node metastasis classification[17], clinical stage was determined based on an upper endoscopic examination (including tumor biopsy), barium esophagography, chest and abdominal CT scans, MRI, and FDG-PET.
Patients received concurrent CRT consisting of 5-fluorouracil (5-FU), cisplatin and irradiation ranging from 40 to 60 Gy. The chemotherapy regimen involved a continuous 24-hour infusion of 5-FU (500 mg/m2/day) from day 0 to 4, administered alongside cisplatin (15 mg/m2/day) given as a 2-hour infusion from day 1 to 5. The radiation fields were designed as an extended T-shaped, encompassing the primary tumor and regional lymph nodes, specifically targeting the supraclavicular, mediastinal, and upper abdominal regions. Radiotherapy commenced on day 1 of chemotherapy, delivered at a dose of 2 Gy/day, five days a week, for four weeks, accumulating an initial total dose of 40 Gy.
Following the initial 40 Gy radiation, all patients underwent re-evaluation by CT to assess tumor resectability. If the tumor was diagnosed as resectable, patients proceeded to radical esophagectomy with three-field lymphadenectomy approximately four weeks after completing CRT. Conversely, if the tumor remained unresectable, an additional 20 Gy of irradiation was administered, bringing the total dose to 60 Gy (defined as definitive CRT). For patients whose tumors became resectable after receiving the full 60 Gy, salvage surgery was performed.
The pathological response of the primary tumor to CRT was evaluated based on the Japanese Classification of Esophageal Cancer, 12th Edition[18]. The criteria for pathological response were categorized as follows: Grade 3 indicated no viable cancer cells in the primary tumor; Grade 2 indicated that viable cancer cells occupied less than one-third of the residual tumor; Grade 1 indicated that viable cancer cells occupied more than one-third of the residual tumor; Grade 0 indicated no discernible therapeutic effect. A pCR was defined as a grade 3 response in the primary tumor with no evidence of pathological lymph node metastases, consistent with prior reports linking the absence of residual cancer cells in both the primary tumor and lymph nodes to improved outcomes.
MRI examinations were performed before treatment using a 1.5 T whole-body scanner (Achieva 1.5 T Nova Dual; Philips Medical, Best, The Netherlands). T2-weighted fast spin-echo images were obtained with the following parameters: Repetition time (TR)/echo time (TE): 1100/110 ms; Slice thickness: 4 mm; Matrix size: 256 × 204; Field of view: 320 mm × 320 mm. DWI scans were obtained using a single-shot spin-echo type of echo-planar sequence, and fat signals were suppressed using short-tau inversion recovery. The imaging parameters were as follows: TR/TE: 7800/65 ms; Slice thickness: 4 mm; Matrix size: 160 × 125; Field of view: 400 mm × 400 mm; b value: 0 and 1000 second/mm2; and acquisition time: 7 minutes with free-breathing scanning.
All image processing was performed on a dedicated workstation, BD score (PixSpace, Ltd., Fukuoka, Japan), which is certified as a medical device in Japan. An ADC map was generated from DWI acquired with b values of 0 and 1000 second/mm². In addition, a computed DWI at a b value of 2000 second/mm² was synthesized from the acquired DWI data. Computed DWI is a relatively new technique in which non-acquired DWI at higher b values can be mathematically derived from directly acquired lower b value DWI and it has the potential to improve the lesion-to-background contrast[19].
The software automatically extracted high-signal intensity areas from the DWI data in three dimensions (3D). Subsequently, on the maximum intensity projection images, the initial automated segmentation was manually edited to exclude non-tumorous hyperintense structures, thereby isolating the final tumor volume for analysis. On the BD score display, the first ADC range (0.3-1.0 × 10-3 mm²/second) was presented in red, the second range (1.0-1.5 × 10-3 mm²/second) in yellow, and the third range (1.5-2.0 × 10-3 mm²/second) in green (Figure 1). Texture analysis was performed, and radiomics parameters, including those derived from histogram analysis and gray-level co-occurrence matrix (GLCM), were calculated.
For each of the 70 patients, texture analysis was performed on ADC, DWI (b = 0 second/mm²), DWI (b = 1000 second/mm²), and computed DWI (b = 2000 second/mm²) images to quantify tumor heterogeneity. The extracted features were divided into two primary groups: Histogram parameters and GLCM features.
For histogram parameters, we calculated first-order statistics including mean, median, maximum, minimum, upper quartile value, lower quartile value, skewness, kurtosis, and entropy to describe global intensity distribution.
For GLCM features, we captured second-order spatial relationships between voxels. GLCMs were computed at a distance of 1 voxel across four direction angles (0°, 45°, 90°, and 135°); the final feature values were averaged across these directions to ensure rotational invariance (see Parekh and Jacobs[20] for the mathematical definition and calculation method). Before GLCM calculation, image intensities were quantized using two approaches: A mean ± 3 (SD) range and a min-max range, which normalized gray levels, reduced computational complexity, and stabilized the extracted features.
In this study, we initially selected a total of 76 radiomics features. Recognizing that not all features would be equally suitable for evaluation due to irrelevance or redundancy, we applied a feature selection process to retain only the most informative variables, thereby reducing dimensionality and improving model performance.
To enhance model stability and generalizability, feature selection was conducted in two sequential steps. First, we performed the Mann-Whitney U test on each radiomics feature to identify those significantly associated with pCR. Only features exhibiting statistically significant differences were retained for the subsequent step. Second, a wrapper method using backward elimination was applied to further reduce feature dimensionality and optimize model performance. Within this backward elimination process, a logistic regression model was used to evaluate the performance of feature subsets at each iteration.
Following feature selection, the retained features were standardized using the StandardScaler method to ensure consistent scaling across all variables. Subsequently, machine learning models were constructed utilizing the AutoGluon framework (https://github.com/autogluon/autogluon) which automates the entire machine learning process from data processing to model training, tuning, and model fusion. Its strengths include simplicity, robustness, fault tolerance, predictability of predictions, and enhanced accuracy in identifying feature relevance. AutoGluon automatically trains various machine learning models on the same data, such as light gradient boosting machine, categorical boosting, extreme gradient boosting, extremely randomized trees (employing both entropy and Gini criteria), random forest (using both entropy and Gini criteria), neural networks, and k-nearest neighbors (with both uniform and distance-based weighting).
The entire model development process was embedded within a nested cross-validation framework to prevent overfitting and ensure robust generalization performance. Stratified sampling was applied throughout the cross-validation procedures to preserve the class distribution of the outcome variable pCR. Inner cross-validation (5-fold, repeated 10 times) was used for feature selection and hyperparameter tuning. Outer cross-validation (5-fold, repeated 5 times), also stratified, was performed to assess the generalization performance of the models. For each outer loop, predicted probabilities of pCR were calculated, and the average of the five predictions per sample was used as the final prediction score. All machine learning procedures were conducted using Python 3.10.12 with AutoGluon 1.3.1. The summary of the processing is shown in Figure 2.
Continuous variables were presented as their median (range). For comparing differences between the pCR group and the non-pCR group, we utilized the Mann-Whitney U test for continuous data, and both the χ2 test and Fisher’s exact test for categorical variables, as appropriate. To identify the optimal threshold for predicting pCR, receiver operating characteristic (ROC) analysis was conducted. The impact of radiomics parameters and pathological factors on patient survival was evaluated using Cox’s proportional hazards model. Survival probabilities, specifically for relapse-free survival (RFS) were estimated using the Kaplan-Meier method. Differences in survival curves were then assessed for statistical significance with the log-rank test. All statistical computations were performed using JMP 18.2 (SAS Institute, Inc., Cary, NC, United States). A P value below 0.05 was considered to indicate statistical significance. The statistical methods of this study were reviewed by Kobayashi H at the Department of Surgery and Gastroenterology, Funabashi Municipal Medical Center.
The study population comprised 70 patients with ESCC (60 men and 10 women) with a median age of 66 years (range, 45-78 years). Baseline demographic and clinical characteristics are summarized in Table 1. Median follow-up was 23.7 months (range, 1.9-120.4 months). Pathological response of the primary tumor to CRT was classified as grade 3 (pCR) in 15 patients (21.4%), grade 2 in 27 patients (38.6%), and grade 1 in 28 patients (40.0%). Patients who achieved pCR had a favorable prognosis. Kaplan-Meier analysis demonstrated that these patients showed significantly longer RFS than those non-pCR (5-year RFS, 83.9% vs 33.9%; log-rank P = 0.009) (Figure 3). No statistically significant differences were observed in the patient characteristics summarized in Table 1 between the pCR and non-pCR groups.
All (n = 70) | pCR (n = 15) | Non-pCR (n = 55) | P value | |
Age, median (range) | 66 (45-78) | 64 (52-76) | 66 (45-78) | 0.97 |
Gender | ||||
Male/female | 60/10 | 12/3 | 48/7 | 0.48 |
Tumor site | ||||
Cervical/upper/middle/lower | 5/14/33/18 | 1/5/6/3 | 4/9/27/15 | 0.60 |
TNM staging | ||||
cT1/2/3/4 | 0/2/19/49 | 0/0/3/12 | 0/2/16/37 | 0.56 |
cN0/1/2/3 | 1/17/29/23 | 0/1/5/9 | 1/16/24/14 | 0.06 |
cStage I/II/III/IVA | 0/0/10/60 | 0/0/0/15 | 0/0/10/45 | 0.075 |
Total radiation dose | ||||
40 Gy/up to 60 Gy | 59/11 | 11/4 | 48/7 | 0.19 |
Among the 76 individual radiomics features, ADC skewness demonstrated the highest predictive performance for pCR, with an area under the curve (AUC) of 0.77. Following in descending order of performance were GLCM entropy (b = 1000 second/mm²) with an AUC of 0.76, GLCM autocorrelation (b = 0 second/mm²) with an AUC of 0.76, histogram-based skewness (b = 0 second/mm²) with an AUC of 0.73, and kurtosis (b = 0 second/mm²) with an AUC of 0.72. These five features represented the highest AUC values among the single radiomics features analyzed (Table 2).
AUC | Sensitivity | Specificity | Accuracy | Cut off | 95%CI | P value | |
ADC skewness | 0.77 | 0.67 | 0.82 | 0.79 | 0.37 | 0.61-0.88 | 0.005 |
GLCM entropy (b = 1000 second/mm²) | 0.76 | 0.87 | 0.62 | 0.67 | 9.28 | 0.64-0.85 | 0.002 |
GLCM autocorrelation (b = 0 second/mm²) | 0.76 | 0.53 | 0.93 | 0.84 | 5276.1 | 0.60-0.87 | 0.009 |
Skewness (b = 0 second/mm²) | 0.73 | 0.87 | 0.42 | 0.64 | 0.72 | 0.61-0.84 | 0.006 |
Kurtosis (b = 0 second/mm²) | 0.72 | 0.73 | 0.64 | 0.66 | 5.05 | 0.55-0.82 | 0.005 |
Machine learning radiomics model | 0.85 | 0.80 | 0.85 | 0.81 | NA | 0.73-0.93 | < 0.001 |
Univariate and multivariate Cox regression analyses were performed to identify prognostic factors for RFS among both radiomics and pathological features (Table 3). For the radiomics features, patients were divided into two groups using the optimal cut-off value for predicting pCR, which was determined by the ROC analysis, and this variable was subsequently incorporated into the Cox regression models.
Univariate | Multivariate | |||||
HR | 95%CI | P value | HR | 95%CI | P value | |
Radiomics features | ||||||
ADC skewness | 0.42 | 0.16-1.09 | 0.076 | |||
GLCM entropy (b = 1000 second/mm²) | 0.25 | 0.11-0.56 | 0.001 | 0.32 | 0.14-0.75 | 0.009 |
GLCM autocorrelation (b = 0 second/mm²) | 2.00 | 0.61-6.57 | 0.25 | |||
Skewness (b = 0 second/mm²) | 0.60 | 0.29-1.22 | 0.16 | |||
Kurtosis (b = 0 second/mm²) | 0.76 | 0.37-1.54 | 0.45 | |||
Pathological features | ||||||
pT3 vs pT0-2 | 1.08 | 0.54-2.18 | 0.82 | |||
pN + vs pN - | 3.14 | 1.48-6.63 | 0.003 | 2.20 | 1.00-4.80 | 0.042 |
Grade 3 or 2 vs grade 1 | 0.50 | 0.25-1.01 | 0.054 |
In the univariate analysis, GLCM entropy (b = 1000 second/mm²) was significantly associated with better RFS [hazard ratio (HR) = 0.25, 95% confidence interval (CI): 0.11-0.56, P = 0.001]. Other radiomics features did not show a significant association with RFS. Among the pathological features, positive lymph node status (pN + vs pN -) was identified as a significant predictor for poorer RFS (HR = 3.14, 95%CI: 1.48-6.63, P = 0.003).
To assess the independent prognostic value of these variables, features found to be significant in the univariate analysis were included in a multivariate Cox regression model. The multivariate analysis confirmed that GLCM entropy (b = 1000 second/mm²) remained a significant independent predictor of RFS (HR = 0.32, 95%CI: 0.14-0.75, P = 0.009). pN status also remained an independent prognostic factor (HR = 2.20, 95%CI: 1.00-4.80, P = 0.042). In Kaplan-Meier analysis, patients with high GLCM entropy (b = 1000 second/mm²) tumors exhibited significantly better 5-year RFS rates compared to those with low GLCM entropy (72.9% vs 18.0%, P < 0.001) (Figure 3).
A two-step feature selection process (Mann-Whitney U test and backward elimination) was performed, which led to the identification of four optimal radiomics features for pCR prediction: b = 0 GLCM entropy, b = 1000 GLCM entropy, b = 1000 GLCM correlation, and ADC skewness. These features subsequently provided input for the development of machine learning models within the AutoGluon framework.
The best-performing model, KNeighborsDist_BAG_L1, achieved an AUC of 0.85, sensitivity of 0.80, specificity of 0.85, and accuracy of 0.81, as summarized in Table 2. Notably, this machine learning radiomics model significantly outperformed single radiomics features for predicting pCR.
In Kaplan-Meier analysis, the predicted pCR-positive group (n = 24) showed significantly better RFS than the predicted pCR-negative group (n = 46), with 5-year RFS rates of 73.1% vs 32.1% (log-rank test, P = 0.007) (Figure 3).
In this study, we developed and validated a model to predict pCR in patients with ESCC following nCRT. Our study is characterized by a novel approach that combines semi-automated 3D tumor segmentation of pre-treatment MRI-DWI images with an automated machine learning framework for pCR prediction, allowing for highly reproducible and accurate prediction of treatment response. The resulting machine learning radiomics model achieved a high predictive performance, with the best-performing model yielding an AUC of 0.85. These results demonstrate the potential of this semi-automated 3D radiomics approach as a non-invasive biomarker, enabling the construction of optimal treatment strategies for each patient.
DWI and its quantitative value, ADC are established techniques for evaluating tumor cellularity and microstructural integrity. The utility of pre-treatment ADC values as a predictive biomarker for nCRT response in ESCC has been investigated, but the results remain controversial. A systematic review by Vollenbrock et al[21] revealed conflicting evidence on the predictive role of pre-treatment ADC. The review pointed out that some studies linked high ADC values to a good response, while others demonstrated an inverse relationship, underscoring the high variability of the findings. A significant limitation is that a simple mean ADC value represents only a one-dimensional average, failing to capture the spatial heterogeneity of the tumor’s microenvironment, which is crucial for treatment response[14]. Radiomics analysis, a process that extracts high-dimensional quantitative features from medical images, provides a powerful framework to overcome the limitations of simple quantitative metrics. This approach enables the objective quantification of intra-tumoral heterogeneity, a crucial biological property that cannot be fully captured by single-value parameters. The potential of radiomics to serve as a non-invasive biomarker for diagnosis, staging, and the prediction of prognosis and treatment response has been widely demonstrated[7,22]. In this study, we constructed a predictive model for pCR based on features derived from a 3D volumetric analysis of the entire tumor.
In this study, ADC skewness was the most powerful single predictor of pCR (AUC = 0.77), strongly reinforcing the findings of our previous report[4]. Notably, the conclusion is strengthened by enhanced methodological robustness; the semi-automated segmentation used here improves objectivity and reproducibility compared to the manual approach in our prior study, which we believe enhances its reliability as a biomarker. High ADC skewness reflects a distribution skewed towards lower ADC values. In our previous report[4], we hypothesized this indicates a microenvironment with less tumor stroma[23] and a higher density of highly proliferative cells (high Ki-67 index)[24], characteristics that confer greater sensitivity to CRT.
Furthermore, our study identified GLCM entropy (b = 1000 second/mm²) as an independent predictor for RFS, with high entropy values associated with a more favorable prognosis. This finding is consistent with previous reports on PET imaging in neuroendocrine tumors, which also found high entropy predicted better survival[25,26]. They hypothesized that high GLCM entropy, which reflects greater voxel-level derangement, is associated with a well-differentiated histology. This biological characteristic, in turn, reflects a better treatment response or lower overall malignancy. Therefore, it is suspected that in our study as well, high entropy reflects a similar underlying biological characteristic, leading to the observed favorable prognosis.
In this study, we utilized the automated machine learning framework, AutoGluon, to comprehensively analyze a multitude of radiomics features, thereby constructing a model with highly accurate pCR predictive performance. This AI technique automates algorithm selection and hyperparameter tuning, which is particularly advantageous for radiomics analysis that characteristically involves many features. This approach is particularly valuable as it reduces researcher bias in model selection and allows for a more extensive exploration of models than manual methods permit, thus enhancing the objectivity and reproducibility of the final predictive model[27].
Previous machine learning-based radiomics studies for predicting pCR in ESCC following nCRT have primarily focused on CT and FDG-PET. For example, Yang et al[12] reported an AUC of 0.79 with a CT-based radiomics model in the testing cohort, while Murakami et al[11] achieved a high predictive performance with an AUC of 0.95 in the testing data using PET images. Additionally, Li et al[28] showed the value of incorporating peritumoral information from CT images, reporting an AUC of 0.749 in their external validation cohort. More recently, Zhang et al[29] demonstrated high accuracy by combining deep learning with CT images, achieving an AUC of up to 0.92 in validation sets.
Radiomics studies using MRI to predict treatment response in esophageal cancer following nCRT remain limited, but reports on its application have emerged in recent years. For example, Lu et al[30] achieved an AUC of 0.831 for predicting good responders using delta-radiomics from pre- and post-treatment T2-weighted images. In studies focusing specifically on pCR prediction, Liu et al[15] and Liu et al[16] reported an AUC of 0.885 in a multicenter study using T2-weighted images and also achieved an AUC of 0.868 with a multi-modal approach combining CT and MRI.
Our machine learning model achieved an AUC of 0.85, demonstrating a pCR prediction accuracy comparable to these previous reports. Although MRI has limitations, such as susceptibility to motion artifacts and longer scanning times, it remains a highly valuable non-invasive biomarker compared with CT and PET, as it is free from radiation exposure and does not require contrast agents. Furthermore, our study differs from previous reports in that it incorporates semi-automated 3D segmentation of the entire tumor. We believe this approach increases the reliability and clinical value of our findings. The prognostic value of our model is further highlighted by the significantly better RFS observed in the predicted pCR-positive group compared to the predicted non-pCR group (5-year RFS: 73.1% vs 32.1%; P = 0.007). These findings demonstrate the model’s potential as an effective aid for clinical decision-making processes by identifying patients most likely to benefit from nCRT. In particular, a high predicted probability of pCR to CRT could serve as a critical biomarker to support organ-preserving “watch-and-wait” strategies for select patients, helping to optimize therapeutic pathways and improve patient outcomes.
This study has several limitations. First, this was a retrospective, single-center study. This design is susceptible to selection bias and limits the generalizability of our findings. Therefore, a large-scale, prospective, multicenter validation study is required to confirm the robustness and clinical utility of our model. Second, the sample size of our cohort was relatively small, which prevented us from creating a separate hold-out test set. Although we employed a nested cross-validation framework to mitigate overfitting, it will be necessary in the future to validate the model using a larger dataset that can be split into training and test sets, in order to improve its performance and stability. Third, radiomics features are known to be sensitive to imaging parameters. For reproducibility and reliability, the consensus and standardization of DWI protocols should be considered for future applications. Additionally, although we used a semi-automated 3D segmentation method to enhance objectivity, the manual editing step may still be subject to some degree of inter-observer variability. To address this limitation, we are currently developing a fully automated segmentation technique.
In conclusion, we developed an AI-based radiomics model using pre-treatment MRI to predict pCR in ESCC patients following nCRT. The model demonstrated high predictive accuracy, confirming its potential as a non-invasive biomarker. This approach could ultimately enhance clinical decision-making for personalized medicine and contribute to improved patient outcomes.
1. | Zhao YX, Zhao HP, Zhao MY, Yu Y, Qi X, Wang JH, Lv J. Latest insights into the global epidemiological features, screening, early diagnosis and prognosis prediction of esophageal squamous cell carcinoma. World J Gastroenterol. 2024;30:2638-2656. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 14] [Reference Citation Analysis (0)] |
2. | Kato K, Machida R, Ito Y, Daiko H, Ozawa S, Ogata T, Hara H, Kojima T, Abe T, Bamba T, Watanabe M, Kawakubo H, Shibuya Y, Tsubosa Y, Takegawa N, Kajiwara T, Baba H, Ueno M, Takeuchi H, Nakamura K, Kitagawa Y; JCOG1109 investigators. Doublet chemotherapy, triplet chemotherapy, or doublet chemotherapy combined with radiotherapy as neoadjuvant treatment for locally advanced oesophageal cancer (JCOG1109 NExT): a randomised, controlled, open-label, phase 3 trial. Lancet. 2024;404:55-66. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 48] [Cited by in RCA: 70] [Article Influence: 70.0] [Reference Citation Analysis (0)] |
3. | Duan X, Yue J, Wang S, Zhao F, Zhang W, Qie S, Jiang H. Prognostic role of the pathological status following neoadjuvant chemoradiotherapy and surgery in esophageal squamous cell carcinoma. BMC Cancer. 2025;25:61. [RCA] [PubMed] [DOI] [Full Text] [Reference Citation Analysis (0)] |
4. | Hirata A, Hayano K, Ohira G, Imanishi S, Hanaoka T, Murakami K, Aoyagi T, Shuto K, Matsubara H. Volumetric histogram analysis of apparent diffusion coefficient for predicting pathological complete response and survival in esophageal cancer patients treated with chemoradiotherapy. Am J Surg. 2020;219:1024-1029. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 17] [Cited by in RCA: 15] [Article Influence: 3.0] [Reference Citation Analysis (0)] |
5. | Fokkinga E, Hernandez-Tamames JA, Ianus A, Nilsson M, Tax CMW, Perez-Lopez R, Grussu F. Advanced Diffusion-Weighted MRI for Cancer Microstructure Assessment in Body Imaging, and Its Relationship With Histology. J Magn Reson Imaging. 2024;60:1278-1304. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 4] [Cited by in RCA: 7] [Article Influence: 7.0] [Reference Citation Analysis (0)] |
6. | Hayano K, Ohira G, Hirata A, Aoyagi T, Imanishi S, Tochigi T, Hanaoka T, Shuto K, Matsubara H. Imaging biomarkers for the treatment of esophageal cancer. World J Gastroenterol. 2019;25:3021-3029. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in CrossRef: 16] [Cited by in RCA: 19] [Article Influence: 3.2] [Reference Citation Analysis (0)] |
7. | Lambin P, Leijenaar RTH, Deist TM, Peerlings J, de Jong EEC, van Timmeren J, Sanduleanu S, Larue RTHM, Even AJG, Jochems A, van Wijk Y, Woodruff H, van Soest J, Lustberg T, Roelofs E, van Elmpt W, Dekker A, Mottaghy FM, Wildberger JE, Walsh S. Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol. 2017;14:749-762. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1825] [Cited by in RCA: 3582] [Article Influence: 447.8] [Reference Citation Analysis (0)] |
8. | Watanabe H, Hayano K, Ohira G, Imanishi S, Hanaoka T, Hirata A, Kano M, Matsubara H. Quantification of Structural Heterogeneity Using Fractal Analysis of Contrast-Enhanced CT Image to Predict Survival in Gastric Cancer Patients. Dig Dis Sci. 2021;66:2069-2074. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 14] [Cited by in RCA: 12] [Article Influence: 3.0] [Reference Citation Analysis (0)] |
9. | Wesdorp NJ, Hellingman T, Jansma EP, van Waesberghe JTM, Boellaard R, Punt CJA, Huiskens J, Kazemier G. Advanced analytics and artificial intelligence in gastrointestinal cancer: a systematic review of radiomics predicting response to treatment. Eur J Nucl Med Mol Imaging. 2021;48:1785-1794. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 44] [Cited by in RCA: 38] [Article Influence: 9.5] [Reference Citation Analysis (0)] |
10. | Hou Z, Ren W, Li S, Liu J, Sun Y, Yan J, Wan S. Radiomic analysis in contrast-enhanced CT: predict treatment response to chemoradiotherapy in esophageal carcinoma. Oncotarget. 2017;8:104444-104454. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 37] [Cited by in RCA: 60] [Article Influence: 7.5] [Reference Citation Analysis (0)] |
11. | Murakami Y, Kawahara D, Tani S, Kubo K, Katsuta T, Imano N, Takeuchi Y, Nishibuchi I, Saito A, Nagata Y. Predicting the Local Response of Esophageal Squamous Cell Carcinoma to Neoadjuvant Chemoradiotherapy by Radiomics with a Machine Learning Method Using (18)F-FDG PET Images. Diagnostics (Basel). 2021;11:1049. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 2] [Cited by in RCA: 13] [Article Influence: 3.3] [Reference Citation Analysis (0)] |
12. | Yang Z, He B, Zhuang X, Gao X, Wang D, Li M, Lin Z, Luo R. CT-based radiomic signatures for prediction of pathologic complete response in esophageal squamous cell carcinoma after neoadjuvant chemoradiotherapy. J Radiat Res. 2019;60:538-545. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 39] [Cited by in RCA: 65] [Article Influence: 10.8] [Reference Citation Analysis (0)] |
13. | Hou Z, Li S, Ren W, Liu J, Yan J, Wan S. Radiomic analysis in T2W and SPAIR T2W MRI: predict treatment response to chemoradiotherapy in esophageal squamous cell carcinoma. J Thorac Dis. 2018;10:2256-2267. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 19] [Cited by in RCA: 33] [Article Influence: 4.7] [Reference Citation Analysis (0)] |
14. | Li Z, Han C, Wang L, Zhu J, Yin Y, Li B. Prognostic Value of Texture Analysis Based on Pretreatment DWI-Weighted MRI for Esophageal Squamous Cell Carcinoma Patients Treated With Concurrent Chemo-Radiotherapy. Front Oncol. 2019;9:1057. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 3] [Cited by in RCA: 4] [Article Influence: 0.7] [Reference Citation Analysis (0)] |
15. | Liu Y, Wang Y, Hu X, Wang X, Xue L, Pang Q, Zhang H, Ma Z, Deng H, Yang Z, Sun X, Men Y, Ye F, Men K, Qin J, Bi N, Zhang J, Wang Q, Hui Z. Multimodality deep learning radiomics predicts pathological response after neoadjuvant chemoradiotherapy for esophageal squamous cell carcinoma. Insights Imaging. 2024;15:277. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 4] [Reference Citation Analysis (0)] |
16. | Liu Y, Wang Y, Wang X, Xue L, Zhang H, Ma Z, Deng H, Yang Z, Sun X, Men Y, Ye F, Men K, Qin J, Bi N, Wang Q, Hui Z. MR radiomics predicts pathological complete response of esophageal squamous cell carcinoma after neoadjuvant chemoradiotherapy: a multicenter study. Cancer Imaging. 2024;24:16. [RCA] [PubMed] [DOI] [Full Text] [Reference Citation Analysis (0)] |
17. | Rice TW, Patil DT, Blackstone EH. 8th edition AJCC/UICC staging of cancers of the esophagus and esophagogastric junction: application to clinical practice. Ann Cardiothorac Surg. 2017;6:119-130. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 303] [Cited by in RCA: 531] [Article Influence: 66.4] [Reference Citation Analysis (0)] |
18. | Doki Y, Tanaka K, Kawachi H, Shirakawa Y, Kitagawa Y, Toh Y, Yasuda T, Watanabe M, Kamei T, Oyama T, Seto Y, Murakami K, Arai T, Muto M, Mine S. Japanese Classification of Esophageal Cancer, 12th Edition: Part II. Esophagus. 2024;21:216-269. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1] [Cited by in RCA: 29] [Article Influence: 29.0] [Reference Citation Analysis (0)] |
19. | Blackledge MD, Leach MO, Collins DJ, Koh DM. Computed diffusion-weighted MR imaging may improve tumor detection. Radiology. 2011;261:573-581. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 120] [Cited by in RCA: 135] [Article Influence: 9.6] [Reference Citation Analysis (0)] |
20. | Parekh V, Jacobs MA. Radiomics: a new application from established techniques. Expert Rev Precis Med Drug Dev. 2016;1:207-226. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 239] [Cited by in RCA: 255] [Article Influence: 28.3] [Reference Citation Analysis (0)] |
21. | Vollenbrock SE, Voncken FEM, Bartels LW, Beets-Tan RGH, Bartels-Rutten A. Diffusion-weighted MRI with ADC mapping for response prediction and assessment of oesophageal cancer: A systematic review. Radiother Oncol. 2020;142:17-26. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 12] [Cited by in RCA: 20] [Article Influence: 3.3] [Reference Citation Analysis (0)] |
22. | Gillies RJ, Kinahan PE, Hricak H. Radiomics: Images Are More than Pictures, They Are Data. Radiology. 2016;278:563-577. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 4541] [Cited by in RCA: 5573] [Article Influence: 619.2] [Reference Citation Analysis (3)] |
23. | Driessen JP, Caldas-Magalhaes J, Janssen LM, Pameijer FA, Kooij N, Terhaard CH, Grolman W, Philippens ME. Diffusion-weighted MR imaging in laryngeal and hypopharyngeal carcinoma: association between apparent diffusion coefficient and histologic findings. Radiology. 2014;272:456-463. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 97] [Cited by in RCA: 113] [Article Influence: 10.3] [Reference Citation Analysis (0)] |
24. | Surov A, Meyer HJ, Wienke A. Associations between apparent diffusion coefficient (ADC) and KI 67 in different tumors: a meta-analysis. Part 1: ADC(mean). Oncotarget. 2017;8:75434-75444. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 84] [Cited by in RCA: 111] [Article Influence: 13.9] [Reference Citation Analysis (0)] |
25. | Werner RA, Ilhan H, Lehner S, Papp L, Zsótér N, Schatka I, Muegge DO, Javadi MS, Higuchi T, Buck AK, Bartenstein P, Bengel F, Essler M, Lapa C, Bundschuh RA. Pre-therapy Somatostatin Receptor-Based Heterogeneity Predicts Overall Survival in Pancreatic Neuroendocrine Tumor Patients Undergoing Peptide Receptor Radionuclide Therapy. Mol Imaging Biol. 2019;21:582-590. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 44] [Cited by in RCA: 66] [Article Influence: 11.0] [Reference Citation Analysis (0)] |
26. | Pellegrino S, Panico M, Bologna R, Morra R, Servetto A, Bianco R, Del Vecchio S, Fonti R. Texture Analysis of 68Ga-DOTATOC PET/CT Images for the Prediction of Outcome in Patients with Neuroendocrine Tumors. Biomedicines. 2025;13:1286. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Reference Citation Analysis (0)] |
27. | Lin S, Ma Z, Yao Y, Huang H, Chen W, Tang D, Gao W. Automatic machine learning accurately predicts the efficacy of immunotherapy for patients with inoperable advanced non-small cell lung cancer using a computed tomography-based radiomics model. Diagn Interv Radiol. 2025;31:130-140. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 2] [Reference Citation Analysis (0)] |
28. | Li Z, Wang F, Zhang H, Xie S, Peng L, Xu H, Wang Y. A radiomics strategy based on CT intra-tumoral and peritumoral regions for preoperative prediction of neoadjuvant chemoradiotherapy for esophageal cancer. Eur J Surg Oncol. 2024;50:108052. [RCA] [PubMed] [DOI] [Full Text] [Reference Citation Analysis (0)] |
29. | Zhang Z, Luo T, Yan M, Shen H, Tao K, Zeng J, Yuan J, Fang M, Zheng J, Bermejo I, Dekker A, Ruysscher D, Wee L, Zhang W, Jiang Y, Ji Y. Voxel-level radiomics and deep learning for predicting pathologic complete response in esophageal squamous cell carcinoma after neoadjuvant immunotherapy and chemotherapy. J Immunother Cancer. 2025;13:e011149. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Reference Citation Analysis (0)] |
30. | Lu S, Wang C, Liu Y, Chu F, Jia Z, Zhang H, Wang Z, Lu Y, Wang S, Yang G, Qu J. The MRI radiomics signature can predict the pathologic response to neoadjuvant chemotherapy in locally advanced esophageal squamous cell carcinoma. Eur Radiol. 2024;34:485-494. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 5] [Cited by in RCA: 15] [Article Influence: 15.0] [Reference Citation Analysis (0)] |