Kang BY, Qiao YH, Zhu J, Hu BL, Zhang ZC, Li JP, Pei YJ. Serum calcium-based interpretable machine learning model for predicting anastomotic leakage after rectal cancer resection: A multi-center study. World J Gastroenterol 2025; 31(19): 105283 [DOI: 10.3748/wjg.v31.i19.105283]
Corresponding Author of This Article
Yan-Jiang Pei, MD, PhD, Professor, Department of Digestive Surgery, Honghui Hospital, Xi'an Jiaotong University, No. 555 Youyi East Road, Beilin District, Xi’an 710032, Shanxi Province, China. 15829329200@126.com
Research Domain of This Article
Gastroenterology & Hepatology
Article-Type of This Article
Retrospective Study
Open-Access Policy of This Article
This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Bo-Yu Kang, Yi-Huan Qiao, Jun Zhu, Ze-Cheng Zhang, Ji-Peng Li, Department of Digestive Surgery, Xijing Hospital of Digestive Diseases, Xi’an 710032, Shaanxi Province, China
Jun Zhu, Department of General Surgery, The Southern Theater Air Force Hospital, Guangzhou 510000, Guangdong Province, China
Bao-Liang Hu, Yan'an Medical College, Yan'an University, Yan’an 716000, Shaanxi Province, China
Ji-Peng Li, Department of Experiment Surgery, Xijing Hospital, Xi’an 710032, Shaanxi Province, China
Yan-Jiang Pei, Department of Digestive Surgery, Honghui Hospital, Xi'an Jiaotong University, Xi’an 710032, Shanxi Province, China
Co-corresponding authors: Ji-Peng Li and Yan-Jiang Pei.
Author contributions: Kang BY and Qiao YH contributed equally to this study as co-first authors; Li JP and Pei YJ contributed equally to this study as co-corresponding authors; Kang BY was responsible for study conceptualization and design, data acquisition, analysis, and interpretation, and manuscript drafting, review, and editing; Qiao YH was responsible for study conceptualization and design, data acquisition, analysis, and interpretation, and manuscript review and editing; Zhu J was responsible for data acquisition, analysis, and interpretation, statistical analysis, and manuscript review and editing; Hu BL was responsible for data analysis and interpretation, and statistical analysis. Li JP and Pei YJ were responsible for manuscript review and editing.
Supported by National Natural Science Foundation of China, No. 82172781; and Shaanxi Health Scientific Research Innovation Team Project, No. 2024TD-06.
Institutional review board statement: The previous study was approved by the ethics committee of the First Affiliated Hospital of Air Force Military Medical University (approval No. KY20212211-N-1).
Informed consent statement: This study was a secondary analysis of retrospective data and was a retrospective, multi-cohort, observational study using de-identified data. Therefore, consent and research ethics committee approval was not required.
Conflict-of-interest statement: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Data sharing statement: The data that support the findings of this study are available from the corresponding author upon reasonable request.
Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Yan-Jiang Pei, MD, PhD, Professor, Department of Digestive Surgery, Honghui Hospital, Xi'an Jiaotong University, No. 555 Youyi East Road, Beilin District, Xi’an 710032, Shanxi Province, China. 15829329200@126.com
Received: January 17, 2025 Revised: March 27, 2025 Accepted: April 27, 2025 Published online: May 21, 2025 Processing time: 124 Days and 15.2 Hours
Abstract
BACKGROUND
Despite the promising prospects of utilizing artificial intelligence and machine learning (ML) for comprehensive disease analysis, few models constructed have been applied in clinical practice due to their complexity and the lack of reasonable explanations. In contrast to previous studies with small sample sizes and limited model interpretability, we developed a transparent eXtreme Gradient Boosting (XGBoost)-based model supported by multi-center data, using patients' basic information and clinical indicators to forecast the occurrence of anastomotic leakage (AL) after rectal cancer resection surgery. The model demonstrated robust predictive performance and identified clinically relevant thresholds, which may assist physicians in optimizing perioperative management.
AIM
To develop an interpretable ML model for accurately predicting the occurrence probability of AL after rectal cancer resection and define our clinical alert values for serum calcium ions.
METHODS
Patients who underwent anterior resection of the rectum for rectal carcinoma at the Department of Digestive Surgery, Xijing Hospital of Digestive Diseases, Air Force Medical University, and Shaanxi Provincial People's Hospital, were retrospectively collected from January 2011 to December 2021,. Ten ML models were integrated to analyze the data and develop the predictive models. Receiver operating characteristic (ROC) curves, calibration curve, decision curve analysis, accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score were used to evaluate model performance. We employed the SHapley Additive exPlanations (SHAP) algorithm to explain the feature importance of the optimal model.
RESULTS
A total of ten features were integrated to construct the predictive model and identify the optimal model. XGBoost was considered the best-performing model with an area under the ROC curve (AUC) of 0.984 (95%confidence interval: 0.972-0.996) in the test set (accuracy: 0.925; sensitivity: 0.92; specificity: 0.927). Furthermore, the model achieved an AUC of 0.703 in external validation. The interpretable SHAP algorithm revealed that the serum calcium ion level was the crucial factor influencing the predictions of the model.
CONCLUSION
A superior predictive model, leveraging clinical data, has been crafted by employing the most effective XGBoost from a selection of ten algorithms. This model, by predicting the occurrence of AL in patients after rectal cancer resection, has identified the significant role of serum calcium ion levels, providing guidance for clinical practice. The integration of SHAP provides a clear interpretation of the model's predictions.
Core Tip: Ten machine learning models were established using ten factors and interpreted using the SHapley Additive exPlanations model. Through model evaluation and comparison, we selected the best prediction model and performed external validation in multiple centers. We found for the first time that perioperative serum calcium ion level plays an important role in the occurrence of anastomotic leakage (AL) after anterior resection of rectal cancer, and proposed that preoperative serum calcium level lower than 2.1 and postoperative calcium level lower than 2.2 are clinical warning values for the occurrence of AL.
Citation: Kang BY, Qiao YH, Zhu J, Hu BL, Zhang ZC, Li JP, Pei YJ. Serum calcium-based interpretable machine learning model for predicting anastomotic leakage after rectal cancer resection: A multi-center study. World J Gastroenterol 2025; 31(19): 105283
Colorectal cancer is recognized as a critical public health issue, being the third most common cancer and the second leading cause of cancer-related mortality worldwide[1]. Rectal cancer is one of the most common and severe diseases that threatens human health worldwide as well[2]. And the global burden of rectal cancer is expected to increase by 2040[3]. Up to 20% of patients undergoing low anterior resection for rectal cancer experience anastomotic leakage (AL)[4]. AL is a severe complication after rectal cancer surgery, leading to increased permanent stoma formation and cancer recurrence[5,6]. Therefore, it is of great importance to investigate the risk factors for AL to reduce its incidence.
Machine learning (ML), an artificial intelligence (AI)-based predictive tool, holds significant advantages in dealing with the complex relationships between diseases and contributing factors in the medical field[7,8]. Currently, ML is extensively employed for predicting diseases and survival outcomes, aiding physicians in making precise clinical decisions[9-12]. SHapley Additive exPlanations (SHAP) is a tool for explaining predictions made by ML models. By leveraging the SHAP algorithm to analyze and interpret individual model predictions, as well as to provide a more comprehensive view of the impact of inputs on the output, the model gains increased clinical value. Currently, this type of interpretable predictive model has been successfully applied across various medical fields, such as predictions for sepsis and hepatocellular carcinoma[13,14].
Despite the promising prospects of utilizing AI and ML for comprehensive disease analysis, few models have been applied in clinical practice due to their complexity and the lack of reasonable explanations[15,16]. In contrast to previous studies with small sample sizes and limited interpretability, we developed a transparent eXtreme Gradient Boosting (XGBoost)-based model supported by multi-center data, using patients' basic information and clinical indicators to forecast the occurrence of AL after rectal cancer resection. The model demonstrated robust predictive performance and identified clinically relevant thresholds, which may assist physicians in optimizing perioperative management.
MATERIALS AND METHODS
Data and participants
This study encompassed 1818 patients diagnosed with rectal cancer, all of whom underwent anterior resection of the rectum for rectal carcinoma at the Department of Digestive Surgery, Xijing Hospital of Digestive Diseases, Air Force Medical University, from January 2011 to December 2021. The external validation data was collected from Shaanxi Provincial People's Hospital, encompassing 60 cases of patients who underwent anterior resection for rectal cancer from January 2021 to January 2024. The implementation of radical resection for rectal carcinoma strictly adhered to the treatment guidelines of the corresponding period, with standard surgical procedures employed. The case information came from electronic medical records. This study was a secondary analysis of retrospective data and was a retrospective, multi-cohort, observational study using de-identified data. Therefore, informed consent and research ethics committee approval were not required. The study protocol adhered to the ethical guidelines of the 1995 Declaration of Helsinki, and the previous study was approved by the ethics committee of the First Affiliated Hospital of Air Force Military Medical University (approval No. KY20212211-N-1).
Inclusion and exclusion criteria
Inclusion criteria: (1) Age ≥ 18 years; (2) Patients with confirmed AL after anterior resection of the rectum for rectal cancer; and (3) Primary rectal carcinoma confirmed by preoperative pathology.
Exclusion criteria: (1) Non-unifocal primary cancer lesions; (2) Development of rectourethral or rectovaginal leakage; (3) Incomplete clinical data; (4) Abnormal or unclear clinical data; (5) All cases with severe heart, lung, and brain diseases or serious infections; and (6) All cases with coagulation dysfunction or severe blood disorders. The flowchart of patient selection and model construction is illustrated in Figure 1.
Figure 1 Flowchart of study procedure.
SHAP: SHapley Additive exPlanations.
Definitions and data preprocessing
The diagnosis of rectal cancer was based on the 2021 guidelines from the National Comprehensive Cancer Network for the diagnosis and treatment of rectal cancer. The definition of AL adopted the criteria proposed by The International Study Group of Rectal Cancer in 2009, and patients with three grades of A, B, and C were included in the study[17]. Based on 28 continuous variables, the median was taken as the cutoff value for dichotomization and 11 categorical variables were analyzed in dummy form. Preoperative and postoperative patient data were collected three days prior to surgery and one day after surgery, respectively, and subsequently entered into the electronic medical records. The data did not contain missing values.
Study variables
Based on patient information, clinical variables, and hematological indicators, we included 39 variables for analysis and screened for important factors using least absolute shrinkage and selection operator (LASSO) regression, stepwise Logistic regression, and Boruta regression. By employing LASSO regression, we used the optimal regularization lambda parameters, divided the entire dataset into 10 folds, and sequentially incorporated each variable into the model, ceasing the addition when the area under the receiver operating characteristic (ROC) curve (AUC) no longer increased, thereby conducting feature selection. Nine variables were considered significant in all regression analyses. Preoperative and postoperative serum creatinine levels, as well as postoperative cystatin C, were considered to be of significant importance in both regression models. These three variables all reflect renal function. To minimize potential interactions and considering that postoperative serum creatinine is more commonly used in clinical practice, we selected postoperative serum creatinine for inclusion in the model construction.
Model construction and validation
The patients (n = 1818) were randomly assigned to training and testing datasets in a 7:3 ratio. To evaluate the impact of data imbalance, models were constructed and evaluated using both raw data and SMOTE resampled data (n = 3721), and the most effective data processing method was selected. Additionally, the efficacy of the model will be rigorously evaluated through external validation processes to guarantee the precision and dependability of its prognostic capabilities. Subsequently, an optimal predictive model was developed utilizing 10 distinct ML algorithms: Logistic regression, support vector machine, gradient boosting machines, neural network, random forest, XGBoost, K-nearest neighbors, AdaBoost, light gradient boosting machine, and CatBoost. The performance of the models was rigorously assessed employing ROC curves, calibration curve, decision curve analysis (DCA), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1 score.
Evaluating predictive factors using SHAP values
SHAP, originally developed by Lee, is an approach grounded in cooperative game theory for elucidating ML models[18]. By assigning SHAP values that quantify the contribution of each predictor variable, SHAP creates a consistent framework ideal for assessing variable contributions within predictive models, thus mitigating the challenge of model opacity. Distinct from alternative methodologies, SHAP values offer local interpretability, enhancing the comprehension of model intricacies. Positive SHAP values indicate that the variable is more instrumental in predicting patients with AL, while negative values suggest that the variable is more conducive to predicting patients without AL.
Statistical analysis
Statistical analyses were performed by using R, version 4.4.1 (R, Foundation for Statistical Computing). SMOTE resampling was calculated using the R package DMwR. LASSO regularization model, Boruta regression, and Logistic regression were calculated using the R packages Boruta and glmnet. Ten ML models were calculated using the R packages e1071, gbm, caret, XGBoost, nnet, Adaboost, lightgbm, and Catboost. ROC curve, DCA, and calibration curve were calculated using the R packages pROC and rmda and risk regression. All continuous variables in this study were converted into categorical variables by median, and all categorical variables appeared in frequency and percentage form. The χ2 test was used to compare the differences between groups, and P < 0.05 was considered statistically significant. The kernelshap and shiny packages in R were utilized to ascertain the significance and hierarchy of variables within the model. The directional influence of each variable on the outcome was established based on SHAP values and detailed visual explanations were generated at the individual observation level.
RESULTS
Patients’ baseline characteristics
This retrospective study included 1818 patients (average age, 61.6 years; range, 20-89 years) from January 2011 to December 2021, including 1119 males and 699 females, of whom 86 (4.73%) suffered from AL. The baseline data of included patients are shown in Table 1. Baseline data for the training set, validation set, and external validation are shown in Supplementary Tables 1-3.
Table 1 Baseline characteristics of patients with and without anastomotic leakage, n (%).
NAL (n = 1732)
AL (n = 86)
P value
Sex = male
1055 (60.9)
64 (74.4)
0.016
Tobacco or alcohol
< 0.001
No
1256 (72.5)
46 (53.5)
Tobacco
258 (14.9)
18 (20.9)
Alcohol
25 (1.4)
1 (1.2)
All
193 (11.1)
21 (24.4)
T stage
< 0.001
T1
192 (11.1)
2 (2.3)
T2
453 (26.2)
17 (19.8)
T3
995 (57.4)
43 (50.0)
T4
92 (5.3)
24 (27.9)
N stage
0.025
N0
291 (16.8)
12 (14.0)
N1
1260 (72.7)
57 (66.3)
N2
181 (10.5)
17 (19.8)
Histological type
< 0.001
Adenocarcinoma
36 (2.1)
0 (0.0)
Adenosquamouscarcinoma
0 (0.0)
6 (7.0)
Others
1663 (96.0)
80 (93.0)
Multi
1 (0.1)
0 (0.0)
Neuroendocrine
17 (1.0)
0 (0.0)
Stromal
15 (0.9)
0 (0.0)
Family = yes
32 (1.8)
10 (11.6)
< 0.001
CD34 = yes
647 (37.4)
67 (77.9)
< 0.001
FOBT = M1
1276 (73.7)
64 (74.4)
0.978
Location = others
76 (4.4)
2 (2.3)
0.517
Age = high
871 (50.3)
39 (45.3)
0.433
Neoadjuvant = yes
142 (8.2)
57 (66.3)
< 0.001
There was no significant difference in age, location, or other aspects between the AL group and NAL group (P = 0.433, P = 0.517). In the univariate analysis, significant differences (P < 0.05) were observed between the AL group and the NAL group with respect to family history, tobacco and alcohol history, CD34, T stage, preoperative and postoperative blood calcium levels, preoperative platelet count, preoperative plasma albumin and globulin levels, positive lymph node count (rLNs), and neoadjuvant therapy. Upon further multivariate analysis, significant differences were observed between the AL group and the NAL group in family history, tobacco and alcohol history, CD34, T stage, preoperative and postoperative blood calcium levels, preoperative platelet count, rLNs, and neoadjuvant therapy. These factors also emerged as common key variables across the three multivariate analysis results. Based on statistical analysis and clinical experience, we ultimately selected ten variables, namely, family history, tobacco and alcohol history, CD34, T stage, preoperative and postoperative blood calcium levels, preoperative platelet count, postoperative creatinine, rLNs, and neoadjuvant therapy, as parameters for the ML model (Supplementary Figure 1). Correlation coefficients between these variables are presented as a correlation matrix (Figure 2) and all the correlation coefficients were below 0.8, which demonstrated no serious collinearity. At this point, we identified the significant impact of serum calcium ion, as both preoperative and postoperative levels are important determinants of AL.
Figure 2 Interaction between variables in a correlation matrix.
PrCa: Prstoperative calcium ion concentration; PrCRE: Preoperative serum creatinine concentration; PoCa: Postoperative calcium ion concentration; rLNs: Positive lymph node count; PrPLT: Preoperative platelet concentration.
Comparative performance analysis of ML models for predicting risk of AL
We trained 10 different ML prediction models on a dataset of 1273 raw samples and a dataset of 3721 samples, which were balanced using the SMOTE (Synthetic Least Squares) method and focused on 10 key factors. The performance of each model in the test set is shown in Table 2.
Table 2 Evaluation indicators in testing set of 10 machine learning models.
Model
Accuracy
Sensitivity/PPV
Specificity/NPV
F1
Logistic.R
0.916
0.98
0.912
0.521
SVM.R
0.873
0.76
0.879
0.355
GBM.R
0.901
0.96
0.898
0.471
NeuralNetwork.R
0.89
0.98
0.885
0.455
RandomForest.R
0.974
0.56
0.994
0.667
XGBoost.R
0.925
0.92
0.927
0.708
KNN.R
0.919
0.88
0.921
0.5
Adaboost.R
0.903
0.8
0.908
0.43
LightGBM.R
0.916
0.92
0.915
0.5
CatBoost.R
0.906
0.96
0.904
0.485
Logistic.S
0.914
0.98
0.91
0.515
SVM.S
0.952
0.92
0.954
0.639
GBM.S
0.956
0.96
0.956
0.667
NeuralNetwork.S
0.958
0.96
0.958
0.676
RandomForest.S
0.95
0.76
0.96
0.585
XGBoost.S
0.939
0.96
0.938
0.593
KNN.S
0.952
0.84
0.958
0.618
Adaboost.S
0.804
0.92
0.798
0.301
LightGBM.S
0.927
0.96
0.925
0.545
CatBoost.S
0.943
0.96
0.942
0.608
Most models demonstrated satisfactory predictive performance in terms of accuracy, sensitivity, specificity, PPV, NPV, and F1 score. When comparing the models derived from the raw dataset and the SMOTE-resampled dataset, it was observed that while the resampled group exhibited higher accuracy, the F1 score and calibration curve were inferior. The F1 score, as the harmonic mean of precision and recall, accurately shows a model's ability to predict minority classes[19]. The calibration curve, which compares predicted and actual outcomes, indicates how well the model performs in real-world scenarios[20]. When early stopping was applied, SMOTE resampling of the data led to worse F1 scores and calibration curves for AL patients. This suggests that resampling the minority data amplifies the noise and raises the risk of overfitting. XGBoost, Logistic regression, and Neural Network performed well on the ROC curve, with the highest AUC values of 0.988, 0.986 and 0.984, respectively, signifying their robust predictive capacity for AL. In terms of calibration curves and F1 scores, XGBoost outperformed the other two models, leading us to conclude that XGBoost is the optimal model in terms of performance for predicting AL occurrence, with the best PPV (0.92) and NPV (0.927). DCA is a method for evaluating the potential impact of predictive models on clinical decision-making and it assesses the clinical utility of a model by comparing the net benefits of different threshold probabilities for treatment decisions[21]. The calibration curve in predictive modeling serves as a graphical representation that delineates the agreement between the predicted probabilities and the observed outcomes, thereby assessing the model's accuracy[22]. Therefore, to further evaluate the fitting effect of the XGBoost model, we drew its DCA curve and calibration curve. The DCA curve shows that the XGBoost model can provide profitable predictive performance within a threshold probability range of around 0.05 to 0.9. Despite the presence of data imbalance, the model demonstrated good performance in calibration curves, ROC curves, and confusion matrices, indicating that it indeed possesses high predictive performance (Figures 3 and 4).
Figure 3 Testing and evaluation of eXtreme Gradient Boosting model based on raw data.
A: Area under the receiver operating characteristic curve comparison between models; B: The confusion matrix of eXtreme Gradient Boosting model; C: Comparison of decision curve analysis curves between models; D: Comparison of calibration curves between models. AUC: Area under the receiver operating characteristic curve; ROC: Receiver operating characteristic; SVM: Support vector machine; GBM: Gradient boosting machines; KNN: K-nearest neighbors; LightGBM: Light gradient boosting machine.
Figure 4 Testing and evaluation of eXtreme Gradient Boosting model based on SMOTE-resampled data.
A: Area under the receiver operating characteristic curve comparison between models; B: The confusion matrix of eXtreme Gradient Boosting model; C: Comparison of decision curve analysis curves between models; D: Comparison of calibration curves between models. AUC: Area under the receiver operating characteristic curve; ROC: Receiver operating characteristic; SVM: Support vector machine; GBM: Gradient boosting machines; KNN: K-nearest neighbors; LightGBM: Light gradient boosting machine.
External validation of the model
Although the model exhibited satisfactory predictive performance in the training and validation cohorts, the generalizability of findings from a single-center study is inherently constrained. To bolster the evidence for the model's predictive utility and to underscore the pivotal role of serum calcium ion concentrations, we pursued external validation. This rigorous assessment culminated in an AUC of 0.703 (95% confidence interval: 0.525 to 0.881) (Figure 5).
Figure 5 Receiver operating characteristic curve analysis of the eXtreme Gradient Boosting model based on the external validation dataset.
AUC: Area under the receiver operating characteristic curve; ROC: Receiver operating characteristic; SVM: Support vector machine; GBM: Gradient boosting machines; KNN: K-nearest neighbors; LightGBM: Light gradient boosting machine.
The SHAP to model interpretation
While ML models can achieve high predictive accuracy, their decision-making processes are often opaque, limiting their interpretability in clinical contexts. The application of SHAP has effectively demystified the "black box" nature of ML models, endowing them with interpretability. Through the lens of SHAP, we are able to gain a deeper comprehension of the predictive processes and, consequently, enhance the reliability of our models[23]. This approach allows for a more transparent understanding of how features contribute to the output and the individual characteristics of patients, which is crucial for the credibility and effectiveness of ML applications. In this study, we employed SHAP values to quantify the contribution of each variable to the predictive outcomes for patients (Figure 6). The greater the SHAP value, the more significant the increase in the incidence of AL that the variable was associated with. This method provided a robust framework for understanding the impact of individual factors on AL risk. Therefore, we analyzed one AL patient and one NAL patient separately (Figure 7). By using the interaction of significant factors diagram, we can clarify the effects of different factor combinations on the model and gain a deeper understanding of its behavior (Figure 8).
Figure 6 Interpretation of the eXtreme Gradient Boosting model using SHapley Additive exPlanations.
A: Mean importance ranking of features displayed by SHapley Additive exPlanations (SHAP); B: Bee colony diagram of characterization attributes in SHAP. SHAP: SHapley Additive exPlanations.
Figure 7 Interpretation of the light gradient boosting machine model using SHapley Additive exPlanations.
A: A patient who did not develop anastomotic leakage; B: A patient who developed anastomotic leakage. SHAP: SHapley Additive exPlanations; PoCa: Postoperative calcium ion concentration; rLNs: Positive lymph node count; PrPLT: Preoperative platelet concentration.
Figure 8 Interaction of significant factors based on SHapley Additive exPlanations.
SHAP: SHapley Additive exPlanations; PrCa: Preoperative calcium ion concentration; PrCRE: Preoperative serum creatinine concentration; PoCA: Postoperative calcium ion concentration; rLNs: Positive lymph node count; PrPLT: Preoperative platelet concentration.
DISCUSSION
AL is a serious complication after anterior resection of rectal cancer, with a clinical incidence ranging from 3% to 20%, and can lead to severe mortality and poor prognosis[24]. In recent studies, AL has been confirmed to be associated with poor overall survival and disease-free survival[25]. Although new surgical methods have been recognized to be able to improve surgical outcomes[26], the occurrence of AL still poses a significant threat to patients. Therefore, reducing the occurrence of AL or intervening early in AL becomes a top priority.
In this study, to better predict the occurrence of AL and clarify the risk threshold for its development, we divided the clinical information, hematological indicators, and surgical details of 1818 patients into binary variables for analysis and external validation among 60 patients. This approach not only refined the model's predictive capabilities but also provides a clearer benchmark for clinical decision-making and perioperative care. And this study included 86 cases of AL, which is a larger number of positive cases compared to previous studies. This increased sample size of affected individuals contributes to a more accurate and reliable model. In the selection of patients, with the criterion that missing values should be less than 10%, we opted to exclude those with incomplete clinical data to ensure the most authentic dataset. To ensure data authenticity and minimize potential biases, we opted not to impute missing values and instead excluded incomplete records.
From a methodological perspective, in order to improve the accuracy of the model variables, we used three variable importance assessment methods and selected variables that are deemed significant by at least two of these methods for the construction of our ML model. Furthermore, to mitigate issues such as noise amplification and overfitting that may arise from SMOTE resampling, we constructed models using two sets of data: One before and one after SMOTE resampling. This approach enhanced the quality of the data, ensuring a more robust and reliable model. In the realm of ML, this study incorporated 10 distinct mainstream ML models. We employed a validation set to assess the models, conducting a comprehensive comparison that considered both graphical representations and data metrics. Ultimately, we determined that the XGBoost ML model demonstrated optimal performance across all assessed metrics. Moreover, this model demonstrated commendable predictive performance on imbalanced datasets as well, which aligned with the clinical reality of the low incidence of AL. Although the occurrence of AL is highly correlated with the medical environment and the surgical team at the time of surgery, the construction of this model can serve two main purposes. On the one hand, it allows for the estimation of the likelihood of an anastomotic leak in patients, guiding the medical team to pay closer attention to those at risk. On the other hand, it helps in identifying significant factors and determining clinical alert values, thereby improving patients outcomes.
From the importance ranking of features displayed by SHAP, which was used to assess the risk factors for the development of anastomotic leaks. the number of positive lymph nodes was considered the most significant prognostic factor, which was associated with tumor spread and metastasis, indicating the need for a more extensive lymph node dissection. In terms of hematological indicators, preoperative and postoperative serum calcium ion levels and preoperative platelet level were identified as the primary contributors. By synthesizing previous research and clinical experience, it is posited that both of these factors are intricately linked to the processes of infection and healing at the anastomotic site. A study in 2021 demonstrated that postoperative serum calcium ion level may be used to identify patients at risk for AL, and postoperative low serum calcium ion level can represent a risk factor for AL in digestive surgery[27]. In our study, we found that postoperative serum calcium ion levels have a greater impact on AL compared to preoperative serum calcium ion levels. On the other hand, this also highlights the importance of continuous blood calcium monitoring. In the context of coagulation, calcium ions, also known as clotting factor IV, can activate the intrinsic pathway of blood coagulation in conjunction with other coagulation factors, thereby accelerating the formation and activation of thrombin[28].
From a mechanistic perspective, existing studies have demonstrated that local calcium can modulate keratinocytes and fibroblasts, as well as contribute to the formation of the stratum corneum lipid barrier through signal transduction and gene expression[29]. Further research indicates that calcium flash (rapid calcium waves) dependent on TRP channels is involved in the early stages of wound healing, which could partially explain the poor intestinal anastomosis associated with low calcium levels[30]. Building on the aforementioned theories, researchers have utilized calcium silicate ceramics to stimulate adipose-derived stem cells, thereby promoting angiogenesis and enhancing skin wound healing, as confirmed in animal models[31]. Beyond these functions, calcium ions can also enhance the antimicrobial activity of antimicrobial dressings by damaging and destroying bacterial cell membranes, as well as by oxidizing bacterial media, which further contributes to bacterial killing[32,33]. Clinical applications bestow greater significance upon basic research, and studies on the role of calcium in wound healing have been performed in animal models[34]. Therefore, in response to the present clinical practice, we can initially maintain the normal serum calcium ion levels in patients to minimize the risk of anastomotic leaks. In the future, it may be possible to further develop calcium-rich dressings or sutures to promote the healing of anastomotic sites in patients. Preoperative systemic immune-inflammation index (SII) in patients with colorectal cancer is considered a marker for evaluating the systemic inflammatory status of patients and associated with patient prognosis[35]. SII is calculated with the formula SII = (P × N)/L, where P, N, and L refer to platelet, neutrophil, and lymphocyte counts, respectively. Therefore, we believe that elevated preoperative platelet counts may indicate a heightened systemic inflammatory response, which could increase the risk of AL in patients and reduce their prognosis.
In this research, we found that neoadjuvant therapy is a risk factor for anterior resection of rectal cancer, which is consistent with previous studies[36]. A new significant risk factor was identified: Perioperative serum calcium ion levels. Patients with preoperative serum calcium ion levels below 2.2 or postoperative serum calcium ion levels below 2.06 are considered to be at risk for AL. This finding can not only aid clinicians in identifying high-risk populations for AL, but also provide clinical alert values. We posit that calcium plays a crucial role in anastomotic healing by exerting antimicrobial effects, promoting hemostasis, and enhancing the function of keratinocytes and fibroblasts. Consequently, hypocalcemia significantly impacts the risk of AL in patients post-proctectomy for rectal cancer.
Our multicenter study has constructed an ML predictive model that exhibits superior predictive accuracy, demonstrating commendable performance in both internal and external validation processes. Furthermore, the dichotomization of variables within our model has yielded clinically relevant threshold values for alerting purposes. With the aim of enhancing clinical utility and facilitating widespread adoption across a broad spectrum of hospitals, we have engineered an intuitive user interface (UI) designed for medical practitioners. This interface empowers clinicians to conduct real-time assessments of patients' risk for AL, thereby enabling them to modify treatment strategies in a timely and informed manner. At present, we have collected the survival data of 963 patients, and further survival analysis is expected.
However, our study was not without its limitations. Primarily, our retrospective study was inevitably subject to information bias, which included omissions and errors in data collection. Additionally, during the data processing phase, factors such as missing data also came into play. In addition, due to the limitations of our center's database, some important factors may be overlooked in model construction and multicenter validation, such as body mass index and the distance from the lower edge of the tumor to the dentate line. The lower AUC in the external validation may be attributed to the following reasons: (1) The external validation dataset is relatively small and there are differences in the baseline characteristics of patients from different centers; and (2) Variations exist in the testing conditions of hematological indicators across different centers, and during the study period, the testing techniques for hematological indicators have evolved. This omission may result in a less comprehensive model and could diminish its accuracy and stability. Finally, while this study provided an interactive, physician-friendly UI, there was still a gap before it meets clinical application standards. Therefore, we plan to develop a related app based on this model to facilitate its widespread use in clinical practice.
CONCLUSION
We have developed an ML predictive model based on perioperative serum calcium ion levels and other indices, which has shown excellent performance in predicting the occurrence of AL following anterior resection for rectal cancer. The application of SHAP has significantly enhanced the model's interpretability, playing a crucial role in both understanding the model and facilitating its clinical application. During the model construction, we have also identified the significant role of perioperative serum calcium ion and defined clinical alert values, which aids in the early warning of AL and provides prognostic information for patients.
Footnotes
Provenance and peer review: Unsolicited article; Externally peer reviewed.
Peer-review model: Single blind
Specialty type: Gastroenterology and hepatology
Country of origin: China
Peer-review report’s classification
Scientific Quality: Grade A, Grade A, Grade A, Grade B, Grade B
Novelty: Grade A, Grade A, Grade A, Grade B, Grade B
Creativity or Innovation: Grade A, Grade A, Grade A, Grade B, Grade B
Scientific Significance: Grade A, Grade A, Grade A, Grade B, Grade B
P-Reviewer: Abdelsamad A; Li JT; Xu DW S-Editor: Lin C L-Editor: Wang TQ P-Editor: Zhao S
Thomas CE, Lin Y, Kim M, Kawaguchi ES, Qu C, Um CY, Lynch BM, Van Guelpen B, Tsilidis K, Carreras-Torres R, van Duijnhoven FJB, Sakoda LC, Campbell PT, Tian Y, Chang-Claude J, Bézieau S, Budiarto A, Palmer JR, Newcomb PA, Casey G, Le Marchandz L, Giannakis M, Li CI, Gsur A, Newton C, Obón-Santacana M, Moreno V, Vodicka P, Brenner H, Hoffmeister M, Pellatt AJ, Schoen RE, Dimou N, Murphy N, Gunter MJ, Castellví-Bel S, Figueiredo JC, Chan AT, Song M, Li L, Bishop DT, Gruber SB, Baurley JW, Bien SA, Conti DV, Huyghe JR, Kundaje A, Su YR, Wang J, Keku TO, Woods MO, Berndt SI, Chanock SJ, Tangen CM, Wolk A, Burnett-Hartman A, Wu AH, White E, Devall MA, Díez-Obrero V, Drew DA, Giovannucci E, Hidaka A, Kim AE, Lewinger JP, Morrison J, Ose J, Papadimitriou N, Pardamean B, Peoples AR, Ruiz-Narvaez EA, Shcherbina A, Stern MC, Chen X, Thomas DC, Platz EA, Gauderman WJ, Peters U, Hsu L. Characterization of Additive Gene-environment Interactions For Colorectal Cancer Risk.Epidemiology. 2025;36:126-138.
[RCA] [PubMed] [DOI] [Full Text][Reference Citation Analysis (0)]
de'Angelis N, Schena CA, Azzolina D, Carra MC, Khan J, Gronnier C, Gaujoux S, Bianchi PP, Spinelli A, Rouanet P, Martínez-Pérez A, Pessaux P; Association Française de Chirurgie (AFC). Histopathological outcomes of transanal, robotic, open, and laparoscopic surgery for rectal cancer resection. A Bayesian network meta-analysis of randomized controlled trials.Eur J Surg Oncol. 2025;51:109481.
[RCA] [PubMed] [DOI] [Full Text][Reference Citation Analysis (0)]
Penna M, Hompes R, Arnold S, Wynn G, Austin R, Warusavitarne J, Moran B, Hanna GB, Mortensen NJ, Tekkis PP; International TaTME Registry Collaborative. Incidence and Risk Factors for Anastomotic Failure in 1594 Patients Treated by Transanal Total Mesorectal Excision: Results From the International TaTME Registry.Ann Surg. 2019;269:700-711.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 188][Cited by in RCA: 237][Article Influence: 47.4][Reference Citation Analysis (0)]
Hu W, Jin T, Pan Z, Xu H, Yu L, Chen T, Zhang W, Jiang H, Yang W, Xu J, Zhu F, Dai H. An interpretable ensemble learning model facilitates early risk stratification of ischemic stroke in intensive care unit: Development and external validation of ICU-ISPM.Comput Biol Med. 2023;166:107577.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 3][Reference Citation Analysis (0)]
Shao S, Zhao Y, Lu Q, Liu L, Mu L, Qin J. Artificial intelligence assists surgeons' decision-making of temporary ileostomy in patients with rectal cancer who have received anterior resection.Eur J Surg Oncol. 2023;49:433-439.
[RCA] [PubMed] [DOI] [Full Text][Reference Citation Analysis (0)]
Chen KA, Goffredo P, Butler LR, Joisa CU, Guillem JG, Gomez SM, Kapadia MR. Prediction of Pathologic Complete Response for Rectal Cancer Based on Pretreatment Factors Using Machine Learning.Dis Colon Rectum. 2024;67:387-397.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 1][Reference Citation Analysis (0)]
Liu Y, Shi J, Liu W, Tang Y, Shu X, Wang R, Chen Y, Shi X, Jin J, Li D. A deep neural network predictor to predict the sensitivity of neoadjuvant chemoradiotherapy in locally advanced rectal cancer.Cancer Lett. 2024;589:216641.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 1][Reference Citation Analysis (0)]
Ma J, Bo Z, Zhao Z, Yang J, Yang Y, Li H, Yang Y, Wang J, Su Q, Wang J, Chen K, Yu Z, Wang Y, Chen G. Machine Learning to Predict the Response to Lenvatinib Combined with Transarterial Chemoembolization for Unresectable Hepatocellular Carcinoma.Cancers (Basel). 2023;15.
[RCA] [PubMed] [DOI] [Full Text][Cited by in RCA: 4][Reference Citation Analysis (0)]
Kagawa Y, Smith JJ, Fokas E, Watanabe J, Cercek A, Greten FR, Bando H, Shi Q, Garcia-Aguilar J, Romesser PB, Horvat N, Sanoff H, Hall W, Kato T, Rödel C, Dasari A, Yoshino T. Future direction of total neoadjuvant therapy for locally advanced rectal cancer.Nat Rev Gastroenterol Hepatol. 2024;21:444-455.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 17][Reference Citation Analysis (0)]
Kulu Y, Ulrich A, Bruckner T, Contin P, Welsch T, Rahbari NN, Büchler MW, Weitz J; International Study Group of Rectal Cancer. Validation of the International Study Group of Rectal Cancer definition and severity grading of anastomotic leakage.Surgery. 2013;153:753-761.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 83][Cited by in RCA: 100][Article Influence: 8.3][Reference Citation Analysis (0)]
Islam MM, Rahman MJ, Rabby MS, Alam MJ, Pollob SMAI, Ahmed NAMF, Tawabunnahar M, Roy DC, Shin J, Maniruzzaman M. Predicting the risk of diabetic retinopathy using explainable machine learning algorithms.Diabetes Metab Syndr. 2023;17:102919.
[RCA] [PubMed] [DOI] [Full Text][Reference Citation Analysis (0)]
Moosavi SM, Ghassabian S.
Linearity of Calibration Curves for Analytical Methods: A Review of Criteria for Assessment of Method Reliability. In: Stauffer MT, editor. Calibration and Validation of Analytical Methods - A Sampling of Current Approaches London: Intech Open, 2018.
[PubMed] [DOI] [Full Text]
Abdelsamad A, Elsheikh A, Eltantawy M, Othman AM, Arif F, Atallah H, Elderiny H, Zayed H, Alshal MM, Ali MM, Elmorsi AH, Rashad S, Elagezy F, Gebauer F, Langenbach MR, Hamdy NM. A battle of surgical strategies: Clinically enlarged lateral lymph nodes in patients with locally advanced rectal cancer; extended mesorectal excision (e-TME) versus traditional surgery (TME-alone) a meta-analysis.Pathol Res Pract. 2025;269:155874.
[RCA] [PubMed] [DOI] [Full Text][Reference Citation Analysis (0)]
Budin C, Staniloaie D, Vasile D, Ilco A, Balan DG, Popa CC, Stiru O, Tulin A, Enyedi M, Miricescu D, Georgescu DE, Georgescu TF, Badiu DC, Mihai DA. Hypocalcemia: A possible risk factor for anastomotic leak in digestive surgery.Exp Ther Med. 2021;21:523.
[RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)][Reference Citation Analysis (0)]
Souad G, Baghdadi C. Effect of calcium phosphate synthesis conditions on its physico-chemical properties and evaluation of its antibacterial activity.Mater Res Express. 2020;7:015040.
[PubMed] [DOI]
Spinelli A, Foppa C, Carvello M, Sacchi M, De Lucia F, Clerico G, Carrano FM, Maroli A, Montorsi M, Heald RJ. Transanal Transection and Single-Stapled Anastomosis (TTSS): A comparison of anastomotic leak rates with the double-stapled technique and with transanal total mesorectal excision (TaTME) for rectal cancer.Eur J Surg Oncol. 2021;47:3123-3129.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 3][Cited by in RCA: 42][Article Influence: 10.5][Reference Citation Analysis (0)]