Published online Oct 15, 2025. doi: 10.4251/wjgo.v17.i10.111163
Revised: July 16, 2025
Accepted: September 1, 2025
Published online: October 15, 2025
Processing time: 112 Days and 6.1 Hours
Delayed wound healing is a common clinical complication following gastric cancer radical surgery, adversely affecting patient prognosis. With advances in artificial intelligence, machine learning offers a promising approach for deve
To construct machine learning-based risk prediction models for delayed wound healing after gastric cancer surgery to support clinical decision-making.
We reviewed a total of 514 patients who underwent gastric cancer radical surgery under general anesthesia from January 1, 2014 to December 30, 2023. Seventy percent of the dataset was selected as the training set and 30% as the validation set. Decision trees, support vector machines, and logistic regression were used to construct a risk prediction model. The performance of the model was evaluated using accuracy, recall, precision, F1 index, and area under the receiver operating characteristic curve and decision curve.
This study included five variables: Sex, elderly, duration of abdominal drainage, preoperative white blood cell (WBC) count, and absolute value of neutrophils. These variables were selected based on their clinical relevance and statistical significance in predicting delayed wound healing. The results showed that the decision tree model outperformed the logistic regression and support vector machine models in both the training and validation sets. Specifically, the decision tree model achieved higher accuracy, F1 index, recall, and area under the curve (AUC) values. The support vector machine model also demonstrated better performance than logistic regression, with higher accuracy, recall, and F1 index, but a slightly lower AUC. The key variables of sex, elderly, duration of abdominal drainage, preoperative WBC count, and absolute value of neutrophils were found to be strong predictors of delayed wound healing. Patients with longer duration of abdominal drainage had a significantly higher risk of delayed wound healing, with a risk ratio of 1.579 compared to those with shorter duration of abdominal drainage. Similarly, preoperative WBC count, sex, elderly, and absolute value of neutrophils were associated with a higher risk of delayed wound healing, highlighting the importance of these variables in the model.
The model is able to identify high-risk patients based on sex, elderly, duration of abdominal drainage, preoperative WBC count, and absolute value of neutrophils can provide valuable insights for clinical decision-making.
Core Tip: Delayed wound healing after gastric cancer surgery poses a significant risk to patient recovery. This study developed machine learning models specifically decision tree, support vector machine, and logistic regression for predicting postoperative delayed wound healing. The decision tree model demonstrated the best performance, achieving an accuracy of 90.1% and an area under the curve of 0.951 in the validation set. Key predictors included duration of abdominal drainage and preoperative white blood cell count. These findings offer clinicians a data-driven tool for early identification of high-risk patients and improved perioperative management.
- Citation: An Y, Sun YG, Feng S, Wang YS, Chen YY, Jiang J. Constructing a prediction model for delayed wound healing after gastric cancer radical surgery based on three machine learning algorithms. World J Gastrointest Oncol 2025; 17(10): 111163
- URL: https://www.wjgnet.com/1948-5204/full/v17/i10/111163.htm
- DOI: https://dx.doi.org/10.4251/wjgo.v17.i10.111163
Gastric cancer is the fourth most common cancer worldwide and the third most common cause of cancer death, which poses a huge threat to human health and survival[1]. To date, surgery remains the cornerstone for treating gastric cancer[2]. However, due to the special nature of the surgical site, malnutrition or infection may occur, which can affect the healing of the surgical site and lead to delayed healing. Research has shown that delayed healing at the surgical site is a common complication in cancer patients[3]. Therefore, it is crucial to identify and determine the risk factors for delayed healing at the surgical site. Machine learning methods have been widely applied in the medical field. Through machine learning algorithms, disease-related features can be screened from a large amount of clinical data, thereby establishing a clinical prediction model to predict the risk of individual related diseases and provide reference for clinical diagnosis and treatment[4]. Common machine learning algorithms include logistic regression, support vector machines, and decision trees[5].
At present, most of the research on delayed healing of the surgical site after gastric cancer radical surgery has focused on identifying the risk factors involved, and no clinical predictive models for delayed healing of the surgical site after gastric cancer radical surgery have been found. This study reviewed clinical data of patients undergoing radical gastrectomy for gastric cancer. Based on machine learning algorithms, three prediction models for delayed healing at the surgical site after radical gastrectomy were constructed using logistic regression, support vector machine, and decision tree methods, and their predictive performance was compared to provide reference for clinical diagnosis and treatment.
We reviewed and analyzed 514 patients who underwent radical gastrectomy for gastric cancer under general anesthesia from January 1, 2014 to September 30, 2023. The postoperative pathological diagnosis was gastric cancer. Patient gender was not limited, patient age was ≥ 18 years old, body mass index (BMI) ≥ 18.5 kg/m2, and American Society of Anesthesiologists (ASA) score I-III. This study was conducted in accordance with the principles outlined in the Helsinki Declaration and was approved by the Ethics Committee of Affiliated Hospital of Weifang Medical University (No. wyfy-2024-ky-141). As this work is a retrospective study, the ethics committee waived the requirement for patient informed consent. The personal identifiers of the included subjects were completely removed, and the data analysis was anonymized. Exclusion criteria were as follows: Inability to collect clinical information or missing information exceeding 20%; Non general anesthesia surgery.
The data were accessed for research purposes on October 6, 2023. The authors had access to information that could identify individual participants during or after data collection.
The diagnostic criteria for delayed healing of the surgical site were: (1) There was still exudation from the wound three days after surgery, or the width of the wound opening was ≥ 1 cm, and the length of the non-healing area was ≥ 2 cm; and (2) Failure to remove stitches 14 days after surgery; If one of these two conditions was met, this was defined as delayed healing of the surgical site[6].
Patients were divided into two groups based on whether delayed healing occurred at the surgical site: Non delayed healing group (N group) and delayed healing group (D group). The allocation of training and validation sets was randomly assigned in a 7:3 ratio using the create Data Partition function in the caret package of R language software (version 4.3.0).
We retrieved the hospital medical record system, and searched and recorded relevant information and data related to the patients including: Sex, elderly (defined by the World Health Organization as the elderly population ≥ 60 years old), BMI (World Health Organization classification: < 18.5 kg/m2 for underweight, ≥ 30.0 kg/m2 for obesity[7]), ASA score, preoperative complications, length of incision, surgical scope, operation (open surgery or laparoscopic), duration of abdominal drainage, length of operation, duration of anesthesia, smoking and drinking history, preoperative parenteral nutrition duration, preoperative blood routine and blood glucose levels, etc.
We used SPSS 25.0 for statistical analysis of the data. Measurement data are expressed as mean ± SD or median and interquartile range, while count data are expressed as rate or percentage (%). Three risk prediction models were constructed for delayed healing at the surgical site after gastric cancer radical surgery based on machine learning algorithms using R 4.4.0, including logistic regression, support vector machine, and decision tree.
In the logistic regression model, the glm function from the stats package in R software was used to fit the model, with the link function set to logit and the maximum number of iterations set to 25. The model parameters were estimated using the method of maximum likelihood. For the decision tree model, the rpart function from the rpart package in R was employed. The complexity parameter was set to 0.01, which was determined to balance model complexity and generalization ability, effectively controlling tree growth and preventing overfitting. The minimum number of samples required to split a node was set to 20, while the minimum number of samples in a terminal node was set to 7, to avoid generating overly small leaf nodes and improve the model’s generalization ability. The maximum depth of the tree was limited to 30 to restrict tree complexity and prevent overfitting. In the support vector machine model, the support vector machine function from the e1071 package in R was used. The kernel type was set to “radial”, indicating the use of a radial basis function kernel. The cost parameter was set to 1 to enhance model fitting, and the tolerance for stopping criteria (tolerance) was set to 0.001 for convergence precision. The data were standardized using the z-score normalization method, which involves subtracting the mean and dividing by the standard deviation for each feature. This ensures all features are on the same scale, preventing any single feature from having an undue influence on the model and improving both performance and convergence speed. The support vector machine type was specified as “C-classification” to construct a C-support vector machine model for classification purposes.
We reviewed a total of 528 patients who underwent radical gastrectomy for gastric cancer, each with 38 characteristics. We calculated the missing values for each feature. If the missing value exceeded 20% of the total value, the feature was removed. Otherwise, we used the Mice software package to perform multiple imputation on the missing value through random forest interpolation. Thus, this study retained information from 514 patients for model construction. The clinical and demographic characteristics of the patients are shown in Table 1.
| Variables | Patients | |||
| Total (n = 514) | N group (n = 388) | D group (n = 126) | P value | |
| Preoperative variables | ||||
| Age, years | 63.5 ± 10.1 | 63.0 ± 10.1 | 65.2 ± 10.0 | 0.031 |
| Elderly | 23 (4.5) | 256 (66) | 100 (79.4) | 0.005 |
| Sex | 0.301 | |||
| Male | 385 (74.9) | 295 (76.0) | 90 (71.4) | |
| Female | 129 (25.1) | 93 (24.0) | 36 (28.6) | |
| Body mass index, kg/m2 | 22.6 ± 3.3 | 22.5 ± 3.4 | 22.8 ± 3.0 | 0.505 |
| ASA score | 0.320 | |||
| I | 58 (11.3) | 48 (12.4) | 10 (7.9) | |
| II | 343 (66.7) | 260 (67.0) | 83 (65.9) | |
| III | 113 (22.0) | 80 (20.6) | 33 (26.2) | |
| Hypertension | 136 (26.4) | 100 (25.8) | 36 (28.6) | 0.536 |
| Diabetes | 42 (8.1) | 30 (7.7) | 12 (9.5) | 0.524 |
| Chronic obstructive pulmonary disease | 7 (1.3) | 7 (1.8) | 0 (0) | 0.282 |
| Cardiovascular diseases | 43 (8.3) | 36 (9.3) | 7 (5.6) | 0.190 |
| Obesity (body mass index ≥ 30 kg/m2) | 6 (1.1) | 6 (1.5) | 0 (0) | 0.354 |
| Underweight (body mass index < 18.5 kg/m2) | 119 (23.1) | 88 (22.7) | 31 (24.6) | 0.657 |
| Smoking history | 245 (47.6) | 187 (48.2) | 58 (46.0) | 0.673 |
| Drinking history | 214 (41.6) | 163 (42.0) | 51 (40.5) | 0.762 |
| History of abdominal surgery | 44 (8.5) | 30 (7.7) | 14 (11.1) | 0.239 |
| Emergency | 34 (6.6) | 28 (7.2) | 6 (4.8) | 0.335 |
| Weight loss, kg, median (IQR) | 2.9 (0, 5.0) | 2.0 (0, 5.0) | 0 (0, 5.0) | 0.534 |
| Hospitalization duration, day | 4 (3, 7) | 4 (2, 7) | 4 (3, 7) | 0.497 |
| Total parenteral nutrition | 156 (30.3) | 124 (32.0) | 32 (25.4) | 0.164 |
| Hemoglobin concentration, g/L, median (IQR) | 123.0 (102.0, 138.0) | 124.0 (102.0, 139.0) | 120.5 (99.0, 135.8) | 0.230 |
| Hypoproteinemia (< 35 g/L) | 121 (23.5) | 93 (24.0) | 28 (22.2) | 0.688 |
| Absolute value of neutrophils, 109/L, median (IQR) | 3.0 (1.5, 4.5) | 3.0 (1.8, 4.7) | 2.7 (0.9, 4.0) | 0.032 |
| Absolute value of lymphocytes, 109/L, median (IQR) | 1.6 (1.2, 2.4) | 1.6 (1.2, 2.3) | 1.7 (1.2, 2.4) | 0.966 |
| Neutrophil to lymphocyte ratio, median (IQR) | 2.2 (1.3, 3.6) | 2.2 (1.3, 3.5) | 2.3 (1.1, 3.7) | 0.701 |
| White blood cell count, 109/L | 6.8 ± 3.2 | 6.7 ± 2.7 | 7.1 ± 4.3 | 0.329 |
| Blood glucose concentration, mmol/L | 6.0 ± 2.1 | 5.9 ± 1.9 | 6.3 ± 2.4 | 0.067 |
| Intraoperative and postoperative variables | ||||
| Length of incision, cm | 9.3 ± 4.8 | 9.1 ± 4.8 | 9.1 ± 4.9 | 0.005 |
| Infection | 23 (4.4) | 16 (4.1) | 7 (5.6) | 0.499 |
| Surgical scope | 0.158 | |||
| Complete gastrectomy | 128 (24.9) | 93 (24.0) | 35 (27.7) | |
| Partial gastrectomy | 250 (48.6) | 198 (51.0) | 52 (41.3) | |
| Multi organ combined surgery | 136 (26.5) | 97 (25.0) | 39 (31.0) | |
| Operation | 0.505 | |||
| Laparoscopic | 418 (81.4) | 313 (80.7) | 105 (83.3) | |
| Open surgery | 96 (18.6) | 75 (19.3) | 21 (16.7) | |
| Total input, mL, median (IQR) | 2900 (2500, 3500) | 2875 (2500, 3500) | 3000 (2500, 3500) | 0.479 |
| Bleeding, mL, median (IQR) | 100.0 (50.0, 150.0) | 65.0 (50.0, 150.0) | 100.0 (50.0, 187.5) | 0.243 |
| Urine output, mL, median (IQR) | 400 (250, 600) | 400 (250, 600) | 450 (300, 650) | 0.123 |
| Length of operation, hour | 4.3 ± 1.4 | 4.2 ± 1.4 | 4.6 ± 1.4 | 0.005 |
| Destination | 0.563 | |||
| Post anesthesia care unit | 181 (35.2) | 133 (34.3) | 48 (38.1) | |
| Ward | 330 (64.2) | 252 (64.9) | 78 (61.9) | |
| Intensive care unit | 3 (0.6) | 3 (0.8) | 0 (0) | |
| Duration of abdominal drainage, day | 12.9 ± 5.1 | 11.2 ± 3.8 | 18.0 ± 5.2 | < 0.001 |
| Duration of total parenteral nutrition, day, median (IQR) | 9.0 (7.0, 11.0) | 8.0 (6.0, 10.0) | 10.0 (7.0, 12.0) | 0.497 |
| C-reactive protein concentration, mg/L, median (IQR) | 46.5 (27.0, 78.3) | 47.5 (27.0, 76.5) | 45.1 (27.9, 79.9) | 0.891 |
| Blood glucose concentration, mmol/L | 8.1 ± 3.2 | 8.0 ± 3.1 | 8.4 ± 3.6 | 0.247 |
Through feature selection, least absolute shrinkage and selection operator (LASSO) regression was used to reduce the dimension of data and extract the most important prediction factors to avoid excessive fitting of the model. The optimal parameter (lambda) in LASSO regression was selected by cross-validation, and the minimum lambda value of mean square error was the optimal value of the model. The variables selected by LASSO regression were included in multivariate logistic regression, and the variables with a P value < 0.05 were selected: Sex (female), elderly, duration of abdominal drainage, preoperative white blood cell (WBC) count, and preoperative absolute value of neutrophils, these five characteristics were ultimately included as input features for decision trees, support vector machines, and logistic regression, and a predictive model was established. The correlation between the 5 features is shown in Figure 1.
The Pearson correlation coefficient measured the degree of linear correlation between two variables[8]. In Figure 1, the correlation between the five features weakened. For example, the correlation coefficient between the absolute value of preoperative neutrophils and the duration of abdominal drainage was 0.01, which was a positive correlation, but the correlation between the two was very small. The correlation coefficient between preoperative WBC count and preoperative absolute value of neutrophils was 0.38, which was a positive correlation, but there was no significant correlation.
After feature selection in this study, the count data included in decision trees, support vector machines, and logistic regression algorithms were assigned values, such as sex and age. The metric data were input based on the original data, as shown in Table 2.
| Attribute name | Attribute value | Assignment |
| Sex | Male | 1 |
| Female | 0 | |
| Elderly (≥ 60 years) | Yes | 1 |
| No | 0 |
We used the create data partition function in the caret package of R language software to randomly assign postoperative gastric cancer patients to a training set (70%) and a validation set (30%) in a 7:3 ratio, respectively, for the construction and validation of predictive models. The patient information for the training set and validation set is shown in Table 3.
| Variables | Training set (n = 362) | Validation set (n = 152) | ||
| N group (n = 270) | D group (n = 92) | N group (n = 118) | D group (n = 34) | |
| Elderly (%) | 179 (66.3) | 72 (78.3) | 77 (65.3) | 28 (82.4) |
| Sex (%) | ||||
| Male | 205 (75.9) | 68 (73.9) | 80 (67.8) | 22 (64.7) |
| Female | 65 (24.1) | 24 (26.1) | 38 (32.2) | 12 (35.3) |
| Absolute value of lymphocytes, 109/L | 3.6 ± 2.6 | 3.1 ± 2.3 | 3.4 ± 2.6 | 2.7 ± 2.0 |
| White blood cell count, 109/L | 6.7 ± 2.8 | 7.1 ± 4.7 | 6.8 ± 2.6 | 6.9 ± 2.7 |
| Duration of abdominal drainage, day | 11.3 ± 3.7 | 17.8 ± 5.0 | 10.9 ± 3.9 | 18.5 ± 5.3 |
The confusion matrices of three clinical prediction models are shown in Figure 2. The figure shows the confusion matrix of the prediction models established by three machine learning methods on the training and validation sets of this study. It can be seen that both in the training set and the validation set, true positive and true negative data occupy the majority of the dataset, indicating that the three machine learning methods used in this study to predict delayed healing at the surgical site after gastric cancer radical surgery were relatively reliable.
The receiver operating characteristic (ROC) curves of the three machine learning prediction models on the training and validation sets are shown in Figure 3. From the graph, it can be seen that the ROC of the decision tree was optimal in both the training and validation sets, followed by logistic regression, and the difference in area under the curve (AUC) between the two datasets was not significant. The minimum AUC of support vector machines in the training and validation sets indicated that their performance was inferior to that of decision trees and logistic regression.
By using the confusion matrix, the accuracy, recall, F1 index, and AUC of the three prediction models were calculated. According to the Delong test, the AUC value of all models was significantly better than that of the random model (P < 0.05) in all sets, indicating that the prediction performance was statistically significant. In the training sets, the AUC of the logistic regression model was 0.924 [95% confidence interval (CI): 0.897-0.950], the AUC of the decision tree was 0.962 (95%CI: 0.944-0.979), and the AUC of the support vector machine was 0.749 (95%CI: 0.697-0.802). The Delong test showed a P value < 0.001 of these three models against the random model. In the validation sets, the AUC of the logistic regression model was 0.937 (95%CI: 0.900-0.970), the AUC of the decision tree was 0.951 (95%CI: 0.920-0.979), and the AUC of the support vector machine was 0.773 (95%CI: 0.685-0.855). The Delong test showed a P value < 0.001 of these three models against random model.
In addition, we also conducted the Delong test to compare the AUCs of the models. In the training set, the P value comparison between the decision tree and logistic regression was 0.423, the P value comparison between the decision tree and the support vector machine was less than 0.001, and the P value comparison between logistic regression and the support vector machine was less than 0.001. In the validation set, the P value comparison between the decision tree and logistic regression was 0.001, the P value comparison between the decision tree and the support vector machine was less than 0.001, and the P value comparison between logistic regression and the support vector machine was less than 0.001 (Table 4).
| Datasets | Prediction models | Precision | Accuracy | Recall | F1 index | Area under the receiver operating curve (95%CI) | P value1 | P value2 |
| Training set | Decision tree | 0.951 | 0.917 | 0.937 | 0.944 | 0.962 (0.944-0.979) | 4.00 × 10-37 | 0.4233 |
| Logistic regression | 0.848 | 0.823 | 0.930 | 0.887 | 0.924 (0.897-0.950) | 1.40 × 10-30 | < 0.0015 | |
| SVM | 0.861 | 0.845 | 0.944 | 0.901 | 0.749 (0.697-0.802) | 1.21 × 10-08 | < 0.0014 | |
| Validation set | Decision tree | 0.940 | 0.901 | 0.932 | 0.936 | 0.951 (0.920-0.979) | 2.72 × 10-21 | 0.0013 |
| Logistic regression | 0.869 | 0.855 | 0.957 | 0.911 | 0.937 (0.900-0.970) | 4.86 × 10-20 | < 0.0015 | |
| SVM | 0.890 | 0.875 | 0.958 | 0.922 | 0.773 (0.685-0.855) | 7.30 × 10-07 | < 0.0014 |
In the training and validation sets, the accuracy, F1 index, recall and AUC of the decision tree model were superior to logistic regression and support vector machine models, indicating that the decision tree had good generalization ability in constructing a risk prediction model for delayed healing of the surgical site after gastric cancer radical surgery. In addition, in the training and validation sets, the accuracy, recall, and F1 index of the support vector machine model were better than logistic regression, but the AUC was lower, indicating that the generalization ability of the support vector machine in constructing a risk prediction model for delayed healing of the surgical site after gastric cancer radical surgery was better than logistic regression. These results are shown in Table 4.
In addition, we analyzed in detail the model performance metrics under the clinical threshold corresponding to the maximized Youden index. In the comprehensive comparison of all indicators, the decision tree model showed better overall performance than the other models. These results are shown in Table 5. The results demonstrated that in the training set, the maximized Youden index of the decision tree model was 0.822, and the corresponding threshold was 0.130, under which the accuracy was 0.657, with recall rate reaching 1.000, and the F1 index was 0.793, indicating its strong capability in detecting delayed wound healing. In the validation set, the decision tree model maintained excellent performance, the maximized Youden index of the model was 0.856, and the corresponding threshold was 0.219, under which the recall rate was 1.000, and the F1 index was 0.800.
| Datasets | Prediction models | Precision | Accuracy | Recall | F1 index | Youden index | Best threshold |
| Training set | Decision tree | 0.657 | 0.867 | 1.000 | 0.793 | 0.822 | 0.130 |
| Logistic regression | 0.641 | 0.856 | 0.989 | 0.778 | 0.800 | 0.192 | |
| SVM | 0.641 | 0.856 | 0.989 | 0.778 | 0.800 | 0.173 | |
| Validation set | Decision tree | 0.667 | 0.888 | 1.000 | 0.800 | 0.856 | 0.219 |
| Logistic regression | 0.611 | 0.855 | 0.971 | 0.750 | 0.793 | 0.168 | |
| SVM | 0.750 | 0.921 | 0.971 | 0.846 | 0.877 | 0.195 |
In the training set, the decision curve of the three models (decision tree, logistic regression and support vector machine models) revealed that the nomogram provided a greater net benefit in predicting delayed wound healing compared with treat all or treat none, when the threshold probability was 0%-89%, 0%-72% and 0%-85%, respectively (Figure 4A-C). Similarly, in the validation set, the decision curve of the three models revealed that the nomogram provided a greater net benefit in predicting delayed wound healing compared with treat all or treat none when the threshold probability was 0%-77%, 0%-73% and 0%-95%, respectively (Figure 4D-F).
In the past decade, artificial intelligence (AI) has become a hot topic both inside and outside the scientific community, with numerous articles covering topics such as machine learning, deep learning, and AI[9]. The performance and attention of machine learning and AI applications in academic research and industrial fields have also been significantly improved[10]. Given the latest advances in machine learning, the application of this technology in the medical field has achieved gratifying results[11]. In many cases, the complexity and unpredictability of human physiology have been proven to be better described through machine learning algorithms[12].
This study reviewed the relevant clinical data of patients who underwent radical gastrectomy for gastric cancer in our hospital, and ultimately identified five risk factors, namely sex (female), elderly, duration of abdominal drainage, preoperative WBC count, and preoperative absolute value of lymphocytes. Gastric cancer is the third most common cause of cancer death, posing a huge threat to human health and survival[1]. To date, radical surgery for gastric cancer remains the preferred treatment method[13]. Due to the unique nature of the surgical site, postoperative complications such as delayed healing and infection may occur. A study found that elderly patients are more prone to delayed healing at the surgical site due to a decrease in the robustness and repair ability of damaged tissue[14].
Advanced age is a well-documented risk factor for delayed wound healing. The physiological changes associated with aging, such as decreased collagen production, reduced blood flow, and impaired immune function, contribute to slower and less effective tissue repair[15]. Additionally, elderly patients often have comorbid conditions such as diabetes and hypertension, which further complicate the healing process. Although the exact mechanisms are not fully understood, gender differences in wound healing have been observed. Hormonal differences, such as lower estrogen levels in postmenopausal women, may contribute to slower healing rates. Additionally, differences in immune response and inflammatory markers between males and females could play a role in the observed gender disparities in wound healing[16]. Moreover, prolonged abdominal drainage is often associated with postoperative complications. The presence of an abdominal drain can disrupt the normal healing process by introducing foreign bodies and potential pathogens into the wound site[17]. The findings of Hajibandeh et al[17] were consistent with these results.
Elevated preoperative WBC count is indicative of an ongoing inflammatory or infectious process. Inflammation is a natural part of the wound healing process, but excessive or chronic inflammation can impair tissue repair and lead to delayed healing[18]. Elevated WBC count may also reflect the presence of systemic infection or sepsis, which can further complicate postoperative recovery[19]. Besides, neutrophils are key players in the immune response and are crucial for the early stages of wound healing. An elevated neutrophil count can indicate an ongoing infection or inflammation, which may disrupt the normal healing process[20]. The findings of Heuer et al[20] were consistent with these results.
The results of this study indicated that decision tree, logistic regression, and support vector machine models constructed based on machine learning algorithms could accurately predict the risk of delayed healing at the surgical site after radical gastrectomy for gastric cancer. By comparing the performance of the three models in the dataset, it was found that the decision tree model outperformed the logistic regression and support vector machine models in terms of accuracy, F1 index, recall and AUC in both the training and validation sets. In addition, the support vector machine model had better accuracy, recall, and FI index than logistic regression, but lower AUC. The model with the best comprehensive predictive ability was the decision tree model.
Furthermore, we conducted a detailed analysis of the three models’ performance using the Youden index. The Youden index evaluates overall discrimination (sensitivity + specificity-1), but clinical applications prioritize balancing precision with the risk of missing or misdiagnosing cases. Decision tree models achieved a recall rate of 1.000 in both the training and validation sets, indicating “zero false negatives” in positive case identification a critical requirement for disease screening and diagnosis. In contrast, the logistic regression and support vector machine models demonstrated slightly lower recall rates, indicating a potential risk of missing diagnoses.
From the analysis of the characteristics of machine learning algorithms, the support vector machine algorithm is a widely used machine learning algorithm for classification and regression problems. It is suitable for high-dimensional data and is based on the principle of structural risk minimization, with good generalization ability. However, it requires high integrity of input data and is sensitive to missing data, which may lead to reduced model performance and difficulty in processing large-scale datasets, limiting its application in processing large-scale datasets[21]. The logistic regression algorithm is easy to understand and implement, has high computational efficiency, strong model interpretability, but is easily affected by outliers. Moreover, when the model is more complex or there is too little training data, overfitting is prone to occur[22]. The decision tree algorithm is a nonparametric supervised learning method mainly applied to classification and regression problems[23]. It is presented in the form of a tree diagram, which is easy to understand and interpret, and can automatically select the features that have the greatest impact on the target variable for splitting, helping to discover key features in the data. Secondly, the decision tree algorithm is very robust and is not sensitive to missing values in input data. It can handle missing values to a certain extent and various types of data.
This study has certain limitations. This study adopted internal validation, which may have regional limitations. Secondly, this is a retrospective study and there may be selection bias. In the future, the sample size should be expanded, multicenter studies should be conducted, and the practicality of the model should be tested in clinical practice, in order that the results of this study can provide a reliable reference for the prevention of delayed healing at the surgical site in gastric cancer patients undergoing radical surgery.
This study applied machine learning algorithms to predict delayed wound healing at the surgical site in gastric cancer patients undergoing radical surgery. Decision trees have better performance and generalization ability in constructing a risk prediction model for delayed healing, which can provide better reference for clinical decision-making.
All authors contributed to the design, analysis, critical interpretation of the data, and critical revision of the manuscript. The authors thank the patients who agreed to participate in this study and all researchers, and clinical staff who supported this study.
| 1. | Zhang H, Liang F, Wang F, Xu Q, Qiu Y, Lu X, Jiang L, Jian K. miR-148-3p inhibits gastric cancer cell malignant phenotypes and chemotherapy resistance by targeting Bcl2. Bioengineered. 2024;15:2005742. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 2] [Cited by in RCA: 3] [Article Influence: 3.0] [Reference Citation Analysis (0)] |
| 2. | Li GX. [Research progress and prospect of gastric cancer surgery in 2021]. Zhonghua Wei Chang Wai Ke Za Zhi. 2022;25:15-21. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 1] [Reference Citation Analysis (0)] |
| 3. | Farreras N, Artigas V, Cardona D, Rius X, Trias M, González JA. Effect of early postoperative enteral immunonutrition on wound healing in patients undergoing surgery for gastric cancer. Clin Nutr. 2005;24:55-65. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 123] [Cited by in RCA: 130] [Article Influence: 6.5] [Reference Citation Analysis (0)] |
| 4. | Sharma A, Lysenko A, Jia S, Boroevich KA, Tsunoda T. Advances in AI and machine learning for predictive medicine. J Hum Genet. 2024;69:487-497. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 7] [Cited by in RCA: 47] [Article Influence: 47.0] [Reference Citation Analysis (0)] |
| 5. | Al Mudawi N, Alazeb A. A Model for Predicting Cervical Cancer Using Machine Learning Algorithms. Sensors (Basel). 2022;22:4132. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 5] [Cited by in RCA: 28] [Article Influence: 9.3] [Reference Citation Analysis (0)] |
| 6. | Tardáguila-García A, García-Morales E, García-Alamino JM, Álvaro-Afonso FJ, Molines-Barroso RJ, Lázaro-Martínez JL. Metalloproteinases in chronic and acute wounds: A systematic review and meta-analysis. Wound Repair Regen. 2019;27:415-420. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 27] [Cited by in RCA: 47] [Article Influence: 7.8] [Reference Citation Analysis (0)] |
| 7. | Weir CB, Jan A. BMI Classification Percentile and Cut Off Points. 2023 Jun 26. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2025 Jan-. [PubMed] |
| 8. | Šverko Z, Vrankić M, Vlahinić S, Rogelj P. Complex Pearson Correlation Coefficient for EEG Connectivity Analysis. Sensors (Basel). 2022;22:1477. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 3] [Cited by in RCA: 27] [Article Influence: 9.0] [Reference Citation Analysis (0)] |
| 9. | Choi RY, Coyner AS, Kalpathy-Cramer J, Chiang MF, Campbell JP. Introduction to Machine Learning, Neural Networks, and Deep Learning. Transl Vis Sci Technol. 2020;9:14. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 278] [Reference Citation Analysis (2)] |
| 10. | Raschka S, Kaufman B. Machine learning and AI-based approaches for bioactive ligand discovery and GPCR-ligand recognition. Methods. 2020;180:89-110. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 42] [Cited by in RCA: 48] [Article Influence: 9.6] [Reference Citation Analysis (0)] |
| 11. | Giarnieri E, Carico E, Scarpino S, Ricci A, Bruno P, Scardapane S, Giansanti D. Bringing AI to Clinicians: Simplifying Pleural Effusion Cytology Diagnosis with User-Friendly Models. Diagnostics (Basel). 2025;15:1240. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 1] [Reference Citation Analysis (0)] |
| 12. | Chen X, Pan Y, Tang T, Fu J, Chen X, Bao C. Machine learning-guided one-step fabrication of targeted emodin liposomes via novel micromixer for ulcerative colitis therapy. Nano Res. 2025;. [RCA] [DOI] [Full Text] [Cited by in Crossref: 5] [Cited by in RCA: 2] [Article Influence: 2.0] [Reference Citation Analysis (0)] |
| 13. | Wang S, Xu L, Wang Q, Li J, Bai B, Li Z, Wu X, Yu P, Li X, Yin J. Postoperative complications and prognosis after radical gastrectomy for gastric cancer: a systematic review and meta-analysis of observational studies. World J Surg Oncol. 2019;17:52. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 51] [Cited by in RCA: 95] [Article Influence: 15.8] [Reference Citation Analysis (0)] |
| 14. | Vu R, Jin S, Sun P, Haensel D, Nguyen QH, Dragan M, Kessenbrock K, Nie Q, Dai X. Wound healing in aged skin exhibits systems-level alterations in cellular composition and cell-cell communication. Cell Rep. 2022;40:111155. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 6] [Cited by in RCA: 75] [Article Influence: 25.0] [Reference Citation Analysis (0)] |
| 15. | Davan-Wetton CSA, Pessolano E, Perretti M, Montero-Melendez T. Senescence under appraisal: hopes and challenges revisited. Cell Mol Life Sci. 2021;78:3333-3354. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 12] [Cited by in RCA: 38] [Article Influence: 9.5] [Reference Citation Analysis (0)] |
| 16. | Ahn S, Chantre CO, Ardoña HAM, Gonzalez GM, Campbell PH, Parker KK. Biomimetic and estrogenic fibers promote tissue repair in mice and human skin via estrogen receptor β. Biomaterials. 2020;255:120149. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 10] [Cited by in RCA: 20] [Article Influence: 4.0] [Reference Citation Analysis (0)] |
| 17. | Hajibandeh S, Hajibandeh S, Raza SS, Bartlett D, Dasari BVM, Sutcliffe RP. Abdominal drainage is contraindicated after uncomplicated hepatectomy: Results of a meta-analysis of randomized controlled trials. Surgery. 2023;173:401-411. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 4] [Reference Citation Analysis (0)] |
| 18. | Patel PP, Weller JH, Westermann CR, Cappiello C, Garcia AV, Rhee DS. Appendectomy and Cholecystectomy Outcomes for Pediatric Cancer Patients with Leukopenia: A NSQIP-Pediatric Study. J Surg Res. 2021;267:556-562. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 2] [Reference Citation Analysis (0)] |
| 19. | Shi X, Lin L, Sun J. The Value of Continuous Closed Negative Pressure Drainage Combined with Antibacterial Biofilm Dressing in Postoperative Wound Healing for Severe Pancreatitis. Altern Ther Health Med. 2023;29:375-379. [PubMed] |
| 20. | Heuer A, Stiel C, Elrod J, Königs I, Vincent D, Schlegel P, Trochimiuk M, Appl B, Reinshagen K, Raluy LP, Boettcher M. Therapeutic Targeting of Neutrophil Extracellular Traps Improves Primary and Secondary Intention Wound Healing in Mice. Front Immunol. 2021;12:614347. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 12] [Cited by in RCA: 38] [Article Influence: 9.5] [Reference Citation Analysis (0)] |
| 21. | Yan Y, Wang Y, Lei Y. Micro Learning Support Vector Machine for Pattern Classification: A High-Speed Algorithm. Comput Intell Neurosci. 2022;2022:4707637. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 7] [Reference Citation Analysis (0)] |
| 22. | Ghavamipour AR, Turkmen F, Jiang X. Privacy-preserving logistic regression with secret sharing. BMC Med Inform Decis Mak. 2022;22:89. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 1] [Cited by in RCA: 2] [Article Influence: 0.7] [Reference Citation Analysis (0)] |
| 23. | Becker T, Rousseau AJ, Geubbelmans M, Burzykowski T, Valkenborg D. Decision trees and random forests. Am J Orthod Dentofacial Orthop. 2023;164:894-897. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 32] [Reference Citation Analysis (0)] |
