Constructing a prediction model for delayed wound healing after gastric cancer radical surgery based on three machine learning algorithms

doi:10.4251/wjgo.v17.i10.111163

Advanced Search

BPG is committed to discovery and dissemination of knowledge

Home / Archive / Volume 17, Issue 10

This Article

Table of Contents

Academic Content and Language Evaluation of This Article

CrossCheck and Google Search of This Article

Academic Rules and Norms of This Article

Citation of this article

Corresponding Author of This Article

Research Domain of This Article

Article-Type of This Article

Open-Access Policy of This Article

Times Cited Counts in Google of This Article

Number of Hits and Downloads for This Article

Total Article Views (709)

All Articles published online

The chart showing PDF series, HTML series, Figures (1-4) series, Tables (1-5) series.

Item

Count

PDF

HTML

Figures (1-4)

Tables (1-5)

Sum=141

Featured Article

The chart showing Browse series, Download series.

Item

Count

Browse

173

Download

233

Sum=406

Publishing Process of This Article

Item

Count

Browse

Download

Sum=131

Oct 15, 2025 (publication date) through Nov 21, 2025

Times Cited of This Article

Times Cited (0)

Journal Information of This Article

Publication Name

World Journal of Gastrointestinal Oncology

ISSN

1948-5204

Publisher of This Article

Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA

Retrospective Study Open Access

World J Gastrointest Oncol. Oct 15, 2025; 17(10): 111163
Published online Oct 15, 2025. doi: 10.4251/wjgo.v17.i10.111163

Constructing a prediction model for delayed wound healing after gastric cancer radical surgery based on three machine learning algorithms

Yan An, Yin-Gui Sun, Shuo Feng, Yun-Sheng Wang, Yuan-Yuan Chen, Jun Jiang

Yan An, Yin-Gui Sun, Shuo Feng, Yun-Sheng Wang, Yuan-Yuan Chen, Jun Jiang, Affiliated Hospital of Shandong Second Medical University (Clinical Medical College), Weifang 261000, Shandong Province, China

ORCID number: Yan An (0009-0008-5205-2546); Jun Jiang (0000-0003-3091-9512).

Co-corresponding authors: Yuan-Yuan Chen and Jun Jiang.

Author contributions: Sun YG, Feng S, Wang YS, Chen YY and Jiang J contributed to material preparation, data collection and analysis; An Y contributed to the first draft of the manuscript; An Y, Sun YG, Feng S, Wang YS, Chen YY, Jiang J contributed to the study conception and design, they commented on previous versions of the manuscript; All authors have read and approve the final manuscript.

Supported by the Shandong Province Traditional Chinese Medicine Technology Project, No. Q-2023147; the Weifang Health Commission Research Project, No. WFWSJK-2023-033; the Weifang City Science and Technology Development Plan (Medical Category), No. 2023YX057; the Weifang Medical University 2022 Campus Level Education and Teaching Reform and Research Project, No. 2022YB051; Norman Bethune Public Welfare Foundation, No. ezmr2023-037; and Special Research Project on Optimized Management of Acute Pain, Wu Jieping Medical Foundation.

Institutional review board statement: The study was reviewed and approved by the Ethics Committee of Affiliated Hospital of Weifang Medical University (No. wyfy-2024-ky-141).

Informed consent statement: As this work is a retrospective study, the ethics committee waived the requirement for patient informed consent.

Conflict-of-interest statement: The authors declare that they have no conflict of interest.

Data sharing statement: No additional data are available.

Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/

Corresponding author: Jun Jiang, MD, Doctor, Affiliated Hospital of Shandong Second Medical University (Clinical Medical College), Kuiwen District, Weifang 261000, Shandong Province, China. fyjiangjun@sdsmu.edu.cn

Received: June 25, 2025
Revised: July 16, 2025
Accepted: September 1, 2025
Published online: October 15, 2025
Processing time: 112 Days and 6.1 Hours

Abstract

BACKGROUND

Delayed wound healing is a common clinical complication following gastric cancer radical surgery, adversely affecting patient prognosis. With advances in artificial intelligence, machine learning offers a promising approach for developing predictive models that can identify high-risk patients and support early clinical intervention.

AIM

To construct machine learning-based risk prediction models for delayed wound healing after gastric cancer surgery to support clinical decision-making.

METHODS

We reviewed a total of 514 patients who underwent gastric cancer radical surgery under general anesthesia from January 1, 2014 to December 30, 2023. Seventy percent of the dataset was selected as the training set and 30% as the validation set. Decision trees, support vector machines, and logistic regression were used to construct a risk prediction model. The performance of the model was evaluated using accuracy, recall, precision, F1 index, and area under the receiver operating characteristic curve and decision curve.

RESULTS

This study included five variables: Sex, elderly, duration of abdominal drainage, preoperative white blood cell (WBC) count, and absolute value of neutrophils. These variables were selected based on their clinical relevance and statistical significance in predicting delayed wound healing. The results showed that the decision tree model outperformed the logistic regression and support vector machine models in both the training and validation sets. Specifically, the decision tree model achieved higher accuracy, F1 index, recall, and area under the curve (AUC) values. The support vector machine model also demonstrated better performance than logistic regression, with higher accuracy, recall, and F1 index, but a slightly lower AUC. The key variables of sex, elderly, duration of abdominal drainage, preoperative WBC count, and absolute value of neutrophils were found to be strong predictors of delayed wound healing. Patients with longer duration of abdominal drainage had a significantly higher risk of delayed wound healing, with a risk ratio of 1.579 compared to those with shorter duration of abdominal drainage. Similarly, preoperative WBC count, sex, elderly, and absolute value of neutrophils were associated with a higher risk of delayed wound healing, highlighting the importance of these variables in the model.

CONCLUSION

The model is able to identify high-risk patients based on sex, elderly, duration of abdominal drainage, preoperative WBC count, and absolute value of neutrophils can provide valuable insights for clinical decision-making.

Key Words: Machine learning; Logistic regression; Support vector machine; Decision tree; Delayed healing; Prediction model; Gastric cancer

Core Tip: Delayed wound healing after gastric cancer surgery poses a significant risk to patient recovery. This study developed machine learning models specifically decision tree, support vector machine, and logistic regression for predicting postoperative delayed wound healing. The decision tree model demonstrated the best performance, achieving an accuracy of 90.1% and an area under the curve of 0.951 in the validation set. Key predictors included duration of abdominal drainage and preoperative white blood cell count. These findings offer clinicians a data-driven tool for early identification of high-risk patients and improved perioperative management.

Citation: An Y, Sun YG, Feng S, Wang YS, Chen YY, Jiang J. Constructing a prediction model for delayed wound healing after gastric cancer radical surgery based on three machine learning algorithms. World J Gastrointest Oncol 2025; 17(10): 111163
URL: https://www.wjgnet.com/1948-5204/full/v17/i10/111163.htm
DOI: https://dx.doi.org/10.4251/wjgo.v17.i10.111163

INTRODUCTION

Gastric cancer is the fourth most common cancer worldwide and the third most common cause of cancer death, which poses a huge threat to human health and survival[1]. To date, surgery remains the cornerstone for treating gastric cancer[2]. However, due to the special nature of the surgical site, malnutrition or infection may occur, which can affect the healing of the surgical site and lead to delayed healing. Research has shown that delayed healing at the surgical site is a common complication in cancer patients[3]. Therefore, it is crucial to identify and determine the risk factors for delayed healing at the surgical site. Machine learning methods have been widely applied in the medical field. Through machine learning algorithms, disease-related features can be screened from a large amount of clinical data, thereby establishing a clinical prediction model to predict the risk of individual related diseases and provide reference for clinical diagnosis and treatment[4]. Common machine learning algorithms include logistic regression, support vector machines, and decision trees[5].

At present, most of the research on delayed healing of the surgical site after gastric cancer radical surgery has focused on identifying the risk factors involved, and no clinical predictive models for delayed healing of the surgical site after gastric cancer radical surgery have been found. This study reviewed clinical data of patients undergoing radical gastrectomy for gastric cancer. Based on machine learning algorithms, three prediction models for delayed healing at the surgical site after radical gastrectomy were constructed using logistic regression, support vector machine, and decision tree methods, and their predictive performance was compared to provide reference for clinical diagnosis and treatment.

MATERIALS AND METHODS

Data collection

We reviewed and analyzed 514 patients who underwent radical gastrectomy for gastric cancer under general anesthesia from January 1, 2014 to September 30, 2023. The postoperative pathological diagnosis was gastric cancer. Patient gender was not limited, patient age was ≥ 18 years old, body mass index (BMI) ≥ 18.5 kg/m², and American Society of Anesthesiologists (ASA) score I-III. This study was conducted in accordance with the principles outlined in the Helsinki Declaration and was approved by the Ethics Committee of Affiliated Hospital of Weifang Medical University (No. wyfy-2024-ky-141). As this work is a retrospective study, the ethics committee waived the requirement for patient informed consent. The personal identifiers of the included subjects were completely removed, and the data analysis was anonymized. Exclusion criteria were as follows: Inability to collect clinical information or missing information exceeding 20%; Non general anesthesia surgery.

The data were accessed for research purposes on October 6, 2023. The authors had access to information that could identify individual participants during or after data collection.

Diagnostic criteria for delayed healing

The diagnostic criteria for delayed healing of the surgical site were: (1) There was still exudation from the wound three days after surgery, or the width of the wound opening was ≥ 1 cm, and the length of the non-healing area was ≥ 2 cm; and (2) Failure to remove stitches 14 days after surgery; If one of these two conditions was met, this was defined as delayed healing of the surgical site[6].

Grouping methods

Patients were divided into two groups based on whether delayed healing occurred at the surgical site: Non delayed healing group (N group) and delayed healing group (D group). The allocation of training and validation sets was randomly assigned in a 7:3 ratio using the create Data Partition function in the caret package of R language software (version 4.3.0).

Observed indicators

We retrieved the hospital medical record system, and searched and recorded relevant information and data related to the patients including: Sex, elderly (defined by the World Health Organization as the elderly population ≥ 60 years old), BMI (World Health Organization classification: < 18.5 kg/m² for underweight, ≥ 30.0 kg/m² for obesity[7]), ASA score, preoperative complications, length of incision, surgical scope, operation (open surgery or laparoscopic), duration of abdominal drainage, length of operation, duration of anesthesia, smoking and drinking history, preoperative parenteral nutrition duration, preoperative blood routine and blood glucose levels, etc.

Statistical analysis

We used SPSS 25.0 for statistical analysis of the data. Measurement data are expressed as mean ± SD or median and interquartile range, while count data are expressed as rate or percentage (%). Three risk prediction models were constructed for delayed healing at the surgical site after gastric cancer radical surgery based on machine learning algorithms using R 4.4.0, including logistic regression, support vector machine, and decision tree.

In the logistic regression model, the glm function from the stats package in R software was used to fit the model, with the link function set to logit and the maximum number of iterations set to 25. The model parameters were estimated using the method of maximum likelihood. For the decision tree model, the rpart function from the rpart package in R was employed. The complexity parameter was set to 0.01, which was determined to balance model complexity and generalization ability, effectively controlling tree growth and preventing overfitting. The minimum number of samples required to split a node was set to 20, while the minimum number of samples in a terminal node was set to 7, to avoid generating overly small leaf nodes and improve the model’s generalization ability. The maximum depth of the tree was limited to 30 to restrict tree complexity and prevent overfitting. In the support vector machine model, the support vector machine function from the e1071 package in R was used. The kernel type was set to “radial”, indicating the use of a radial basis function kernel. The cost parameter was set to 1 to enhance model fitting, and the tolerance for stopping criteria (tolerance) was set to 0.001 for convergence precision. The data were standardized using the z-score normalization method, which involves subtracting the mean and dividing by the standard deviation for each feature. This ensures all features are on the same scale, preventing any single feature from having an undue influence on the model and improving both performance and convergence speed. The support vector machine type was specified as “C-classification” to construct a C-support vector machine model for classification purposes.

RESULTS

We reviewed a total of 528 patients who underwent radical gastrectomy for gastric cancer, each with 38 characteristics. We calculated the missing values for each feature. If the missing value exceeded 20% of the total value, the feature was removed. Otherwise, we used the Mice software package to perform multiple imputation on the missing value through random forest interpolation. Thus, this study retained information from 514 patients for model construction. The clinical and demographic characteristics of the patients are shown in Table 1.

Table 1 Clinical and demographic characteristics of the two groups of patients, mean ± SD/n (%).

Variables	Patients
Variables	Total (n = 514)	N group (n = 388)	D group (n = 126)	P value
Preoperative variables
Age, years	63.5 ± 10.1	63.0 ± 10.1	65.2 ± 10.0	0.031
Elderly	23 (4.5)	256 (66)	100 (79.4)	0.005
Sex				0.301
Male	385 (74.9)	295 (76.0)	90 (71.4)
Female	129 (25.1)	93 (24.0)	36 (28.6)
Body mass index, kg/m²	22.6 ± 3.3	22.5 ± 3.4	22.8 ± 3.0	0.505
ASA score				0.320
I	58 (11.3)	48 (12.4)	10 (7.9)
II	343 (66.7)	260 (67.0)	83 (65.9)
III	113 (22.0)	80 (20.6)	33 (26.2)
Hypertension	136 (26.4)	100 (25.8)	36 (28.6)	0.536
Diabetes	42 (8.1)	30 (7.7)	12 (9.5)	0.524
Chronic obstructive pulmonary disease	7 (1.3)	7 (1.8)	0 (0)	0.282
Cardiovascular diseases	43 (8.3)	36 (9.3)	7 (5.6)	0.190
Obesity (body mass index ≥ 30 kg/m²)	6 (1.1)	6 (1.5)	0 (0)	0.354
Underweight (body mass index < 18.5 kg/m²)	119 (23.1)	88 (22.7)	31 (24.6)	0.657
Smoking history	245 (47.6)	187 (48.2)	58 (46.0)	0.673
Drinking history	214 (41.6)	163 (42.0)	51 (40.5)	0.762
History of abdominal surgery	44 (8.5)	30 (7.7)	14 (11.1)	0.239
Emergency	34 (6.6)	28 (7.2)	6 (4.8)	0.335
Weight loss, kg, median (IQR)	2.9 (0, 5.0)	2.0 (0, 5.0)	0 (0, 5.0)	0.534
Hospitalization duration, day	4 (3, 7)	4 (2, 7)	4 (3, 7)	0.497
Total parenteral nutrition	156 (30.3)	124 (32.0)	32 (25.4)	0.164
Hemoglobin concentration, g/L, median (IQR)	123.0 (102.0, 138.0)	124.0 (102.0, 139.0)	120.5 (99.0, 135.8)	0.230
Hypoproteinemia (< 35 g/L)	121 (23.5)	93 (24.0)	28 (22.2)	0.688
Absolute value of neutrophils, 10⁹/L, median (IQR)	3.0 (1.5, 4.5)	3.0 (1.8, 4.7)	2.7 (0.9, 4.0)	0.032
Absolute value of lymphocytes, 10⁹/L, median (IQR)	1.6 (1.2, 2.4)	1.6 (1.2, 2.3)	1.7 (1.2, 2.4)	0.966
Neutrophil to lymphocyte ratio, median (IQR)	2.2 (1.3, 3.6)	2.2 (1.3, 3.5)	2.3 (1.1, 3.7)	0.701
White blood cell count, 10⁹/L	6.8 ± 3.2	6.7 ± 2.7	7.1 ± 4.3	0.329
Blood glucose concentration, mmol/L	6.0 ± 2.1	5.9 ± 1.9	6.3 ± 2.4	0.067
Intraoperative and postoperative variables
Length of incision, cm	9.3 ± 4.8	9.1 ± 4.8	9.1 ± 4.9	0.005
Infection	23 (4.4)	16 (4.1)	7 (5.6)	0.499
Surgical scope				0.158
Complete gastrectomy	128 (24.9)	93 (24.0)	35 (27.7)
Partial gastrectomy	250 (48.6)	198 (51.0)	52 (41.3)
Multi organ combined surgery	136 (26.5)	97 (25.0)	39 (31.0)
Operation				0.505
Laparoscopic	418 (81.4)	313 (80.7)	105 (83.3)
Open surgery	96 (18.6)	75 (19.3)	21 (16.7)
Total input, mL, median (IQR)	2900 (2500, 3500)	2875 (2500, 3500)	3000 (2500, 3500)	0.479
Bleeding, mL, median (IQR)	100.0 (50.0, 150.0)	65.0 (50.0, 150.0)	100.0 (50.0, 187.5)	0.243
Urine output, mL, median (IQR)	400 (250, 600)	400 (250, 600)	450 (300, 650)	0.123
Length of operation, hour	4.3 ± 1.4	4.2 ± 1.4	4.6 ± 1.4	0.005
Destination				0.563
Post anesthesia care unit	181 (35.2)	133 (34.3)	48 (38.1)
Ward	330 (64.2)	252 (64.9)	78 (61.9)
Intensive care unit	3 (0.6)	3 (0.8)	0 (0)
Duration of abdominal drainage, day	12.9 ± 5.1	11.2 ± 3.8	18.0 ± 5.2	< 0.001
Duration of total parenteral nutrition, day, median (IQR)	9.0 (7.0, 11.0)	8.0 (6.0, 10.0)	10.0 (7.0, 12.0)	0.497
C-reactive protein concentration, mg/L, median (IQR)	46.5 (27.0, 78.3)	47.5 (27.0, 76.5)	45.1 (27.9, 79.9)	0.891
Blood glucose concentration, mmol/L	8.1 ± 3.2	8.0 ± 3.1	8.4 ± 3.6	0.247

N: Non delayed healing group; D: Delayed healing group; IQR: Interquartile range; ASA: American Society of Anesthesiologists.

Open in New Tab Full Size Table

Feature encoding

Through feature selection, least absolute shrinkage and selection operator (LASSO) regression was used to reduce the dimension of data and extract the most important prediction factors to avoid excessive fitting of the model. The optimal parameter (lambda) in LASSO regression was selected by cross-validation, and the minimum lambda value of mean square error was the optimal value of the model. The variables selected by LASSO regression were included in multivariate logistic regression, and the variables with a P value < 0.05 were selected: Sex (female), elderly, duration of abdominal drainage, preoperative white blood cell (WBC) count, and preoperative absolute value of neutrophils, these five characteristics were ultimately included as input features for decision trees, support vector machines, and logistic regression, and a predictive model was established. The correlation between the 5 features is shown in Figure 1.

Open in New Tab Full Size Figure Download Figure

Figure 1 Correlation analysis of risk factors for delayed healing at the surgical site after radical gastrectomy for gastric cancer. WBC: White blood cell.

The Pearson correlation coefficient measured the degree of linear correlation between two variables[8]. In Figure 1, the correlation between the five features weakened. For example, the correlation coefficient between the absolute value of preoperative neutrophils and the duration of abdominal drainage was 0.01, which was a positive correlation, but the correlation between the two was very small. The correlation coefficient between preoperative WBC count and preoperative absolute value of neutrophils was 0.38, which was a positive correlation, but there was no significant correlation.

After feature selection in this study, the count data included in decision trees, support vector machines, and logistic regression algorithms were assigned values, such as sex and age. The metric data were input based on the original data, as shown in Table 2.

Table 2 Data after feature encoding.

Attribute name	Attribute value	Assignment
Sex	Male	1
Sex	Female	0
Elderly (≥ 60 years)	Yes	1
Elderly (≥ 60 years)	No	0

Open in New Tab Full Size Table

We used the create data partition function in the caret package of R language software to randomly assign postoperative gastric cancer patients to a training set (70%) and a validation set (30%) in a 7:3 ratio, respectively, for the construction and validation of predictive models. The patient information for the training set and validation set is shown in Table 3.

Table 3 Patient information the in training and validation sets, mean ± SD/n (%).

Variables	Training set (n = 362)		Validation set (n = 152)
Variables	N group (n = 270)	D group (n = 92)	N group (n = 118)	D group (n = 34)
Elderly (%)	179 (66.3)	72 (78.3)	77 (65.3)	28 (82.4)
Sex (%)
Male	205 (75.9)	68 (73.9)	80 (67.8)	22 (64.7)
Female	65 (24.1)	24 (26.1)	38 (32.2)	12 (35.3)
Absolute value of lymphocytes, 10⁹/L	3.6 ± 2.6	3.1 ± 2.3	3.4 ± 2.6	2.7 ± 2.0
White blood cell count, 10⁹/L	6.7 ± 2.8	7.1 ± 4.7	6.8 ± 2.6	6.9 ± 2.7
Duration of abdominal drainage, day	11.3 ± 3.7	17.8 ± 5.0	10.9 ± 3.9	18.5 ± 5.3

N: Non delayed healing group; D: Delayed healing group.

Open in New Tab Full Size Table

Establishment and performance comparison of the three machine learning models

The confusion matrices of three clinical prediction models are shown in Figure 2. The figure shows the confusion matrix of the prediction models established by three machine learning methods on the training and validation sets of this study. It can be seen that both in the training set and the validation set, true positive and true negative data occupy the majority of the dataset, indicating that the three machine learning methods used in this study to predict delayed healing at the surgical site after gastric cancer radical surgery were relatively reliable.

Open in New Tab Full Size Figure Download Figure

Figure 2 Confusion matrix. A-C: Training set models: Decision tree (A); Logistic regression (B); Support vector machine (C); D-F: Validation set models: Decision tree (D); Logistic regression (E); Support vector machine (F).

The receiver operating characteristic (ROC) curves of the three machine learning prediction models on the training and validation sets are shown in Figure 3. From the graph, it can be seen that the ROC of the decision tree was optimal in both the training and validation sets, followed by logistic regression, and the difference in area under the curve (AUC) between the two datasets was not significant. The minimum AUC of support vector machines in the training and validation sets indicated that their performance was inferior to that of decision trees and logistic regression.

Open in New Tab Full Size Figure Download Figure

Figure 3 The receiver operating characteristic curve. A: Training set; B: Validation set. AUC: Area under the curve.

By using the confusion matrix, the accuracy, recall, F1 index, and AUC of the three prediction models were calculated. According to the Delong test, the AUC value of all models was significantly better than that of the random model (P < 0.05) in all sets, indicating that the prediction performance was statistically significant. In the training sets, the AUC of the logistic regression model was 0.924 [95% confidence interval (CI): 0.897-0.950], the AUC of the decision tree was 0.962 (95%CI: 0.944-0.979), and the AUC of the support vector machine was 0.749 (95%CI: 0.697-0.802). The Delong test showed a P value < 0.001 of these three models against the random model. In the validation sets, the AUC of the logistic regression model was 0.937 (95%CI: 0.900-0.970), the AUC of the decision tree was 0.951 (95%CI: 0.920-0.979), and the AUC of the support vector machine was 0.773 (95%CI: 0.685-0.855). The Delong test showed a P value < 0.001 of these three models against random model.

In addition, we also conducted the Delong test to compare the AUCs of the models. In the training set, the P value comparison between the decision tree and logistic regression was 0.423, the P value comparison between the decision tree and the support vector machine was less than 0.001, and the P value comparison between logistic regression and the support vector machine was less than 0.001. In the validation set, the P value comparison between the decision tree and logistic regression was 0.001, the P value comparison between the decision tree and the support vector machine was less than 0.001, and the P value comparison between logistic regression and the support vector machine was less than 0.001 (Table 4).

Table 4 Comparison of the three machine learning models.

Datasets	Prediction models	Precision	Accuracy	Recall	F1 index	Area under the receiver operating curve (95%CI)	P value¹	P value²
Training set	Decision tree	0.951	0.917	0.937	0.944	0.962 (0.944-0.979)	4.00 × 10^-37	0.423³
	Logistic regression	0.848	0.823	0.930	0.887	0.924 (0.897-0.950)	1.40 × 10^-30	< 0.001⁵
	SVM	0.861	0.845	0.944	0.901	0.749 (0.697-0.802)	1.21 × 10^-08	< 0.001⁴
Validation set	Decision tree	0.940	0.901	0.932	0.936	0.951 (0.920-0.979)	2.72 × 10^-21	0.001³
	Logistic regression	0.869	0.855	0.957	0.911	0.937 (0.900-0.970)	4.86 × 10^-20	< 0.001⁵
	SVM	0.890	0.875	0.958	0.922	0.773 (0.685-0.855)	7.30 × 10^-07	< 0.001⁴

¹P value represents the significance test results of area under the receiver operating curves performance between the three models and the random model.

²P value indicates the results of significance tests for area under the receiver operating curves comparisons between the models.

³P value comparison between the decision tree and logistic regression.

⁴P value comparison between the decision tree and the support vector machine.

⁵P value comparison between the logistic regression and the support vector machine.

SVM: Support vector machine; CI: Confidence interval.

Open in New Tab Full Size Table

In the training and validation sets, the accuracy, F1 index, recall and AUC of the decision tree model were superior to logistic regression and support vector machine models, indicating that the decision tree had good generalization ability in constructing a risk prediction model for delayed healing of the surgical site after gastric cancer radical surgery. In addition, in the training and validation sets, the accuracy, recall, and F1 index of the support vector machine model were better than logistic regression, but the AUC was lower, indicating that the generalization ability of the support vector machine in constructing a risk prediction model for delayed healing of the surgical site after gastric cancer radical surgery was better than logistic regression. These results are shown in Table 4.

In addition, we analyzed in detail the model performance metrics under the clinical threshold corresponding to the maximized Youden index. In the comprehensive comparison of all indicators, the decision tree model showed better overall performance than the other models. These results are shown in Table 5. The results demonstrated that in the training set, the maximized Youden index of the decision tree model was 0.822, and the corresponding threshold was 0.130, under which the accuracy was 0.657, with recall rate reaching 1.000, and the F1 index was 0.793, indicating its strong capability in detecting delayed wound healing. In the validation set, the decision tree model maintained excellent performance, the maximized Youden index of the model was 0.856, and the corresponding threshold was 0.219, under which the recall rate was 1.000, and the F1 index was 0.800.

Table 5 Comparison of performance metrics under the clinical threshold corresponding to the maximized Youden index of the three machine learning models.

Datasets	Prediction models	Precision	Accuracy	Recall	F1 index	Youden index	Best threshold
Training set	Decision tree	0.657	0.867	1.000	0.793	0.822	0.130
	Logistic regression	0.641	0.856	0.989	0.778	0.800	0.192
	SVM	0.641	0.856	0.989	0.778	0.800	0.173
Validation set	Decision tree	0.667	0.888	1.000	0.800	0.856	0.219
	Logistic regression	0.611	0.855	0.971	0.750	0.793	0.168
	SVM	0.750	0.921	0.971	0.846	0.877	0.195

SVM: Support vector machine.

Open in New Tab Full Size Table

In the training set, the decision curve of the three models (decision tree, logistic regression and support vector machine models) revealed that the nomogram provided a greater net benefit in predicting delayed wound healing compared with treat all or treat none, when the threshold probability was 0%-89%, 0%-72% and 0%-85%, respectively (Figure 4A-C). Similarly, in the validation set, the decision curve of the three models revealed that the nomogram provided a greater net benefit in predicting delayed wound healing compared with treat all or treat none when the threshold probability was 0%-77%, 0%-73% and 0%-95%, respectively (Figure 4D-F).

Open in New Tab Full Size Figure Download Figure

Figure 4 Decision curve of the three models. A-C: In the training set: Decision tree (A); Logistic regression (B); Support vector machine (C). The X-axis shows the threshold probability. The Y-axis represents net benefit. “None” is the assumption that no patient developed delayed wound healing and “All” refers to the assumption that all patients developed delayed wound healing. When the threshold probability was 0%-89%, 0%-72% and 0%-85%, respectively, using the models to predict delayed wound healing adds a greater benefit; D-F: In the validation set: Decision tree (D); Logistic regression (E); Support vector machine (F). The X-axis shows the threshold probability. The Y-axis represents net benefit. “None” is the assumption that no patient developed delayed wound healing and “All” refers to the assumption that all patients developed delayed wound healing. When the threshold probability was 0%-77%, 0%-73% and 0%-95%, respectively, using the models to predict delayed wound healing adds a greater benefit.

DISCUSSION

In the past decade, artificial intelligence (AI) has become a hot topic both inside and outside the scientific community, with numerous articles covering topics such as machine learning, deep learning, and AI[9]. The performance and attention of machine learning and AI applications in academic research and industrial fields have also been significantly improved[10]. Given the latest advances in machine learning, the application of this technology in the medical field has achieved gratifying results[11]. In many cases, the complexity and unpredictability of human physiology have been proven to be better described through machine learning algorithms[12].

This study reviewed the relevant clinical data of patients who underwent radical gastrectomy for gastric cancer in our hospital, and ultimately identified five risk factors, namely sex (female), elderly, duration of abdominal drainage, preoperative WBC count, and preoperative absolute value of lymphocytes. Gastric cancer is the third most common cause of cancer death, posing a huge threat to human health and survival[1]. To date, radical surgery for gastric cancer remains the preferred treatment method[13]. Due to the unique nature of the surgical site, postoperative complications such as delayed healing and infection may occur. A study found that elderly patients are more prone to delayed healing at the surgical site due to a decrease in the robustness and repair ability of damaged tissue[14].

Advanced age is a well-documented risk factor for delayed wound healing. The physiological changes associated with aging, such as decreased collagen production, reduced blood flow, and impaired immune function, contribute to slower and less effective tissue repair[15]. Additionally, elderly patients often have comorbid conditions such as diabetes and hypertension, which further complicate the healing process. Although the exact mechanisms are not fully understood, gender differences in wound healing have been observed. Hormonal differences, such as lower estrogen levels in postmenopausal women, may contribute to slower healing rates. Additionally, differences in immune response and inflammatory markers between males and females could play a role in the observed gender disparities in wound healing[16]. Moreover, prolonged abdominal drainage is often associated with postoperative complications. The presence of an abdominal drain can disrupt the normal healing process by introducing foreign bodies and potential pathogens into the wound site[17]. The findings of Hajibandeh et al[17] were consistent with these results.

Elevated preoperative WBC count is indicative of an ongoing inflammatory or infectious process. Inflammation is a natural part of the wound healing process, but excessive or chronic inflammation can impair tissue repair and lead to delayed healing[18]. Elevated WBC count may also reflect the presence of systemic infection or sepsis, which can further complicate postoperative recovery[19]. Besides, neutrophils are key players in the immune response and are crucial for the early stages of wound healing. An elevated neutrophil count can indicate an ongoing infection or inflammation, which may disrupt the normal healing process[20]. The findings of Heuer et al[20] were consistent with these results.

The results of this study indicated that decision tree, logistic regression, and support vector machine models constructed based on machine learning algorithms could accurately predict the risk of delayed healing at the surgical site after radical gastrectomy for gastric cancer. By comparing the performance of the three models in the dataset, it was found that the decision tree model outperformed the logistic regression and support vector machine models in terms of accuracy, F1 index, recall and AUC in both the training and validation sets. In addition, the support vector machine model had better accuracy, recall, and FI index than logistic regression, but lower AUC. The model with the best comprehensive predictive ability was the decision tree model.

Furthermore, we conducted a detailed analysis of the three models’ performance using the Youden index. The Youden index evaluates overall discrimination (sensitivity + specificity-1), but clinical applications prioritize balancing precision with the risk of missing or misdiagnosing cases. Decision tree models achieved a recall rate of 1.000 in both the training and validation sets, indicating “zero false negatives” in positive case identification a critical requirement for disease screening and diagnosis. In contrast, the logistic regression and support vector machine models demonstrated slightly lower recall rates, indicating a potential risk of missing diagnoses.

From the analysis of the characteristics of machine learning algorithms, the support vector machine algorithm is a widely used machine learning algorithm for classification and regression problems. It is suitable for high-dimensional data and is based on the principle of structural risk minimization, with good generalization ability. However, it requires high integrity of input data and is sensitive to missing data, which may lead to reduced model performance and difficulty in processing large-scale datasets, limiting its application in processing large-scale datasets[21]. The logistic regression algorithm is easy to understand and implement, has high computational efficiency, strong model interpretability, but is easily affected by outliers. Moreover, when the model is more complex or there is too little training data, overfitting is prone to occur[22]. The decision tree algorithm is a nonparametric supervised learning method mainly applied to classification and regression problems[23]. It is presented in the form of a tree diagram, which is easy to understand and interpret, and can automatically select the features that have the greatest impact on the target variable for splitting, helping to discover key features in the data. Secondly, the decision tree algorithm is very robust and is not sensitive to missing values in input data. It can handle missing values to a certain extent and various types of data.

This study has certain limitations. This study adopted internal validation, which may have regional limitations. Secondly, this is a retrospective study and there may be selection bias. In the future, the sample size should be expanded, multicenter studies should be conducted, and the practicality of the model should be tested in clinical practice, in order that the results of this study can provide a reliable reference for the prevention of delayed healing at the surgical site in gastric cancer patients undergoing radical surgery.

CONCLUSION

This study applied machine learning algorithms to predict delayed wound healing at the surgical site in gastric cancer patients undergoing radical surgery. Decision trees have better performance and generalization ability in constructing a risk prediction model for delayed healing, which can provide better reference for clinical decision-making.

ACKNOWLEDGEMENTS

All authors contributed to the design, analysis, critical interpretation of the data, and critical revision of the manuscript. The authors thank the patients who agreed to participate in this study and all researchers, and clinical staff who supported this study.

Footnotes

Provenance and peer review: Unsolicited article; Externally peer reviewed.

Peer-review model: Single blind

Specialty type: Gastroenterology and hepatology

Country of origin: China

Peer-review report’s classification

Scientific Quality: Grade A, Grade B

Novelty: Grade B, Grade B

Creativity or Innovation: Grade B, Grade B

Scientific Significance: Grade B, Grade B

P-Reviewer: Chen XY, PhD, Professor, China S-Editor: Fan M L-Editor: A P-Editor: Zhang L

References

Zhang H, Liang F, Wang F, Xu Q, Qiu Y, Lu X, Jiang L, Jian K. miR-148-3p inhibits gastric cancer cell malignant phenotypes and chemotherapy resistance by targeting Bcl2. Bioengineered. 2024;15:2005742. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 2] [Cited by in RCA: 3] [Article Influence: 3.0] [Reference Citation Analysis (0)]

2.	Li GX. [Research progress and prospect of gastric cancer surgery in 2021]. Zhonghua Wei Chang Wai Ke Za Zhi. 2022;25:15-21. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 1] [Reference Citation Analysis (0)]

Farreras N, Artigas V, Cardona D, Rius X, Trias M, González JA. Effect of early postoperative enteral immunonutrition on wound healing in patients undergoing surgery for gastric cancer. Clin Nutr. 2005;24:55-65. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 123] [Cited by in RCA: 130] [Article Influence: 6.5] [Reference Citation Analysis (0)]

Sharma A, Lysenko A, Jia S, Boroevich KA, Tsunoda T. Advances in AI and machine learning for predictive medicine. J Hum Genet. 2024;69:487-497. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 7] [Cited by in RCA: 51] [Article Influence: 51.0] [Reference Citation Analysis (0)]

5.	Al Mudawi N, Alazeb A. A Model for Predicting Cervical Cancer Using Machine Learning Algorithms. Sensors (Basel). 2022;22:4132. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 5] [Cited by in RCA: 29] [Article Influence: 9.7] [Reference Citation Analysis (0)]

Tardáguila-García A, García-Morales E, García-Alamino JM, Álvaro-Afonso FJ, Molines-Barroso RJ, Lázaro-Martínez JL. Metalloproteinases in chronic and acute wounds: A systematic review and meta-analysis. Wound Repair Regen. 2019;27:415-420. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 27] [Cited by in RCA: 49] [Article Influence: 8.2] [Reference Citation Analysis (0)]

7.	Weir CB, Jan A. BMI Classification Percentile and Cut Off Points. 2023 Jun 26. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2025 Jan-. [PubMed] [DOI]

Šverko Z, Vrankić M, Vlahinić S, Rogelj P. Complex Pearson Correlation Coefficient for EEG Connectivity Analysis. Sensors (Basel). 2022;22:1477. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 3] [Cited by in RCA: 28] [Article Influence: 9.3] [Reference Citation Analysis (0)]

9.	Choi RY, Coyner AS, Kalpathy-Cramer J, Chiang MF, Campbell JP. Introduction to Machine Learning, Neural Networks, and Deep Learning. Transl Vis Sci Technol. 2020;9:14. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 294] [Reference Citation Analysis (3)]

10.

Raschka S, Kaufman B. Machine learning and AI-based approaches for bioactive ligand discovery and GPCR-ligand recognition. Methods. 2020;180:89-110. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 42] [Cited by in RCA: 49] [Article Influence: 9.8] [Reference Citation Analysis (0)]

11.

Giarnieri E, Carico E, Scarpino S, Ricci A, Bruno P, Scardapane S, Giansanti D. Bringing AI to Clinicians: Simplifying Pleural Effusion Cytology Diagnosis with User-Friendly Models. Diagnostics (Basel). 2025;15:1240. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 1] [Reference Citation Analysis (0)]

12.

Chen X, Pan Y, Tang T, Fu J, Chen X, Bao C. Machine learning-guided one-step fabrication of targeted emodin liposomes via novel micromixer for ulcerative colitis therapy. Nano Res. 2025;. [RCA] [DOI] [Full Text] [Cited by in Crossref: 5] [Cited by in RCA: 3] [Article Influence: 3.0] [Reference Citation Analysis (0)]

13.

Wang S, Xu L, Wang Q, Li J, Bai B, Li Z, Wu X, Yu P, Li X, Yin J. Postoperative complications and prognosis after radical gastrectomy for gastric cancer: a systematic review and meta-analysis of observational studies. World J Surg Oncol. 2019;17:52. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 51] [Cited by in RCA: 95] [Article Influence: 15.8] [Reference Citation Analysis (0)]

14.

Vu R, Jin S, Sun P, Haensel D, Nguyen QH, Dragan M, Kessenbrock K, Nie Q, Dai X. Wound healing in aged skin exhibits systems-level alterations in cellular composition and cell-cell communication. Cell Rep. 2022;40:111155. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 6] [Cited by in RCA: 78] [Article Influence: 26.0] [Reference Citation Analysis (0)]

15.

Davan-Wetton CSA, Pessolano E, Perretti M, Montero-Melendez T. Senescence under appraisal: hopes and challenges revisited. Cell Mol Life Sci. 2021;78:3333-3354. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 12] [Cited by in RCA: 38] [Article Influence: 9.5] [Reference Citation Analysis (0)]

16.

Ahn S, Chantre CO, Ardoña HAM, Gonzalez GM, Campbell PH, Parker KK. Biomimetic and estrogenic fibers promote tissue repair in mice and human skin via estrogen receptor β. Biomaterials. 2020;255:120149. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 10] [Cited by in RCA: 20] [Article Influence: 4.0] [Reference Citation Analysis (0)]

17.

Hajibandeh S, Hajibandeh S, Raza SS, Bartlett D, Dasari BVM, Sutcliffe RP. Abdominal drainage is contraindicated after uncomplicated hepatectomy: Results of a meta-analysis of randomized controlled trials. Surgery. 2023;173:401-411. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 4] [Reference Citation Analysis (0)]

18.

Patel PP, Weller JH, Westermann CR, Cappiello C, Garcia AV, Rhee DS. Appendectomy and Cholecystectomy Outcomes for Pediatric Cancer Patients with Leukopenia: A NSQIP-Pediatric Study. J Surg Res. 2021;267:556-562. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 3] [Reference Citation Analysis (0)]

19.	Shi X, Lin L, Sun J. The Value of Continuous Closed Negative Pressure Drainage Combined with Antibacterial Biofilm Dressing in Postoperative Wound Healing for Severe Pancreatitis. Altern Ther Health Med. 2023;29:375-379. [PubMed] [DOI]

20.

Heuer A, Stiel C, Elrod J, Königs I, Vincent D, Schlegel P, Trochimiuk M, Appl B, Reinshagen K, Raluy LP, Boettcher M. Therapeutic Targeting of Neutrophil Extracellular Traps Improves Primary and Secondary Intention Wound Healing in Mice. Front Immunol. 2021;12:614347. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 12] [Cited by in RCA: 38] [Article Influence: 9.5] [Reference Citation Analysis (0)]

21.	Yan Y, Wang Y, Lei Y. Micro Learning Support Vector Machine for Pattern Classification: A High-Speed Algorithm. Comput Intell Neurosci. 2022;2022:4707637. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 7] [Reference Citation Analysis (0)]

22.

Ghavamipour AR, Turkmen F, Jiang X. Privacy-preserving logistic regression with secret sharing. BMC Med Inform Decis Mak. 2022;22:89. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 1] [Cited by in RCA: 2] [Article Influence: 0.7] [Reference Citation Analysis (0)]

23.	Becker T, Rousseau AJ, Geubbelmans M, Burzykowski T, Valkenborg D. Decision trees and random forests. Am J Orthod Dentofacial Orthop. 2023;164:894-897. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 38] [Reference Citation Analysis (0)]