Copyright
©The Author(s) 2025.
World J Gastroenterol. Nov 14, 2025; 31(42): 112180
Published online Nov 14, 2025. doi: 10.3748/wjg.v31.i42.112180
Published online Nov 14, 2025. doi: 10.3748/wjg.v31.i42.112180
Table 1 Baseline characteristics of colorectal cancer patients, n (%)
| Characteristic | Overall (n = 765) | Training set (n = 612) | Validation set (n = 153) | P value |
| Age, median (95%CI) | 62.26 (61.58-62.96) | 62.12 (61.36-62.88) | 62.84 (61.16-64.51) | 0.422 |
| BSA, median (95%CI) | 1.71 (1.69-1.72) | 1.71 (1.69-1.72) | 1.71 (1.68-1.74) | 0.996 |
| BMI, median (95%CI) | 22.94 (22.70-23.17) | 22.99 (22.72-23.26) | 22.75 (22.23-23.27) | 0.428 |
| ALB, median (95%CI) | 37.53 (37.27-37.80) | 37.48 (37.19-37.50) | 37.74 (37.10-38.37) | 0.453 |
| CEA, median (95%CI) | 182.30 (125.72-246.75) | 178.10 (115.79-240.41) | 199.10 (53.75-344.45) | 0.770 |
| CA19-9, median (95%CI) | 169.03 (113.66-237.21) | 177.95 (108.12-247.78) | 133.33 (0.11-266.55) | 0.571 |
| CA125, median (95%CI) | 22.20 (19.20-25.77) | 21.93 (18.30-25.56) | 23.30 (13.48-33.12) | 0.804 |
| Gender | 0.607 | |||
| Male | 449 (58.7) | 362 (59.15) | 87 (56.86) | |
| Female | 316 (41.3) | 250 (40.85) | 66 (43.14) | |
| Smoking | 0.572 | |||
| Yes | 275 (35.9) | 223 (36.44) | 52 (33.99) | |
| No | 316 (41.3) | 389 (63.56) | 101 (66.01) | |
| Diabetes | 0.529 | |||
| Yes | 128 (16.7) | 105 (17.16) | 23 (15.03) | |
| No | 637 (83.2) | 507 (82.84) | 130 (84.97) | |
| Hypertension | 0.417 | |||
| Yes | 307 (40.1) | 250 (40.85) | 57 (37.25) | |
| No | 458 (59.8) | 362 (59.15) | 96 (62.75) | |
| T | 0.491 | |||
| 1 | 18 (2.4) | 15 (2.45) | 3 (1.96) | |
| 2 | 84 (11.0) | 67 (10.95) | 17 (11.11) | |
| 3 | 264 (34.5) | 323 (52.78) | 76 (49.67) | |
| 4 | 399 (52.2) | 207 (33.82) | 57 (37.25) | |
| N | 0.255 | |||
| 0 | 4 (0.5) | 126 (20.59) | 36 (23.53) | |
| 1 | 162 (21.2) | 261 (42.65) | 68 (44.44) | |
| 2 | 270 (35.3) | 222 (36.27) | 48 (31.37) | |
| 3 | 329 (43.0) | 3 (0.49) | 1 (0.65) | |
| M | 0.103 | |||
| 0 | 355 (46.4) | 293 (47.88) | 62 (40.52) | |
| 1 | 410 (53.5) | 319 (52.12) | 91 (59.48) | |
| Staging | 0.180 | |||
| I | 15 (2.0) | 11 (1.8) | 4 (2.61) | |
| II | 66 (8.6) | 56 (9.15) | 10 (6.54) | |
| III | 267 (34.9) | 219 (35.78) | 48 (31.37) | |
| IV | 417 (54.5) | 326 (53.27) | 91 (59.48) | |
| Position | 0.687 | |||
| Ascending colon | 21 (2.7) | 118 (19.28) | 30 (19.61) | |
| Transverse colon | 27 (3.5) | 15 (2.45) | 6 (3.92) | |
| Descending colon | 148 (19.3) | 24 (3.92) | 3 (1.96) | |
| Sigmoid colon | 209 (27.3) | 164 (26.8) | 45 (29.41) | |
| Rectum | 360 (47.1) | 291 (47.55) | 69 (45.1) | |
| Hepatic metastasis | 0.195 | |||
| Yes | 261 (34.1) | 202 (33.01) | 59 (38.56) | |
| No | 504 (65.8) | 410 (66.99) | 94 (61.44) | |
| Lung metastasis | 0.064 | |||
| Yes | 214 (27.9) | 162 (26.47) | 52 (33.99) | |
| No | 551 (72.0) | 450 (73.53) | 101 (66.01) | |
| Peritoneum metastasis | 0.201 | |||
| Yes | 47 (6.1) | 41 (6.7) | 6 (3.92) | |
| No | 718 (93.8) | 571 (93.3) | 147 (96.08) | |
| KRAS | 0.961 | |||
| Yes | 124 (16.2) | 99 (16.18) | 25 (16.34) | |
| No | 641 (83.7) | 513 (83.82) | 128 (83.66) | |
| BRAF | 0.051 | |||
| Yes | 15 (1.9) | 15 (2.45) | 0 (0) | |
| No | 750 (98) | 597 (97.55) | 153 (100) | |
| TP53 | 0.286 | |||
| Yes | 61 (7.9) | 52 (8.5) | 9 (5.88) | |
| No | 704 (92) | 560 (91.5) | 144 (94.12) | |
| Myelosuppression | 0.336 | |||
| Yes | 250 (32.6) | 195 (31.86) | 55 (35.95) | |
| No | 515 (67.3) | 417 (68.14) | 98 (64.05) | |
| Chemotherapy cycles | 0.572 | |||
| 1-2 cycles | 315 (41.2) | 258 (42.16) | 57 (37.25) | |
| 3-4 cycles | 237 (31.0) | 182 (29.74) | 55 (35.95) | |
| 5 or more cycles | 213 (27.8) | 172 (28.1) | 41 (26.8) | |
| Chemotherapy regimens | 0.248 | |||
| CapeOx | 571 (74.6) | 463 (75.65) | 108 (70.59) | |
| FOLFOX | 85 (11.1) | 63 (10.29) | 22 (14.38) | |
| FOLFIRI | 109 (14.2) | 86 (14.05) | 23 (15.03) |
Table 2 Performance of 10 machine learnings for predicting myelosuppression after first-line chemotherapy for colorectal cancer
| MLs | AUC (95%CI) | AUPRC (95%CI) | Accuracy (%) | Sensitivity (%) | Specificity (%) | F1 score | PPV | NPV |
| Training set | ||||||||
| Adaptive boosting | 0.88 (0.85-0.91) | 0.79 (0.73-0.84) | 0.79 | 0.49 | 0.94 | 0.61 | 0.81 | 0.78 |
| Artificial neural network | 0.96 (0.95-0.97) | 0.94 (0.91-0.96) | 0.73 | 0.53 | 0.83 | 0.57 | 0.61 | 0.78 |
| Decision tree | 1.00 (1.00-1.00) | 1.00 (1.00-1.00) | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
| Extra trees | 1.00 (1.00-1.00) | 1.00 (1.00-1.00) | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
| Gradient boosting machine | 0.99 (0.99-1.00) | 0.99 (0.99-1.00) | 0.98 | 0.93 | 0.99 | 0.96 | 0.99 | 0.97 |
| K-Nearest neighbors | 0.91 (0.89-0.93) | 0.79 (0.75-0.84) | 0.90 | 0.85 | 0.92 | 0.85 | 0.85 | 0.92 |
| Logistic regression | 0.75 (0.71-0.79) | 0.57 (0.50-0.65) | 0.69 | 0.25 | 0.92 | 0.35 | 0.61 | 0.70 |
| Random forest | 1.00 (0.99-1.00) | 1.00 (0.99-1.00) | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
| Support vector machine | 0.87 (0.83-0.90) | 0.79 (0.73-0.84) | 0.68 | 0.05 | 1.00 | 0.10 | 1.00 | 0.67 |
| Extreme gradient boosting | 1.00 (0.99-1.00) | 1.00 (0.99-1.00) | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
| Validation set | ||||||||
| Adaptive boosting | 0.83 (0.76-0.89) | 0.72 (0.60-0.85) | 0.78 | 0.35 | 0.95 | 0.47 | 0.71 | 0.79 |
| Artificial neural network | 0.69 (0.60-0.78) | 0.61 (0.48-0.74) | 0.65 | 0.42 | 0.75 | 0.40 | 0.39 | 0.77 |
| Decision tree | 0.70 (0.62-0.78) | 0.52 (0.41-0.64) | 0.82 | 0.81 | 0.83 | 0.72 | 0.65 | 0.92 |
| Extra trees | 0.94 (0.89-0.97) | 0.90 (0.82-0.96) | 0.87 | 0.72 | 0.93 | 0.76 | 0.79 | 0.89 |
| Gradient boosting machine | 0.92 (0.86-0.97) | 0.90 (0.84-0.95) | 0.83 | 0.67 | 0.89 | 0.69 | 0.71 | 0.88 |
| K-Nearest neighbors | 0.75 (0.67-0.83) | 0.62 (0.48-0.74) | 0.80 | 0.70 | 0.85 | 0.67 | 0.64 | 0.88 |
| Logistic regression | 0.67 (0.59-0.76) | 0.49 (0.38-0.65) | 0.68 | 0.20 | 0.86 | 0.27 | 0.38 | 0.74 |
| Random forest | 0.96 (0.93-0.98) | 0.93 (0.87-0.97) | 0.88 | 0.72 | 0.95 | 0.78 | 0.84 | 0.90 |
| Support vector machine | 0.75 (0.67-0.83) | 0.65 (0.52-0.77) | 0.75 | 0.12 | 1.00 | 0.21 | 1.00 | 0.74 |
| Extreme gradient boosting | 0.97 (0.94-0.99) | 0.92 (0.83-0.99) | 0.88 | 0.79 | 0.92 | 0.79 | 0.79 | 0.92 |
Table 3 Logistic regression based on 10 machine learnings for predicting myelosuppression
| MLs | Univariate logistic regression | Multivariate logistic regression | ||
| OR (95%CI) | P value | OR (95%CI) | P value | |
| Adaptive boosting | 49.558 (15.807-155.376) | 0.000 | 0.083 (0.003-2.486) | 0.151 |
| Artificial neural network | 54.444 (13.032-227.465) | 0.000 | 1.572 (0.043-57.273) | 0.805 |
| Decision tree | 10.587 (5.084-22.047) | 0.000 | 0.563 (0.102-3.109) | 0.510 |
| Extra trees | 390.471 (75.734-2013.192) | 0.000 | 31.948 (2.468-413.586) | 0.008 |
| Gradient boosting machine | 104.831 (32.579-337.322) | 0.000 | 2.169 (0.176-26.731) | 0.546 |
| K-Nearest neighbors | 24.992 (9.652-64.711) | 0.000 | 3.081 (0.419-22.650) | 0.269 |
| Logistic regression | 279.116 (24.997-3116.614) | 0.000 | 0.225 (0.000-129.448) | 0.646 |
| Random forest | 404.139 (88.973-1835.710) | 0.000 | 94.621 (1.178-7597.788) | 0.042 |
| Support vector machine | 4.180 (0.332-52.698) | 0.269 | 23.780 (0.434-1303.403) | 0.121 |
| Extreme gradient boosting | 131.875 (39.641-438.709) | 0.000 | 2.669 (0.170-41.930) | 0.485 |
Table 4 Performance comparison of extreme gradient boosting, clinic nomogram, and clinic-machine learning nomograms
| Model | AUC (95%CI) | AUPRC (95%CI) | Accuracy (%) | Sensitivity (%) | Specificity (%) | F1 score | PPV | NPV |
| Training set | ||||||||
| Extreme gradient boosting | 1.00 (1.00-1.00) | 1.00 (0.99-1.00) | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
| Clinic nomogram | 0.68 (0.63-0.72) | 0.49 (0.43-0.56) | 0.69 | 0.25 | 0.92 | 0.69 | 0.61 | 0.71 |
| Clinic-machine learning nomogram | 0.99 (0.99-1.00) | 0.99 (0.99-1.00) | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
| Validation set | ||||||||
| Extreme gradient boosting | 0.97 (0.94-0.99) | 0.92 (0.83-0.99) | 0.88 | 0.79 | 0.92 | 0.79 | 0.79 | 0.92 |
| Clinic nomogram | 0.61 (0.52-0.71) | 0.48 (0.36-0.62) | 0.68 | 0.21 | 0.86 | 0.27 | 0.38 | 0.74 |
| Clinic-machine learning nomogram | 0.96 (0.92-0.98) | 0.93 (0.87-0.98) | 0.90 | 0.77 | 0.95 | 0.80 | 0.85 | 0.91 |
Table 5 Performance of clinic-machine learning nomogram in the testing set
| Model | AUC (95%CI) | AUPRC (95%CI) | Accuracy (%) | Sensitivity (%) | Specificity (%) | F1 score | PPV | NPV |
| Clinic-machine learning nomogram | 0.95 (0.93-0.99) | 0.83 (0.71-0.92) | 0.81 | 0.52 | 0.97 | 0.65 | 0.89 | 0.79 |
- Citation: Liu YM, Du YY, Song Y, Xiong HT, Yu HB, Li BH, Cai L, Ma SS, Gao J, Zhang HY, Fang RY, Cai R, Zheng HG. Predicting chemotherapy-induced myelosuppression in colorectal cancer: An interpretable, machine learning-based nomogram. World J Gastroenterol 2025; 31(42): 112180
- URL: https://www.wjgnet.com/1007-9327/full/v31/i42/112180.htm
- DOI: https://dx.doi.org/10.3748/wjg.v31.i42.112180
