Liu W, Wu HY, Lin JX, Qu ST, Gu YJ, Zhu JZ, Xu CF. Combining lymph node ratio to develop prognostic models for postoperative gastric neuroendocrine neoplasm patients. World J Gastrointest Oncol 2024; 16(8): 3507-3520 [PMID: 39171165 DOI: 10.4251/wjgo.v16.i8.3507]
Corresponding Author of This Article
Chun-Fang Xu, PhD, Chief Physician, Professor, Department of Gastroenterology, The First Affiliated Hospital of Soochow University, No. 188 Shizi Street, Suzhou 215006, Jiangsu Province, China. xuchunfang@suda.edu.cn
Research Domain of This Article
Gastroenterology & Hepatology
Article-Type of This Article
Retrospective Study
Open-Access Policy of This Article
This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Wen Liu, Department of Gastroenterology, Changzhou Hospital of Traditional Chinese Medicine, Changzhou 213000, Jiangsu Province, China
Hong-Yu Wu, Jia-Xi Lin, Shu-Ting Qu, Jin-Zhou Zhu, Chun-Fang Xu, Department of Gastroenterology, The First Affiliated Hospital of Soochow University, Suzhou 215006, Jiangsu Province, China
Yi-Jie Gu, Department of Gastroenterology, Suzhou Ninth Hospital Affiliated to Soochow University, Suzhou 215200, Jiangsu Province, China
Author contributions: Liu W and Wu HY wrote the first draft of the manuscript; Qu ST and Gu YJ contributed to clinical data collection; Liu W, Lin JX, and Zhu JZ contributed to data analysis and results interpretation; Zhu JZ and Xu CF revised the manuscript; and all authors had checked and approved the final manuscript.
Supported bythe Science and Technology Plan of Suzhou City, No. SKY2021038.
Institutional review board statement: This study was approved by the Ethics Committee of the First Affiliated Hospital of Soochow University (Number: 2024-145).
Informed consent statement: Patients were not required to give informed consent to the study because we had acquired the Ethics committee’s approval of exemption of the subject’s informed consent. This study does not have direct contact with the subjects, and only collects clinical baseline data from outpatient and inpatient medical records. the study results will remove any characters with the subjects’ identification to ensure that personal privacy will not be disclosed. Therefore, objectively, there will be no risk to the subjects.
Conflict-of-interest statement: All the authors report no relevant conflicts of interest for this article.
Data sharing statement: Data will be made available on reasonable request.
Open-Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Chun-Fang Xu, PhD, Chief Physician, Professor, Department of Gastroenterology, The First Affiliated Hospital of Soochow University, No. 188 Shizi Street, Suzhou 215006, Jiangsu Province, China. xuchunfang@suda.edu.cn
Received: March 28, 2024 Revised: May 14, 2024 Accepted: June 12, 2024 Published online: August 15, 2024 Processing time: 132 Days and 16 Hours
Abstract
BACKGROUND
Lymph node ratio (LNR) was demonstrated to play a crucial role in the prognosis of many tumors. However, research concerning the prognostic value of LNR in postoperative gastric neuroendocrine neoplasm (NEN) patients was limited.
AIM
To explore the prognostic value of LNR in postoperative gastric NEN patients and to combine LNR to develop prognostic models.
METHODS
A total of 286 patients from the Surveillance, Epidemiology, and End Results database were divided into the training set and validation set at a ratio of 8:2. 92 patients from the First Affiliated Hospital of Soochow University in China were designated as a test set. Cox regression analysis was used to explore the relationship between LNR and disease-specific survival (DSS) of gastric NEN patients. Random survival forest (RSF) algorithm and Cox proportional hazards (CoxPH) analysis were applied to develop models to predict DSS respectively, and compared with the 8th edition American Joint Committee on Cancer (AJCC) tumor-node-metastasis (TNM) staging.
RESULTS
Multivariate analyses indicated that LNR was an independent prognostic factor for postoperative gastric NEN patients and a higher LNR was accompanied by a higher risk of death. The RSF model exhibited the best performance in predicting DSS, with the C-index in the test set being 0.769 [95% confidence interval (CI): 0.691-0.846] outperforming the CoxPH model (0.744, 95%CI: 0.665-0.822) and the 8th edition AJCC TNM staging (0.723, 95%CI: 0.613-0.833). The calibration curves and decision curve analysis (DCA) demonstrated the RSF model had good calibration and clinical benefits. Furthermore, the RSF model could perform risk stratification and individual prognosis prediction effectively.
CONCLUSION
A higher LNR indicated a lower DSS in postoperative gastric NEN patients. The RSF model outperformed the CoxPH model and the 8th edition AJCC TNM staging in the test set, showing potential in clinical practice.
Core Tip: The prognostic value of lymph node ratio (LNR) in postoperative gastric neuroendocrine neoplasm (NEN) patients was explored in this research. A higher LNR indicated a lower disease-specific survival (DSS) in postoperative gastric NEN patients. In addition, we combined LNR to develop prognostic models to predict DSS for postoperative gastric NEN patients, using the random survival forest (RSF) algorithm and Cox proportional hazards (CoxPH) analysis. The RSF model outperformed the CoxPH model and the 8th edition American Joint Committee on Cancer tumor-node-metastasis staging in the test set. Also, the RSF model demonstrated value in risk stratification and individual prognosis prediction.
Citation: Liu W, Wu HY, Lin JX, Qu ST, Gu YJ, Zhu JZ, Xu CF. Combining lymph node ratio to develop prognostic models for postoperative gastric neuroendocrine neoplasm patients. World J Gastrointest Oncol 2024; 16(8): 3507-3520
Neuroendocrine neoplasms (NENs) are rare tumors originating from secretory cells of the diffuse endocrine system, which typically produce bioactive amines or peptide hormones[1]. A majority of NENs occur in the gastroenteropancreatic system, and gastric NENs account for 6.9% to 23% of all gastroenteropancreatic NENs[2-5]. In recent years, with the popularization of physical examination and the improvement of detection methods (computed tomography and endoscopy), the detection rate of gastric NENs has increased 7-fold to 10-fold[6]. Based on the 2019 revision of the World Health Organization (WHO) classification criteria for digestive system tumors, gastric NENs were divided into three types: Well-differentiated neuroendocrine tumor (NET), poorly-differentiated neuroendocrine carcinoma (NEC), and mixed neuroendocrine-non-NEN. The outcomes of NENs were significantly different between gastric NET and gastric NEC. The latter presented with more aggressive behavior, a shorter survival rate, and was prone to relapse after surgery[7-9]. Based on the 5th edition Japanese gastric cancer treatment guidelines[10], standard surgical procedure combined with adjuvant chemotherapy was the common treatment strategy for gastric NEC. With the increasing detection rate of gastric NENs, more and more researchers have begun to explore the risk factors that affect the prognosis of gastric NENs, including tumor classification, tumor-node-metastasis (TNM) stage, tumor size, and so on.
Lymph node ratio (LNR) has been proven to be an important prognostic factor in most tumors[11-13]. LNR is defined as the ratio of the number of positive lymph nodes to the number of examined lymph nodes. Due to combining the number of lymph nodes examined during the operation, LNR was more beneficial to be applied in the prognostic analysis of patients with the same number of lymph nodes involved. Many studies have demonstrated that the prognostic value of LNR was superior to the absolute number of involved lymph nodes[14,15]. As extremely rare types of gastric neoplasm, research concerning the prognostic value of LNR in patients with gastric NENs was limited.
Artificial intelligence (AI) has undergone rapid development over the past decade and its application in medicine is currently a hot topic. Machine learning (ML), as a subgroup of AI, presented strengths in tackling larger sample sizes and high-dimensional data compared with the traditional statistical methods[16]. Random forest (RF), as one of the ML algorithms, showed good performance in the application of oncology because RF was suitable for the common medium-sized datasets in clinical[17]. Random survival forest (RSF), as a combination of RF and traditional survival analysis, was first proposed in 2008[18]. RSF can be applied to analyze right-censored survival data. Numerous researches have demonstrated that the RSF model could predict cancer prognosis effectively and outperformed the traditional Cox proportional hazards (CoxPH) model, such as in pancreatic cancer, breast cancer, cervical adenocarcinoma, and so on[19-21]. Our research aims to explore the relationship between LNR and disease-specific survival (DSS) in postoperative gastric NEN patients and to establish and validate prognostic models using the RSF algorithm and CoxPH analysis.
MATERIALS AND METHODS
Study population
We obtained patients’ data from the Surveillance, Epidemiology, and End Results (SEER) database between 2000 and 2019. Cases were selected based on the primary site code (C16.0-C16.9, stomach) and third edition (ICD-O-3) histology codes (8013, large cell NEC; 8153, gastrinoma; 8240, carcinoid tumor; 8241, enterochromaffin cell carcinoid; 8242, enterochromaffin-like cell tumor; 8244, mixed adenoneuroendocrine carcinoma (MANEC); 8245, adenocarcinoid tumor; 8246, NEC; 8249, atypical carcinoid tumor; 8156, somatostatinoma). Exclusion criteria included: (1) Cases without histopathological evidence; (2) Cases with a history of other malignancies; (3) Cases without detailed clinical data information: Lack of differentiation grade, size, or TNM stage; (4) Cases without information on survival months or death within one month; (5) Cases without surgery performed or performing local surgery; and (6) Cases without information on the number of examined lymph nodes and positive lymph nodes. Finally, 286 cases with gastric NENs were included. Moreover, 92 gastric NENs patients from the First Affiliated Hospital of Soochow University were enrolled in the study as the test set. These patients were diagnosed by histopathology from 2011 to 2020 and experienced the screening process as above. We excluded patients who died within one month because these patients died due to postoperative complications rather than recurrent diseases[22]. The concrete screening process is shown in Figure 1. The ethics committee of the First Affiliated Hospital of Soochow University approved this retrospective study (Number: 2024-145). We conducted postoperative follow-ups for our hospital patients, including outpatient reviews, inpatient medical reviews, and telephone interviews. The last follow-up time was December 2022. DSS was calculated from the date of surgery to the date of last contact or death caused by the primary tumor.
Figure 1 The flow diagram of the patients’ selection.
GNEN: Gastric neuroendocrine neoplasms; SEER: Surveillance, Epidemiology, and End Results; TNM: Tumor-node-metastasis; NEN: Neuroendocrine neoplasm.
Variables definition and selection
The following information were identified from the SEER dataset: Sex, age at diagnosis, race, marital status, primary site, differentiation grade, tumor size, American Joint Committee on Cancer (AJCC) T stage, AJCC N stage, AJCC M stage, the number of regional nodes positive, the number of regional nodes examined, surgery for primary site, radiotherapy recode, chemotherapy recode, cause-specific death classification and survival months. LNR was defined as the ratio of positive lymph nodes to the total number of examined lymph nodes. We turned the LNR from a continuous variable into a categorical variable with the X-tile software. We classify the separated, divorced, unmarried, domestic partner, single (never married), and widowed into unmarried groups[23]. Based on the latest 2019 5th WHO classification for tumors of the digestive system, the gastroenteropancreatic NENs were divided into three types: NET, NEC, and MANEC. NET is defined as well differentiated tumors and NEC is defined as poorly differentiated tumors. In this research, we define well and moderately differentiated gastric NENs as ‘NET’, defining poorly differentiated and undifferentiated gastric NENs as ‘NEC’[24]. Considering the partial overlapping relationship between the N stage and LNR, we did not admit the N stage into the univariate analysis and the construction of the RSF model. Moreover, we restage the TNM stage in the SEER database based on the latest 8th edition of the AJCC stage. Based on the SEER Program Coding and Staging Manual 2021, the 20 to 27code in the column of ‘surgery primary site’ was defined as local tumor excision, and the 30 to 80 code was defined as open surgery[25]. We only admitted patients undergoing open surgery, ensuring the accuracy of the results of the regional nodes examined.
The construction of prognostic models
We randomly split patients from the SEER database into the training set and validation set at a ratio of 8:2. Patients from our hospital were designated as a test set. The training set was applied to develop models. The validation set was used for tuning hyperparameters of the RSF model. The test set was applied to confirm the optimal predictive model.
For the CoxPH model, we first conducted univariable and multivariable regression analyses to select independent risk factors. The selected independent risk factors were then applied to develop the CoxPH model and presented in the format of a nomogram. The criteria for variable inclusion in univariable and multivariate analysis were both P values less than 0.05. The scaled Schoenfeld residual test was used to check the PH assumption.
For the RSF model, we directly admitted all variables. Optuna was applied to determine the hyperparameters of the RSF model. Optuna presented advantages in adopting an effective sampling and pruning strategy to construct hyperparameter search space and could achieve good performance under limited resources[26]. For the RSF model, we explored the hyperparameters’ combination of the number of estimators (from 100 to 1000, range 10), the minimum of samples split (from 1 to 29, range 2), and the minimum of samples leaf (from 1 to 29, range 2).
The assessment and explanation of the developed models
We evaluated the performances of the CoxPH model and the RSF model in the test set. The C-index and areas under the receiver operating characteristic curve (AUCs) of 1-, 3-, and 5-year were used to evaluate the discrimination ability of models. The calibration curves were used to evaluate the model’s calibration. The DCA was adopted to calculate the clinical net benefit of the models.
The SHapley Additive exPlanations (SHAP) plot, which adopts a game theoretic approach, was applied to explain the RSF model. SHAP can either explain the impact of each variable on the model output as a whole or locally analyze how each feature affects the outcome of a single patient.
The risk stratification of patients
We calculated each patient’s risk score with the optimal model. Then, we stratify patients into the low-risk group, the medium-risk group, and the high-risk group with the tool of X-tile. Kaplan-Meier survival analysis was applied to calculate survival rates between different risk groups and compared with the method of log-rank test.
The individual prediction
The individual prediction was composed of the survival probability curve and local SHAP plot. The survival probability curve could exhibit the survival probability at each time point. The local SHAP plot could present the contribution of each admitted variable to the outcome of the individual.
Statistical analysis
R software (version 4.1.0) and Python (Version 3.8.8) were applied to perform statistical analysis. We expressed the continuous variables in the format of mean ± SD. Categorical variables were reported as numbers and percentages. Kaplan-Meier survival analysis and log-rank test were used to explore the relationship between LNR and DSS of gastric NENs. Two-tailed P values less than 0.05 were considered statistically significant. The “survival” package of R software (Version 4.1.0) was used for Kaplan-Meier survival analysis. The “survminer” package of R software was used for plotting Kaplan-Meier survival curves and performing the log-rank test. The “rms” package of R software was used for CoxPH regression and calibration curves. The “dcurves” package of R software was used for plotting DCA curves. The “scikit-survival” module in Python (Version 3.8.8) was applied to establish the RSF model. The ‘optuna’ module in Python was used to tune hyperparameters.
RESULTS
Baseline characteristics of patients
According to the inclusion and exclusion criteria, 286 patients from the SEER database and 92 patients from the First Affiliated Hospital of Soochow University were enrolled into the research. We randomly split patients from the SEER database into the training set (n = 233) and validation set (n = 53) at a ratio of 8:2. Our hospital were assigned as the test set. The detailed demographic and clinical information of these patients were summarized in Table 1. The median follow-up period in training, validation and test set were 51, 40, and 23 months respectively.
Table 1 Demographic and clinical characteristics of patients in the Surveillance, Epidemiology, and End Results dataset and our hospital.
Variables
Training set
Validation set
Test set
n = 233
n = 53
n = 92
Age, mean (SD)
60.7 (13.5)
61.4 (13.2)
65.62 (8.78)
Sex, n (%)
Female
120 (51.5)
21 (39.6)
17 (18.48)
Male
113 (48.5)
32 (60.4)
75 (81.52)
Marital, n (%)
Unmarried
84 (36.1)
16 (30.2)
0 (0.00)
Married
141 (60.5)
37 (69.8)
92 (100)
Unknown
8 (3.43)
0 (0.00)
0 (0.00)
Race, n (%)
White
41 (17.6)
7 (13.2)
0 (0.00)
Black
167 (71.7)
41 (77.4)
0 (0.00)
Asian and others
25 (10.7)
5 (9.43)
92 (100)
Primary site, n (%)
Upper 1/3
44 (18.9)
13 (24.5)
57 (61.96)
Middle 1/3
89 (38.2)
21 (39.6)
11 (11.96)
Lower 1/3
50 (21.5)
12 (22.6)
18 (19.57)
Overlapping
50 (21.5)
7 (13.2)
6 (6.52)
Histology, n (%)
NET
150 (64.4)
33 (62.3)
6 (6.52)
NEC
71 (30.5)
16 (30.2)
62 (67.39)
MANEC
12 (5.15)
4 (7.55)
24 (26.09)
Size, n (%)
≤ 2
92 (39.5)
19 (35.8)
7 (7.61)
> 2 and ≤ 5
82 (35.2)
21 (39.6)
53 (57.61)
> 5
59 (25.3)
13 (24.5)
32 (34.78)
T stage, n (%)
Tis
3 (1.29)
1 (1.89)
0 (0.00)
T1
41 (17.6)
5 (9.43)
4 (4.35)
T2
77 (33.0)
20 (37.7)
10 (10.87)
T3
67 (28.8)
12 (22.6)
59 (64.13)
T4
45 (19.3)
15 (28.3)
19 (20.65)
N stage, n (%)
N0
111 (47.6)
27 (50.9)
23 (25.00)
N1/N2/N3
122 (52.4)
26 (49.1)
69 (75.00)
M stage, n (%)
M0
196 (84.1)
43 (81.1)
79 (85.87)
M1
37 (15.9)
10 (18.9)
13 (14.13)
LNR, n (%)
0
111 (47.6)
27 (50.9)
24 (26.09)
≤ 0.2
47 (20.2)
11 (20.8)
35 (38.04)
> 0.2
75 (32.2)
15 (28.3)
33 (35.87)
Radiation, n (%)
No
213 (91.4)
48 (90.6)
86 (93.48)
Yes
20 (8.58)
5 (9.43)
6 (6.52)
Chemotherapy, n (%)
No
183 (78.5)
40 (75.5)
41 (44.57)
Yes
50 (21.5)
13 (24.5)
51 (55.43)
Status, n (%)
Alive
168 (72.1)
36 (67.9)
53 (57.61)
Dead
65 (27.9)
17 (32.1)
39 (42.39)
Time, median (range)
51 (1, 189)
40 (1, 186)
23 (1, 148)
The univariable and multivariate regression analysis
We confirmed optimal cutoff value for LNR was 0.20 by X-tile tool and patients was stratified into three groups (LNR = 0, 0 < LNR ≤ 0.2, LNR > 0.2). Then, we conduct univariable and multivariate regression analysis. The univariable regression analysis showed that age, primary site, histology type, size, M stage, LNR, radiation, and chemotherapy were risk factors. The multivariate regression analysis showed that primary site, histology type, size, M stage, and LNR were independent risk factors for gastric NENs. Figure 2 was the forest plot of multivariate cox regression analysis.
Figure 2 Forest plot of multivariate Cox regression analysis of disease-specific survival.aP < 0.05, bP < 0.001.
We found patients with a higher LNR were accompanied by a higher risk of death [LNR = 0 vs 0 < LNR ≤ 0.2: Odds ratio (OR) = 2.214, 95% confidence interval (CI): 1.005-4.880, P = 0.0485; LNR = 0 vs LNR > 0.2: OR = 4.774, 95%CI: 2.369-9.620, P < 0.001]. Kaplan-Meier survival curves presented that DSS of different LNR groups were significant different both in the training set and test set (Figure 3A and B). The log-rank test revealed P value less than 0.05 both. The 1-, 3- and 5- years DSS of different LNR groups in the training set were observed to be 95% vs 87% vs 84%, 81% vs 57% vs 51%, 55% vs 34% vs 29% respectively. The 1-, 3- and 5- years DSS of different LNR groups in the test set were observed to be 92% vs 77% vs 52%, 67% vs 29% vs 12%, 46% vs 17% vs 6% respectively.
Figure 3 Kaplan-Meier survival curves of different lymph node ratio groups in the training set and the test set.
A: Training set; B: Test set. LNR: Lymph node ratio.
The construction of prognostic models
CoxPH model: We used the above independent risk factors to develop the CoxPH model and visualized in the format of a nomogram (Supplementary Figure 1). The χ2 test of the Schoenfeld residuals demonstrated the Cox model satisfied the PH assumption, with P values of all variables above 0.05 (Supplementary Figure 2).
RSF model: We treated age as a continuous variable, gender, marital status, race, primary site, histology type, size, T stage, M stage, LNR, radiotherapy, chemotherapy as categorical variables, and directly admitted them in the establishment of the RSF model. Then, we took advantage of the Optuna algorithm to confirm the optimal hyperparameter combination of RSF model: 330 estimators, 5 minimum of samples split, and 1 minimum of samples leaf. Supplementary Figure 3A depicted the optimization history, Supplementary Figure 3B depicted the hyperparameters importance, and Supplementary Figure 3C depicted the high-dimensional parameter relationships.
The performance of the developed models
CoxPH model: The C-index of the CoxPH model in the training set, validation set, and the test set were 0.834 (95%CI: 0.789-0.879), 0.871 (95%CI: 0.802-0.940), and 0.744 (95%CI: 0.665-0.822) respectively. AUCs for 1-, 3-, and 5-year DSS in the training set were 0.848 (95%CI: 0.763-0.930), 0.881 (95%CI: 0.831-0.932), and 0.875 (95%CI: 0.822-0.927). AUCs for 1-, 3-, and 5-year DSS in the validation set were 0.843 (95%CI: 0.717-0.969), 0.948 (95%CI: 0.892-1.000), and 0.990 (95%CI: 0.969-1.000). AUCs for 1-, 3-, and 5-year DSS in the test set were 0.786 (95%CI: 0.622-0.889), 0.834 (95%CI: 0.735-0.934), and 0.810 (95%CI: 0.688-0.931). The 1-, 3-, and 5-year calibration curves of the CoxPH model in the test set (Supplementary Figure 4A-C) indicated a good calibration.
RSF model: The C-index of the RSF model in the training set, validation set, and the test set were 0.940 (95%CI: 0.924-0.956), 0.870 (95%CI: 0.818-0.921), and 0.769 (95%CI: 0.691-0.846) respectively. AUCs for 1-, 3-, and 5-year DSS in the training set were 0.962 (95%CI: 0.938-0.989), 0.979 (95%CI: 0.963-0.995), and 0.971 (95%CI: 0.951-0.992). AUCs for 1-, 3-, and 5-year DSS in the validation set were 0.867 (95%CI: 0.761-0.973), 0.955 (95%CI: 0.899-1.000), and 0.986 (95%CI: 0.960-1.000). AUCs for 1-, 3-, and 5-year DSS in the test set were 0.803 (95%CI: 0.608-0.891), 0.895 (95%CI: 0.814-0.976), and 0.869 (95%CI: 0.769-0.970). The 1-, 3-, and 5-year calibration curves of the RSF model (Figure 4A-C) in the test set also indicated a good calibration. We summarized the performance of the CoxPH model and the RSF model in Table 2.
Figure 4 Calibration curves of the random survival forest model for 1-year, 3-year, and 5-year disease-specific survival in the test set.
A: Calibration curves of the random survival forest (RSF) model for 1-year disease-specific survival (DSS); B: Calibration curves of the RSF model for 3-year DSS; C: Calibration curves of the RSF model for 5-year DSS. DSS: Disease-specific survival.
Table 2 The performance of the Cox proportional hazards model and the random survival forest model.
Model
Dataset
C-index
AUC
1-year
3-year
5-year
CoxPH
Training
0.834 (0.789-0.879)
0.848 (0.763-0.930)
0.881 (0.831-0.932)
0.875 (0.822-0.927)
Validation
0.871 (0.802-0.940)
0.843 (0.717-0.969)
0.948 (0.892-1.000)
0.990 (0.969-1.000)
Test
0.744 (0.665-0.822)
0.786 (0.622-0.889)
0.834 (0.735-0.934)
0.810 (0.688-0.931)
RSF
Training
0.940 (0.924-0.956)
0.962 (0.938-0.989)
0.979 (0.963-0.995)
0.971 (0.951-0.992)
Validation
0.870 (0.818-0.921)
0.867 (0.761-0.973)
0.955 (0.899-1.000)
0.986 (0.960-1.000)
Test
0.769 (0.691-0.846)
0.803 (0.608-0.891)
0.895 (0.814-0.976)
0.869 (0.769-0.970)
Comparison between the AJCC TNM staging, the CoxPH model, and the RSF model
We compared the performance of the 8th AJCC TNM staging, the CoxPH model, and the RSF model in the test set. The C-index of the 8th AJCC TNM staging, the CoxPH model, and the RSF model in the test set were 0.723 (95%CI: 0.613-0.833), 0.744 (95%CI: 0.665-0.822), and 0.769 (95%CI: 0.691-0.846) respectively. The RSF model owned the best C-index. Figure 5 depicted the receiver operating characteristic curves of 1-, 3-, and 5-year DSS in the test set. The 1-, 3-, and 5-year AUCs of the AJCC TNM staging were 0.690, 0.769, and 0.77. The 1-, 3-, and 5-year AUCs of the CoxPH model were 0.786, 0.834, and 0.810. The 1-, 3-, and 5-year AUCs of the RSF model were 0.803, 0.895, and 0.869. We could find the RSF model still owned the higher 1-, 3-, and 5-year AUCs than the other two. Furthermore, we depicted the 1-, 3-, and 5-year DCA of the above three models in the test set (Figure 6). We found the RSF owned higher clinical net benefits in 3-year and 5-year.
Figure 5 The 1-year, 3-year, and 5-year receiver operating characteristic curves for the American Joint Committee on Cancer tumor-node-metastasis staging, the Cox proportional hazards model, and the random survival forest model in the test set.
A: The 1-year receiver operating characteristic (ROC) curves; B: The 3-year ROC curves; C: The 5-year ROC curves. TNM: Tumor-node-metastasis; RSF: Random survival forest; CoxPH: Cox proportional hazards.
Figure 6 The 1-year, 3-year, and 5-year decision curve analysis curves for the American Joint Committee on Cancer tumor-node-metastasis staging, the Cox proportional hazards model, and the random survival forest model in the test set.
A: The 1-year decision curve analysis (DCA) curves; B: The 3-year DCA curves; C: The 5-year DCA curves. TNM: Tumor-node-metastasis; RSF: Random survival forest; CoxPH: Cox proportional hazards.
The interpretation for the RSF model
We interpreted the RSF model with SHAP plot (Figure 7). A dot on the SHAP plot represents a sample. Y-axis variables are sorted according to the variables’ importance. We found histology type was thought the most valuable variable, followed by LNR, T stage, M stage and so on. The X-axis represents the influence of variables on the model output outcome. The left side of X = 0.0 represents the negative effect of variables on the outcome, and the right side of X = 0.0 represents the positive effect of variables on the outcome. Taking LNR as an example, the higher the LNR level, the worse the prognosis of patients.
Figure 7 The SHapley Additive exPlanations plot of the random survival forest model.
A dot on the SHapley Additive exPlanations plot represents a sample. Y-axis variables are sorted according to the variables’ importance. The X-axis represents the influence of variables on the model output outcome. The left side of X = 0.0 represents the negative effect of variables on the outcome, and the right side of X = 0.0 represents the positive effect of variables on the outcome. Taking lymph node ratio as an example, the higher the lymph node ratio level, the worse the prognosis of patients. LNR: Lymph node ratio; SHAP: SHapley Additive exPlanations.
Patients’ risk stratification
We computed patients’ risk scores based on the RSF model and divided them into low-risk group (risk score < 9.61), medium-risk group (9.61 ≤ risk score ≤ 26.10), and high-risk group (risk score > 26.10) with X-tile tool (Supplementary Figure 5). The Kaplan-Meier survival analysis in training set, internal validation set and test set demonstrated that the DSS between the high-risk group, medium-risk group, and low-risk group were significant different (Figure 8). The log-rank test revealed P values were all less than 0.05. The 1-, 3- and 5- years DSS of different risk groups in the training set were observed to be 99% vs 87% vs 45%, 84% vs 48% vs 3%, 57% vs 20% vs 0% respectively. The 1-, 3- and 5- years DSS of different risk groups in the validation set were observed to be 96% vs 68% vs 50%, 79% vs 32% vs 0%, 64% vs 5% vs 0% respectively. The 1-, 3- and 5- years DSS of different risk groups in the test set were observed to be 94% vs 80% vs 33%, 81% vs 31% vs 0%, 62% vs 16% vs 0%, respectively.
Figure 8 Kaplan-Meier survival curves of the high-, medium-, low-risk group in the training set, validation set, and the test set.
A: Training set; B: Validation set; C: Test set.
Individual prediction
We randomly select three patients from the test set, and then draw the survival probability curves and local SHAP plots of the three patients respectively. Figure 9A exhibited three patients’ survival probability in each time point. Figure 9B-D explained the prognosis of the three patients from each variable’s contribution to the outcome in detail.
Figure 9 The individual prediction for postoperative gastric neuroendocrine neoplasm patients.
A: The survival probability at each time point of patient #1, patient #2, and patient #3; B: The local SHapley Additive exPlanations (SHAP) plot of the patient #1; C: The local SHAP plot of the patient #2; D: The local SHAP plot of the patient #3. The red color illustrated the variable was positively correlated with the outcome of the patient and the blue color illustrated the variable was negatively correlated with the outcome of the patient. LNR: Lymph node ratio; SHAP: SHapley Additive exPlanations.
Patient #1: Age: 63 years, sex: Male, married: Yes, race: Asian, primary site: Middle 1/3, histology: NEC, tumor size: 4 cm, T stage: T4, M stage: M1, LNR: 0.667, radiation: No, chemotherapy: Yes. The follow-up time was 19 months. The status was dead at the deadline of follow-up.
Patient #2: Age: 61 years, sex: Male, married: Yes, race: Asian, primary site: Upper 1/3, histology: MANEC, tumor size: 6 cm, T stage: T3, M stage: M0, LNR: 0.167, radiation: Yes, chemotherapy: No. The follow-up time was 42 months. The status was dead at the deadline of follow-up.
Patient #3: Age: 76 years, sex: Male, married: Yes, race: Asian, primary site: Upper 1/3, histology: NEC, tumor size: 2.8 cm, T stage: T2, M stage: M0, LNR: 0, radiation: No, chemotherapy: No. The follow-up time was 127 months. The status was alive at the deadline of follow-up.
DISCUSSION
With the increasing incidence of gastric NENs, more attention was paid to the prognosis of the tumor. Multiple researches had confirmed that lymph node metastasis plays a crucial role in the prognosis of gastric NENs[27-29]. Recently, the relationship between metastatic lymph node number and LNR has been analyzed in gastric cancer[30]. Compared with the factor of lymph node metastasis, the factor of LNR is less influenced by the number of lymph nodes resected and may be more suitable for inclusion in prognosis evaluation[31]. However, researches about the influence of LNR on the prognosis of gastric NENs are still limited due to the rarity of the disease. In this research, we collected patients from the SEER dataset and the First Affiliated Hospital of Soochow University to explore the relationship between LNR and the prognosis of postoperative gastric NENs patients. Moreover, we respectively developed the CoxPH model and the RSF model to predict DSS of postoperative gastric NEN patients.
LNR has been proved to be crucial in prognosis evaluation in many cancers, such as gastric, colon, rectal, and breast cancer[32-34]. A large retrospective single-institutional studies divide gastric cancer patients into two parts based on the 0.25 cut-off value of LNR and demonstrated that LNR was negatively associated with overall survival[32]. The prognostic value of LNR in other sites of NENs has also been researched and demonstrated, such as lung, small intestinal, pancreas, and colon[35-38]. We find out two articles concerning the effect of LNR on gastric NENs. However, due to the small simple sizes or lack of the external validation set, researches have no enough persuasion[31,39]. In our study, we select the 286 patients from the SEER database as a training set and validation set, 92 patients from our hospital as an external test set. The optimal cut-off value for LNR was defined as 0.20 by using X-tile software, and patients were divided into three groups (LNR = 0, 0 < LNR ≤ 0.2, LNR > 0.2). Our research shows that the DSS have significant differences in the three LNR groups, no matter in the training set and the test set. The higher the LNR, the lower the DSS rates.
Researchers had previously built various prognostic models for gastric NENs, using traditional Cox regression analysis[31,40-42]. Cao et al[41] develop a nomogram to predict DSS for gastric NENs, with a higher C-index compared with the traditional AJCC TNM staging (C-index: 0.899 vs 0.864). Song et al[42] construct a nomogram to predict DSS for gastric NEC, also outperforming the traditional AJCC TNM staging. However, the significant drawback is that the Cox regression analysis is limited by the PH assumption and can only explore linear relationships between variables, even though sometimes the relationship between variables and survival outcomes is complex and non-linear[43].
The repaid development in AI facilitated the establishment of the required nonlinear prognostic models. The RSF model, as a novel nonlinear ML model for survival analysis, adopted an ensemble tree method to analyze censored survival data[18]. The RSF model was no need assuming the influence of all variables on the risk function is linear and applied the internal cross-validation to avoid overfitting, ensuring the high predictive accuracy[44]. Lin et al[19] developed a RSF model to predict the DSS in pancreatic cancer, outperforming the Cox regression model (C-index: 0.723 vs 0.670). Kar et al[45] also found the RSF model had better performance in predicting relapse in stage I non-small cell lung cancer patients, compared with the CoxPH model. In our research, we concluded the RSF model had favorable performance in predicting DSS of postoperative gastric NEN patients, with C-index in the external test set was 0.769 (95%CI: 0.691-0.846), superior to the CoxPH model which was 0.744 (95%CI: 0.665-0.822).
In our research, we made use of the SHAP plot to rank the importance of variables in the RSF model. We found histology, LNR and T stage ranked the top three. Compared with the well-differentiated NET, the poorly-differentiated NEC and MANEC tend to have more aggressive behavior and a worse prognosis[46]. Gastric MANEC was defined as containing gastric adenocarcinoma and NEC cells, with each proportion accounting for at least 30%[24]. Several researches had evaluated the prognosis of gastric MANEC. Fernandes et al[47] concluded that gastric MANEC prognosis is decided by the more aggressive component. Chen et al[48] concluded that a higher rate of NEC component indicating a poorer prognosis. We found LNR ranked second. This also demonstrated LNR plays a crucial role in the construction of the RSF model and evaluating patients’ prognosis. The higher the LNR, the lower the DSS of gastric NENs patients. T stage represented the infiltration depth of tumor. T stage as was an independent risk factor for gastric NENs has been demonstrated in many researches[42,49,50].
In addition, we made use of the RSF model for risk stratification and individualized survival prediction. When risk stratification is performed, it is easier for physicians to identify high-risk groups (risk score > 26.10) and give patients clinical intervention in time. For individualized survival prediction, we combined survival probability curves and local SHAP plots. The survival curves can dynamically reflect the change of patients’ survival probability over time, which are more flexible and intuitive. The local SHAP clearly shows the impact of all admitted variables on each patient’ s outcome, which is easy to understand.
We are also aware of the potential limitations of the research. Firstly, our research was retrospective and may exist selection bias[51]. Secondly, some important histopathological information cannot be found in SEER dataset, such as Ki-67 index, vascular invasion, neural invasion which are important for evaluating tumor prognosis[52,53]. Thirdly, the training set and the internal validation set are all derived from the SEER database and lack multicenter external validation sets.
CONCLUSION
In conclusion, our research demonstrated that the LNR was negatively correlated with the prognosis of postoperative gastric NEN patients. The novel constructed RSF model presented the best performance in predicting DSS of postoperative gastric NEN patients, compared with the CoxPH model and the 8th AJCC TNM staging. Furthermore, we performed risk stratification and individual prognosis prediction with the RSF model. The RSF algorithm showed potential in clinical practice.
Footnotes
Provenance and peer review: Unsolicited article; Externally peer reviewed.
Peer-review model: Single blind
Specialty type: Oncology
Country of origin: China
Peer-review report’s classification
Scientific Quality: Grade A
Novelty: Grade A
Creativity or Innovation: Grade A
Scientific Significance: Grade B
P-Reviewer: Vinh-Hung V, Martinique S-Editor: Wang JJ L-Editor: A P-Editor: Xu ZH
O'Connor JM, Marmissolle F, Bestani C, Pesce V, Belli S, Dominichini E, Mendez G, Price P, Giacomi N, Pairola A, Loria FS, Huertas E, Martin C, Patane K, Poleri C, Rosenberg M, Cabanne A, Kujaruk M, Caino A, Zamora V, Mariani J, Dioca M, Parma P, Podesta G, Andriani O, Gondolesi G, Roca E. Observational study of patients with gastroenteropancreatic and bronchial neuroendocrine tumors in Argentina: Results from the large database of a multidisciplinary group clinical multicenter study.Mol Clin Oncol. 2014;2:673-684.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 34][Cited by in F6Publishing: 39][Article Influence: 3.9][Reference Citation Analysis (0)]
Niederle MB, Hackl M, Kaserer K, Niederle B. Gastroenteropancreatic neuroendocrine tumours: the current incidence and staging based on the WHO and European Neuroendocrine Tumour Society classification: an analysis based on prospectively collected parameters.Endocr Relat Cancer. 2010;17:909-918.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 298][Cited by in F6Publishing: 296][Article Influence: 21.1][Reference Citation Analysis (1)]
Iwasaki K, Barroga E, Enomoto M, Tsurui K, Shimoda Y, Matsumoto M, Miyoshi K, Ota Y, Matsubayashi J, Nagakawa Y. Long-term surgical outcomes of gastric neuroendocrine carcinoma and mixed neuroendocrine-non-neuroendocrine neoplasms.World J Surg Oncol. 2022;20:165.
[PubMed] [DOI][Cited in This Article: ][Cited by in F6Publishing: 4][Reference Citation Analysis (0)]
Nitti D, Marchet A, Olivieri M, Ambrosi A, Mencarelli R, Belluco C, Lise M. Ratio between metastatic and examined lymph nodes is an independent prognostic factor after D2 resection for gastric cancer: analysis of a large European monoinstitutional experience.Ann Surg Oncol. 2003;10:1077-1085.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 140][Cited by in F6Publishing: 154][Article Influence: 7.3][Reference Citation Analysis (0)]
Occhionorelli S, Andreotti D, Vallese P, Morganti L, Lacavalla D, Forini E, Pascale G. Evaluation on prognostic efficacy of lymph nodes ratio (LNR) and log odds of positive lymph nodes (LODDS) in complicated colon cancer: the first study in emergency surgery.World J Surg Oncol. 2018;16:186.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 13][Cited by in F6Publishing: 19][Article Influence: 3.2][Reference Citation Analysis (0)]
Lu YJ, Lin PC, Lin CC, Wang HS, Yang SH, Jiang JK, Lan YT, Lin TC, Liang WY, Chen WS, Lin JK, Chang SC. The impact of the lymph node ratio is greater than traditional lymph node status in stage III colorectal cancer patients.World J Surg. 2013;37:1927-1933.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 24][Cited by in F6Publishing: 26][Article Influence: 2.6][Reference Citation Analysis (0)]
Khan M, Ali M, Najeh T, Gamil Y. Computational prediction of workability and mechanical properties of bentonite plastic concrete using multi-expression programming.Sci Rep. 2024;14:6105.
[PubMed] [DOI][Cited in This Article: ][Reference Citation Analysis (0)]
Ouyang D, Shi M, Wang Y, Luo L, Huang L. Prognostic analysis of pT1-T2aN0M0 cervical adenocarcinoma based on random survival forest analysis and the generation of a predictive nomogram.Front Oncol. 2022;12:1049097.
[PubMed] [DOI][Cited in This Article: ][Reference Citation Analysis (0)]
Akiba T, Sano S, Yanase T, Ohta T, Koyama M.
Optuna. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; 2019 Aug 4-8; Anchorage, United States. New York, Association for Computing Machinery, 2019.
[PubMed] [DOI][Cited in This Article: ]
Lin J, Zhao Y, Zhou Y, Tian Y, He Q, Lin J, Hao H, Zou B, Jiang L, Zhao G, Lin W, Xu Y, Li Z, Xue F, Li S, Fu W, Li Y, Xu Z, Li Y, Chen J, Zhou X, Zhu Z, Cai L, Li E, Li H, Zheng C, Li P, Huang C, Xie J. Comparison of Survival and Patterns of Recurrence in Gastric Neuroendocrine Carcinoma, Mixed Adenoneuroendocrine Carcinoma, and Adenocarcinoma.JAMA Netw Open. 2021;4:e2114180.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 11][Cited by in F6Publishing: 29][Article Influence: 9.7][Reference Citation Analysis (0)]
Wang L. Prognostic Values of MLNn and MLNr for Gastric Cancer Patients Receiving Chemoradiotherapy and Lesser Curvature Resection.Clin Lab. 2022;68.
[PubMed] [DOI][Cited in This Article: ][Reference Citation Analysis (0)]
He Z, Li D, Xu Y, Wang H, Gao J, Zhang Z, Chen K. Prognostic significance of metastatic lymph node ratio in patients with gastric cancer after curative gastrectomy: a single-center retrospective study.Scand J Gastroenterol. 2022;57:832-841.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 1][Reference Citation Analysis (0)]
Xiong L, Jiang Y, Hu T. Prognostic nomograms for lung neuroendocrine carcinomas based on lymph node ratio: a SEER database analysis.J Int Med Res. 2022;50:3000605221115160.
[PubMed] [DOI][Cited in This Article: ][Reference Citation Analysis (0)]
Grillo F, Albertelli M, Malandrino P, Dotto A, Pizza G, Cittadini G, Colao A, Faggiano A. Prognostic Effect of Lymph Node Metastases and Mesenteric Deposits in Neuroendocrine Tumors of the Small Bowel.J Clin Endocrinol Metab. 2022;107:3209-3221.
[PubMed] [DOI][Cited in This Article: ][Reference Citation Analysis (0)]
Song X, Xie Y, Lou Y. A novel nomogram and risk stratification system predicting the cancer-specific survival of patients with gastric neuroendocrine carcinoma: a study based on SEER database and external validation.BMC Gastroenterol. 2023;23:238.
[PubMed] [DOI][Cited in This Article: ][Reference Citation Analysis (0)]
Kar İ, Kocaman G, İbrahimov F, Enön S, Coşgun E, Elhan AH. Comparison of deep learning-based recurrence-free survival with random survival forest and Cox proportional hazard models in Stage-I NSCLC patients.Cancer Med. 2023;12:19272-19278.
[PubMed] [DOI][Cited in This Article: ][Reference Citation Analysis (0)]
Xie J, Zhao Y, Zhou Y, He Q, Hao H, Qiu X, Zhao G, Xu Y, Xue F, Chen J, Su G, Li P, Zheng CH, Huang CM. Predictive Value of Combined Preoperative Carcinoembryonic Antigen Level and Ki-67 Index in Patients With Gastric Neuroendocrine Carcinoma After Radical Surgery.Front Oncol. 2021;11:533039.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 4][Cited by in F6Publishing: 4][Article Influence: 1.3][Reference Citation Analysis (0)]
Tian FX, Cai YQ, Zhuang LP, Chen MF, Xiu ZB, Zhang Y, Liu H, Liu ZH, Liu GP, Zeng C, Lin FL, Liu J, Huang ST, Zhang LZ, Lin HY. Clinicopathological features and prognosis of patients with gastric neuroendocrine tumors: A population-based study.Cancer Med. 2018;7:5359-5369.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 4][Cited by in F6Publishing: 7][Article Influence: 1.2][Reference Citation Analysis (0)]
Xu Y, Yan L, Chen T, Hu P, Bai J, Ye T, Long Q, Tang Q. Prognosis of patients with poorly differentiated gastric neuroendocrine neoplasms: a multi-center study in China.Future Oncol. 2022;18:2465-2473.
[PubMed] [DOI][Cited in This Article: ][Reference Citation Analysis (0)]