Published online Feb 7, 2024. doi: 10.3748/wjg.v30.i5.450
Peer-review started: October 16, 2023
First decision: December 6, 2023
Revised: December 19, 2023
Accepted: January 12, 2024
Article in press: January 12, 2024
Published online: February 7, 2024
Processing time: 106 Days and 20.3 Hours
Colorectal cancer (CRC) is a serious threat worldwide. Although early screening is suggested to be the most effective method to prevent and control CRC, the current situation of early screening for CRC is still not optimistic. In China, the incidence of CRC in the Yangtze River Delta region is increasing dramatically, but few studies have been conducted. Therefore, it is necessary to develop a simple and efficient early screening model for CRC.
To develop and validate an early-screening nomogram model to identify individuals at high risk of CRC.
Data of 64448 participants obtained from Ningbo Hospital, China between 2014 and 2017 were retrospectively analyzed. The cohort comprised 64448 individuals, of which, 530 were excluded due to missing or incorrect data. Of 63918, 7607 (11.9%) individuals were considered to be high risk for CRC, and 56311 (88.1%) were not. The participants were randomly allocated to a training set (44743) or validation set (19175). The discriminatory ability, predictive accuracy, and clinical utility of the model were evaluated by constructing and analyzing receiver operating characteristic (ROC) curves and calibration curves and by decision curve analysis. Finally, the model was validated internally using a bootstrap resampling technique.
Seven variables, including demographic, lifestyle, and family history information, were examined. Multifactorial logistic regression analysis revealed that age [odds ratio (OR): 1.03, 95% confidence interval (CI): 1.02-1.03, P < 0.001], body mass index (BMI) (OR: 1.07, 95%CI: 1.06-1.08, P < 0.001), waist circumference (WC) (OR: 1.03, 95%CI: 1.02-1.03 P < 0.001), lifestyle (OR: 0.45, 95%CI: 0.42-0.48, P < 0.001), and family history (OR: 4.28, 95%CI: 4.04-4.54, P < 0.001) were the most significant predictors of high-risk CRC. Healthy lifestyle was a protective factor, whereas family history was the most significant risk factor. The area under the curve was 0.734 (95%CI: 0.723-0.745) for the final validation set ROC curve and 0.735 (95%CI: 0.728-0.742) for the training set ROC curve. The calibration curve demonstrated a high correlation between the CRC high-risk population predicted by the nomogram model and the actual CRC high-risk population.
The early-screening nomogram model for CRC prediction in high-risk populations developed in this study based on age, BMI, WC, lifestyle, and family history exhibited high accuracy.
Core Tip: This was the first large-scale study to investigate early screening for detection of colorectal cancer (CRC) in Ningbo, China, which was part of the national early screening CRC program. The study focused on collecting information on the general population who attended annual health checks. Our findings showed that the area under the curve was 0.734 for the final validation set receiver operating characteristic (ROC) curve and 0.735 for the training set ROC curve. Therefore, we developed an early screening model with high accuracy for CRC.
- Citation: Xu LL, Lin Y, Han LY, Wang Y, Li JJ, Dai XY. Development and validation of a prediction model for early screening of people at high risk for colorectal cancer. World J Gastroenterol 2024; 30(5): 450-461
- URL: https://www.wjgnet.com/1007-9327/full/v30/i5/450.htm
- DOI: https://dx.doi.org/10.3748/wjg.v30.i5.450
Colorectal cancer (CRC) has become the third most common cancer worldwide[1]. The incidence of CRC in China increased from 42.74/100000 in 1990 to 8.95/100000 in 2019 and has been increasing annually in the Yangtze River Delta region[2]. A cross-sectional study on CRC knowledge and awareness in the Caribbean in 2020 revealed that only 54.7% of people were aware of the risk factors for CRC[3]. In addition, a questionnaire survey of diagnostic delays and their predictive factors in 303 CRC patients in southern China in 2020 found that the incidence of prolonged diagnostic delays was 57.8%[4]. The study found that the diagnostic delays were attributed to a variety of factors, such as a lack of knowledge of the risk factors for CRC and a reluctance to undergo CRC screening. This suggests that awareness of high-risk factors for CRC in China is insufficient, and colonoscopic screening has yet to become popular. Therefore, it is critical to identify individuals with high-risk CRC at an early stage.
In 2022, a study conducted at Memorial Sloan-Kettering Cancer Center found that altered bowel habits accounted for 24.7% of the common symptoms of CRC[5]. Some people recognize such symptoms and visit a hospital for treatment; however, a unified approach has yet to be developed in China for early screening of people at high risk of CRC. Previously, CRC screening was based on colonoscopy, fecal occult blood test, and abdominal computed tomography[6], which have some limitations, such as high cost and poor compliance. Therefore, it is important to utilize risk factors that are easily obtained in screening settings to develop simple and convenient early screening models capable of identifying those at high risk of CRC.
Some CRC risk-prediction models have been constructed. For instance, a risk-prediction model for advanced CRC in asymptomatic adults in the United States established by Imperiale et al[7], a risk-prediction model for CRC in Caucasian patients in Poland established by Kaminski et al[8], and a risk-prediction model for advanced CRC in Germany established by Tao et al[9]. Most of these models based their predictions on demographic information. In contrast, most of the CRC risk-prediction models developed in China were based on genomics, lifestyle habits, and dietary habits, such as those developed by Wong et al[10] and Sung et al[11]. Cai et al[12] constructed a CRC risk-prediction model in a retro
CRC has many causes; therefore, assessing and screening high-risk groups for CRC can be complicated. It requires examination of dietary habits[13], lifestyle habits[14], genetic factors[15], environmental factors[16], and emotional and psychological factors[17]. High-calorie, high-fat, and high-protein diets; consumption of pickled foods; and unhealthy lifestyle habits, such as staying up late, smoking, and drinking alcohol, can cause CRC[18]. China is a huge country with 56 ethnic groups, and the large variations in lifestyles, dietary habits, and environmental factors in the country mean that constructing a prediction model for early screening of CRC in high-risk groups may be complex.
Here, we collected the demographic information, living habits, dietary habits, and family history of a large CRC early screening cohort in the China Urban Cancer Early Diagnosis and Treatment Program, Ningbo, from 2014 to 2017. These data were examined via a backward Wald logistic regression analysis to identify the risk factors for CRC. These risk factors were used to establish a prediction model for screening groups at high risk for CRC at an early stage. We believe that this model will provide a basis for the accurate identification of high-risk groups in future CRC screening efforts.
This retrospective study was conducted in Ningbo Hospital, China, from 2014 to 2017. Of the 64448 participants, 530 were excluded due to missing or incorrect data. Of the remaining participants, 7607 (11.9%) were considered high risk for CRC, and 56311 (88.1%) were not. The inclusion criteria were as follows: (1) Permanent household registration in the city (living in the local area for > 3 years); (2) age 40-74 years; and (3) ability to sign the informed consent form unaided. The exclusion criteria were as follows: (1) An abnormal identifier number; and (2) a previous CRC diagnosis. All the participants provided written informed consent. The study protocol complied with the Declaration of Helsinki and was approved by the Ethical Review Board of Ningbo No. 2 Hospital (approval number: YJ-NBEY-KY-2023-060-01).
Demographic information, dietary habits, living habits, and family history of the participants in the validation and training sets were obtained via questionnaires. The questionnaires collected details of dietary habits, including usual food intake, food preference, and dietary behavior, using the food frequency questionnaire. The questionnaires were administered by uniformly trained and qualified investigators through face-to-face questioning. We selected variables based on prior knowledge of the underlying biology and epidemiology of CRC and relevant predictors. This yielded seven variables that covered basic information, lifestyle, and family history.
Basic information comprised age, sex, ethnicity, body mass index (BMI), and waist circumference (WC). Age was categorized according to the United Nations New Standard for the Classification of Human Ages: young adults ≤ 65 years and middle-aged and older adults > 65 years[19]. Minors < 18 years of age were not included in the study. Sex was categorized as male or female. Ethnicity was divided into Han nationality and other ethnic groups, according to the results of the questionnaire. BMI was based on the BMI classification standard for Chinese adults[20]: Underweight < 18.50 kg/m2, normal weight 18.50-23.99 kg/m2, overweight 24.0-27.99 kg/m2, and obese ≥ 28.0 kg/m2. The median WC was 80 cm (SD: 50-184 cm).
Lifestyle included dietary habits and living habits. With reference to the standards of Dietary Guidelines for Chinese Residents (2007 edition) and the Dietary Reference Intake of Nutrients for Chinese Residents (2013 edition) and based on the discussion involving a group of relevant experts and scholars, we used the following definitions to determine the intake of substances and their frequencies (dietary habits) and the specific behaviors (living habits) in the questionnaire survey.
Dietary habits included taste, oil consumption, frequency of pickled and sun-cured food intake, and weekly consumption of fresh vegetables, fresh fruits, meat, and coarse grains. Taste was classified into three levels based on salt consumption: double-salt taste (> 5 g/d), moderate-salt taste (5 g/d), and low-salt taste (< 5 g/d). Oil consumption was classified into three levels based on the amount of cooking oil ingested per day: High oil consumption (> 30 g/d), moderate oil consumption (25-30 g/d), and low oil consumption (< 25 g/d). The frequency of pickled and sun-cured food intake was classified into the following three levels: never, < 3 times/wk (i.e., sometimes), and ≥ 3 times/wk (i.e., often). Weekly consumption of fresh vegetables, fresh fruits, meat, and coarse grains was classified as follows: fresh vegetables (0 kg/wk, < 2.5 kg/wk, or ≥ 2.5 kg/wk); fresh fruits (0 kg/wk, < 1.25 kg/wk, or ≥ 1.25 kg/wk); meat (0 kg/wk, < 0.35 kg/wk, or ≥ 0.35 kg/wk), and coarse grains (0 kg/wk, < 0.5 kg/wk, or ≥ 0.5 kg/wk).
Living habits included smoking, alcohol consumption, and physical activity. Smoking was defined as having smoked > 1 cigarette/d for > 6 consecutive or cumulative months, and smoking cessation was defined as not having smoked for ≥ 2 years. Thus, smoking was classified into three levels: never smoker (no), current smoker (yes), and ever smoker but currently not a smoker (quit smoking). Drinking alcohol was defined as having consumed an average of at least 1 drink/wk for > 6 mo, and abstinence was defined as not having had a drink for ≥ 1 year. Thus, alcohol consumption was classified into three levels: never drinker (no), current drinker (yes), and ever drinker but currently abstaining (quit drinking). Physical activity was defined as effective physical activity of > 30 min/session, with an average of ≥ 3 sessions/wk, and was categorized into two levels: ≤ 3 times/wk (no) and > 3 times/wk (yes). Family history was categorized as no family history of CRC (no) and a family history of CRC (yes).
The diagnosis of participants with a high risk of CRC was made by at least two experienced anorectal surgeons who were experts in the field, based on the following conditions: (1) A positive test for fecal occult blood[21]; (2) a first-degree relative with a history of CRC[22]; (3) a history of intestinal polyps or adenomas[23]; (4) a history of cancer or other malignancies[24]; (5) a change in bowel habits[25]; and (6) any two of the following conditions: chronic diarrhea, chronic constipation, mucus bloody stools, a history of chronic appendicitis or appendectomy, a history of chronic cholecystitis or cholecystectomy, and chronic mental depression[24]. Participants were diagnosed as having a high risk of CRC if they had any one of the conditions from 1 to 5 and any two of the conditions listed in 6. After a series of evaluations, those without any of the above conditions were not considered high risk for CRC.
We used χ2 tests to assess the characteristic differences in baseline data, 2014-2017 separately, between the high-risk and non-high-risk groups. Cluster analysis was used to categorize the participants based on their dietary and living habits as either having a healthy or an unhealthy lifestyle. A healthy lifestyle was considered an intake of fresh vegetables, fresh fruits, and coarse grains and participation in physical activity. An unhealthy lifestyle was considered an intake of meat, pickled and sun-cured food, oily food, and double-salted food; smoking; and alcohol consumption. A random sampling method was used to allocate the participants to a training set or a validation set in the ratio of 7:3. Each participant was considered as a randomized unit with the same probability of being selected. We performed a multifactorial logistic regression analysis by introducing variables with P < 0.05 as independent predictor variables into the training set. The strength of the association between predictors and participants with a high risk of CRC was assessed by calculating the ORs and 95%CIs. Meaningful variables were selected based on a backward Wald logistic regression analysis and were used to construct a nomogram model. The discriminative ability, predictive accuracy, and clinical value of the model were evaluated by constructing and analyzing the ROC and calibration curves and by performing decision curve analysis (DCA). Five hundred bootstrap resamples were used to reduce overfitting bias. All of the statistical analyses were conducted using R (version 4.3.0) and SPSS (version 25.0), and P < 0.05 was considered to indicate statistical significance.
Table 1 shows the basic information, dietary habits, living habits, and family histories of both groups in 2014-2017. Compared with the CRC non-high-risk group, the basic information revealed that the CRC high-risk group had more men and the participants were older, and had higher BMI and WC. The dietary habits data revealed that the CRC high-risk group had a lower weekly intake of fresh vegetables, fresh fruits, and coarse grains; had a higher weekly intake of meat; had more participants with double-salt taste and high oil consumption; and consumed pickled and sun-cured foods more frequently. Furthermore, participants in the CRC high-risk group were more likely to smoke, drink alcohol, and not perform physical activity. The family history data showed that the CRC high-risk group typically had a family history of the disease.
Variable | 2014 | P value | 2015 | P value | 2016 | P value | 2017 | P value | |||||
Basic information | Non-high risk group, n (%) | High risk group, n (%) | Non-high risk group, n (%) | High risk group, n (%) | Non-high risk group, n (%) | High risk group, n (%) | Non-high risk group, n (%) | High risk group, n (%) | |||||
Sex | Male | 8514 (43.24) | 888 (43.94) | 0.545 | 14277 (53.80) | 1486 (57.46) | 0.0002 | 2359 (54.82) | 711 (54.61) | 0.892 | 3276 (56.68) | 899 (50.70) | 0.004 |
Female | 11177 (56.76) | 1133 (56.06) | 12260 (46.20) | 1092 (42.36) | 1944 (45.18) | 591 (45.39) | 2504 (43.32) | 807 (47.30) | |||||
Age (yr) | ≤ 65 | 16311 (82.83) | 1573 (77.83) | < 0.001 | 22960 (86.52) | 2139 (82.97) | < 0.001 | 3186 (74.04) | 979 (75.19) | 0.405 | 4499 (77.84) | 1287 (75.44) | 0.038 |
> 65 | 3380 (17.17) | 448 (22.17) | 3577 (13.48) | 439 (17.03) | 1117 (25.96) | 323 (24.81) | 1281 (22.16) | 419 (24.56) | |||||
Ethnicity | The Han nationality | 19641 (99.75) | 2015 (99.70) | 0.717 | 26474 (99.76) | 2572 (99.77) | 0.963 | 4298 (99.88) | 1302 (100) | 0.219 | 5766 (99.76) | 1701 (99.71) | 0.714 |
Other | 50 (0.25) | 6 (0.30) | 63 (0.24) | 6 (0.23) | 5 (0.12) | 0 (0.00) | 14 (0.24) | 5 (0.29) | |||||
BMI (kg/m2) | < 18.50 | 541 (2.75) | 46 (2.28) | < 0.001 | 644 (2.43) | 59 (2.29) | < 0.001 | 104 (2.42) | 26 (2.00) | < 0.001 | 132 (2.28) | 33 (1.93) | < 0.001 |
18.50-23.99 | 10988 (55.80) | 869 (43.00) | 16050 (60.48) | 1211 (46.97) | 2497 (28.03) | 694 (53.30) | 3376 (58.41) | 852 (49.94) | |||||
24-27.99 | 6899 (35.04) | 773 (38.25) | 8586 (32.35) | 1011 (39.22) | 1502 (34.91) | 444 (34.10) | 1971 (34.10) | 650 (38.10) | |||||
≥ 28.00 | 1263 (6.41) | 333 (16.48) | 1257 (4.74) | 297 (11.52) | 200 (4.65) | 138 (10.60) | 301 (5.21) | 171 (10.02) | |||||
WC (cm) | ≤ 80 | 10038 (50.98) | 762 (37.70) | < 0.001 | 14239 (53.66) | 1139 (44.18) | < 0.001 | 2225 (51.71) | 650 (49.92) | 0.259 | 2951 (51.06) | 830 (48.65) | 0.081 |
> 80 | 9653 (49.02) | 1259 (62.30) | 12298 (46.34) | 1439 (55.82) | 2078 (48.29) | 652 (50.08) | 2829 (48.94) | 876 (51.35) | |||||
Dietary habit | |||||||||||||
Fresh vegetables (kg/wk) | 0 | 79 (0.40) | 20 (0.99) | < 0.001 | 151 (0.57) | 18 (0.70) | < 0.001 | 22 (0.51) | 10 (0.77) | < 0.001 | 16 (0.28) | 14 (0.82) | < 0.001 |
< 2.5 | 8839 (44.89) | 1265 (62.59) | 14860 (56.00) | 1727 (66.99) | 2386 (55.45) | 952 (73.12) | 3329 (57.60) | 1188 (69.64) | |||||
≥ 2.5 | 10773 (54.71) | 736 (36.42) | 11526 (43.43) | 833 (32.31) | 1895 (44.04) | 340 (26.11) | 2435 (42.13) | 504 (29.54) | |||||
Fresh fruits (kg/wk) | 0 | 472 (2.40) | 71 (3.51) | < 0.001 | 457 (1.72) | 228 (8.84) | < 0.001 | 61 (1.42) | 39 (3.00) | < 0.001 | 62 (1.07) | 51 (2.99) | < 0.001 |
< 1.25 | 12209 (62.00) | 1496 (74.02) | 17073 (64.34) | 1793 (69.55) | 2722 (63.26) | 998 (76.65) | 4080 (70.59) | 1312 (76.91) | |||||
≥ 1.25 | 7010 (35.60) | 454 (22.46) | 9007 (33.94) | 557 (21.61) | 1520 (35.32) | 265 (20.35) | 1638 (28.34) | 343 (20.11) | |||||
Meat (kg/wk) | 0 | 641 (3.26) | 103 (5.10) | < 0.001 | 427 (1.61) | 49 (1.90) | < 0.001 | 54 (1.25) | 17 (1.31) | 0.843 | 64 (1.11) | 22 (1.29) | < 0.001 |
< 0.35 | 12542 (63.69) | 1047 (51.81) | 17393 (65.54) | 1179 (45.73) | 2781 (64.63) | 830 (63.75) | 4123 (71.33) | 1075 (63.01) | |||||
≥ 0.35 | 6508 (33.05) | 871 (43.10) | 8717 (32.85) | 1350 (52.37) | 1468 (34.12) | 455 (34.95) | 1593 (27.56) | 609 (35.70) | |||||
Coarse grains (kg/wk) | 0 | 1090 (5.54) | 203 (10.04) | < 0.001 | 1309 (4.93) | 437 (16.95) | < 0.001 | 296 (6.88) | 227 (17.43) | < 0.001 | 164 (2.84) | 133 (7.80) | < 0.001 |
< 0.5 | 13140 (66.73) | 1535 (75.95) | 18394 (69.31) | 1714 (66.49) | 2937 (68.25) | 947 (72.73) | 4277 (74.00) | 1314 (77.02) | |||||
≥ 0.5 | 5461 (27.73) | 283 (14.00) | 6834 (25.75) | 427 (16.56) | 1070 (24.87) | 128 (9.83) | 1339 (23.17) | 259 (15.18) | |||||
Taste | Double salt | 2492 (12.66) | 522 (25.83) | < 0.001 | 3073 (11.58) | 985 (38.21) | < 0.001 | 965 (22.43) | 754 (57.91) | < 0.001 | 803 (13.89) | 620 (36.34) | < 0.001 |
Moderate | 13520 (68.66) | 1183 (58.54) | 19823 (74.70) | 1239 (48.06) | 2905 (67.51) | 482 (37.02) | 4351 (75.28) | 882 (51.70) | |||||
Light | 3679 (18.68) | 316 (15.64) | 3641 (13.72) | 354 (13.73) | 433 (10.06) | 66 (5.07) | 626 (10.83) | 204 (11.96) | |||||
Oil consumption | High oil consumption | 1597 (8.11) | 536 (26.52) | < 0.001 | 2311 (8.71) | 981 (38.05) | < 0.001 | 604 (14.04) | 585 (44.93) | < 0.001 | 620 (10.73) | 639 (37.46) | < 0.001 |
Moderate oil consumption | 15323 (77.82) | 1271 (62.89) | 21196 (79.87) | 1364 (52.91) | 3318 (77.11) | 657 (50.46) | 4699(81.30) | 921 (53.99) | |||||
Low oil consumption | 2771 (14.07) | 214 (10.59) | 3030 (11.42) | 233 (9.04) | 381 (8.85) | 60 (4.61) | 461 (7.98) | 146 (8.56) | |||||
Pickled and sun-cured food | Never | 1805 (9.17) | 161 (7.97) | < 0.001 | 1964 (7.40) | 138 (5.35) | < 0.001 | 290 (6.74) | 48 (3.69) | < 0.001 | 550 (9.52) | 63 (3.69) | < 0.001 |
Sometimes | 15686 (79.66) | 1385 (68.53) | 20894 (78.74) | 1430 (55.47) | 3114 (72.37) | 561 (43.09) | 4480 (77.51) | 1098 (64.36) | |||||
Often | 2200 (11.17) | 475 (23.50) | 3679 (13.86) | 1010 (39.18) | 899 (20.89) | 693 (53.23) | 750 (12.98) | 545 (31.95) | |||||
Living habit | |||||||||||||
Smoking | No | 14265 (72.44) | 1358 (67.19) | < 0.001 | 21116 (79.57) | 1682 (65.24) | < 0.001 | 3180 (73.90) | 810 (62.21) | < 0.001 | 4338 (75.05) | 1047 (61.37) | <0.001 |
Yes | 4401 (22.35) | 521 (25.78) | 4445 (16.75) | 724 (28.08) | 839 (19.50) | 364 (27.96) | 1127 (19.50) | 482 (28.25) | |||||
Quit smoking | 1025 (5.21) | 142 (7.03) | 976 (3.68) | 172 (6.67) | 284 (6.60) | 128 (9.83) | 315 (5.45) | 177 (10.38) | |||||
Alcohol drinking | No | 14190 (72.06) | 1287 (63.68) | < 0.001 | 20691 (77.97) | 1523 (59.08) | < 0.001 | 3148 (73.16) | 789 (60.60) | < 0.001 | 4352 (75.29) | 986 (57.80) | < 0.001 |
Yes | 4885 (24.81) | 647 (32.01) | 5177 (19.51) | 903 (35.03) | 997 (23.17) | 445 (34.18) | 1247 (21.57) | 619 (36.28) | |||||
Quit drinking | 616 (3.13) | 87 (4.30) | 668 (2.52) | 152 (5.90) | 158 (3.67) | 68 (5.22) | 181 (3.13) | 101 (5.92) | |||||
Physical activities | No | 9188 (46.66) | 1094 (54.13) | < 0.001 | 11132 (41.95) | 1477 (57.29) | < 0.001 | 1917 (44.55) | 845 (64.90) | < 0.001 | 2865 (49.57) | 917 (53.75) | 0.002 |
Yes | 10503 (53.34) | 927 (45.87) | 15405 (58.05) | 1101 (42.71) | 2386 (55.45) | 457 (35.10) | 2915 (50.43) | 789 (46.25) | |||||
Family history | No | 14438 (73.32) | 1032 (51.06) | < 0.001 | 21524 (81.11) | 1174 (45.54) | < 0.001 | 2960 (68.79) | 527 (40.48) | < 0.001 | 3660 (63.32) | 486 (28.49) | < 0.001 |
Yes | 5253 (26.68) | 9989 (48.94) | 5013 (18.89) | 1404 (54.46) | 1343 (31.21) | 775 (59.52) | 2120 (36.68) | 1220 (71.51) |
A cluster analysis of the dietary and living habits of all the participants was performed to categorize them as having a healthy or unhealthy lifestyle. A healthy lifestyle was considered an intake of fresh vegetables, fresh fruits, and coarse grains and participation in physical activities. An unhealthy lifestyle was considered an intake of meat and pickled and sun-cured foods, high oil consumption, a double-salt taste, smoking, and alcohol consumption. The analysis revealed that 39134 (61.2%) participants had a healthy lifestyle and 24783 (38.8%) had an unhealthy lifestyle. All of the participants were then randomly divided into a training set (n = 44743) and a validation set (n = 19175) in a 7:3 ratio.
Table 2 shows the results of univariate and multivariate analyses of the training set. The univariate analysis indicated that BMI and WC were significantly associated with a high risk of CRC, as was lifestyle and family history. The backward Wald logistic regression model, after excluding variables with P > 0.05, demonstrated that there were five predictors associated with a high risk of CRC: age, BMI, WC, lifestyle, and family history (Figure 1).
Variables | Univariate analysis | Multivariate analysis | ||||
OR | 95%CI | P value | OR | 95%CI | P value | |
Age (≤ 65 yr vs >65 yr) | 1.03 | 1.02-1.03 | < 0.001 | 1.03 | 1.02-1.03 | < 0.001 |
Sex (male vs female) | 0.98 | 0.93-1.04 | 0.540 | |||
Ethnicity (Han nationality vs other) | 0.90 | 0.48-1.67 | 0.731 | |||
BMI (< 18.50 kg/m2, 18.50-23.99 kg/m2, 24-27.99 kg/m2vs ≥ 28.00 kg/m2) | 1.07 | 1.06-1.08 | < 0.001 | 1.05 | 1.04-1.06 | < 0.001 |
WC (≤ 80 cm vs > 80 cm) | 1.03 | 1.02-1.03 | < 0.001 | 1.02 | 1.01-1.02 | < 0.001 |
Family history (no vs yes) | 4.28 | 4.04-4.54 | < 0.001 | 4.30 | 4.04-4.56 | < 0.001 |
Lifestyle (unhealthy lifestyle vs healthy lifestyle) | 0.45 | 0.42-0.48 | < 0.001 | 0.44 | 0.41-0.47 | < 0.001 |
A nomogram for the early screening of individuals at high risk of CRC was constructed based on a logistic regression model (Figure 1). To estimate the probability of high-risk CRC individuals being detected during early screening, each predictor observation was assigned a certain number of points by drawing a vertical line toward the vertex table. The sum of the points for each variable corresponded to the probability of an individual being identified during early screening as having a high risk of CRC. Finally, we analyzed the 500 resamples using the bootstrap method and deter
Our findings indicated that age, BMI, WC, family history, and lifestyle significantly contributed to the prediction of individuals at high risk for CRC. Therefore, these variables were used to validate an early screening model for individuals at high risk for CRC, and the model demonstrated potential clinical utility.
Compared with the current study, a multicenter study combining genetic and environmental risk scores for risk stratification of early-onset CRC[16] placed more emphasis on scores for polygene variants, environmental factors, and lifestyle. The results showed that an increase in the lifestyle score was associated with an increase in the relative risk of early-onset CRC, in line with our findings, which demonstrated a significant association between lifestyle and high risk for CRC. However, unlike this multicenter study, our study incorporated dietary habits in addition to living habits and family history to assess lifestyle. We found five factors to be significantly associated with high risk for CRC, two of which were non-modifiable variables (age and family history), in line with previous findings[22,26]. A healthy lifestyle had an OR of 0.44 (95%CI: 0.41-0.47) in individuals at high risk of CRC, indicating that a healthy lifestyle is a protective factor.
A similar study found that in addition to a high intake of red and processed meat, sex, ethnicity, sedentary lifestyle, and inflammatory bowel disease were associated with a low, intermediate, and high risk of early-onset CRC[27]. The study was a meta-analysis of the literature retrieved from PubMed and Web of Science. Eighteen articles were screened for inclusion, and 10 were ultimately reviewed to afford baseline data on the case group (n = 32843) and control group (n = 25806408), which were used to construct a risk assessment model. Risk factors associated with early-onset CRC in the baseline data were screened by meta-analysis, and those with P < 0.05 were included in the final model. The model differed from ours in that it categorized the population into low-, intermediate-, and high-risk groups based on risk trend scores. Moreover, their participants were < 50 years of age, thus predominantly comprising young adults. In contrast, the current study consisted of a wider range of age groups, although primarily focusing on those aged 40-74 years. It aimed to develop a prediction model that would enable accurate primary screening of groups at high risk for CRC.
One of the major strengths of our study was that it integrated multiple factors including age, sex, ethnicity, BMI, WC, healthy and unhealthy lifestyle, and family history. Second, our sample size, which was obtained from the population in Ningbo, Zhejiang, China, was large. Third, compared with other models, our model enabled better and earlier screening of CRC in high-risk populations in Ningbo. In addition, previous studies have demonstrated that CRC development can be prevented by altering the modifiable risk factors[28]. In our model, all of the risk factors, except age and family history, could be modified by improving lifestyle habits, changing dietary habits, and increasing physical activities. Age and family history are non-modifiable variables and must be carefully examined in early screening of CRC, especially because they are inextricably linked to CRC development[22,26]. Therefore, the combination of modifiable and non-modifiable risk factors in our model can help facilitate early screening of individuals at high risk for CRC. Furthermore, our model makes it easier to screen patients for CRC risk than colonoscopy, which can be painful and complex. The Colorectal Cancer Early Screening Model can also be convenient for clinicians and may help them improve the rate of clinical diagnosis and reduce the rate of underdiagnosis of CRC in high-risk populations.
However, this study had some limitations. First, given that the study population was from a single region in China (Ningbo, Zhejiang), the model lacked generalizability. China is a large country with differences in living environments, lifestyles and habits, diets, and cultures. Second, although we examined four risk factors, there are many other factors associated with dietary habits, living habits, and family history, and our study design could not incorporate them all. Therefore, future studies should include more variables to further validate our model. Third, our model did not include emotional and psychological factors. It is worth noting that a relationship between mental trauma and CRC has been reported in previous studies[29]. Although the exact mechanism underlying this relationship remains unclear, it may be because great mental trauma leads to neurological dysfunction, resulting in bowel disorders and stress ulcers, ultimately leading to the development of malignant intestinal lesions. Alternatively, excessive mental stress weakens the immune system, thereby increasing the susceptibility to disorders of the intestinal flora and the risk of developing CRC[30]. Therefore, future studies must examine the effect of emotional and psychological factors on CRC risk.
This study showed that older age, a high BMI, a large WC, an unhealthy lifestyle, and a family history of CRC are significantly associated with a high risk of CRC. A CRC risk-prediction model was also developed for accurate primary screening of groups with a high risk of CRC. This model could enable clinicians to develop early CRC screening strategies and may support public health campaigns for reducing CRC deaths and disease burden.
The establishment of early screening model for high risk of colorectal cancer (CRC) may become a potential new method for early screening. It is different from traditional invasive screening and is a noninvasive, simple and rapid screening methods. Although there are many studies on early screening model for high risk of CRC, there is still a lack of large sample size studies and clinical validation. Our study focused on collecting information in the general population who attended annual health checks. At the same time, this is also the first study with a large sample size for early screening for CRC in Ningbo, China, which was part of national early screening for CRC.
Constructing an early screening model for CRC high-risk groups by means of basic information such as lifestyle has gradually become a major topic in CRC early screening research, which is mainly aimed at solving the problem of the more complicated means of early screening for colorectal cancer high-risk groups.
This study aimed to establish an efficient early screening model to identify individuals at high risk of CRC, and reduce CRC prevalence and mortality.
This retrospective study included data from the health screening population in Ningbo Hospital, China from 2014 to 2017 to analyze the basic information, living habits and dietary habits, so that the early screening model of CRC was constructed and conducted for internal verification.
Retrospective analysis of 63918 individuals eligible for health screening, comprising studies with seven variables. The area under the curve was 0.734 [95% confidence interval (CI): 0.723-0.745] for the final validation set receiver operating characteristic curve (ROC) and 0.735 (95%CI: 0.728-0.742) for the training set ROC curve. The calibration curve demonstrated a high correlation between the CRC high-risk population predicted by the nomogram model and the actual CRC high-risk population.
This study has an early screening model for high risk of CRC based on basic population information, lifestyle and family history.
This study has the potential to revolutionize primary detection by accurately identifying groups at high risk of developing CRC.
The authors would like to thank the participants and participating doctors at Ningbo No. 2 Hospital; Ningbo Institute of Life and Health Science, University of Chinese Academy of Sciences; and students of Medical College of Soochow University, all of whom were staff members of this study.
Provenance and peer review: Unsolicited article; Externally peer reviewed.
Peer-review model: Single blind
Specialty type: Gastroenterology and hepatology
Country/Territory of origin: China
Peer-review report’s scientific quality classification
Grade A (Excellent): 0
Grade B (Very good): B
Grade C (Good): C
Grade D (Fair): 0
Grade E (Poor): 0
P-Reviewer: Alvarez-Bañuelos MT, Mexico; Bordonaro M, United States S-Editor: Gong ZM L-Editor: A P-Editor: Cai YX
1. | Baidoun F, Elshiwy K, Elkeraie Y, Merjaneh Z, Khoudari G, Sarmini MT, Gad M, Al-Husseini M, Saad A. Colorectal Cancer Epidemiology: Recent Trends and Impact on Outcomes. Curr Drug Targets. 2021;22:998-1009. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 26] [Cited by in F6Publishing: 93] [Article Influence: 31.0] [Reference Citation Analysis (1)] |
2. | GBD 2019 Colorectal Cancer Collaborators. Global, regional, and national burden of colorectal cancer and its risk factors, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet Gastroenterol Hepatol. 2022;7:627-647. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 237] [Cited by in F6Publishing: 231] [Article Influence: 115.5] [Reference Citation Analysis (0)] |
3. | Rocke KD. Colorectal Cancer Knowledge and Awareness Among University Students in a Caribbean Territory: a Cross-sectional Study. J Cancer Educ. 2020;35:571-578. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 2] [Cited by in F6Publishing: 6] [Article Influence: 1.5] [Reference Citation Analysis (0)] |
4. | Jin Y, Zheng MC, Yang X, Chen TL, Zhang JE. Patient delay to diagnosis and its predictors among colorectal cancer patients: A cross-sectional study based on the Theory of Planned Behavior. Eur J Oncol Nurs. 2022;60:102174. [PubMed] [DOI] [Cited in This Article: ] [Reference Citation Analysis (0)] |
5. | Park L, O'Connell K, Herzog K, Chatila W, Walch H, Palmaira RLD, Cercek A, Shia J, Shike M, Markowitz AJ, Garcia-Aguilar J, Schattner MA, Kantor ED, Du M, Mendelsohn RB. Clinical features of young onset colorectal cancer patients from a large cohort at a single cancer center. Int J Colorectal Dis. 2022;37:2511-2516. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 1] [Reference Citation Analysis (0)] |
6. | Wu W, Huang J, Tan S, Wong MCS, Xu W. Screening methods for colorectal cancer in Chinese populations. Hong Kong Med J. 2022;28:183-185. [PubMed] [DOI] [Cited in This Article: ] [Reference Citation Analysis (0)] |
7. | Imperiale TF, Monahan PO, Stump TE, Glowinski EA, Ransohoff DF. Derivation and Validation of a Scoring System to Stratify Risk for Advanced Colorectal Neoplasia in Asymptomatic Adults: A Cross-sectional Study. Ann Intern Med. 2015;163:339-346. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 66] [Cited by in F6Publishing: 67] [Article Influence: 7.4] [Reference Citation Analysis (0)] |
8. | Kaminski MF, Polkowski M, Kraszewska E, Rupinski M, Butruk E, Regula J. A score to estimate the likelihood of detecting advanced colorectal neoplasia at colonoscopy. Gut. 2014;63:1112-1119. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 105] [Cited by in F6Publishing: 108] [Article Influence: 10.8] [Reference Citation Analysis (0)] |
9. | Tao S, Hoffmeister M, Brenner H. Development and validation of a scoring system to identify individuals at high risk for advanced colorectal neoplasms who should undergo colonoscopy screening. Clin Gastroenterol Hepatol. 2014;12:478-485. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 80] [Cited by in F6Publishing: 80] [Article Influence: 8.0] [Reference Citation Analysis (0)] |
10. | Wong MC, Lam TY, Tsoi KK, Hirai HW, Chan VC, Ching JY, Chan FK, Sung JJ. A validated tool to predict colorectal neoplasia and inform screening choice for asymptomatic subjects. Gut. 2014;63:1130-1136. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 65] [Cited by in F6Publishing: 77] [Article Influence: 7.7] [Reference Citation Analysis (0)] |
11. | Sung JJY, Wong MCS, Lam TYT, Tsoi KKF, Chan VCW, Cheung W, Ching JYL. A modified colorectal screening score for prediction of advanced neoplasia: A prospective study of 5744 subjects. J Gastroenterol Hepatol. 2018;33:187-194. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 32] [Cited by in F6Publishing: 45] [Article Influence: 7.5] [Reference Citation Analysis (0)] |
12. | Cai QC, Yu ED, Xiao Y, Bai WY, Chen X, He LP, Yang YX, Zhou PH, Jiang XL, Xu HM, Fan H, Ge ZZ, Lv NH, Huang ZG, Li YM, Ma SR, Chen J, Li YQ, Xu JM, Xiang P, Yang L, Lin FL, Li ZS. Derivation and validation of a prediction rule for estimating advanced colorectal neoplasm risk in average-risk Chinese. Am J Epidemiol. 2012;175:584-593. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 65] [Cited by in F6Publishing: 75] [Article Influence: 6.3] [Reference Citation Analysis (0)] |
13. | Song M, Garrett WS, Chan AT. Nutrients, foods, and colorectal cancer prevention. Gastroenterology. 2015;148:1244-60.e16. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 361] [Cited by in F6Publishing: 435] [Article Influence: 48.3] [Reference Citation Analysis (0)] |
14. | Wang C, Miller SM, Egleston BL, Hay JL, Weinberg DS. Beliefs about the causes of breast and colorectal cancer among women in the general population. Cancer Causes Control. 2010;21:99-107. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 45] [Cited by in F6Publishing: 46] [Article Influence: 3.3] [Reference Citation Analysis (0)] |
15. | Ahmad R, Singh JK, Wunnava A, Al-Obeed O, Abdulla M, Srivastava SK. Emerging trends in colorectal cancer: Dysregulated signaling pathways (Review). Int J Mol Med. 2021;47. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 29] [Cited by in F6Publishing: 64] [Article Influence: 21.3] [Reference Citation Analysis (0)] |
16. | Archambault AN, Jeon J, Lin Y, Thomas M, Harrison TA, Bishop DT, Brenner H, Casey G, Chan AT, Chang-Claude J, Figueiredo JC, Gallinger S, Gruber SB, Gunter MJ, Guo F, Hoffmeister M, Jenkins MA, Keku TO, Le Marchand L, Li L, Moreno V, Newcomb PA, Pai R, Parfrey PS, Rennert G, Sakoda LC, Lee JK, Slattery ML, Song M, Win AK, Woods MO, Murphy N, Campbell PT, Su YR, Lansdorp-Vogelaar I, Peterse EFP, Cao Y, Zeleniuch-Jacquotte A, Liang PS, Du M, Corley DA, Hsu L, Peters U, Hayes RB. Risk Stratification for Early-Onset Colorectal Cancer Using a Combination of Genetic and Environmental Risk Scores: An International Multi-Center Study. J Natl Cancer Inst. 2022;114:528-539. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 13] [Cited by in F6Publishing: 10] [Article Influence: 5.0] [Reference Citation Analysis (0)] |
17. | Peng YN, Huang ML, Kao CH. Prevalence of Depression and Anxiety in Colorectal Cancer Patients: A Literature Review. Int J Environ Res Public Health. 2019;16. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 49] [Cited by in F6Publishing: 96] [Article Influence: 19.2] [Reference Citation Analysis (0)] |
18. | Boi-Dsane NAA, Amarh V, Tsatsu SE, Bachelle SV, Bediako-Bowan AAA, Koney NK, Dzudzor B. Cross-Sectional Study for Investigation of the Association Between Modifiable Risk Factors and Gastrointestinal Cancers at a Tertiary Hospital in Ghana. Cancer Control. 2023;30:10732748231155702. [PubMed] [DOI] [Cited in This Article: ] [Cited by in F6Publishing: 2] [Reference Citation Analysis (0)] |
19. | Yao A, Liang L, Rao H, Shen Y, Wang C, Xie S. The Clinical Characteristics and Treatments for Large Cell Carcinoma Patients Older than 65 Years Old: A Population-Based Study. Cancers (Basel). 2022;14. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 1] [Cited by in F6Publishing: 4] [Article Influence: 2.0] [Reference Citation Analysis (0)] |
20. | Kokkinos P, Faselis C, Myers J, Pittaras A, Sui X, Zhang J, McAuley P, Kokkinos JP. Cardiorespiratory fitness and the paradoxical BMI-mortality risk association in male veterans. Mayo Clin Proc. 2014;89:754-762. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 32] [Cited by in F6Publishing: 22] [Article Influence: 2.2] [Reference Citation Analysis (0)] |
21. | Jayasinghe M, Prathiraja O, Caldera D, Jena R, Coffie-Pierre JA, Silva MS, Siddiqui OS. Colon Cancer Screening Methods: 2023 Update. Cureus. 2023;15:e37509. [PubMed] [DOI] [Cited in This Article: ] [Cited by in F6Publishing: 1] [Reference Citation Analysis (0)] |
22. | Kastrinos F, Samadder NJ, Burt RW. Use of Family History and Genetic Testing to Determine Risk of Colorectal Cancer. Gastroenterology. 2020;158:389-403. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 39] [Cited by in F6Publishing: 44] [Article Influence: 11.0] [Reference Citation Analysis (0)] |
23. | Sano W, Hirata D, Teramoto A, Iwatate M, Hattori S, Fujita M, Sano Y. Serrated polyps of the colon and rectum: Remove or not? World J Gastroenterol. 2020;26:2276-2285. [PubMed] [DOI] [Cited in This Article: ] [Cited by in CrossRef: 8] [Cited by in F6Publishing: 15] [Article Influence: 3.8] [Reference Citation Analysis (1)] |
24. | Zhu N, Huang YQ, Song YM, Zhang SZ, Zheng S, Yuan Y. [Efficacy comparison among high risk factors questionnaire and Asia-Pacific colorectal screening score and their combinations with fecal immunochemical test in screening advanced colorectal tumor]. Zhonghua Wei Chang Wai Ke Za Zhi. 2022;25:612-620. [PubMed] [DOI] [Cited in This Article: ] [Reference Citation Analysis (0)] |
25. | Siegel RL, Jakubowski CD, Fedewa SA, Davis A, Azad NS. Colorectal Cancer in the Young: Epidemiology, Prevention, Management. Am Soc Clin Oncol Educ Book. 2020;40:1-14. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 42] [Cited by in F6Publishing: 67] [Article Influence: 16.8] [Reference Citation Analysis (1)] |
26. | Sninsky JA, Shore BM, Lupu GV, Crockett SD. Risk Factors for Colorectal Polyps and Cancer. Gastrointest Endosc Clin N Am. 2022;32:195-213. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 6] [Cited by in F6Publishing: 53] [Article Influence: 26.5] [Reference Citation Analysis (0)] |
27. | Gu J, Li Y, Yu J, Hu M, Ji Y, Li L, Hu C, Wei G, Huo J. A risk scoring system to predict the individual incidence of early-onset colorectal cancer. BMC Cancer. 2022;22:122. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 6] [Cited by in F6Publishing: 14] [Article Influence: 7.0] [Reference Citation Analysis (0)] |
28. | Keum N, Giovannucci E. Global burden of colorectal cancer: emerging trends, risk factors and prevention strategies. Nat Rev Gastroenterol Hepatol. 2019;16:713-732. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 777] [Cited by in F6Publishing: 1331] [Article Influence: 266.2] [Reference Citation Analysis (1)] |
29. | Simonton OC, Matthews-Simonton S. Cancer and stress: counselling the cancer patient. Med J Aust. 1981;1:679, 682-683. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 22] [Cited by in F6Publishing: 22] [Article Influence: 0.5] [Reference Citation Analysis (0)] |
30. | Geremia A, Arancibia-Cárcamo CV. Innate Lymphoid Cells in Intestinal Inflammation. Front Immunol. 2017;8:1296. [PubMed] [DOI] [Cited in This Article: ] [Cited by in Crossref: 86] [Cited by in F6Publishing: 103] [Article Influence: 14.7] [Reference Citation Analysis (0)] |