Published online Mar 21, 2026. doi: 10.3748/wjg.v32.i11.116220
Revised: December 4, 2025
Accepted: January 8, 2026
Published online: March 21, 2026
Processing time: 131 Days and 6.2 Hours
Acute suppurative cholecystitis (ASC) is a critical stage in the progression of acute cholecystitis. ASC indicates an escalation of local inflammation in the gallbladder from mild to significant. The surgical difficulty and mortality of laparoscopic cholecystectomy will increase significantly.
To develop a model integrating clinical characteristics and computed tomography (CT) radiomics features to improve the predictive performance of ASC.
Patients diagnosed with acute cholecystitis were retrospectively recruited from three independent centers. Patients were grouped into purulent and non-purulent phases based on the results of percutaneous cholecystostomy or laparoscopic cholecystectomy. Visual analysis of radiologic features combined with clinical information established a clinical model. Radiomics features were extracted from CT images. A radiomics model was extracted from these features. Then a fusion model was built by using a stacking ensemble strategy to integrate clinical and radiomics models.
Of 311 patients were included (mean ± SD age, 66 ± 15, 154 men; center 1, training and validation dataset; centers 2 and 3, test dataset; training dataset, n = 150; validation dataset, n = 61; test dataset, n = 100). Model performance was evaluated with the area under the receiver operating characteristic curve (AUC). SHapley Additive exPlanations (SHAP) reveals the importance of radiomics features. In the test dataset, the fusion model better predicted ASC than the clinical model and radiomics model (AUC = 0.82 vs 0.75 vs 0.76, P < 0.05), with similar specificity (83.1% vs 87.7% vs 73.9%) and higher sensitivity (71.4% vs 62.9% vs 45.7%). In addition, SHAP analysis identified logarithm glszm ZoneEntropy as the main predictor for the radiomics model.
The clinical-radiomics model constructed based on the stacking ensemble strategy could significantly improve ASC predictive accuracy.
Core Tip: This multi-center study developed and validated a fusion model to preoperatively predict acute suppurative cholecystitis (ASC). By integrating clinical characteristics and computed tomography radiomics features using a stacking ensemble strategy, the fusion model achieved an area under the receiver operating characteristic curve (AUC) of 0.82 on the external test dataset, significantly outperforming the clinical (AUC = 0.75) and radiomics (AUC = 0.76) models alone. It also showed higher sensitivity (71.4%) while maintaining high specificity (83.1%). The study concludes that this clinical-radiomics model can significantly improve the predictive accuracy for ASC, aiding in better surgical planning and risk assessment.
- Citation: Chen GD, Chen BQ, Ge YH, Liu JL, Cheng KW, Xiao HW, Long HY, Xie F. Explainable machine learning model integrating clinical and radiomic features for predicting acute suppurative cholecystitis. World J Gastroenterol 2026; 32(11): 116220
- URL: https://www.wjgnet.com/1007-9327/full/v32/i11/116220.htm
- DOI: https://dx.doi.org/10.3748/wjg.v32.i11.116220
The purulent phase is a critical stage in the progression of acute cholecystitis (AC). The purulent indicates an escalation of local inflammation in the gallbladder (GB) from mild to significant. And the severity also escalates from mild to moderate[1]. The surgical difficulty, conversion to open surgery and 30-day mortality of laparoscopic cholecystectomy (LC) will increase significantly[2-5]. Given this, biliary drainage should be considered as a more important treatment option, rather than one that is only considered if patient cannot withstand surgery. However, there is a lack of high-level evidence comparing the efficacy of the two in the purulent phase. This is because the diagnosis of acute suppurative cholecystitis (ASC) is usually based on intra-operative observation or biliary drainage, and there is currently a lack of effective noninvasive methods to characterize ASC prior to these interventions.
Ultrasound[6,7], computed tomography (CT)[8], and magnetic resonance imaging[9] can help diagnose purulent, in which pus or purulent bile and pericholecystic abscess are direct manifestations of ASC. However, the manifestations are not specific in these imaging modalities. Pus within the GB resembles sludge[9], making it challenging to differentiate, while the pericholecystic abscess is complex and varied. AC progresses in 3 distinct phases after cystic duct obstruction[10]. The first phase is characterized by inflammation and is manifest by GB wall congestion and edema. The second phase is characterized by hemorrhage and necrosis of the GB wall, which may lead to perforation at the site of ischemic gangrene. The third phase is purulent phase. This indicates that purulent may sometimes coexist with necrosis and perforation. Therefore, describing the imaging features of ASC is challenging.
Non-enhanced CT is commonly used as the initial diagnostic tool for AC[11,12] and is also a commonly used imaging examination to assess complications of AC[8]. And it serves as the first-line diagnostic option for acute abdominal pain[13-15]. Therefore, improving its efficacy in diagnosing ASC will bring wide-ranging benefits and cost savings. However, even when combined with laboratory parameters, current diagnostic efficacy remains unsatisfactory[16].
Radiomics is likely to change this by enabling high-throughput mining of quantitative image features from standard-of-care medical imaging that enables to capture imaging characteristics that are difficult or impossible to characterize by the human eye[17-20]. Thus, this study aimed to evaluate the diagnostic performance of non-enhanced CT radiomics in predicting ASC, using samples obtained during percutaneous cholecystostomy (PC) and LC as a reference standard.
This retrospective multicenter study was approved by the local institutional review boards of the People’s Hospital of Liaoning Province (No. 2023K047), the Institutional Review Board of Panjin Liaohe Oilfield Gem Flower Hospital (No. LLSC-2025-LW-01), and the Institutional Review Board of Nanchong Central Hospital (No. 2025-125), and the re
AC patients who underwent their first PC or LC between January 2020 to January 2023 were considered for inclusion in this study. The diagnosis of AC relied on clinical manifestation and radiological studies[16]. Figure 1 showed the inclusion and exclusion process. Exclusion criteria: (1) Concurrent or secondary pancreatitis and pancreatic trauma; (2) Bloody, mucinous, or unclassifiable bile; (3) Lack of CT images and laboratory values within 48 hours before PC or LC; and (4) Poor image quality. At the three centers involved in this study, a total of 823 initially screened patients were evaluated. First, 106 patients were excluded due to concurrent or secondary pancreatitis and pancreatic trauma. Next, 20 patients were excluded because their bile appeared bloody, mucinous, or unclassifiable. Subsequently, 304 patients were excluded due to lack of CT images and laboratory values within 48 hours before PC or LC. Finally, 82 patients were excluded due to poor image quality. After the above exclusion process, 311 patients were ultimately included in the study. Among them, 211 patients from center 1 were randomly divided into the training dataset (n = 150) and validation dataset (n = 61). The remaining 100 patients from centers 2 (n = 46) and 3 (n = 54) comprised the independent test dataset. The training dataset was utilized for model construction and all parameter optimization via internal cross-validation. The validation dataset was used to provide an internal, independent assessment of the final selected model. The test dataset from centers 2 and 3 was completely held out during model development and used exclusively as an independent external validation cohort for final performance assessment of the locked model.
Interventional radiologists performed PC and observed the bile sample obtained intraoperatively. General surgeons performed LC and observed the intraoperative GB specimens. The diagnostic criterion for ASC was the observation of purulent bile samples and/or pericholecystic abscesses during PC/LC.
The clinical characteristics include gender, age, body mass index (BMI), the most recent laboratory indicators and CT imaging features within 48 hours before PC/LC. Detailed CT scan parameters used by each center were provided in Supplementary material (scan parameters of CT). Two radiologists independently documented every radiologic features. The details of features are described in Supplementary material (radiologic feature analysis of CT). Upon completion, any disagreement on the features of each AC was jointly reviewed, and the final classification was made by the consensus of two other senior radiologists. For all image reviewing, radiologists were blinded to clinical information and pathology results.
To prevent model overfitting from excessive variables while capturing key aspects of the systemic inflammatory response, biliary obstruction, and local anatomical changes of the GB, imaging and clinical features were selected based on the core pathophysiological mechanisms of AC and the 2018 Tokyo Guidelines. The final set of features included in the analysis were: Age, sex, BMI, white blood cells (WBC), neutrophil granulocytes (NE), alanine aminotransferase, serum total bilirubin, unconjugated bilirubin, cystic duct or neck of the stones, GB stones, stratification of bile in the lumen, gas within the GB lumen, necrosis of the GB wall, pericholecystic exudation or fluid, and GB wall thickness.
An abdominal radiologist (reader 1) manually drawn volume-of-interest (VOI) regions layer by layer around the GB based on non-enhanced CT images using three dimensional (3D)-slicer software (version 5.2.2; http://www.slicer.org/). The method is detailed in Supplementary material (image segmentation). After 1 month, 20 patients were randomly selected from the training dataset. Their VOI regions were resegmented by reader 1 and another radiologist (reader 2) using the method to construct two resegmentation datasets. Radiomics features were extracted using PyRadiomics software (version 3.0.1; pyradiomics community). Before feature extraction, segmented images were preprocessed to minimize the influence of contrast and brightness variations on texture features: Images were spatially resampled to 3 mm × 3 mm × 3 mm using sitkNearestNeighbor as SimpleITK constant; Signal intensity values were discretized to a bin width of 25 with relative intensity rescaling. Radiomic features were extracted from both the original images and filtered versions processed with various algorithms, including wavelet (eight directions), logarithm, square, local binary pattern-3D (three variants), gradient, exponential, and square root filters. Feature categories included first-order statistics, shape features (extracted only from the original images), and texture features. A total of 1595 radiomic features were generated per patient. Intra/interobserver reproducibility analysis was evaluated using correlation coefficients. Although some features showed low correlation coefficients, they were retained due to their potential biological relevance. The entire feature extraction workflow is illustrated in Figure 2.
After standardizing continuous variables in the training dataset (Z-score), we used a univariate-way t-test or rank sum test for continuous variables and a χ2 test for categorical variables to preliminarily screen for statistically significant variables (P ≤ 0.05). The screened variables were then included in a multivariate logistic regression model, and the forward stepwise logistic regression was used to determine the variables for model construction.
The radiological features extracted from the training dataset were standardized (Z-score) and preliminarily screened using a univariate t-test (P < 0.01) to remove insignificant variables. Then, redundant features (|ρ| ≥ 0.9) were removed using Spearman correlation analysis. Finally, key radiomic features for model construction were selected using least absolute shrinkage and selection operator (LASSO) regression, using 5-fold cross-validation with area under the receiver operating characteristic curve (AUC) as the performance metric to select the optimal regularization parameter. SHapley Additive exPlanations (SHAP) values were subsequently calculated based on the final LASSO-logistic regression model to interpret the contribution of each selected feature to model predictions.
This study employs a stacking ensemble strategy, using the clinical model and the radiomics model as the base learners. The construction methods for the two base learners mirror those detailed in the preceding clinical model and radiomics model construction sections. Both base learners utilize logistic regression and output probabilities. To train the meta-learner, we implemented a 5-fold out-of-fold (OOF) prediction strategy on the training cohort. This strategy requires that, in each iteration of the 5-fold cross-validation, the base learners follow their respective model construction procedures and generate unbiased predictions only for the data subset left out of training in the current fold. These OOF predicted probabilities are then concatenated to form the complete fused feature matrix. The secondary model (meta-learner) is a logistic regression model, which learns how to optimally combine the predictions of the base models by fitting the OOF fused feature matrix. Finally, the fused model is saved for subsequent validation and application.
Statistical analysis was conducted using R software (version 3.6.3; R Foundation for Statistical Computing), SPSS statistics (version 24.0; IBM), and Python (version 3.10.9; Python Software Foundation). Continuous variables that followed a normal distribution were analyzed using independent samples t-tests, while those that did not follow a normal distribution were analyzed using Mann-Whitney U tests. Categorical variables were analyzed using χ2 tests or Fisher’s exact tests. The association between continuous variables was assessed using Spearman’s rank correlation coefficient. Model performance was evaluated using the AUC and decision curve analysis. To enhance the stability and accuracy of the tests, model comparisons were performed using Delong’s tests based on the predicted probability distributions. These distributions were obtained by 2000 bootstrap resampling, which were used solely to estimate confidence intervals and assess the stability of performance metrics, not to train the models themselves. In addition, the calibration of all three models was assessed across all cohorts using calibration plots and the Brier score to evaluate the agreement between predicted probabilities and observed outcomes.
Of 311 AC patients were included (Table 1 and Figure 1), of whom 114 patients (36.7%) were diagnosed with ASC. In the training, validation, and test dataset, 60 cases (40.0%), 19 cases (31.1%), and 35 cases (35.0%) were diagnosed with ASC, respectively. With the exception of stratification of bile in the lumen (P = 0.046), no statistically significant differences in clinical were observed among the datasets.
| Variable | Training set (n = 150) | Validation set (n = 61) | Test set (n = 100) | P value |
| Age (years), median IQR | 67.00 (56.50, 78.00) | 69.00 (57.50, 78.50) | 68.00 (55.25, 78.00) | 0.883 |
| Sex (male) | 76 (50.7) | 30 (49.2) | 48 (48.0) | 0.917 |
| WBC (× 109/L), median IQR | 10.36 (7.50, 14.72) | 9.85 (6.55, 14.84) | 9.95 (6.66, 13.34) | 0.544 |
| NE (%), median IQR | 84.75 (72.35, 90.83) | 83.40 (70.00, 88.35) | 82.40 (71.42, 90.92) | 0.309 |
| ALT (U/L), median IQR | 29.40 (16.30, 53.58) | 25.00 (16.50, 42.60) | 25.40 (15.65, 50.75) | 0.406 |
| STB (μmol/L), median IQR | 20.35 (13.20, 32.58) | 17.20 (11.40, 27.00) | 21.15 (14.45, 35.00) | 0.120 |
| UCB (μmol/L), median IQR | 13.05 (9.20, 20.05) | 11.00 (6.70, 16.80) | 13.45 (7.60, 20.38) | 0.065 |
| GB wall thickness (mm), median IQR | 3.20 (2.60, 4.20) | 3.00 (2.30, 3.85) | 3.20 (2.33, 4.00) | 0.390 |
| GB stones | 93 (62.0) | 36 (59.0) | 65 (65.0) | 0.7991 |
| Cystic duct or neck of the stones | 65 (43.3) | 23 (37.7) | 40 (40.0) | 0.723 |
| Stratification of bile in the lumen | 17 (11.3) | 5 (8.2) | 3 (3.0) | 0.0461 |
| Gas within the GB lumen | 4 (2.7) | 0 (0.0) | 6 (6.0) | 0.093 |
| Necrosis of the GB wall | 31 (20.7) | 14 (23.0) | 19 (19.0) | 0.834 |
| Pericholecystic exudation or fluid | 66 (44.0) | 25 (41.0) | 32 (32.0) | 0.159 |
| Pus | 60 (40) | 19 (31.14) | 35 (35) | 0.441 |
Univariate analysis of 14 clinical characteristics (Table 2), identified 7 predictors significantly associated with ASC, including age, WBC, NE, GB wall thickness, gas within the GB lumen, necrosis of the GB wall, and pericholecystic exudation or fluid. A multivariate logistic regression model was constructed via forward stepwise selection (significance level α = 0.05), ultimately including NE [odds ratio (OR) = 2.456, 95% confidence interval (CI): 1.520-3.969; P < 0.001] and necrosis of the GB wall (OR = 5.255, 95%CI: 2.091-13.206; P < 0.001) as independent predictive factors. The model achieved AUC values of 0.784, 0.745, and 0.746 in the training, validation, and test datasets, respectively (Figure 3).
| Variable | ASC (n = 60) | Non-ASC (n = 90) | Univariate P value | Multivariate analysis | |
| P value | OR (95%CI) | ||||
| Age (years), median IQR | 70.5 (59.00, 81.50) | 65.50 (50.00, 75.25) | 0.041 | ||
| Sex (male) | 34 (56.7) | 42 (46.7) | 0.23 | ||
| BMI, median IQR | 24.17 (21.38, 27.13) | 24.00 (21.59, 27.08) | 0.844 | ||
| WBC (× 109/L), median IQR | 11.77 (8.31, 15.65) | 9.44 (6.77, 14.19) | 0.016 | ||
| NE (%), median IQR | 90.35 (81.33, 93.50) | 80.60 (66.70, 88.75) | < 0.001 | < 0.001 | 2.456 (1.520-3.969) |
| ALT (U/L), median IQR | 28.00 (16.00, 60.03) | 30.25 (19.48, 50.60) | 0.524 | ||
| STB (μmol/L), median IQR | 23.00 (14.30, 37.83) | 18.65 (12.70, 27.80) | 0.107 | ||
| UCB (μmol/L), median IQR | 13.60 (9.33, 22.88) | 12.60 (9.00, 19.83) | 0.322 | ||
| GB wall thickness (mm), median IQR | 3.65 (2.70, 4.50) | 3.10 (2.43, 3.85) | 0.017 | ||
| GB stones | 41 (68.3) | 55 (61.1) | 0.367 | ||
| Cystic duct or neck of the stones | 29 (48.3) | 36 (40) | 0.313 | ||
| Stratification of bile in the lumen | 4 (6.7) | 13 (14.4) | 0.141 | ||
| Gas within the GB lumen | 4 (6.7) | 0 (0.0) | 0.0241 | ||
| Necrosis of the GB wall, median IQR | 22 (36.7) | 9 (10) | < 0.001 | < 0.001 | 5.255 (2.091-13.206) |
| Pericholecystic exudation or fluid | 38 (63.3) | 28 (31.1) | < 0.001 | ||
| Time between CT and intervention (days) | 0 (0-1) | 0 (0-1) | 0.672 | ||
A total of 1595 radiomics features were initially extracted from non-contrast CT images of the GB. Following initial screening using independent samples t-tests, 333 features were retained. Redundant features were subsequently removed using Pearson correlation analysis, reducing the feature set to 42. Subsequently, LASSO regression was then applied to select 11 optimal radiomics features. The pairwise correlations among these selected features were all below 0.7 (Supplementary Figure 2). The radiomics model achieved AUC values of 0.804, 0.781, and 0.763 in the training, vali
To elucidate the contribution of individual features to model predictions, the SHAP algorithm was applied, and a SHAP beeswarm plot was constructed (Figure 4). The results revealed two most influential features were logarithm glszm ZoneEntropy and wavelet-LLH gldm dependence nonuniformity normalized, both exhibiting positive SHAP values, suggesting a positive association with the risk of ASC. In contrast, features such as square root glszm size zone nonuniformity and lbp-3D-k glszm gray level nonuniformity exhibited negative SHAP values, indicating a potential association with decreased risk.
Figure 5 presents four representative cases and their corresponding SHAP force plots, illustrating how individual features contribute positively or negatively to the prediction outcome. The baseline value in each plot represents the probability from the baseline model, while f(x) denotes the final predicted probability.
The fusion model was constructed by integrating the prediction probabilities from both the clinical model and the radiomics model using logistic regression. The AUC values for the training, validation, and test datasets were 0.848, 0.840, and 0.826, respectively (Figure 3A). In instances where the two models yielded conflicting predictions (e.g., high probability from the clinical model but low probability from the radiomics model), the fusion model leveraged dynamic weight allocation to mitigate misclassifications. This adaptive integration enhanced the robustness and accuracy of the overall predictive performance (Figure 6).
Compared with the clinical and radiomics models, the fusion model consistently achieved higher AUC values across all datasets (Figure 3A), with statistically significant differences observed between the training and test datasets (Table 3). Decision curve analysis further demonstrated that the fusion model yielded a greater net benefit than either individual model across almost the entire range of threshold probabilities in the training cohort (Figure 3B). In the validation cohort, the fusion model showed superior net benefit primarily within the 0.2-0.8 threshold range, although the advantage over the other models was minimal in the 0.4-0.6 interval. In the test cohort, the fusion model provided the highest net benefit in the clinically relevant 0.4-0.7 threshold range. Calibration analysis using calibration plots and Brier scores indicated that the fusion model also exhibited better agreement between predicted probabilities and observed outcomes across the training, validation, and test cohorts (Supplementary Table 1 and Figure 3C). Taking into account multiple evaluation metrics including sensitivity, specificity, and overall accuracy as well as generalizability, the fusion model demonstrated the best performance among the three models.
| Model | Dataset | AUC (95%CI) | Sensitivity (%) | Specificity (%) | Accuracy (%) | DeLong test P value | |
| P vs clinical | P vs radiomics | ||||||
| Clinical | Training | 0.7841 (0.7079-0.8557) | 53.3 | 86.7 | 73.3 | ||
| Radiomics | Training | 0.8043 (0.7224-0.8724) | 63.3 | 85.5 | 76.7 | 0.686 | |
| Fusion | Training | 0.8478 (0.7773-0.9070) | 65.0 | 85.6 | 77.3 | 0.046 | 0.039 |
| Clinical | Validation | 0.7450 (0.5915-0.8800) | 57.9 | 92.9 | 82.0 | ||
| Radiomics | Validation | 0.7807 (0.6405-0.9021) | 63.2 | 83.3 | 77.1 | 0.713 | |
| Fusion | Validation | 0.8396 (0.7214-0.9385) | 63.2 | 90.5 | 82.0 | 0.140 | 0.192 |
| Clinical | Test | 0.7459 (0.6345-0.8497) | 45.7 | 87.7 | 73.0 | ||
| Radiomics | Test | 0.7631 (0.6658-0.8515) | 62.9 | 73.9 | 70.0 | 0.794 | |
| Fusion | Test | 0.8264 (0.7327-0.9063) | 71.4 | 83.1 | 79.0 | 0.049 | 0.047 |
Our study aimed to evaluate the value of radiomics in improving the diagnostic performance of non-enhanced CT for ASC. This study first constructed a clinical model incorporating laboratory indicators and radiologic features, and then independently developed a machine learning model based on CT radiomics features. In the test dataset, both base models demonstrated moderate diagnostic performance with no significant difference (P = 0.80). The further constructed fusion model, by integrating the two types of features, significantly improved diagnostic performance compared to single models (P < 0.05). This indicates that combining radiomics features with conventional clinical indicators can effectively enhance the diagnostic efficacy of non-enhanced CT for ASC, suggesting that radiomics features contain important incremental diagnostic information. Moreover, in the test cohort, decision curve analysis showed that the fusion model provided a clear net benefit over both the clinical and radiomics models. This benefit was most pronounced within the clinically relevant 0.4-0.7 threshold probability range, highlighting its usefulness for guiding decisions in patients with intermediate risk.
In this study, we systematically collected two types of data to construct a clinical model. One type was conventional demographic characteristics and laboratory indicators. The other type was CT imaging manifestations collected based on the pathological development patterns of AC. These imaging features were classified into two categories: (1) Etiology-related indicators (e.g., GB stones, cystic duct stones); and (2) Disease severity assessment indicators (e.g., pericholecystic exudation, necrosis of the GB wall, which directly reflect the severity of inflammation). The study design deliberately incorporated the phased nature of AC disease progression. The pathological process of AC indicates that cystic duct obstruction and the passage of time are determining factors in the progression of AC to the purulent phase[10]. However, in clinical practice, there are significant individual differences in the duration of illness reported by patients, especially in the elderly population, where the correlation between symptoms and actual disease duration is weak. Therefore, this study did not include the subjective onset time of symptoms in the variable collection scope. Without the assistance of accurate timing, the value of etiological indicators such as cystic duct obstruction may also be significantly reduced. This was also validated in subsequent analyses. Univariate analysis results also confirmed this, with no significant statistical correlation found between ASC and etiological indicators such as GB stones (P > 0.05). The features of clinical model included NE (OR = 2.456, 95%CI: 1.520-3.969; P < 0.001) and necrosis of the GB wall (OR = 5.255, 95%CI: 2.091-13.206; P < 0.001). Both of these features directly reflect the degree of inflammation, with the former reflecting systemic inflammation and the latter reflecting local inflammation.
The advantage of radiologic features directly reflecting the severity of inflammation is that they more accurately reveal the role of temporal factors. A higher proportion of pericholecystic exudate or effusion and thicker GB walls on CT within 48 hours before PC compared to over 48 hours before PC[16]. This is also the basis for excluding patients lacking CT within 48 hours before PC/LC in this study. But even then, there was a duration of about 24 hours between the time of undergoing PC/LC and the time of CT. This can be an important factor in preventing CT reaching its full potential.
To further enhance the diagnostic utility of non-contrast CT in AC, we developed a radiomics-based clinical model. Given that radiomics models are often perceived as “black boxes” in clinical decision-making, we applied the SHAP method to conduct interpretability analysis. By visualizing both global and individual SHAP values, we quantitatively assessed the contribution of each feature to the model’s predictions[21]. As illustrated in the SHAP beeswarm plot, the most influential features included logarithm glszm ZoneEntropy, wavelet- LLH gldm dependence nonuniformity normalized, and lbp-3D-k first order 10Percentile. Features derived from the logarithm and wavelet domains effectively captured high- and low-frequency information in the grayscale texture of CT images, thereby revealing subtle heterogeneity caused by inflammatory changes. SHAP values for logarithm glszm ZoneEntropy and wavelet- LLH gldm dependence nonuniformity normalized were predominantly positive, indicating a strong association with increased risk of ASC. In contrast, features such as squareroot glszm size zone nonuniformity and lbp-3D-k glszm gray level nonuniformity showed negative SHAP contributions, suggesting a potential protective role in identifying low-risk cases. Interestingly, original shape maximum two-dimensional diameter row was the only shape feature retained, highlighting that two-dimensional GB enlargement may reflect morphological alterations during suppuration. Overall, radiomics features with their quantitative, multidimensional, and multiscale imaging representations enable identification of microstructural changes often undetectable by conventional imaging, thus offering substantial complementary value in clinical diagnosis[19,22].
Although the clinical and radiomics models showed similar performance (AUC: 0.746 vs 0.763), they are complementary. This is because they focus on different biological levels: The former reflects systemic and local inflammatory response, while the latter captures local microstructural alterations in the GB. This intrinsic complementarity motivated the construction of a fusion model, which achieved superior diagnostic performance (AUC = 0.826) on the test cohort compared to the clinical (AUC = 0.746) and radiomics (AUC = 0.763) models. This improvement underscores the added value of multimodal data integration in enhancing the diagnostic potential of non-contrast CT for ASC. Furthermore, the incorporation of a dynamic weighting mechanism allowed the fusion model to resolve inconsistencies between the two individual models (e.g., high clinical probability but low radiomics probability), thereby mitigating potential misclassifications. These findings highlight the synergistic and non-redundant contributions of clinical and radiomics features in diagnosis of ASC.
This study has several limitations. First, there were some differences in scanning protocols and baseline characteristics among different centers, which may affect the radiomics features. Secondly, due to differences in laboratory testing items among these centers, this study was unable to include some laboratory indicators with potential diagnostic value, such as C-reactive protein.
In conclusion, our study showed that the fusion model constructed by integrating the clinical model and the radiomics model based on the Stacking ensemble strategy could accurately predict ASC.
| 1. | Yokoe M, Hata J, Takada T, Strasberg SM, Asbun HJ, Wakabayashi G, Kozaka K, Endo I, Deziel DJ, Miura F, Okamoto K, Hwang TL, Huang WS, Ker CG, Chen MF, Han HS, Yoon YS, Choi IS, Yoon DS, Noguchi Y, Shikata S, Ukai T, Higuchi R, Gabata T, Mori Y, Iwashita Y, Hibi T, Jagannath P, Jonas E, Liau KH, Dervenis C, Gouma DJ, Cherqui D, Belli G, Garden OJ, Giménez ME, de Santibañes E, Suzuki K, Umezawa A, Supe AN, Pitt HA, Singh H, Chan ACW, Lau WY, Teoh AYB, Honda G, Sugioka A, Asai K, Gomi H, Itoi T, Kiriyama S, Yoshida M, Mayumi T, Matsumura N, Tokumura H, Kitano S, Hirata K, Inui K, Sumiyama Y, Yamamoto M. Tokyo Guidelines 2018: diagnostic criteria and severity grading of acute cholecystitis (with videos). J Hepatobiliary Pancreat Sci. 2018;25:41-54. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 769] [Cited by in RCA: 792] [Article Influence: 99.0] [Reference Citation Analysis (0)] |
| 2. | Okamoto K, Suzuki K, Takada T, Strasberg SM, Asbun HJ, Endo I, Iwashita Y, Hibi T, Pitt HA, Umezawa A, Asai K, Han HS, Hwang TL, Mori Y, Yoon YS, Huang WS, Belli G, Dervenis C, Yokoe M, Kiriyama S, Itoi T, Jagannath P, Garden OJ, Miura F, Nakamura M, Horiguchi A, Wakabayashi G, Cherqui D, de Santibañes E, Shikata S, Noguchi Y, Ukai T, Higuchi R, Wada K, Honda G, Supe AN, Yoshida M, Mayumi T, Gouma DJ, Deziel DJ, Liau KH, Chen MF, Shibao K, Liu KH, Su CH, Chan ACW, Yoon DS, Choi IS, Jonas E, Chen XP, Fan ST, Ker CG, Giménez ME, Kitano S, Inomata M, Hirata K, Inui K, Sumiyama Y, Yamamoto M. Tokyo Guidelines 2018: flowchart for the management of acute cholecystitis. J Hepatobiliary Pancreat Sci. 2018;25:55-72. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 556] [Cited by in RCA: 564] [Article Influence: 70.5] [Reference Citation Analysis (0)] |
| 3. | Ambe PC, Jansen S, Macher-Heidrich S, Zirngibl H. Surgical management of empyematous cholecystitis: a register study of over 12,000 cases from a regional quality control database in Germany. Surg Endosc. 2016;30:5319-5324. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 15] [Cited by in RCA: 25] [Article Influence: 2.5] [Reference Citation Analysis (0)] |
| 4. | Griffiths EA, Hodson J, Vohra RS, Marriott P; CholeS Study Group, Katbeh T, Zino S, Nassar AHM; West Midlands Research Collaborative. Utilisation of an operative difficulty grading scale for laparoscopic cholecystectomy. Surg Endosc. 2019;33:110-121. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 48] [Cited by in RCA: 97] [Article Influence: 13.9] [Reference Citation Analysis (0)] |
| 5. | Nugent JP, Li J, Pang E, Harris A. What's new in the hot gallbladder: the evolving radiologic diagnosis and management of acute cholecystitis. Abdom Radiol (NY). 2023;48:31-46. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 2] [Cited by in RCA: 4] [Article Influence: 1.3] [Reference Citation Analysis (0)] |
| 6. | Charalel RA, Jeffrey RB, Shin LK. Complicated cholecystitis: the complementary roles of sonography and computed tomography. Ultrasound Q. 2011;27:161-170. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 40] [Cited by in RCA: 32] [Article Influence: 2.3] [Reference Citation Analysis (0)] |
| 7. | Sagrini E, Pecorelli A, Pettinari I, Cucchetti A, Stefanini F, Bolondi L, Piscaglia F. Contrast-enhanced ultrasonography to diagnose complicated acute cholecystitis. Intern Emerg Med. 2016;11:19-30. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 9] [Cited by in RCA: 12] [Article Influence: 1.2] [Reference Citation Analysis (0)] |
| 8. | Shakespear JS, Shaaban AM, Rezvani M. CT findings of acute cholecystitis and its complications. AJR Am J Roentgenol. 2010;194:1523-1529. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 100] [Cited by in RCA: 75] [Article Influence: 4.7] [Reference Citation Analysis (0)] |
| 9. | Watanabe Y, Nagayama M, Okumura A, Amoh Y, Katsube T, Suga T, Koyama S, Nakatani K, Dodo Y. MR imaging of acute biliary disorders. Radiographics. 2007;27:477-495. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 109] [Cited by in RCA: 88] [Article Influence: 4.6] [Reference Citation Analysis (0)] |
| 10. | Gallaher JR, Charles A. Acute Cholecystitis: A Review. JAMA. 2022;327:965-975. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 27] [Cited by in RCA: 233] [Article Influence: 58.3] [Reference Citation Analysis (0)] |
| 11. | Wertz JR, Lopez JM, Olson D, Thompson WM. Comparing the Diagnostic Accuracy of Ultrasound and CT in Evaluating Acute Cholecystitis. AJR Am J Roentgenol. 2018;211:W92-W97. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 62] [Cited by in RCA: 70] [Article Influence: 8.8] [Reference Citation Analysis (0)] |
| 12. | Martellotto S, Dohan A, Pocard M. Evaluation of the CT Scan as the First Examination for the Diagnosis and Therapeutic Strategy for Acute Cholecystitis. World J Surg. 2020;44:1779-1789. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 5] [Cited by in RCA: 14] [Article Influence: 2.3] [Reference Citation Analysis (0)] |
| 13. | Lee D, Appel S, Nunes L. CT findings and outcomes of acute cholecystitis: is additional ultrasound necessary? Abdom Radiol (NY). 2021;46:5434-5442. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1] [Cited by in RCA: 7] [Article Influence: 1.4] [Reference Citation Analysis (0)] |
| 14. | Min JH, Shin KS, Lee JE, Choi SY, Ahn S. Combination of CT findings can reliably predict radiolucent common bile duct stones: a novel approach using a CT-based nomogram. Eur Radiol. 2019;29:6447-6457. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1] [Cited by in RCA: 4] [Article Influence: 0.6] [Reference Citation Analysis (0)] |
| 15. | Expert Panel on Gastrointestinal Imaging:, Scheirey CD, Fowler KJ, Therrien JA, Kim DH, Al-Refaie WB, Camacho MA, Cash BD, Chang KJ, Garcia EM, Kambadakone AR, Lambert DL, Levy AD, Marin D, Moreno C, Noto RB, Peterson CM, Smith MP, Weinstein S, Carucci LR. ACR Appropriateness Criteria(®) Acute Nonlocalized Abdominal Pain. J Am Coll Radiol. 2018;15:S217-S231. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 31] [Cited by in RCA: 58] [Article Influence: 7.3] [Reference Citation Analysis (0)] |
| 16. | Chen BQ, Xie F, Chen GD, Li X, Mao X, Jia B. Value of nonenhanced CT combined with laboratory examinations in the diagnosis of acute suppurative cholecystitis treated with percutaneous cholecystostomy: a retrospective study. BMC Gastroenterol. 2022;22:155. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 4] [Reference Citation Analysis (0)] |
| 17. | Lambin P, Leijenaar RTH, Deist TM, Peerlings J, de Jong EEC, van Timmeren J, Sanduleanu S, Larue RTHM, Even AJG, Jochems A, van Wijk Y, Woodruff H, van Soest J, Lustberg T, Roelofs E, van Elmpt W, Dekker A, Mottaghy FM, Wildberger JE, Walsh S. Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol. 2017;14:749-762. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1825] [Cited by in RCA: 3991] [Article Influence: 443.4] [Reference Citation Analysis (0)] |
| 18. | Lafata KJ, Wang Y, Konkel B, Yin FF, Bashir MR. Radiomics: a primer on high-throughput image phenotyping. Abdom Radiol (NY). 2022;47:2986-3002. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 15] [Cited by in RCA: 62] [Article Influence: 15.5] [Reference Citation Analysis (0)] |
| 19. | Liu Z, Wang S, Dong D, Wei J, Fang C, Zhou X, Sun K, Li L, Li B, Wang M, Tian J. The Applications of Radiomics in Precision Diagnosis and Treatment of Oncology: Opportunities and Challenges. Theranostics. 2019;9:1303-1322. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 558] [Cited by in RCA: 681] [Article Influence: 97.3] [Reference Citation Analysis (0)] |
| 20. | Sohn JH, Fields BKK. Radiomics and Deep Learning to Predict Pulmonary Nodule Metastasis at CT. Radiology. 2024;311:e233356. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 20] [Reference Citation Analysis (0)] |
| 21. | Li MD, Cheng MQ, Chen LD, Hu HT, Zhang JC, Ruan SM, Huang H, Kuang M, Lu MD, Li W, Wang W. Reproducibility of radiomics features from ultrasound images: influence of image acquisition and processing. Eur Radiol. 2022;32:5843-5851. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 2] [Cited by in RCA: 21] [Article Influence: 5.3] [Reference Citation Analysis (0)] |
| 22. | Gillies RJ, Kinahan PE, Hricak H. Radiomics: Images Are More than Pictures, They Are Data. Radiology. 2016;278:563-577. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 4541] [Cited by in RCA: 6081] [Article Influence: 608.1] [Reference Citation Analysis (7)] |
