Copyright: ©Author(s) 2026.
Figure 1 Construction of the development, external validation, and exploratory cohorts.
The development cohort was derived from two tertiary hospitals in Zhejiang Province, and an independent biopsy-confirmed cohort from the First Affiliated Hospital of Zhejiang University served as external validation. A population-based exploratory cohort was obtained from National Health and Nutrition Examination Survey 2011-2020 and included individuals with evidence of hepatitis B virus exposure. Because liver biopsy is unavailable in National Health and Nutrition Examination Survey, fibrosis status was approximated using a conservative aspartate aminotransferase to platelet ratio index/fibrosis-4-based algorithm. After data harmonization and exclusion of participants with missing key variables or other causes of liver disease, the final cohorts were used for model development and evaluation. HBV: Hepatitis B virus; NHANES: National Health and Nutrition Examination Survey; APRI: Aspartate aminotransferase to platelet ratio index; AST: Aspartate aminotransferase; FIB-4: Fibrosis-4; ALT: Alanine aminotransferase; PLT: Platelet.
Figure 2 Performance and reclassification of the model within the fibrosis-4 indeterminate zone.
A: Distribution of model-assigned risk strata among patients within the fibrosis-4 indeterminate (gray) zone (n = 129). Among these individuals, 54 (41.9%) were classified as lower risk, 39 (30.2%) remained in an intermediate range, and 36 (27.9%) were classified as higher risk. The observed prevalence of significant fibrosis (S ≥ 2) increased across strata (13.0%, 35.9%, and 75.0%, respectively); B: Receiver operating characteristic curve of the model restricted to the fibrosis-4 indeterminate subgroup, demonstrating preserved discrimination (area under the curve = 0.821, 95% confidence interval: 0.737-0.895). The dashed diagonal line represents no discrimination. AUC: Area under the curve; CI: Confidence interval; FIB-4: Fibrosis-4.
Figure 3 Model performance across cohorts.
Receiver operating characteristic curves in the development cohort, the biopsy-confirmed external cohort (First Affiliated Hospital of Zhejiang University), and the population-based exploratory cohort (National Health and Nutrition Examination Survey, surrogate-labeled). The model demonstrated good discrimination across all datasets. In National Health and Nutrition Examination Survey, discrimination was evaluated against fibrosis status defined by aspartate aminotransferase to platelet ratio index/fibrosis-4-based surrogate criteria rather than histological confirmation. The dashed diagonal line represents no discrimination. AUC: Area under the curve; CI: Confidence interval.
Figure 4 Calibration curves.
Calibration curves for the ensemble model in the development cohort and the external validation cohort (First Affiliated Hospital of Zhejiang University). Observed event proportions are plotted against mean predicted probabilities. Calibration slope, intercept, and Brier score are shown for each cohort. The dashed line indicates perfect calibration. FAHZU: First Affiliated Hospital of Zhejiang University.
Figure 5 Decision curve analysis of the machine-learning model in the development and external validation cohorts.
Decision curve analysis comparing the ensemble model with aspartate aminotransferase to platelet ratio index, fibrosis-4, and the “treat-all” and “treat-none” strategies in the development cohort and external validation cohort. Across clinically relevant threshold probabilities, the ensemble model showed comparable or higher net benefit, supporting its potential clinical utility for identifying patients with significant fibrosis. The shaded region indicates the prespecified interpretive threshold range (0.20-0.50) used for primary interpretation of net benefit. FAHZU: First Affiliated Hospital of Zhejiang University; APRI: Aspartate aminotransferase to platelet ratio index; FIB-4: Fibrosis-4.
Figure 6 Model explainability using SHapley Additive exPlanations (surrogate model).
SHapley Additive exPlanations analysis illustrating the relative importance and directional effects of key predictors in the ensemble model. Platelet count, albumin/globulin ratio, aspartate aminotransferase/alanine aminotransferase ratio, and age were the major contributors, with feature effects showing clinically consistent patterns. A: Global feature importance (surrogate model); B: SHapley Additive exPlanations summary (red = higher values, blue = lower values). SHAP: SHapley Additive exPlanations; AST: Aspartate aminotransferase; ALT: Alanine aminotransferase ratio.
- Citation: Wang TT, Chu YL, Lou YQ, Yang RY, Pu MM, Shan LJ, Huang L, Chen SS, Huang HJ. Routine laboratory model for identifying significant fibrosis in chronic hepatitis B. World J Hepatol 2026; 18(6): 119005
- URL: https://www.wjgnet.com/1948-5182/full/v18/i6/119005.htm
- DOI: https://dx.doi.org/10.4254/wjh.119005