Copyright
©The Author(s) 2026.
World J Gastroenterol. Jan 14, 2026; 32(2): 113059
Published online Jan 14, 2026. doi: 10.3748/wjg.v32.i2.113059
Published online Jan 14, 2026. doi: 10.3748/wjg.v32.i2.113059
| Ref. | Type of study | Population | Number of patients | AI technique employed | Main results |
| Ruan et al[43] | Retrospective, multi-center | HBV | 508 | MSTNet DL | High accuracy in detecting both moderate (≥ F2) and advanced (F4) liver fibrosis, outperforming conventional clinical tools (APRI, FIB-4 and Forns) and human sonographers |
| Song et al[42] | Retrospective, single-center | HBV | 93 | ANNs DL | Excellent predictive capability to stage liver fibrosis and superior to serum fibrosis tests |
| Zhang et al[44] | Retrospective, multi-center | HBV | 1500 | CNNs DL | High-frequency images outperformed low-frequency ones across all trained CNNs models, as well as FIB-4, APRI and SWE in staging liver fibrosis |
| Duan et al[45] | Retrospective, two-center | CLD | 434 | GAN model DL | Good performances in staging liver fibrosis. Good predictive accuracy in identifying liver cirrhosis |
| Miura et al[55] | Retrospective, single-center | CLD | 517 | CNNs DL | Higher diagnostic accuracy than human scoring for detecting significant fibrosis (≥ F2) |
| Li et al[40] | Prospective, single-center | Chronic HBV infection | 144 | Adaptive boosting, random forest, SVM ML | ML algorithms improve the accuracy of liver fibrosis assessment. Combining conventional radiomics, ORF and CEMF data with ML algorithms enhances accuracy in detecting significant liver fibrosis |
| Durot et al[46] | Retrospective | CLD or elevated liver enzymes | 204 | SVM ML | SVM ML algorithm demonstrated excellent diagnostic accuracy in distinguishing significant liver fibrosis (≥ F2) when applied to both p-SWE and 2D-SWE data from two different systems, compared with MRE |
| Gatos et al[47] | Prospective, single-center | 54 healthy patients, 31 with CLD | 85 | ML | Good accuracy in distinguishing healthy individuals from patients with CLD |
| Gatos et al[48] | Retrospective | 56 healthy patients, 70 with CLD | 126 | ML | Good accuracy in distinguishing healthy individuals from patients with CLD combining different cluster features |
| Destrempes et al[49] | Retrospective, cross-sectional | CLD (HBV, HCV, NAFLD, AIH) | 82 | ML | Combining QUS and p-SWE in an ML model enhanced accuracy in staging fibrosis, inflammation and steatosis |
| Wang et al[51] | Prospective, multi-center | HBV | 398 | Dlre | Dlre outperformed 2D-SWE in detecting cirrhosis and advanced fibrosis. It was more reliable than biomarkers (FIB-4, APRI) to identify all fibrosis stages |
| Lu et al[52] | Retrospective, multi-center | CLD | 807 | Dlre2.0 | Dlre2.0 achieved a higher AUC than Dlre for significant fibrosis, but without statistical significance |
| Kagadis et al[50] | Retrospective | 88 healthy individuals, 112 with CLD | 200 | GoogLeNet, AlexNet, VGG16, ResNet50, DenseNet201 DL | All pre-trained DL networks achieved good to excellent performance in staging liver fibrosis, outperforming radiologists. ResNet50 and DenseNet201 showed high accuracy across all fibrosis stages |
| Xue et al[54] | Retrospective | Local liver lesions treated by partial hepatectomy | 466 | Inception-V3 network (DL), TL | Gray scale US images and 2D-SWE images analyzed with Inception-V3 (DL) using the TL achieved excellent performance in staging liver fibrosis |
| Brattain et al[53] | Retrospective | NAFLD | 328 | Random forest, SVM ML; CNN DL | CNN demonstrated the highest performance in distinguishing liver fibrosis as significant or not |
| Zhou et al[57] | Retrospective | 94 patients with liver fibrosis; 143 patients with liver fibrosis and liver steatosis | 237 | iANN, DL | Radiomics with iANN-based homodyned-K US imaging outperformed both the standalone iANN method and radiomics on uncompressed US data for liver fibrosis assessment |
| Park et al[110] | Retrospective, multi-center | Patients underwent to liver biopsy or hepatectomy | 933 | DL (VGGNet, ResNet, DenseNet, EfficientNet, ViT) | Deep CNNs accurately staged liver fibrosis by METAVIR score from B-mode US images. EfficientNet showed the best performance among models |
| Lee et al[111] | Retrospective, multi-center | Healthy individuals and patients with CLD | 838 | DCNN, DL | DCNN accurately assessed METAVIR score from US images and outperformed radiologists in diagnosing cirrhosis in simulated US examination |
| Ref. | Type of study | Population | Number of patients | AI technique employed | Main results |
| Fujii et al[112] | Prospective, cross-sectional | MASLD | 486 | DL (U-net) | DL-based segmentation reliably identified the surface irregularity of the liver |
| Drazinos et al[113] | Retrospective, monocentric | MASLD | 112 | DL (Inception-V3, MobileNetV2, ResNet50, DenseNet201 and NASNet mobile) | DenseNet201 achieved the highest overall performance, while Inception-V3 showed superior accuracy in the binary classification of steatosis |
| Chou et al[114] | Retrospective | Healthy patients and patients with liver steatosis | 2070 | DL | DL models achieved higher 88.7% sensitivity for mild steatosis and consistent accuracy across all grades (normal 91.8%, moderate 77.3% moderate, severe 84.4%) |
| Vianna et al[115] | Retrospective | Healthy patients and patients with liver steatosis | 199 | DL (VGG16, ResNet50 and Inception-V3) | DL–based analysis of B-mode US images demonstrated diagnostic performance comparable to expert human readers in both the detection and grading of hepatic steatosis |
| Vianna et al[116] | Retrospective, multi-center | Patients with suspected hepatic steatosis datasets | Not specified | DL | Diagnostic AUC for steatosis detection increased from 0.78 to 0.97. Test-time adaptation improved DL models robustness and generalizability B-mode US |
| Cao et al[117] | Prospective, cross-sectional | Healthy patients and patients with liver steatosis | 240 | DL | The methods showed a good ability (AUC > 0.7) to identify steatosis, particularly in distinguishing moderate from severe (AUC = 0.958) |
| Han et al[73] | Prospective | Healthy individuals and patients with NAFLD | 204 | CNN DL | Accurate diagnosis of NAFLD and fat quantification using US radiofrequency signals |
| Byra et al[74] | Prospective | Steatosis and/or obese patients | 55 | DL (Inception ResNet-v2) | The AI-based model performed best (AUC = 0.977) outperforming the hepatorenal sonographic index (not significant) and grey-level co-occurrence matrix (significant difference) |
| Constantinescu et al[75] | Retrospective | Healthy patients and patients with liver steatosis | 60 | DL (Inception-V3 and VGG-16) | DL algorithms demonstrated excellent diagnostic performance, achieving accuracy rates exceeding 90% |
| Jeon et al[68] | Prospective | Suspected steatosis | 173 | DL | DL algorithm combining QUS parametric maps with B-mode imaging accurately estimated hepatic fat fraction and reliably diagnosed hepatic steatosis |
| Gómez-Gavara et al[118] | Prospective | Livers from brain-dead donors, evaluated during the procurement phase | 192 livers | ML | Integrating ML with liver texture and color analysis smartphone images enables highly accurate estimation of hepatic steatosis severity |
| Santoro et al[119] | Prospective, cross-sectional | Healthy patients and patients with liver steatosis | 134 | ML | AI application enhances both the diagnostic accuracy and efficiency of US in the assessment of hepatic steatosis |
| Kaffas et al[120] | Retrospective, single center | Healthy patients and patients with liver steatosis | 403 | DL | This DL algorithm achieved accurate estimation of hepatic fat fraction and reliable diagnosis of hepatic steatosis |
| Destrempes et al[49] | Prospective | CLD | 82 | ML (random forest) | Random Forest integration of QUS and SWE markedly enhanced diagnostic vs SWE alone, particularly for steatosis assessment, increasing AUC by 25%-50% |
Table 3 Comparative summary of the main artificial intelligence models applied to liver ultrasound, outlining their key features, strengths, limitations, and representative clinical applications
| Model type | Main features | Strengths | Limitations | Typical clinical applications |
| Convolutional neural networks | Deep-learning models extracting hierarchical image features from B-mode or SWE data | High accuracy in fibrosis staging; automatic feature extraction; excellent for large datasets | Require large training datasets; limited interpretability (“black box”) | Fibrosis staging, steatosis grading, lesion detection |
| Support vector machines | Supervised ML classifier using kernel-based separation of data | Robust for small datasets; interpretable decision boundaries | Lower performance for complex, high-dimensional data | Early fibrosis detection, ML radiomics, feature selection |
| Random forest | Ensemble ML algorithm combining multiple decision trees | Handles mixed data (imaging + clinical); resistant to overfitting | Limited ability to capture image texture; less suitable for pixel-level analysis | Integration of US features with clinical and laboratory data |
| Generative adversarial networks | DL models using generator-discriminator structure | Effective for data augmentation; improves synthetic image realism and model generalizability | Computationally demanding; risk of instability during training | Image synthesis, dataset expansion, quality enhancement |
| Hybrid/multimodal models | Combine DL image-based features with ML classifiers or clinical variables | Capture complementary information; improve diagnostic precision | Require harmonized data and complex implementation | Comprehensive multiparametric liver assessment (fibrosis + steatosis) |
- Citation: Viceconti N, Andaloro S, Paratore M, Miliani S, D’Acunzo G, Cerniglia G, Mancuso F, Melita E, Gasbarrini A, Riccardi L, Garcovich M. Harnessing artificial intelligence for the assessment of liver fibrosis and steatosis via multiparametric ultrasound. World J Gastroenterol 2026; 32(2): 113059
- URL: https://www.wjgnet.com/1007-9327/full/v32/i2/113059.htm
- DOI: https://dx.doi.org/10.3748/wjg.v32.i2.113059
