Published online Jun 14, 2022. doi: 10.3748/wjg.v28.i22.2494
Peer-review started: December 18, 2021
First decision: January 23, 2022
Revised: February 3, 2022
Accepted: April 22, 2022
Article in press: April 22, 2022
Published online: June 14, 2022
Processing time: 174 Days and 2.5 Hours
Hepatic steatosis is a major cause of chronic liver disease. Two-dimensional (2D) ultrasound is the most widely used non-invasive tool for screening and monitoring, but associated diagnoses are highly subjective.
To develop a scalable deep learning (DL) algorithm for quantitative scoring of liver steatosis from 2D ultrasound images.
Using multi-view ultrasound data from 3310 patients, 19513 studies, and 228075 images from a retrospective cohort of patients received elastography, we trained a DL algorithm to diagnose steatosis stages (healthy, mild, moderate, or severe) from clinical ultrasound diagnoses. Performance was validated on two multi-scanner unblinded and blinded (initially to DL developer) histology-proven cohorts (147 and 112 patients) with histopathology fatty cell percentage diagnoses and a subset with FibroScan diagnoses. We also quantified reliability across scanners and viewpoints. Results were evaluated using Bland-Altman and receiver operating characteristic (ROC) analysis.
The DL algorithm demonstrated repeatable measurements with a moderate number of images (three for each viewpoint) and high agreement across three premium ultrasound scanners. High diagnostic performance was observed across all viewpoints: Areas under the curve of the ROC to classify mild, moderate, and severe steatosis grades were 0.85, 0.91, and 0.93, respectively. The DL algorithm outperformed or performed at least comparably to FibroScan control attenuation parameter (CAP) with statistically significant improvements for all levels on the unblinded histology-proven cohort and for “= severe” steatosis on the blinded histology-proven cohort.
The DL algorithm provides a reliable quantitative steatosis assessment across view and scanners on two multi-scanner cohorts. Diagnostic performance was high with comparable or better performance than the CAP.
Core Tip: Ultrasound is widely used to evaluate liver steatosis, but it is subjective. We developed a deep learning algorithm for quantitative steatosis scoring from ultrasound. The algorithm was trained on > 200000 images and composed of different scanners and viewpoints from both hepatic lobes. High diagnostic performance was measured across all viewpoints in separate histology proven groups, which was comparable to or better than the control attenuation parameter. We demonstrated high agreement across scanners and viewpoints. Thus, our deep learning algorithm provides a quantitative assessment with high performance and reliability.