Chen Y, Zhang Q, Zhang MY. Deep learning techniques for using computed tomography imaging for hepatocellular carcinoma diagnosis, treatment and prognosis. World J Gastroenterol 2026; 32(5): 113592 [DOI: 10.3748/wjg.v32.i5.113592]
Corresponding Author of This Article
Ming-Yang Zhang, MD, Doctor, School of Basic Medical Sciences, Nanchang University, No. 461 Bayi Avenue, Nanchang 330006, Jiangxi Province, China. zmmyipuyuan@163.com
Research Domain of This Article
Computer Science, Artificial Intelligence
Article-Type of This Article
Review
Open-Access Policy of This Article
This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Feb 7, 2026 (publication date) through Jan 28, 2026
Times Cited of This Article
Times Cited (0)
Journal Information of This Article
Publication Name
World Journal of Gastroenterology
ISSN
1007-9327
Publisher of This Article
Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA
Share the Article
Chen Y, Zhang Q, Zhang MY. Deep learning techniques for using computed tomography imaging for hepatocellular carcinoma diagnosis, treatment and prognosis. World J Gastroenterol 2026; 32(5): 113592 [DOI: 10.3748/wjg.v32.i5.113592]
Co-corresponding authors: Qiang Zhang and Ming-Yang Zhang.
Author contributions: Zhang MY designed the study; Chen Y and Zhang Q extracted data and wrote the original draft; Zhang MY and Zhang Q reviewed the manuscript; all authors have read and approved the final manuscript.
Conflict-of-interest statement: The authors declare that they have no conflict of interest.
Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Ming-Yang Zhang, MD, Doctor, School of Basic Medical Sciences, Nanchang University, No. 461 Bayi Avenue, Nanchang 330006, Jiangxi Province, China. zmmyipuyuan@163.com
Received: August 29, 2025 Revised: November 4, 2025 Accepted: December 22, 2025 Published online: February 7, 2026 Processing time: 152 Days and 16.3 Hours
Abstract
Hepatocellular carcinoma (HCC), the predominant form of primary liver cancer, significantly threatens to global health. Despite considerable advances in diagnostic and therapeutic approaches in recent years, the prognosis for patients with HCC remains unsatisfactory. The emergence of artificial intelligence (AI), particularly deep learning technologies, offers new hope for improving the diagnosis and treatment of HCC. Researchers have extensively explored ways to integrate deep learning models into the clinical management of HCC patients, which provides a valuable foundation for developing more personalized treatment strategies. Compared with other detection methods, computed tomography (CT) has attracted significant research interest because of its comprehensive advantages, including wide availability and high resolution, making it well suited for AI-powered analysis. This review systematically integrates deep learning technologies for HCC based on CT imaging, while focusing primarily on tumor diagnosis, segmentation, treatment response prediction, and patient prognosis prediction. Moreover, we review popular deep learning networks in various fields and describe the advantages of these prevalent deep learning models for different applications. Furthermore, we discuss the outstanding challenges in applying deep learning to extract information from CT images for the diagnosis and treatment of HCC patients. These insights could provide guidance for subsequent studies.
Core Tip: This review systematically integrates deep learning technologies for hepatocellular carcinoma (HCC) based on computed tomography (CT) imaging, with a primary focus on tumor diagnosis, segmentation, predicting treatment response, and forecasting patient prognosis. Moreover, we reviewed popular deep learning networks in various fields and described the advantages of these prevalent deep learning models for different applications. Furthermore, we discussed the outstanding challenges in applying deep learning to extract information from CT images for the diagnosis and treatment of HCC patients. These insights could provide guidance for subsequent studies.
Citation: Chen Y, Zhang Q, Zhang MY. Deep learning techniques for using computed tomography imaging for hepatocellular carcinoma diagnosis, treatment and prognosis. World J Gastroenterol 2026; 32(5): 113592
Liver cancer ranks as the sixth most prevalent malignancy and the third leading cause of cancer-related death globally, thus posing a significant threat to human health[1]. According to projections by the American Cancer Society, approximately 42240 new liver cancer cases and 30090 cancer-attributable deaths are anticipated in the United States during 2025[2]. Hepatocellular carcinoma (HCC) is the predominant histopathological subtype, and accounts for 90% of primary liver cancer cases[3]. Multiple etiological factors contribute to HCC, including chronic hepatitis B/C viral infection, heavy alcohol consumption, obesity, and nonalcoholic fatty liver disease[4] (Figure 1). HCC frequently presents with nonspecific clinical manifestations in early stages, thus delaying diagnosis for most patients until advanced phases with concomitant intrahepatic or extrahepatic metastases, and thus hindering treatment options[5-7]. Current therapeutic strategies for treating HCC include surgical resection[7], liver transplantation (for selected patients)[7], radiotherapy[5], immunotherapy[8], transarterial chemoembolization (TACE)[9], targeted therapy[10] and systemic chemotherapy[11] (Figure 2). Although numerous treatment methods are available for HCC, the prognosis remains unsatisfactory with up to 70% of patients experiencing recurrence within 5 years after treatment. Promoting early diagnosis of liver cancer, optimizing treatment strategies, and enhancing recurrence surveillance would improve the prognosis of patients with HCC.
Figure 2 Diagnostic and therapeutic approaches for hepatocellular carcinoma.
CT: Computed tomography; MRI: Magnetic resonance imaging; cf-DNAs: Cell free DNAs; CTCs: Circulating tumor cells; ctDNA: Circulating tumor DNAs; circRNAs: Circular RNAs; mRNA: Messenger RNA; lncRNAs: Long non-coding RNAs; miRNAs: MicroRNAs; EVs: Extracellular vesicles; IHC: Immunohistochemistry; HE: Hematoxylin and eosin staining; CTLA-4: Cytotoxic T lymphocyte-associated antigen-4; PD-1: Programmed death-1; PD-L1: Programmed death ligand 1; TCR: T cell antigen receptor; MHC I: Major histocompatibility complex class I; CD80: Cluster of differentiation 80. The image materials are sourced from Biorender (https://biorender.com) and The Human Protein Atlas (https://www.proteinatlas.org) (Supplementary material).
Imaging techniques [computed tomography (CT), magnetic resonance imaging (MRI), and ultrasonography (US)] are pivotal in the diagnosis, formulation of treatment strategies, and prognostic assessment of HCC[12] (Figure 2). US provides inexpensive, real-time, and noninvasive detection for patients with HCC, thereby establishing it as the primary screening modality[13]. However, its operator dependence and limited sensitivity constrain diagnostic accuracy[14]. MRI is the gold standard for radiological imaging of HCC and can significantly improve early-stage HCC detection efficacy[15]. Although MRI has superior diagnostic performance[16,17], its prolonged acquisition times due to enhanced soft-tissue contrast[18,19], combined with operator dependence and elevated costs, limit clinical applicability. CT is another gold standard for radiological imaging of HCC and has a high spatial resolution and a short scan duration; thus, CT has emerged as a more pragmatic option for routine clinical practice[18].
However, conventional CT image analysis relies on manual interpretation, which suffers from high subjectivity and poor reproducibility and fails to meet the growing demands for precision medicine in clinical practice[20]. In recent years, deep learning (DL), a prominent subfield of artificial intelligence (AI), has propelled the intelligent transformation of medical image analysis through its exceptional capabilities in image recognition, feature extraction, and classification prediction. DL models have been extensively implemented across liver imaging processing workflows. These implementations encompass HCC segmentation, detection and prognostic prediction, while also providing clinical decision support for therapeutic interventions, thereby significantly increasing both precision and efficiency across the diagnostic, therapeutic management, and prognostic assessment continuum in HCC care.
This review systematically integrates DL technologies based on CT imaging for detecting HCC, with a primary focus on tumor diagnosis[21-31], segmentation[32-49], treatment response prediction[50-56], and patient prognosis prediction[57-70]. Moreover, we review popular DL networks in various fields and describe their advantages for different applications. Furthermore, we discuss the outstanding challenges in applying DL to extract information from CT images for diagnosing and treating of HCC patients. These insights could provide guidance for subsequent studies.
CONVOLUTIONAL NEURAL NETWORKS AND ITS DERIVATIVE MODELS
Classical convolutional neural networks
Convolutional neural networks (CNNs) are among the most important neural network architectures in DL and are well suited for image data processing and analysis[71]. The design of CNNs is inspired by the receptive field mechanism in biological neural systems, where each neuron connects only to a local region of the feature map from the previous layer[72]. A typical CNN consists of multiple convolutional layers, activation functions [such as rectified linear unit (ReLU)], pooling layers, normalization layers, and fully connected layers, thus forming an end-to-end automated learning system[73] (Figure 3A)[74]. Multiple convolutional layers in the early part of the network progressively capture low-, mid-, and high-level features of images, such as edges, textures, and shapes[75]. The subsequent pooling layers compress the feature maps by down sampling, thereby improving the model’s robustness to transformations such as translation and scaling[76].
U shaped network (UNet) is a classical CNN architecture specifically designed for medical image segmentation and has been widely applied in the automated identification and localization of lesions[77]. Its structure consists of a symmetric encoder and a decoder path: The encoder extracts deep semantic features through multilayer convolution and pooling operations, while the decoder gradually restores spatial resolution by up sampling[78] (Figure 3B)[79]. Skip connections between the encoder and decoder integrate shallow spatial information with deep semantics, thereby significantly improving segmentation accuracy and preserving boundary details[80].
Residual network
Residual network (ResNet) was first proposed by He et al[81] to address the issues of vanishing gradients and degradation in deep network training, thereby enabling effective training of much deeper networks. The core innovation of ResNet lies in its introduction of the residual block[82], which consists of a stack of convolutional layers, batch normalization layers, and a nonlinear activation function (ReLU)[83] (Figure 3C)[84]. A substantial body of empirical evidence demonstrates that residual connections can significantly alleviate the difficulty of fitting training samples when training deep neural networks, while maintaining strong generalization performance on test samples[85].
Densely connected convolutional network
Densely connected convolutional network (DenseNet) was proposed by Huang et al[86]. The main advantage of this network is that it has fewer parameters than a standard convolutional network does[87]. Each convolutional layer in the network receives additional input from all preceding layers and then passes its feature maps to all subsequent layers[88] (Figure 3D)[84]. Each layer of DenseNet can directly receive gradients from the loss function and the original input signal, thereby achieving implicit deep supervision and alleviating the vanishing gradient problem, thereby facilitating the training of deeper network architectures[89].
Visual geometry group network
The visual geometry group (VGG) network (VGGNet) architecture, which has a moderately deep yet uniform structure, was designed to achieve high accuracy in image classification tasks[90]. Its design is similar to that of AlexNet, thus implying that VGGNet possesses many parameters[91]. The network comprises six distinct CNN configurations, namely VGG11, VGG11 (local response normalization), VGG13, VGG16 (convolutional 1), VGG16, and VGG19[90,91]. Taking VGG16 as an example, its 16 convolutional layers are divided into two 3 × 3 convolutional layers, accompanied by 2 × 2 max pooling and two blocks, including three layers of 3 × 3 convolutions succeeded by 2 × 2 max pooling[92] (Figure 3E)[93]. Owing to the nonlinearities in the blocks, this network provides significantly enhanced discriminative capability, which is one of its key advantages[92]. Another advantage of the block design is that it allows for two ReLU activations to be performed after each convolution[92].
Transformer network
The transformer network is a DL architecture based on the attention mechanism[94]. The model is structured as a stack of multiple identical layers, each consisting of two key components: A multi-head self-attention (MSA) module and a position wise feedforward network[95] (Figure 3F)[96]. The MSA mechanism, which is central to the transformer’s operation[97], computes the response at a given pixel position by weighting information from all other positions, thereby enabling it to dynamic measurement of the correlation of pixels at all other locations in the image[98]. This model boasts several advantages, including a strong ability to learning long-range dependencies, powerful multimodal fusion, and good interpretability[89]. It has demonstrated remarkable achievements in tasks such as image classification and segmentation.
Efficient network
The efficient network (EfficientNet) architecture family can provide an appropriate method to scale CNNs for improved accuracy and efficiency[99]. The first model introduced was EfficientNet-B0, which employs a compound scaling method that uses a fixed set of coefficients to uniformly scale network width, depth, and resolution; this model contains 5.3 million parameters and takes 224 × 224 images as input[100] (Figure 3G)[101]. This approach allowed the authors to generate a highly efficient CNN architecture. Furthermore, they applied the same compound scaling method to EfficientNet-B0 to obtain the scaled variants EfficientNet-B1 to B7[102]. Compared with traditional models, EfficientNets are smaller in size, faster in inference speed, and highly accurate[103].
AIDING HCC DIAGNOSIS VIA CT-BASED DL
CT is widely used in the diagnosis of HCC[104,105]. However, conventional CT image analysis techniques are time-consuming and subjective[106]. Distinguishing HCC from hepatic lesions such as intrahepatic cholangiocarcinoma, focal nodular hyperplasia, and metastases can also be challenging[106]. Recently, DL models have been widely employed in HCC diagnosis because they can automatically detect subtle morphological alterations in hepatic lesions and enhance image resolution, thereby significantly improving the sensitivity and accuracy of HCC diagnosis[26,107-110].
In this review, modified ResNet and its integrations with other networks were widely used DL approaches in HCC diagnosis based on CT imaging[21-24] (Table 1). Among all the network models, the AI system proposed by Wang et al[21] stands out because of the many patients in its training set. This AI system consists of NoduleNet and HCCNet, and both networks are built upon the ResNet architecture[21]. Moreover, NoduleNet which can identify nodule images, was designed to function as an auxiliary model[21]. These advantages contribute to the system’s robust performance; for example, this system achieved area under the curves (AUCs) of 0.88 and 0.89 on internal and external validation sets, respectively[21]. To fully use the spatial information in CT images, some researchers have used three-dimensional (3D) ResNet architectures to extract features from these images[22,23]. To compare the performance between nonenhanced phase CT and contrast-enhanced phase CT scans within a 3D ResNet-based image analysis model, Ling et al[22] proposed both a base model and an enhanced model built upon the 3D ResNet architecture. In the base model, the authors used nonenhanced phase CT, whereas in the enhanced model, they employed contrast-enhanced phase CT scans[22]. The results demonstrated that the enhanced model significantly outperformed the base model in terms of execution performance[22]. To improve diagnostic performance, the authors introduced patient sex and age into the model and proposed the minimum extra information about patients and lesions model[22]. Achieving an AUC of 0.96, the model demonstrated significantly improved sensitivity and specificity[22]. Moreover, other researchers have explored alternative approaches to enhance the performance of 3D ResNet models in extracting and diagnosing HCC-related information from CT scans. Guo et al[23] proposed ALARM which integrates DL with the aMAP HCC risk score. This risk score, which is calculated using five common clinical variables (age, male sex, albumin-bilirubin concentration and platelet count), was developed to predict the likelihood of HCC occurrence in patients with chronic hepatitis[111]. Achieving AUCs of 0.90 and 0.92 on the internal and external validation sets, respectively, ALARM can effectively predict the short-term development of HCC in patients with liver cirrhosis[23].
Table 1 Characteristics of deep learning networks for diagnosis of hepatocellular carcinoma from computed tomography images, mean ± SD.
Modified UNet and its integration with other networks are also frequently used for diagnosing of HCC on the basis of CT images[23,24,26,27] (Table 1). Differentiating between primary HCC and metastatic cancer is particularly critical for determining a patient’s subsequent treatment plan. However, as mentioned previously, distinguishing HCC from other liver lesions poses certain challenges[106]. To address this challenge, Zossou et al[26] proposed a model that connects a residual attention UNet (RA-UNet) and a 9-layer CNN classifier in series. RA-UNet is used for segmenting of CT images, and both the segmented results and the original images are fed into a CNN[26]. Finally, RA-UNet achieved a classification accuracy of 93.97%[26]. Similarly employing a cascaded network architecture, Chen et al[27] proposed a successive encoder-decoder (SED) model to detect HCC. The difference was that they employed a cascaded framework of UNet and Dense UNet[27]. SED-1 (UNet) is primarily used to remove unwanted voxels and organs and extract the liver’s location from CT images[27]. SED-2 (Dense UNet) uses the output from SED-1 to further segment the lesions[27]. This model achieved an accuracy of 0.99 and an AUC of 0.95 in extracting HCC from CT images[27]. To provide a more intuitive visualization of the shape and location of the HCC, the authors also reconstructed the two-dimensional (2D) segmentation results from the SED model into a 3D format[27], which is highly beneficial for clinical practice.
AIDING HCC THERAPY PLANNING VIA CT-BASED DL
Pretreatment HCC segmentation
Accurate HCC segmentation prior to treatment is critical in HCC therapy. However, HCC segmentation is particularly challenging because of the similar tissue intensity between the liver and adjacent organs and the high variability in tumor morphology and size[48,112,113]. Current HCC segmentation is accomplished primarily through three technical approaches: Manual, semiautomatic, and automatic[114]. Because automatic segmentation reduces interobserver variability and decreases reliance on operator experience, it is gaining increasing attention in clinical practice. Automatic segmentation is achieved primarily through two technical approaches: Machine learning and DL[115]. Although machine learning has substantially increased the precision of automatic segmentation, the continued need for manual feature engineering compromises system robustness[116]. Owing to its superior feature learning capabilities, DL is being increasingly adopted by researchers for automated medical image analysis, including liver tumor segmentation.
In this review, modified UNets are the most prevalent DL approaches for HCC segmentation based on CT imaging (Table 2). To further improve the segmentation efficiency of UNet, researchers have enhanced to its architecture, either through cascaded UNet networks or the integration of specialized modules[34,35,37-41]. Ouhmich et al[34] designed a three-stage cascaded neural network system based on UNet to perform liver tissue segmentation. The basic process is as follows: The initial network module processes input CT scans to segment liver regions; The second network uses the segmentation results from the first network to segment liver tumor lesion segmentation; And the third network segments the necrotic tissue within the lesion[34]. This cascaded architecture performed exceptionally in automated liver segmentation[34]. Moreover, researchers have explored the integration of specialized modules into the UNet architecture to increase its segmentation accuracy[35,39]. Residual and dense blocks are the most frequently used blocks in modified UNet[35,37,38,40]. The former can enhance detailed feature extraction from images while effectively addressing segmentation deficiencies in low-contrast regions and weakly defined boundaries[117]. Additionally, it can effectively address the vanishing gradient problem that occurs with increasing network depth[117]. The latter can concatenate feature maps and perform dimension-level merging thereby reducing the number of required input feature maps[37]. It not only reduces computational complexity but also mitigates gradient vanishing[37]. To reduce training losses and improve segmentation performance, Khan et al[35] proposed the residual multiscale UNet (RMS-UNet) network. They incorporated dilated convolution and residual modules into the UNet architecture. Dilated convolution with multiple dilations can improve the receptive view of input CT images without any change in kernel size[35]. However, adding more convolutional layers could introduce training errors; thus, a ResNet was introduced to avoid overfitting[35]. Both qualitative and quantitative results demonstrate the superior performance of RMS-UNet[35]. To obtain sufficient image features and parameters, reduce runtime, and minimize interference from irrelevant regions, Chen et al[37] designed a novel residual-dense-attention (RDA) UNet network by introducing the following into a UNet network: A residual block from a residual neural network, a dense block from DenseNet and an attention gates module from an attention UNet model. The proposed RDA UNet achieved high accuracy while reducing runtime by 28%[37]. Nevertheless, the forementioned studies focused primarily on 2D UNet architectures, thus inherently limiting the full use of spatial information. To fully exploit spatial information during image segmentation process, Çiçek et al[118] proposed the 3D UNet architecture. Regardless, compared with 2D UNet, 3D UNet requires more memory and has a longer computational time[119]. To address these challenges, Li et al[38] formulated a novel hybrid densely connected UNet, which integrated 3D DenseUNet with 2D DenseUNet via an auto-context mechanism; CT intra-slice and inter-slice features were jointly optimized through a hybrid feature fusion layer[38]. Experimental results on public medical imaging datasets demonstrated that the proposed method performs competently in HCC segmentation[38]. Wang et al[39] also designed a hybrid 3D UNet network to segment HCC. In designing the network, which they designated as a multiscale attention and deep supervision-based 3D UNet, the authors incorporated multiscale attention and deep supervision mechanisms into the 3D UNet architecture[39]. The introduction of the attention mechanism increased the network model’s ability to use multiscale contextual spatial information[39]. The incorporation of supervision mechanisms improved the model’s overall accuracy[39]. The proposed model, which the authors evaluated on public datasets, demonstrated superior computational efficiency and performance in liver cancer segmentation[39].
Table 2 Characteristics of deep learning networks for segmentation of hepatocellular carcinoma from computed tomography images, mean ± SD.
Beyond UNet, various other CNN architectures have been successfully employed for HCC segmentation tasks (Table 2). The hybrid ResNet and VGGNet configuration represents a frequently used approach[44,45]. ResNet can extract deep features and VGGNet can effectively utilize spatial information[44,45]. On the basis of this theoretical framework, d’Albenzio et al[44] proposed a novel dual-encoder double concatenation network in which two parallel encoders (VGG19 and ResNet) were used to extract features[44]. These features were passed to a single decoder[44]. This network demonstrated superior HCC segmentation capability. Subsequently, Singh et al[45] designed the FasNet, which integrates VGG16 and ResNet50. Similarly, this model performed exceptionally well in HCC segmentation[45].
Treatment response prediction
Surgical resection and liver transplantation remain the primary potentially curative modalities for treating HCC[120]. However, some patients have already lost the opportunity for surgical treatment by the time they are diagnosed[121]. In cases of unresectable HCC at the early to intermediate stages, locoregional therapies now constitute the primary treatment modality[122,123]. Locoregional therapies primarily consist of the following three modalities: Radiofrequency ablation, TACE, and transarterial radioembolization. For patients with advanced-stage HCC, systemic therapy may provide clinical benefit[124]. Systemic therapies for HCC include chemotherapy, targeted therapy, and immunotherapy[125,126]. For decades, chemotherapy has remained the standard systemic treatment option for advanced HCC, but patient benefits remain limited[127]. In recent decades, novel molecular-targeted and immunotherapy drugs have been developed[126], these drugs have revolutionized the management of advanced HCC and offered new hope where limited options previously had existed[128]. Although both locoregional therapies and emerging targeted immunotherapies provide varying degrees of clinical benefit, therapeutic responses exhibit significant interpatient heterogeneity, particularly for targeted agents and immunotherapies, where treatment efficacy often depends on the patient’s genetic profile[129-131]. Accurate prediction of patient treatment responsiveness will facilitate optimal therapeutic strategy development by clinicians and promote personalized medicine. Imaging techniques have consistently served as cornerstone modalities for assessing treatment response in patients with HCC[132-134]. Manual analysis of medical images relies on experienced radiologists and is susceptible to subjective variability and other limitations. Consequently, Lambin et al[135] pioneered the concept of radiomics to employ high-throughput feature extraction algorithms to derive quantitative characteristics from medical images. The emergence of machine learning and DL has further accelerated the advancement of medical imaging analysis technologies[136]. The integration of radiomics, machine learning, and DL has substantially increased the utilization efficiency of medical imaging information[137]. In predicting treatment response in patients with HCC, researchers frequently integrate DL, machine learning, and radiomics to increase predictive efficiency[51,53]. Some investigators have further incorporated clinical data into analytical networks to further improve performance[54,56].
For predicting treatment response in patients with HCC, modified ResNet is the most widely adopted CNN architecture and is frequently integrated with radiomics data and clinical variables in this review[50,52-56] (Table 3). Liao et al[54] and Lin et al[55] used the ResNet18-based models to predict treatment response in patients with HCC receive immune checkpoint inhibitor (ICI) therapy and combination lenvatinib and ICI therapy, respectively. Three additional studies utilized ResNet50-based models to predict treatment response in patients with HCC who receive either TACE therapy or TACE-hepatic arterial infusion chemotherapy (HAIC) combined with a programmed cell death 1 (PD-1) inhibitor and tyrosine kinase inhibitor (TKI) therapy[50,53,56]. ResNet18 is an 18-layer architecture that employs residual connections to mitigate the vanishing gradient problem[138]. It has lower computational costs while maintaining the benefits of deeper models[81]. Liao et al[54] used ResNet18 to integrate information from multiphase CT images to predict treatment response in patients with HCC who receive combination lenvatinib and ICI therapy, which provided it superior predictive accuracy, achieving an AUC of 0.802[54]. To improve accuracy, Lin et al[55] employed an integrated model that combined features from ResNet18, radiomics, and clinical data to predict treatment response in patients with HCC receive ICI therapy[55]. Predictions made by this model were clearly superior to those made using radiomics or clinical data alone[55]. Compared with ResNet18, ResNet50 comprises more layers, thus enabling it to capture more complex features[81]. The incorporation of residual connections enhances trainability at such depths[81]. ResNet50 delivers significantly superior performance, particularly on large datasets and more complex image processing tasks, thereby enhancing the classification accuracy of DL models[81]. Owing to these advantages, several studies have used it to predict the treatment response in patients with HCC[50,53,56]. Peng et al[50] used ResNet50 to predict treatment response patients with HCC who receive TACE therapy. This network achieved accuracies of 85.1% and 82.8% in the two validation sets[50]. With respect to predicting the response of patients with HCC to TACE-HAIC combined with PD-1 inhibitors and TKI therapy, Yin et al[56] proposed a combined model that integrates ResNet50, radiomics, and clinical data to improve accuracy. This model achieved a remarkable accuracy of 89.5% in the training cohort, which is highly encouraging[56].
Table 3 Characteristics of deep learning networks for treatment response of hepatocellular carcinoma from computed tomography images.
As previously stated, a variety of therapeutic approaches exist for curing and palliating of HCC. Despite this, the prognosis for HCC patients remains poor[139]. Curative treatment options for early-stage HCC provide a 5-year survival rate of more than 70%[4]. However, up to 70% of patients with HCC experience tumor recurrence within 5 years after undergoing resection or ablation therapy[140]. For patients with advanced liver cancer who have undergone systemic therapy, the median survival is only 1 year to 1.5 years[4]. Monitoring a patient’s disease status and identifying those with a poor prognosis inform clinical decision-making and enable timely intervention. The molecular and pathological profiles of HCC are strongly correlated with biological aggressiveness and have significant prognostic value[141]. However, assessing these factors requires tissue samples, which are not readily available for most HCC patients[139]. Fortunately, imaging techniques can depict entire tumors and offer noninvasive insights into their biology and heterogeneity[139]. Advances in radiomics and DL have greatly enhanced the use of CT imaging data, improved the accuracy of prognosis prediction, and provided valuable insights for clinical treatment[142].
Recurrence
Tumor recurrence is a major determinant of overall survival (OS) in patients with HCC[143]. HCC recurrence is classified into early and late recurrence. Early recurrence, defined as recurrence occurring within two years of HCC treatment, is primarily attributed to occult intrahepatic dissemination from the original tumor and is correlated with a high initial tumor burden[144]. In contrast, late recurrence (≥ 2 years) is not related to the primary lesion but rather to the development of de novo HCC[144]. Compared with patients with late recurrence, patients with early recurrent HCC have significantly shorter post-recurrence survival[145]. Therefore, predicting HCC recurrence and its pattern (early vs late) will assist clinicians in formulating both post-treatment surveillance strategies and corresponding treatment plans. In using CT imaging to develop prognostic models for HCC recurrence, researchers have increasingly integrated clinical variables with DL-based radiographic analysis within a multimodal framework[57-62].
In this review, modified ResNet is the most frequently employed CNN architecture that uses CT imaging to predict HCC recurrence[57,58,62] (Table 4). Furthermore, all researchers integrated the modified ResNet model with clinical data to improve its predictive accuracy[57,58,62]. To maximize the use of information from every phase of contrast enhanced CT and improve predictive accuracy, Wang et al[57] embedded an attention mechanism into the ResNet network. Intra-phase attention focuses on important information across different channels and spatial locations within a single phase, whereas inter-phase attention focuses on salient features across different contrast phases[57]. When combined with clinical data, the model achieved an accuracy of 81.2% in predicting HCC recurrence[57]. To increase prediction accuracy, Lv et al[58] adopted a multimodal fusion approach that integrates radiomic data, clinical variables, and the ResNet50 model, thus resulting in an AUC of 0.83. Notably, to select the optimal network for constructing a prediction model, Zhang et al[59] evaluated and compared the predictive performance of four architectures DenseNet121, ResNet101, ResNet50, and VGG19 on separate training, validation, and test sets. Owing to its superior stability and ability to minimize data loss, VGG19 was ultimately selected by researchers to construct a prediction model. Their results demonstrated that the 2.5D DL model was superior to the 3D model[59]. Subsequently, the integration of the 2.5D model with clinical data further enhanced the predictive performance, thereby achieving an AUC of 0.804 in the external validation set[59].
Table 4 Characteristics of deep learning networks for prognosis of hepatocellular carcinoma from computed tomography images, mean ± SD.
OS and progression-free survival (PFS) are the primary endpoints for measuring the efficacy of HCC treatment[146]. As previously mentioned, a variety of treatment modalities are currently available for HCC, and patient survival has achieved improved markedly in some respects[147]. However, patients with HCC undergoing the same treatment show significant variation in the improvement of their survival[148]. The inherent heterogeneity of HCC itself dictates divergent responses to therapy and ultimately affects patient survival[149]. Therefore, pretreatment evaluation of post-therapeutic survival for individual patients by using diverse information contributes to the formulation of more personalized treatment approaches and ultimately leads to prolonged patient survival. Imaging information has long been an instrumental tool for assessing patient prognosis[139,150]. With the integration of machine learning and DL into radiology, a wealth of information concealed within medical images is being uncovered, thus dramatically increasing the predictive accuracy for patient survival[151].
In the context of CT-based survival prediction, modified ResNet remained the predominant model in this review[63,64,66,67] (Table 4). Notably, prior to constructing the combined model, both Chen et al[66] and Ren et al[67] compared the performance of ResNet50 with that of other convolutional models, and their results consistently demonstrated the superiority of ResNet50 over alternative architectures. In the study by Chen et al[66], hand-crafted radiomic features, DL features, and clinical data were integrated to predict the 2-year survival of HCC patients after stereotactic body radiation therapy; their model achieved an AUC of 0.86[66]. Ren et al[67] developed another model that integrated ResNet50, a feature selector based on mutual information maximization, and the nearest neighbors machine learning algorithm to predict the OS of HCC patients receiving a combination therapy of TACE and TKI; the model achieved an accuracy of 0.92[67]. It is noteworthy that Dai et al[64] used a 3D ResNeXt architecture in their study. ResNeXt, an architecture that builds upon the foundation of ResNet, integrates the “split-transform-merge” strategy of the inception network with a stacked building block approach[152]. ResNeXt has garnered significant attention because of its performance improvements on image classification tasks[153]. Dai et al[64] integrated hepatitis B surface antigen, five radiomics signatures, and a 3D ResNeXt architecture into a novel predictive model, which achieved an AUC of 0.89 in the validation cohort[64].
In addition to modified ResNet, modified EfficientNet is a widely adopted model for predicting the survival period of patients with HCC after treatment by using CT imaging information[68,69] (Table 4). The scalability and efficiency of EfficientNet render it highly suitable for medical image analysis applications, particularly those requiring high precision in resource-limited environments[154]. Notably, EfficientNet has been applied to predict the prognosis of patients with HCC after immunotherapy[68,69]. Building upon EfficientNet, Xia et al[68] proposed a convolutional-recurrent neural network architecture, which can decode spatial features from CT images and capture the temporal dynamics of tumor evolution between baseline and follow-up scans. When integrated with clinical data, the proposed model achieved an AUC of 0.839 for predicting 2-year survival in HCC patients treated with immunotherapy[68]. To improve the prediction of OS and PFS in HCC patients receiving ICI therapy, Xu et al[69] developed an integrated system that combined EfficientNet, a semi-supervised learning framework, a CNN-Transformer model, and clinical data. By fully using diverse data types, the model achieved an AUC of 0.84 in predicting 2-year patient survival[69].
FUTURE
As previously mentioned, DL has significantly improved the utilization efficiency of CT images for early diagnosis, treatment decision-making, and prognostic assessment of patients with HCC. Nevertheless, many challenges remain to be addressed. First, publication bias should be noted. Studies with negative or suboptimal results are rarely published. This imbalance may lead to an overly optimistic assessment of the potential for DL in clinical applications. Moving forward, it is paramount to encourage researchers to preregister their studies, which enables the discovery of all research outcomes, irrespective of their publication status. Furthermore, to further mitigate publication bias, specialized journals for publishing negative results should be established. Second, in terms of experimental design, most current studies are single-center and retrospective. Single-center studies carry the risk of significant regional bias, and a retrospective design precludes the possibility of real-time validation. Therefore, there is a clear need to initiate multicenter prospective studies, even randomized controlled trials. Third, as mentioned earlier, researchers have used either contrast-enhanced CT or non-contrast CT scans. The substantial difference in image quality and characteristics between these protocols presents a significant preprocessing challenge. Effectively mitigating this heterogeneity to ensure model generalizability remains a critical and unresolved issue in current research. Fourth, even within the same domain, multiple network models or combined network models are used for extracting information from CT images. Especially within the same application domain, different researchers have proposed a variety of fusion models with varying performance. However, as these models are trained and evaluated on non-standardized, heterogeneous datasets, their results are often not directly comparable. This lack of consistency inevitably creates significant challenges for clinical practitioners in selecting the optimal model for deployment. Sharing image data or using public datasets for model validation is imperative. Fifth, AI tools are often described as black boxes[155]. As a type of AI tool, DL also suffers from the black box problem[156]. This leads to poor interpretability of the results. The interpretability of a model’s output is crucial for clinicians to make corresponding decisions regarding diagnosis, treatment, and monitoring. Designing models with inherent interpretability, rather than creating methods to explain these black-box models, might be a more reliable approach to mitigate these issues[157]. However, this still requires further exploration from researchers. In addition to the technical issues mentioned above, the application of DL presents ethical and regulatory challenges. Adherence to regulatory and ethical frameworks is an essential prerequisite for integrating DL-based imaging into HCC clinical workflows. This process is inherently resource intensive, and requires considerable effort from researchers. However, currently, no applications of DL models have been approved by the United States Food and Drug Administration for image-guided treatments of HCC[158]. There are currently some Food and Drug Administration-approved radiology AI-based software devices for thoracic radiology[159]. This breakthrough paves the way for the much-anticipated integration of DL-based imaging into clinical management of HCC. That a growing number of DL-based imaging technologies will be integrated into HCC clinical practice soon is anticipated.
CONCLUSION
Over the past few years, DL has made remarkable progress in extracting information from CT images to aid clinicians in the early diagnosis, treatment response monitoring, and prognosis assessment of patients with HCC. Researchers often integrate multiple DL models to extract information from CT images. This approach compensates for the drawbacks of individual networks and increases their ability to extract information. In addition, researchers have integrated traditional radiomic and clinical data with DL models. This combined approach further enhances the model’s performance. All these advancements demonstrate that DL holds great promise for extracting information from CT images and advancing health care development. These promising application prospects, however, present several significant challenges. These include how to reduce CT image heterogeneity during preprocessing, enable data sharing, how to compare the performance of different methods, and interpret the output results. Therefore, the true integration of DL into clinical practice will require researchers’ ongoing and dedicated efforts.
Footnotes
Provenance and peer review: Invited article; Externally peer reviewed.
Peer-review model: Single blind
Specialty type: Gastroenterology and hepatology
Country of origin: China
Peer-review report’s classification
Scientific Quality: Grade A, Grade B
Novelty: Grade A, Grade C
Creativity or Innovation: Grade A, Grade B
Scientific Significance: Grade A, Grade C
P-Reviewer: Li HG, PhD, China; Vaithiyam V, MD, Assistant Professor, India S-Editor: Fan M L-Editor: A P-Editor: Yu HG
Pan Y, Chen H, Zhang X, Liu W, Ding Y, Huang D, Zhai J, Wei W, Wen J, Chen D, Zhou Y, Liang C, Wong N, Man K, Cheung AH, Wong CC, Yu J. METTL3 drives NAFLD-related hepatocellular carcinoma and is a therapeutic target for boosting immunotherapy.Cell Rep Med. 2023;4:101144.
[RCA] [PubMed] [DOI] [Full Text][Cited by in RCA: 82][Reference Citation Analysis (0)]
Nahon P, Najean M, Layese R, Zarca K, Segar LB, Cagnot C, Ganne-Carrié N, N'Kontchou G, Pol S, Chaffaut C, Carrat F, Ronot M, Audureau E, Durand-Zaleski I; ANRS CO12 CirVir; ANRS CO22 Hepather; Scientific Committee – Voting members; CIRRAL groups. Early hepatocellular carcinoma detection using magnetic resonance imaging is cost-effective in high-risk patients with cirrhosis.JHEP Rep. 2022;4:100390.
[RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)][Cited by in Crossref: 24][Cited by in RCA: 26][Article Influence: 6.5][Reference Citation Analysis (0)]
Guo L, Hao X, Chen L, Qian Y, Wang C, Liu X, Fan X, Jiang G, Zheng D, Gao P, Bai H, Wang C, Yu Y, Dai W, Gao Y, Liang X, Liu J, Sun J, Tian J, Wang H, Hou J, Fan R. Early warning of hepatocellular carcinoma in cirrhotic patients by three-phase CT-based deep learning radiomics model: a retrospective, multicentre, cohort study.EClinicalMedicine. 2024;74:102718.
[RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)][Cited by in Crossref: 5][Cited by in RCA: 9][Article Influence: 4.5][Reference Citation Analysis (0)]
Rocha BA, Ferreira LC, Vianna LGR, Ferreira LGG, Ciconelle ACM, Da Silva Noronha A, Cortez Filho JM, Nogueira LSL, Leite JMRS, da Silva Filho MRM, da Costa Leite C, de Maria Felix M, Gutierrez MA, Nomura CH, Cerri GG, Carrilho FJ, Ono SK. Contrast phase recognition in liver computer tomography using deep learning.Sci Rep. 2022;12:20315.
[RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)][Cited by in RCA: 7][Reference Citation Analysis (0)]
Lee IC, Tsai YP, Lin YC, Chen TC, Yen CH, Chiu NC, Hwang HE, Liu CA, Huang JG, Lee RC, Chao Y, Ho SY, Huang YH. A hierarchical fusion strategy of deep learning networks for detection and segmentation of hepatocellular carcinoma from computed tomography images.Cancer Imaging. 2024;24:43.
[RCA] [PubMed] [DOI] [Full Text][Cited by in RCA: 9][Reference Citation Analysis (0)]
Lin Z, Wang W, Yan Y, Ma Z, Xiao Z, Mao K. A deep learning-based clinical-radiomics model predicting the treatment response of immune checkpoint inhibitors (ICIs)-based conversion therapy in potentially convertible hepatocelluar carcinoma patients: a tumor marker prognostic study.Int J Surg. 2025;111:3342-3355.
[RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)][Cited by in RCA: 7][Reference Citation Analysis (0)]
Xu Y, Liu Z, Li Y, Hou H, Cao Y, Zhao Y, Guo W, Cui L. Feature data processing: Making medical data fit deep neural networks.Future Gener Comp Sy. 2020;109:149-157.
[PubMed] [DOI] [Full Text]
Siddique N, Paheding S, Elkin CP, Devabhaktuni V. U-Net and Its Variants for Medical Image Segmentation: A Review of Theory and Applications.IEEE Access. 2021;9:82031-82057.
[PubMed] [DOI] [Full Text]
Ronneberger O, Fischer P, Brox T.
U-Net: Convolutional Networks for Biomedical Image Segmentation. Available from: rXiv:1505.04597.
[PubMed] [DOI] [Full Text]
79 Pravitasari A, Asnawi M, Nugraha F, Darmawan G, Hendrawati T. Enhancing 3D Lung Infection Segmentation with 2D U-Shaped Deep Learning Variants.Appl Sci. 2023;13:11640.
[PubMed] [DOI] [Full Text]
Hasanah SA, Pravitasari AA, Abdullah AS, Yulita IN, Asnawi MH. A Deep Learning Review of ResNet Architecture for Lung Disease Identification in CXR Image.Appl Sci. 2023;13:13111.
[PubMed] [DOI] [Full Text]
Huang G, Huang G, Liu Z, Van Der Maaten L, Weinberger KQ.
Densely Connected Convolutional Networks. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2017 Jul 21-26; Honolulu, HI, United States. IEEE, 2017: 2261-2269.
[PubMed] [DOI]
Zakaria N, Mohmad Hassim YM. A Review Study of the Visual Geometry Group Approaches for Image Classification.J Appl Sci Tech Comput. 2024;1:14-28.
[PubMed] [DOI] [Full Text]
Ajit A, Acharya K, Samanta A.
A Review of Convolutional Neural Networks. Proceedings of the 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE); 2020 Feb 24-25; Vellore, India. IEEE, 2020: 1-5.
[PubMed] [DOI]
Nayak GHH, Alam W, Singh KN, Avinash G, Ray M, Kumar RR. Modelling monthly rainfall of India through transformer-based deep learning architecture.Model Earth Syst Environ. 2024;10:3119-3136.
[PubMed] [DOI] [Full Text]
Zamir SW, Arora A, Khan S, Hayat M, Khan FS, Yang M.
Restormer: Efficient Transformer for High-Resolution Image Restoration. Available from: arXiv:2111.09881.
[PubMed] [DOI] [Full Text]
Gehlot M, Gandhi GC. “EffiNet-TS”: A deep interpretable architecture using EfficientNet for plant disease detection and visualization.J Plant Dis Prot. 2023;130:413-430.
[PubMed] [DOI] [Full Text]
Ali K, Shaikh ZA, Khan AA, Laghari AA. Multiclass skin cancer classification using EfficientNets – a first step towards preventing skin cancer.Neurosci Inform. 2022;2:100034.
[PubMed] [DOI] [Full Text]
Meng Z, Du X, Yao X, He L, Lin L. Classification of Camellia oleifera using a dual recognition strategy based on deep learning.Multimed Tools Appl. 2024;84:12219-12240.
[PubMed] [DOI] [Full Text]
Nansamba B, Nakatumba-Nabende J, Katumba A, Kateete DP. A Systematic Review on Application of Multimodal Learning and Explainable AI in Tuberculosis Detection.IEEE Access. 2025;13:62198-62221.
[PubMed] [DOI] [Full Text]
Narita K, Nakamura Y, Higaki T, Kondo S, Honda Y, Kawashita I, Mitani H, Fukumoto W, Tani C, Chosa K, Tatsugami F, Awai K. Iodine maps derived from sparse-view kV-switching dual-energy CT equipped with a deep learning reconstruction for diagnosis of hepatocellular carcinoma.Sci Rep. 2023;13:3603.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 3][Cited by in RCA: 5][Article Influence: 1.7][Reference Citation Analysis (0)]
Fan R, Papatheodoridis G, Sun J, Innes H, Toyoda H, Xie Q, Mo S, Sypsa V, Guha IN, Kumada T, Niu J, Dalekos G, Yasuda S, Barnes E, Lian J, Suri V, Idilman R, Barclay ST, Dou X, Berg T, Hayes PC, Flaherty JF, Zhou Y, Zhang Z, Buti M, Hutchinson SJ, Guo Y, Calleja JL, Lin L, Zhao L, Chen Y, Janssen HLA, Zhu C, Shi L, Tang X, Gaggar A, Wei L, Jia J, Irving WL, Johnson PJ, Lampertico P, Hou J. aMAP risk score predicts hepatocellular carcinoma development in patients with chronic hepatitis.J Hepatol. 2020;73:1368-1378.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 146][Cited by in RCA: 253][Article Influence: 42.2][Reference Citation Analysis (0)]
Xu L, Zhu Y, Zhang Y, Yang H. Liver segmentation based on region growing and level set active contour model with new signed pressure force function.Optik. 2020;202:163705.
[PubMed] [DOI] [Full Text]
Kim T, Behdinan K. Advances in machine learning and deep learning applications towards wafer map defect recognition and classification: a review.J Intell Manuf. 2023;34:3215-3247.
[PubMed] [DOI] [Full Text]
Yao LQ, Chen ZL, Feng ZH, Diao YK, Li C, Sun HY, Zhong JH, Chen TH, Gu WM, Zhou YH, Zhang WG, Wang H, Zeng YY, Wu H, Wang MD, Xu XF, Pawlik TM, Lau WY, Shen F, Yang T. Clinical Features of Recurrence After Hepatic Resection for Early-Stage Hepatocellular Carcinoma and Long-Term Survival Outcomes of Patients with Recurrence: A Multi-institutional Analysis.Ann Surg Oncol. 2022;.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 44][Cited by in RCA: 54][Article Influence: 13.5][Reference Citation Analysis (0)]
Llovet JM, Villanueva A, Marrero JA, Schwartz M, Meyer T, Galle PR, Lencioni R, Greten TF, Kudo M, Mandrekar SJ, Zhu AX, Finn RS, Roberts LR; AASLD Panel of Experts on Trial Design in HCC. Trial Design and Endpoints in Hepatocellular Carcinoma: AASLD Consensus Conference.Hepatology. 2021;73 Suppl 1:158-191.
[RCA] [PubMed] [DOI] [Full Text][Cited by in Crossref: 134][Cited by in RCA: 295][Article Influence: 59.0][Reference Citation Analysis (0)]
Cao J, Zhang H, Ren W.
Improved YOLOv3 model based on ResNeXt for target detection. Proceedings of the 2021 IEEE International Conference on Power, Intelligent Computing and Systems (ICPICS); 2021 Jul 29-31; Shenyang, Liaoning Province, China. IEEE, 2021: 709-713.
[PubMed] [DOI]
Mienye ID, Swart TG, Obaido G, Jordan M, Ilono P. Deep Convolutional Neural Networks in Medical Image Analysis: A Review.Inform. 2025;16:195.
[PubMed] [DOI] [Full Text]