Review
Copyright ©The Author(s) 2022.
World J Gastroenterol. Nov 28, 2022; 28(44): 6230-6248
Published online Nov 28, 2022. doi: 10.3748/wjg.v28.i44.6230
Table 1 Summary of machine learning articles studying virus and inflammatory-related liver damage
Ref.
Objective
Subjects
Variables
ML model
Performance
Observations/remarks
Fialoke et al[63]To predict NASH in NAFLD patientsn = 108139, NASH and healthy (non-NASH) populationsDemographic data, type 2 diabetes status, and blood biomarkers RF, XGBoosting, DT, LRAUROC of 88% by XGBoostingThe average and maximum value of ALT appeared was the most important variable
Ma et al[64]To predict NAFLD in the general populationn = 10508, Subjects who attended a health examinationAge, blood biomarkers, and anthropometric dataLR, RF, SVM, baggin, DT, LR, KNN, BN, hidden NB, AdaBoosting, AODE83% accuracy, 0.878 specificity, 0.675 sensitivity, and 0.655 F-measure score by BNBMI, TG, GGT, ALT and uric acid were the top five predictors
Yip et al[65]To detect NAFLD for the general populationn = 500, involving NAFLD patients and healthy subjectsDemographic, clinical data and blood biomarkersLR, RIDGE regression, AdaBoosting, DTAUROC of 90% by AdaBoostingALT, HDL-c, TG, HbA1c and white blood cells to predictors
Pei et al[66]To identify FLD in general patientsn = 3419, patients of which 845 had FLDAge, anthropometric, and blood biomarkersRF, ANN, KNN, XGBoosting, LDA0.9415 accuracy, 0.9306 AUC, and 0.9091 sensitivity by XGBoostingUric acid, BMI, and TG were the top three risk factors
Choi et al[67]To stage liver fibrosis n = 7461, patients with pathologically confirmed liver fibrosisAge, sex, clinical data, CT images, and liver fibrosis stageCNNOverall staging accuracy of 79.4% and an AUROC of 0.96, 0.97, and 0.95 for diagnosing significant and advanced fibrosis, and cirrhosis, respectivelyThe model outperformed the radiologist’s interpretation, APRI, and FIB-4 index
Chen et al[68]To stage liver fibrosis in patients with CHBn = 513, patients with confirmed liver fibrosisAge, sex, CT liver imagesRF, KNN, SVM, NB0.8118-0.9125 accuracy by RF for all stagesThe adopted classifiers significantly outperformed the liver fibrosis index method
Jeong et al[69]To classify susceptible individuals for adjuvant treatment in patients with ICC after resectionn = 1421, ICC patientsAge, sex, clinical data, and blood biomarkersDNNAUC of 0.78The model was found to be more accurate than the traditional AJCC stage classifier
Wübbolding et al[70]To identify immune profiles for the prediction of early virological relapsen = 284, patients with CHB and treated with NA antiviralsAge, sex, and analytical and blood biomarkersKNN, RF, LRAUC of 0.89The combination of IL-2, MIG/CCL9, RANTES/CCL5, SCF, and TRAIL was reliable in predicting viral relapse
Hong et al[71]To predict esophageal varices in patients with HBV related cirrhosisn = 197, patients with HBV-related cirrhosisPLT count, spleen width, and portal vein diameterANNSensitivity of 96.5%, specificity of 60.4%, accuracy of 86.8%The model obtained a positive predictive value of 90.00%; and a negative predictive value of 80.85%
Zhong et al[72]To compare the prognostic performance of ALBI and CTP grades for HCC treated with TACE combined with sorafenib as an initial treatmentn = 504, HCC patientsALBI and CTP grades BCLC stage, clinical data and plasma α-fetoproteinANN-The ALBI grade had higher importance in survival prediction compared to the CTP one
Shi et al[73]To predict in-hospital mortality after primary liver cancer surgeryn = 22926, HCC surgery patientsAge, sex, clinical, and hospital dataANN, LR97.28% of accuracy and 84.67 % of AUROC by ANNANN model had higher overall performance indices and accurately predicted in-hospital mortality
Shi et al[74]To predict 5-yr mortality after surgery for HCCn = 22926, HCC surgery patientsAge, sex, clinical, and hospital dataANN, LR96.57 % of accuracy and 88.51 % of AUROC by ANNSurgeon volume was the top predictor parameter
Patnaik et al[75]To predict liver function-related scores (MELD, APRI, CTP) using breath biomarkersn = 28, healthy patients compared to n = 17, liver patientsAge, anthropometric data, blood biomarkers, breath analysisLR, RF, SVR, ETRR2 values of 0.78, 0.82, and 0.85 for CTP score, APRI score, and MELD, respectively, by ETRIsoprene, limonene and dimethyl sulfide can be potential biomarkers for liver disease
Butt et al[85]To diagnose the stage of hepatitis Cn = 968, patients with HCVAge, anthropometric data, blood biomarkers, and histological stagingANN, RF, SVM, XGBoosting98.89% precision by ANNThe model performed better than previously presented models by other authors
Wei et al[87]To predict HBV and HCV-related hepatic fibrosisn = 490, HBV patients; n = 254, and 230 HCV patientsAge, BMI, analytical data (FIB-4 score), and liver biopsyGB, DT, RFAUROC of 0.918 by GBGB outperformed the FIB-4 predictive score
Barakat et al[89]To predict and stage hepatic fibrosis in children with HCVn = 166, children with CHCAnalytical data (APRI and FIB-4 scores)RFAUCs of 0.903 for any type of fibrosisRF outperformed FIB-4 and APRI predictive score
Konerman et al[88]To predict progression of HCVn = 72683, veterans with CHCAge, BMI, demographic, and blood biomarkers (APRI score)CS and LGT Cox and boostingAUROC of 0.830 and 0.77 sensitivity by LGT boosting model for 1 yr follow-upAPRI and PLT count were top predictors in the LGT boosting model
Wong et al[86]To predict HCC in patients with CVHn = 86804, CHV patients, of which 6821 with HCCAge, sex, clinical data, and blood biomarkersLR, RIDGE regression, AdaBoosting, RF, DTAUROC of 0.992 and 0.837 by RF in training and validation cohort, respectivelyML models obtained better AUROCs than HCC traditional risk scores
Feldman et al[91]To predict DAA therapy duration in hepatitis Cn = 3943, HCV patients with sofosbuvir/ledipasvir as the first course of DAA, of which n = 240, received the prolonged DAA treatmentAge, sex, and clinical data (including hepatitis C record data)XGBoosting, RF, SVMAUC of 0.745 by XGBoostingResults showed age, comorbidity burden, and type 2 diabetes status as new predictors for DAA therapy duration
Kamboj et al[92]To predict repurposed drugs for HCVn = 17968, HCV molecular fingerprintsExperimentally validated small molecules from the ChEMBL database with bioactivity against HCV NS3, NS3/A4, NS5A and NS5B proteinsSVM, ANN, KNN, RFR2 value of 0.92 by SVMResults identified more than 8 repurposed treatments anti-HCV
Tian et al[93]To predict HBsAg seroclearancen = 2235, patients with CHB, of which 106 achieved HBsAg seroclearanceAge, BMI, demographic and clinical data, and blood biomarkersLR, RF, DT, XGBoostingAUC of 0.891 by XGBoostingLevel of HBsAg followed by age and HBV DNA were the top predictors
Chen et al[94]To predict HBV-induced HCC using quasispecies patterns of HBVn = 307, CHB patients; n = 237, HBV-related HCC patientsrt nucleic acid and rt/s amino acid sequencesSVM, RF, KNN, LRAUC of 0.96, and accuracy of 0.90 by RFHBV rt gene features can efficiently discriminate HCC from CHB
Mueller-Breckenridge et al[95]To classify HBeAg status in HBV patients using virus full-length genome quasispeciesn = 352, CHB untreated patientsMatrix of allele frequencies (0.1-0.99) and the associated HBeAg statusRFRange balanced accuracy of 0.8-1n1896GA, n1934AT, n1753TC mutants were the highest-ranking variables
Kayvanjoo et al[96]To predict HCV interferon/ribavirin therapy outcome based on viral nucleotide attributesn = 76, gene attributesHCV nucleotide attributesDT, SVM, NB, DNNAccuracy of 84.17% by SVM in responder vs relapser of subtype 1b sequencesDinucleotides UA and UU were top predictors in the combination treatment outcome
Li et al[98]To distinguish influenza from COVID-19 patientsn = 398, COVID-19 and influenza cases Age, sex, blood biomarkers, clinical data, and CT and X-ray scansXGBoosting, RF, and LASSO and RIDGE regression modelsAUC of 0.990, sensitivity of 92.5% and a specificity of 97.9% by XGBoostingAge, CT scan result, and temperature were top three predictors
Bhargava et al[99]To detect novel COVID-19 and discriminate between pneumonian = 31454, images acquired from nine distinct datasets of COVID-19 patientsCT or X-ray scansKNN, SRC, ANN, SVM99.14 of accuracy by SVMSVM model classified with the highest recognition rate the images as normal, pneumonia, and COVID-19 positive
Bennett et al[97]To predict early severity and clinically characterize COVID-19 patientsn = 174568, patients with a positive lab test for COVID-19Age, sex, demographic, anthropometric and clinical data, and blood biomarkersRF, LR, XGBoostingAUROC of 0.87 by XGBoostingAge, oxygen respiratory rate, and blood urea nitrogen were ranked as top predictor for severity outcome
Günster et al[100]To identify independent risk factors for 180-d all-cause mortality in COVID-19 patientsn = 8679, hospitalized COVID-19 patientsAge, sex, BMI, and clinical dataLRAUC of 0.81A high BMI and age were strong risk factors for 180-d all-cause mortality, while female sex was protective
Deng et al[101]To identify clinical indicators for COVID-19n = 379, patients, 62 with COVID-19 and 317 with pneumoniaAge, sex, demographic and clinical data, and blood biomarkersEBMAUC of 0.948Variables grouped under liver function was top the predictor category for COVID-19 prediction
Lipták et al[102]To identify gastrointestinal predictors for the risk of COVID-19-related hospitalizationn = 680, patientsAge, sex, clinical data, and blood biomarkersRFAUC of 0.799AST was top predictor for hospitalization
Elemam et al[103]To identify immunological and clinical predictors of COVID-19 severity and sequelaen = 37, COVID-19 patients; n = 40, controlsAge, sex, BMI, clinical data, and blood biomarkersStepwise linear regressionAUC of 0.93 for cytokines as predictors. AUC of 0.98 for biochemical markers as predictorsIL-6 and granzyme B were top potential predictors of liver injury in COVID-19 patients
Mashraqi et al[104]To predict adverse effects on liver functions of COVID-19 ICU patientsn = 140, COVID-19 patients admitted to ICUBlood biomarkers and existence of liver damageSVM, KNN, ANN, NB, DTAUC of 0.857 and precision of 0.95 by SVMAST and ALT were top predictors of liver damage in these patients
Soltan et al[106]To evaluate a laboratory-free COVID-19 triage for emergency caren = 114957, emergency presentations prior to the global COVID-19 pandemic and n = 437, COVID-19 positiveBlood biomarkers, blood gas, and vital signsLR, XGBoosting, RFAUROC range of 0.9-0.94 by XGBoosting for datasetsThe model could effectively triage patients presenting to hospital for COVID-19 without lab results
Gao et al[111]To predict mortality in patients with alcoholic hepatitisn = 210, alcoholic hepatitis patientsAge, clinical data, blood biomarkers, and omics data sets (metagenomics, lipidomics, and metabolomics)GB, LR, SVM, RFAUC of 0.87 by GB for 30-d mortality prediction using the dataset combining clinical data, bacteria and MetaCyc pathways and for and 90-d mortality prediction using the fungi datasetThe model performed better than the currently used MELD score
Table 2 Summary of the most repeated inputs of the machine learning models with the most repeated predictor outcomes for the four main inflammatory-related liver conditions
Inflammatory-related liver condition
Inputs
Most repeated predictors
FLDAge, sex, blood biomarkers, and demographic, anthropometric, and clinical dataBMI, uric acid, TG, and ALT levels
Liver fibrosisAge, sex, and CT imagesBetter diagnosis compared to classical methods like APRI and FIB-4 indexes
Virus-induced hepatitisAge, sex, blood biomarkers, and demographic, anthropometric, and clinical data AST, PLT levels, APRI index, and age
COVID-19Age, sex, blood biomarkers, CT images, and demographic, anthropometric, and clinical dataAge, BMI, CT images, oxygen rate, AST, and ALT levels