Copyright
©The Author(s) 2022.
World J Gastroenterol. Nov 28, 2022; 28(44): 6230-6248
Published online Nov 28, 2022. doi: 10.3748/wjg.v28.i44.6230
Published online Nov 28, 2022. doi: 10.3748/wjg.v28.i44.6230
Ref. | Objective | Subjects | Variables | ML model | Performance | Observations/remarks |
Fialoke et al[63] | To predict NASH in NAFLD patients | n = 108139, NASH and healthy (non-NASH) populations | Demographic data, type 2 diabetes status, and blood biomarkers | RF, XGBoosting, DT, LR | AUROC of 88% by XGBoosting | The average and maximum value of ALT appeared was the most important variable |
Ma et al[64] | To predict NAFLD in the general population | n = 10508, Subjects who attended a health examination | Age, blood biomarkers, and anthropometric data | LR, RF, SVM, baggin, DT, LR, KNN, BN, hidden NB, AdaBoosting, AODE | 83% accuracy, 0.878 specificity, 0.675 sensitivity, and 0.655 F-measure score by BN | BMI, TG, GGT, ALT and uric acid were the top five predictors |
Yip et al[65] | To detect NAFLD for the general population | n = 500, involving NAFLD patients and healthy subjects | Demographic, clinical data and blood biomarkers | LR, RIDGE regression, AdaBoosting, DT | AUROC of 90% by AdaBoosting | ALT, HDL-c, TG, HbA1c and white blood cells to predictors |
Pei et al[66] | To identify FLD in general patients | n = 3419, patients of which 845 had FLD | Age, anthropometric, and blood biomarkers | RF, ANN, KNN, XGBoosting, LDA | 0.9415 accuracy, 0.9306 AUC, and 0.9091 sensitivity by XGBoosting | Uric acid, BMI, and TG were the top three risk factors |
Choi et al[67] | To stage liver fibrosis | n = 7461, patients with pathologically confirmed liver fibrosis | Age, sex, clinical data, CT images, and liver fibrosis stage | CNN | Overall staging accuracy of 79.4% and an AUROC of 0.96, 0.97, and 0.95 for diagnosing significant and advanced fibrosis, and cirrhosis, respectively | The model outperformed the radiologist’s interpretation, APRI, and FIB-4 index |
Chen et al[68] | To stage liver fibrosis in patients with CHB | n = 513, patients with confirmed liver fibrosis | Age, sex, CT liver images | RF, KNN, SVM, NB | 0.8118-0.9125 accuracy by RF for all stages | The adopted classifiers significantly outperformed the liver fibrosis index method |
Jeong et al[69] | To classify susceptible individuals for adjuvant treatment in patients with ICC after resection | n = 1421, ICC patients | Age, sex, clinical data, and blood biomarkers | DNN | AUC of 0.78 | The model was found to be more accurate than the traditional AJCC stage classifier |
Wübbolding et al[70] | To identify immune profiles for the prediction of early virological relapse | n = 284, patients with CHB and treated with NA antivirals | Age, sex, and analytical and blood biomarkers | KNN, RF, LR | AUC of 0.89 | The combination of IL-2, MIG/CCL9, RANTES/CCL5, SCF, and TRAIL was reliable in predicting viral relapse |
Hong et al[71] | To predict esophageal varices in patients with HBV related cirrhosis | n = 197, patients with HBV-related cirrhosis | PLT count, spleen width, and portal vein diameter | ANN | Sensitivity of 96.5%, specificity of 60.4%, accuracy of 86.8% | The model obtained a positive predictive value of 90.00%; and a negative predictive value of 80.85% |
Zhong et al[72] | To compare the prognostic performance of ALBI and CTP grades for HCC treated with TACE combined with sorafenib as an initial treatment | n = 504, HCC patients | ALBI and CTP grades BCLC stage, clinical data and plasma α-fetoprotein | ANN | - | The ALBI grade had higher importance in survival prediction compared to the CTP one |
Shi et al[73] | To predict in-hospital mortality after primary liver cancer surgery | n = 22926, HCC surgery patients | Age, sex, clinical, and hospital data | ANN, LR | 97.28% of accuracy and 84.67 % of AUROC by ANN | ANN model had higher overall performance indices and accurately predicted in-hospital mortality |
Shi et al[74] | To predict 5-yr mortality after surgery for HCC | n = 22926, HCC surgery patients | Age, sex, clinical, and hospital data | ANN, LR | 96.57 % of accuracy and 88.51 % of AUROC by ANN | Surgeon volume was the top predictor parameter |
Patnaik et al[75] | To predict liver function-related scores (MELD, APRI, CTP) using breath biomarkers | n = 28, healthy patients compared to n = 17, liver patients | Age, anthropometric data, blood biomarkers, breath analysis | LR, RF, SVR, ETR | R2 values of 0.78, 0.82, and 0.85 for CTP score, APRI score, and MELD, respectively, by ETR | Isoprene, limonene and dimethyl sulfide can be potential biomarkers for liver disease |
Butt et al[85] | To diagnose the stage of hepatitis C | n = 968, patients with HCV | Age, anthropometric data, blood biomarkers, and histological staging | ANN, RF, SVM, XGBoosting | 98.89% precision by ANN | The model performed better than previously presented models by other authors |
Wei et al[87] | To predict HBV and HCV-related hepatic fibrosis | n = 490, HBV patients; n = 254, and 230 HCV patients | Age, BMI, analytical data (FIB-4 score), and liver biopsy | GB, DT, RF | AUROC of 0.918 by GB | GB outperformed the FIB-4 predictive score |
Barakat et al[89] | To predict and stage hepatic fibrosis in children with HCV | n = 166, children with CHC | Analytical data (APRI and FIB-4 scores) | RF | AUCs of 0.903 for any type of fibrosis | RF outperformed FIB-4 and APRI predictive score |
Konerman et al[88] | To predict progression of HCV | n = 72683, veterans with CHC | Age, BMI, demographic, and blood biomarkers (APRI score) | CS and LGT Cox and boosting | AUROC of 0.830 and 0.77 sensitivity by LGT boosting model for 1 yr follow-up | APRI and PLT count were top predictors in the LGT boosting model |
Wong et al[86] | To predict HCC in patients with CVH | n = 86804, CHV patients, of which 6821 with HCC | Age, sex, clinical data, and blood biomarkers | LR, RIDGE regression, AdaBoosting, RF, DT | AUROC of 0.992 and 0.837 by RF in training and validation cohort, respectively | ML models obtained better AUROCs than HCC traditional risk scores |
Feldman et al[91] | To predict DAA therapy duration in hepatitis C | n = 3943, HCV patients with sofosbuvir/ledipasvir as the first course of DAA, of which n = 240, received the prolonged DAA treatment | Age, sex, and clinical data (including hepatitis C record data) | XGBoosting, RF, SVM | AUC of 0.745 by XGBoosting | Results showed age, comorbidity burden, and type 2 diabetes status as new predictors for DAA therapy duration |
Kamboj et al[92] | To predict repurposed drugs for HCV | n = 17968, HCV molecular fingerprints | Experimentally validated small molecules from the ChEMBL database with bioactivity against HCV NS3, NS3/A4, NS5A and NS5B proteins | SVM, ANN, KNN, RF | R2 value of 0.92 by SVM | Results identified more than 8 repurposed treatments anti-HCV |
Tian et al[93] | To predict HBsAg seroclearance | n = 2235, patients with CHB, of which 106 achieved HBsAg seroclearance | Age, BMI, demographic and clinical data, and blood biomarkers | LR, RF, DT, XGBoosting | AUC of 0.891 by XGBoosting | Level of HBsAg followed by age and HBV DNA were the top predictors |
Chen et al[94] | To predict HBV-induced HCC using quasispecies patterns of HBV | n = 307, CHB patients; n = 237, HBV-related HCC patients | rt nucleic acid and rt/s amino acid sequences | SVM, RF, KNN, LR | AUC of 0.96, and accuracy of 0.90 by RF | HBV rt gene features can efficiently discriminate HCC from CHB |
Mueller-Breckenridge et al[95] | To classify HBeAg status in HBV patients using virus full-length genome quasispecies | n = 352, CHB untreated patients | Matrix of allele frequencies (0.1-0.99) and the associated HBeAg status | RF | Range balanced accuracy of 0.8-1 | n1896GA, n1934AT, n1753TC mutants were the highest-ranking variables |
Kayvanjoo et al[96] | To predict HCV interferon/ribavirin therapy outcome based on viral nucleotide attributes | n = 76, gene attributes | HCV nucleotide attributes | DT, SVM, NB, DNN | Accuracy of 84.17% by SVM in responder vs relapser of subtype 1b sequences | Dinucleotides UA and UU were top predictors in the combination treatment outcome |
Li et al[98] | To distinguish influenza from COVID-19 patients | n = 398, COVID-19 and influenza cases | Age, sex, blood biomarkers, clinical data, and CT and X-ray scans | XGBoosting, RF, and LASSO and RIDGE regression models | AUC of 0.990, sensitivity of 92.5% and a specificity of 97.9% by XGBoosting | Age, CT scan result, and temperature were top three predictors |
Bhargava et al[99] | To detect novel COVID-19 and discriminate between pneumonia | n = 31454, images acquired from nine distinct datasets of COVID-19 patients | CT or X-ray scans | KNN, SRC, ANN, SVM | 99.14 of accuracy by SVM | SVM model classified with the highest recognition rate the images as normal, pneumonia, and COVID-19 positive |
Bennett et al[97] | To predict early severity and clinically characterize COVID-19 patients | n = 174568, patients with a positive lab test for COVID-19 | Age, sex, demographic, anthropometric and clinical data, and blood biomarkers | RF, LR, XGBoosting | AUROC of 0.87 by XGBoosting | Age, oxygen respiratory rate, and blood urea nitrogen were ranked as top predictor for severity outcome |
Günster et al[100] | To identify independent risk factors for 180-d all-cause mortality in COVID-19 patients | n = 8679, hospitalized COVID-19 patients | Age, sex, BMI, and clinical data | LR | AUC of 0.81 | A high BMI and age were strong risk factors for 180-d all-cause mortality, while female sex was protective |
Deng et al[101] | To identify clinical indicators for COVID-19 | n = 379, patients, 62 with COVID-19 and 317 with pneumonia | Age, sex, demographic and clinical data, and blood biomarkers | EBM | AUC of 0.948 | Variables grouped under liver function was top the predictor category for COVID-19 prediction |
Lipták et al[102] | To identify gastrointestinal predictors for the risk of COVID-19-related hospitalization | n = 680, patients | Age, sex, clinical data, and blood biomarkers | RF | AUC of 0.799 | AST was top predictor for hospitalization |
Elemam et al[103] | To identify immunological and clinical predictors of COVID-19 severity and sequelae | n = 37, COVID-19 patients; n = 40, controls | Age, sex, BMI, clinical data, and blood biomarkers | Stepwise linear regression | AUC of 0.93 for cytokines as predictors. AUC of 0.98 for biochemical markers as predictors | IL-6 and granzyme B were top potential predictors of liver injury in COVID-19 patients |
Mashraqi et al[104] | To predict adverse effects on liver functions of COVID-19 ICU patients | n = 140, COVID-19 patients admitted to ICU | Blood biomarkers and existence of liver damage | SVM, KNN, ANN, NB, DT | AUC of 0.857 and precision of 0.95 by SVM | AST and ALT were top predictors of liver damage in these patients |
Soltan et al[106] | To evaluate a laboratory-free COVID-19 triage for emergency care | n = 114957, emergency presentations prior to the global COVID-19 pandemic and n = 437, COVID-19 positive | Blood biomarkers, blood gas, and vital signs | LR, XGBoosting, RF | AUROC range of 0.9-0.94 by XGBoosting for datasets | The model could effectively triage patients presenting to hospital for COVID-19 without lab results |
Gao et al[111] | To predict mortality in patients with alcoholic hepatitis | n = 210, alcoholic hepatitis patients | Age, clinical data, blood biomarkers, and omics data sets (metagenomics, lipidomics, and metabolomics) | GB, LR, SVM, RF | AUC of 0.87 by GB for 30-d mortality prediction using the dataset combining clinical data, bacteria and MetaCyc pathways and for and 90-d mortality prediction using the fungi dataset | The model performed better than the currently used MELD score |
Inflammatory-related liver condition | Inputs | Most repeated predictors |
FLD | Age, sex, blood biomarkers, and demographic, anthropometric, and clinical data | BMI, uric acid, TG, and ALT levels |
Liver fibrosis | Age, sex, and CT images | Better diagnosis compared to classical methods like APRI and FIB-4 indexes |
Virus-induced hepatitis | Age, sex, blood biomarkers, and demographic, anthropometric, and clinical data | AST, PLT levels, APRI index, and age |
COVID-19 | Age, sex, blood biomarkers, CT images, and demographic, anthropometric, and clinical data | Age, BMI, CT images, oxygen rate, AST, and ALT levels |
- Citation: Martínez JA, Alonso-Bernáldez M, Martínez-Urbistondo D, Vargas-Nuñez JA, Ramírez de Molina A, Dávalos A, Ramos-Lopez O. Machine learning insights concerning inflammatory and liver-related risk comorbidities in non-communicable and viral diseases. World J Gastroenterol 2022; 28(44): 6230-6248
- URL: https://www.wjgnet.com/1007-9327/full/v28/i44/6230.htm
- DOI: https://dx.doi.org/10.3748/wjg.v28.i44.6230