BPG is committed to discovery and dissemination of knowledge
Review
Copyright ©The Author(s) 2025.
World J Gastroenterol. Dec 14, 2025; 31(46): 111176
Published online Dec 14, 2025. doi: 10.3748/wjg.v31.i46.111176
Table 1 Artificial intelligence-based studies on drug-induced liver injury prediction, highlighting their methodological frameworks and key performance outcomes
Ref.
Methodology
Results
Mostafa et al[61], 2024RF & MLP on large human DILI datasets, externally validated on failed drug candidatesRF accuracy 63%, MLP MCC 0.245; models flagged failed drugs in external test set
Lesiński et al[62], 2021RF combining gene expression & molecular descriptorsAUC approximately 0.73 (high vs low-risk classification)
Liu et al[63], 2022Gene-expression cascade modeling preceding DILI histopathologyMechanistic insights into pathways & TFs
Wang et al[64], 2022ML on microarray dataAUC > 0.80 for genes DDIT3, GADD45A, SLC3A2, RBM24
Rao et al[65], 2023SVM, RF, ANN on physicochemical & offtarget features for small moleculesAUC 0.88; sensitivity 0.73; specificity 0.90
Li et al[66], 2021DeepDILI: Deep learning combining coupled ML + Mold2 descriptorsMCC 0.331; outperformed conventional ML (RF, SVM)
Li et al[67], 20208-layer deep neural network on human cell-line transcriptomics (L1000)Training/IV AUC 0.802/0.798; balanced accuracies approximately 0.74
Xiao et al[68], 2024XGBoost, RF, LASSO for TB treatment DILI prediction with SHAP interpretabilityAUROC 0.89 in validation; strong model interpretability
Lee and Yoo[69], 2024InterDILI interpretable RF model on multi-dataset integration (substructures, descriptors)AUROC 0.88-0.97; AUPRC 0.81-0.95; feature insights
Table 2 Artificial intelligence-augmented vs human-only diagnostic accuracy: Current evidence
Task
AI model/dataset
AI performance
Comparator
Outcome
CT-based HCC detection[6]CNN on CT (deep segmentation, auto segment)Sensitivity approximately 92%, specificity approximately 97%RadiologistsOutperformed (AI Sn/Sp 92/98 vs 82.5/96.5); supports workflow
PLAN-B-DF (internal/external validation)[70]Auto segmentation + clinical dataC-index 0.91; 0.89Traditional risk scoresOutperformed
Ultrasound focal lesion detection[81]DL on B-mode USAUC approximately 0.93SonographersComparable performance
Radiomics MVI in HCC[82]Deep learning (large meta analysis)AUC approximately 0.97Non-DL ML (AUC 0.82)DL superior
Histopathology slide review[38]DL assistanceAccuracy approximately 0.885PathologistsAssisted improvements but risks of misguidance noted