Diagnostic accuracy and quality of artificial intelligence models in irritable bowel syndrome: A systematic review

doi:10.3748/wjg.v31.i23.106836

Advanced Search

BPG is committed to discovery and dissemination of knowledge

Home / Archive / Volume 31, Issue 23

This Article

Peer-Review Report of This Article

CrossCheck and Google Search of This Article

Academic Rules and Norms of This Article

Supplementary Materials of This Article

Citation of this article

Corresponding Author of This Article

Research Domain of This Article

Article-Type of This Article

Open-Access Policy of This Article

Times Cited Counts in Google of This Article

Number of Hits and Downloads for This Article

Total Article Views (3176)

All Articles published online

The chart showing PDF series, HTML series, Figures (1-1) series, Tables (1-2) series.

Item

Count

PDF

105

HTML

1928

Figures (1-1)

183

Tables (1-2)

191

Sum=2407

Publishing Process of This Article

The chart showing Browse series, Download series.

Item

Count

Browse

113

Download

541

Sum=654

Jun 21, 2025 (publication date) through Mar 2, 2026

Times Cited of This Article

Times Cited (4)

Journal Information of This Article

Publication Name

World Journal of Gastroenterology

ISSN

1007-9327

Publisher of This Article

Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA

Systematic Reviews

World J Gastroenterol. Jun 21, 2025; 31(23): 106836
Published online Jun 21, 2025. doi: 10.3748/wjg.v31.i23.106836

Table 1 Characteristics of the included studies

Ref.	Journal	Country	Sample size	Sex (male:female)	IBS	Control	Biomarker	AI used	Specificity	Sensitivity	Accuracy	AUC
Shepherd et al[17], 2014	J Breath Res	United Kingdom	80		34	46	Gas chromatography	Artificial neural network analyses			0.54
Aggio et al[18], 2017	Aliment Pharmacol Ther	United Kingdom	69	26:43	28	41	Gas chromatography	ML (SVM)			0.91
Mao et al[19], 2020	Hum Brain Mapp	China	68	34:34	34	34	Neuroimaging, HF	ML (SVM)	HF: 0.671, 0.806, 0.529; all: 0.750	HF: 0.626, 0.438, 0.668; all: 0.679	HF: 0.649, 0.619, 0.599; all: 0.715	HF: 0.708, 0.659, 0.61; all: 0.776
Fukui et al[8], 2020	J Clin Med	Japan	111	46:65	85	26	Gut microbiome	ML (random forest model)	> 0.90	> 0.80		0.846
Su et al[9], 2022	Nat Commun	China	1038		145	893	Gut microbiome	ML (random forests, k-nearest neighbors, SVM multi-layer perceptron and SVM)	0.98	0.94	0.98	0.99
Tanaka et al[20], 2023	Front Microbiol	Japan	70	70:0	35	35	Protease activity, C-terminal residue of K, R, S, or G, all probes, microbiome, and metabolome	ML (random forest model)		Protease activity: 0.727, and all probes: 0.909	Protease activity: 0.81, and all probes: 0.905	Protease activity: 0.83, all probes: 0.92, microbiome: 0.58, metabolome: 0.67

IBS: Irritable bowel syndrome; AI: Artificial intelligence; AUC: Area under the curve; HF: Habenula function; ML: Machine learning; SVM: Support vector machine.

Full Size Table

Table 2 Quality assessment of the included studies

Ref.	Diagnostic accuracy				Machine learning
Ref.	Patient selection¹	Index test¹	Reference standard¹	Flow & timing¹	Predictors²	Outcomes²	Analysis²	Data processing³	Model specification³	Training/validation³	Performance metrics³	Transparency³
Shepherd et al[17]	0 (clinic-based sample)	1 (ANN model)	1 (Rome II)	0 (no external validation)	1 (ANN)	1 (Rome II)	1 (cross-validation used)	1 (time binning and normalization)	1 (ANN model with hidden layers)	1 (4-fold cross-validation)	1 (sensitivity, specificity calculated)	0 (limited code transparency)
Aggio et al[18]	1 (diverse control)	1 (SVM and PLS pipeline)	1 (CRP and WCC levels)	1 (partial external validation)	1 (SVM)	1 (definition)	1 (multiple CV methods for robustness)	1 (normalized gas values)	1 (SVM with PLS setup)	1 (Monte Carlo and 10-fold cross-validation)	1 (ROC, sensitivity)	0 (no full code access)
Mao et al[19]	0 (specific IBS subtypes)	1 (multi-class SVM based on ROIs)	1 (Rome IV)	0 (no external validation)	1 (SVM)	1 (Rome IV)	0 (limited test sets)	1 (SPM preprocessing for ROIs)	1 (SVM for IBS classification)	1 (10-fold cross-validation)	1 (AUC, sensitivity, specificity)	0 (limited data sharing)
Fukui et al[8]	1 (multicenter approach)	1 (RF and KNN models)	1 (Rome IV and histological standards)	0 (no external validation)	1 (adjusted predictors)	1 (Rome IV and histological standards)	0 (no external testing)	1 (batch effect adjustment)	1 (RF and KNN classifiers)	1 (nested CV)	1 (AUC and AUPR)	1 (full settings provided, partial sharing)
Su et al[9]	1 (matched control)	1 (RF model)	1 (standard enzyme-linked diagnosis)	1 (robust cross-validation)	1 (enzyme activity focus)	1 (enzyme-based diagnosis)	1 (5-fold cross-validation)	1 (normalization for enzyme analysis)	1 (RF model with grid search)	1 (5-fold cross-validation)	1 (comprehensive ROC analysis)	1 (standard software in R)
Tanaka et al[20]	1 (broad sample selection)	1 (RF validated with Bray-Curtis)	1 (Rome IV and microbial standards)	1 (rigorous cross-validation	1 (RF)	1 (Rome IV for microbial analysis)	1 (external validation)	1 (Bray-Curtis dissimilarity for microbiome)	1 (RF validated externally)	1 (nested CV with external testing)	1 (AUROC and AUPR)	1 (code and dataset on GitHub)

¹Quality assessment of diagnostic accuracy studies-2.

²Prediction model risk of bias assessment tool-artificial intelligence.

³Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis-artificial intelligence.

1 point if the criterion is met (indicating low risk of bias or high methodological rigor) and 0 point if the criterion is not met. ANN: Artificial neural network; SVM: Support vector machine; PLS: Partial least squares; CRP: C-reactive protein; WCC: White blood cell count; CV: Cross-validation; ROC: Receiver operating characteristic curve; IBS: Irritable bowel syndrome; ROI: Regions of interest; RF: Random forest; KNN: K-nearest neighbors; SPM: Statistical parametric mapping; AUC: Area under the curve; AUPR: Area under the precision-recall curve; AUROC: Areas under receiver operating characteristic curve.

Full Size Table

Citation: Bhagavathula AS, Al Qady AM, Aldhaleei WA. Diagnostic accuracy and quality of artificial intelligence models in irritable bowel syndrome: A systematic review. World J Gastroenterol 2025; 31(23): 106836
URL: https://www.wjgnet.com/1007-9327/full/v31/i23/106836.htm
DOI: https://dx.doi.org/10.3748/wjg.v31.i23.106836