Deep learning approaches for image-based snoring sound analysis in the diagnosis of obstructive sleep apnea-hypopnea syndrome: A systematic review

doi:10.4329/wjr.v17.i9.109116

Advanced Search

BPG is committed to discovery and dissemination of knowledge

Home / Archive / Volume 17, Issue 9

This Article

Academic Content and Language Evaluation of This Article

CrossCheck and Google Search of This Article

Academic Rules and Norms of This Article

Citation of this article

Corresponding Author of This Article

Research Domain of This Article

Article-Type of This Article

Open-Access Policy of This Article

Times Cited Counts in Google of This Article

Number of Hits and Downloads for This Article

Total Article Views (1803)

All Articles published online

The chart showing PDF series, HTML series, Figures (1-2) series, Tables (1-3) series.

Item

Count

PDF

HTML

276

Figures (1-2)

Tables (1-3)

Sum=386

Featured Article

The chart showing Browse series, Download series.

Item

Count

Browse

507

Download

358

Sum=865

Publishing Process of This Article

Item

Count

Browse

313

Download

164

Sum=477

Sep 28, 2025 (publication date) through Feb 7, 2026

Times Cited of This Article

Times Cited (0)

Journal Information of This Article

Publication Name

World Journal of Radiology

ISSN

1949-8470

Publisher of This Article

Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA

Systematic Reviews

World J Radiol. Sep 28, 2025; 17(9): 109116
Published online Sep 28, 2025. doi: 10.4329/wjr.v17.i9.109116

Table 1 Summary of the database information

Dataset name	Subject sample	Age range	Sampling rate	Recording environment
Snoring-detection[22]	1000 audio clips (500 snoring + 500 non- snoring)	Not specified	16 kHz	Various background environments
ICSD[23]	Infant crying and snoring recordings	0–2 years (infants)	Various	Indoor environments
PSG-audio corpus[24]	212 subjects	23–85 years	48 kHz	Sleep laboratories
SSBPR dataset[25]	20 patients for body position recognition	26–57 years	32 kHz	Hospital environment

PSG: Polysomnography.

Full Size Table

Table 2 The performance of snoring sounds detection based on different works

Ref.	Image type	Model	Main results
Hong et al[56]	Log-Mel spectrogram	Vision Transformer-based deep learning model	Sen: 89.8%, Spe: 91.3%, Acc: 95.9%
Romero et al[58]	Bottleneck features	Deep autoencoder, auditory model	F1: 94.75%
Liu et al[59]	Time-domain waveform, spectrogram, Mel-spectrogram	MobileNetV2 CNN	Acc: 95.00%
Ye et al[57]	Spectrogram, Mel-spectrogram, CWT	CNN, multi-channel spectrogram	Acc: 94.18%
Lim et al[44]	Time-domain waveform, spectrogram, Mel-spectrogram	RNN	Acc: 98.9%
Jiang et al[60]	Time-domain waveform, spectrum, spectrogram, Mel-spectrogram, CQT-spectrogram	CNNs-DNNs, CNNs-LSTMs-DNNs	Acc: 95.00%
Li et al[61]	Spectrogram	1D CNN, 2D CNN (visibility graph)	Acc: 89.3%, Sen: 89.7%, Spe: 88.5%
Xie et al[62]	Spectrogram	CNN, RNN	Acc: 95.3%, Sen: 92.2%, Spe: 97.7%
González-martínez et al[63]	Harmonic spectrogram	CNN	AUC: 0.89

Acc: Accuracy; Sen: Sensitivity; Spe: Specificity; CNN: Convolutional neural network; DNN: Deep neural network; RNN: Recurrent neural network; LSTM: Long short-term memory; CWT: Continuous wavelet transform.

Full Size Table

Table 3 The performance of snoring sounds classification of obstructive sleep apnea-hypopnea syndrome patients

Ref.	Image type	Model	Classification	Classification results
Song et al[55]	Mel-spectrogram	XGBoost, CNN, ResNet	OSAHS snoring vs simple snore	Acc: 83.44%, Sen: 85.27%
Ding et al[46]	Mel-spectrogram	VGG19 + LSTM	Simple snoring vs OSAHS snoring	Acc: 85.21%
Cheng et al[65]	MFCC, Fbanks, LPC	LSTM	Apnea vs normal snoring,	Acc: 95.3%
Li et al[66]	Spectrogram, Mel-spectrogram	CNN	OSAHS detection	Acc: 92.5%, Sen: 93.9%, Spc: 91.2%
Serrano et al[67]	Mel-spectrogram	VGGish + bi-LSTM	Apnea vs non-apnea	Acc: 95%

Acc: Accuracy; Sen: Sensitivity; Spe: Specificity; Spe: Specificity; CNN: Convolutional neural network; LSTM: Long short-term memory; OSAHS: Obstructive sleep apnea-hypopnea syndrome; MFCC: Mel-frequency cepstral coefficients; LPC: Linear predictive coding.

Full Size Table

Citation: Ding L, Peng JX, Song YJ. Deep learning approaches for image-based snoring sound analysis in the diagnosis of obstructive sleep apnea-hypopnea syndrome: A systematic review. World J Radiol 2025; 17(9): 109116
URL: https://www.wjgnet.com/1949-8470/full/v17/i9/109116.htm
DOI: https://dx.doi.org/10.4329/wjr.v17.i9.109116