Published online Sep 19, 2025. doi: 10.5498/wjp.v15.i9.108359
Revised: May 16, 2025
Accepted: June 30, 2025
Published online: September 19, 2025
Processing time: 136 Days and 13.3 Hours
Optical coherence tomography (OCT) enables high-resolution, non-invasive visualization of retinal structures. Recent evidence suggests that retinal layer alterations may reflect central nervous system changes associated with psychiatric disorders such as schizophrenia (SZ).
To develop an advanced deep learning model to classify OCT images and distinguish patients with SZ from healthy controls using retinal biomarkers.
A novel convolutional neural network, Self-AttentionNeXt, was designed by integrating grouped self-attention mechanisms, residual and inverted bottleneck blocks, and a final 1 × 1 convolution for feature refinement. The model was trained and tested on both a custom OCT dataset collected from patients with SZ and a publicly available OCT dataset (OCT2017).
Self-AttentionNeXt achieved 97.0% accuracy on the collected SZ OCT dataset and over 95% accuracy on the public OCT2017 dataset. Gradient-weighted class activation mapping visualizations confirmed the model’s attention to clinically relevant retinal regions, suggesting effective feature localization.
Self-AttentionNeXt effectively combines transformer-inspired attention mechanisms with convolutional neural networks architecture to support the early and accurate detection of SZ using OCT images. This approach offers a promising direction for artificial intelligence-assisted psychiatric diagnostics and clinical decision support.
Core Tip: This study presents Self-AttentionNeXt, a novel deep learning architecture that integrates self-attention mechanisms from transformer models with convolutional neural networks for the classification of optical coherence tomography images. By focusing on retinal image regions most relevant to schizophrenia, the model achieves high diagnostic accuracy. The integration of inverted bottleneck attention blocks and residual connections enhances both feature representation and training stability. Self-AttentionNeXt demonstrates that combining attention mechanisms with convolutional neural networks can offer a powerful tool for supporting ophthalmic evaluation in patients with schizophrenia.