Published online May 15, 2025. doi: 10.4251/wjgo.v17.i5.103804
Revised: February 20, 2025
Accepted: February 26, 2025
Published online: May 15, 2025
Processing time: 162 Days and 23.6 Hours
Gastric cancer (GC) has a poor prognosis, and the accurate prediction of patient survival remains a significant challenge in oncology. Machine learning (ML) has emerged as a promising tool for survival prediction, though concerns regarding model interpretability, reliance on retrospective data, and variability in performance persist.
To evaluate ML applications in predicting GC survival and to highlight key limitations in current methods.
A comprehensive search of PubMed and Web of Science in November 2024 identified 16 relevant studies published after 2019. The most frequently used ML models were deep learning (37.5%), random forests (37.5%), support vector machines (31.25%), and ensemble methods (18.75%). The dataset sizes varied from 134 to 14177 patients, with nine studies incorporating external validation.
The reported area under the curve values were 0.669–0.980 for overall survival, 0.920–0.960 for cancer-specific survival, and 0.710–0.856 for disease-free survival. These results highlight the potential of ML-based models to improve clinical practice by enabling personalized treatment planning and risk stratification.
Despite challenges concerning retrospective studies and a lack of interpretability, ML models show promise; prospective trials and multidimensional data integration are recommended for improving their clinical applicability.
Core Tip: Machine learning offers significant promise for predicting gastric cancer patients' survival, but challenges such as data quality, model interpretability, and generalizability must be addressed. This review highlights the importance of integrating diverse data types, robust data preprocessing, and advanced feature-selection techniques to improve prediction accuracy. While open-access and private datasets each have their advantages, ensuring the timeliness and relevance of data is essential for the development of clinically applicable models.
