Gu RT, Li X, Cheng W, Wang XW, Jin H, Liu T. Machine-learning models integrating preoperative clinical factors and circulating tumor DNA features predict lymph node metastasis in esophageal carcinoma. World J Gastrointest Oncol 2026; 18(6): 117851 [DOI: 10.4251/wjgo.v18.i6.117851]
Corresponding Author of This Article
Tao Liu, MD, Department of Thoracic Surgery, Peking University First Hospital, No. 8 Xishiku Street, Xicheng District, Beijing 100034, China. liu-ta0@outlook.com
Research Domain of This Article
Oncology
Article-Type of This Article
research-article
Open-Access Policy of This Article
This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA
Share the Article
Gu RT, Li X, Cheng W, Wang XW, Jin H, Liu T. Machine-learning models integrating preoperative clinical factors and circulating tumor DNA features predict lymph node metastasis in esophageal carcinoma. World J Gastrointest Oncol 2026; 18(6): 117851 [DOI: 10.4251/wjgo.v18.i6.117851]
World J Gastrointest Oncol. Jun 15, 2026; 18(6): 117851 Published online Jun 15, 2026. doi: 10.4251/wjgo.v18.i6.117851
Machine-learning models integrating preoperative clinical factors and circulating tumor DNA features predict lymph node metastasis in esophageal carcinoma
Ren-Tong Gu, Xin Li, Wen Cheng, Xiao-Wei Wang, Hai Jin, Tao Liu
Ren-Tong Gu, Department of Thoracic Surgery, Eastern Hepatobiliary Surgery Hospital, Naval Medical University, Shanghai 201800, China
Xin Li, Xiao-Wei Wang, Hai Jin, Department of Thoracic Surgery, Changhai Hospital, Naval Medical University, Shanghai 200433, China
Wen Cheng, Department of Thoracic Surgery, Shanghai Fourth People’s Hospital, School of Medicine, Tongji University, Shanghai 200434, China
Tao Liu, Department of Thoracic Surgery, Peking University First Hospital, Beijing 100034, China
Co-first authors: Ren-Tong Gu and Xin Li.
Co-corresponding authors: Hai Jin and Tao Liu.
Author contributions: Gu RT and Li X have played indispensable roles in the experimental design and data interpretation as co-first authors; Gu RT, Li X, Cheng W and Wang XW were involved in data curation, formal analysis, and writing original draft; Jin H and Liu T were responsible for supervision and writing review and editing as co-corresponding authors; all of the authors read and approved the final version of the manuscript to be published.
Institutional review board statement: This study complied with all relevant national regulations and institutional policies, was conducted in accordance with the tenets of the Helsinki Declaration (as revised in 2013), and was approved by the Institutional Review Board of Changhai Hospital (No. CHEC2020-021).
Informed consent statement: All participants provided informed consent.
Conflict-of-interest statement: All authors declare no conflict of interest in publishing the manuscript.
Data sharing statement: The data used in this study may be obtained upon reasonable request from the corresponding authors.
Corresponding author: Tao Liu, MD, Department of Thoracic Surgery, Peking University First Hospital, No. 8 Xishiku Street, Xicheng District, Beijing 100034, China. liu-ta0@outlook.com
Received: December 18, 2025 Revised: January 31, 2026 Accepted: March 19, 2026 Published online: June 15, 2026 Processing time: 173 Days and 21.1 Hours
Abstract
BACKGROUND
Accurate assessment of lymph node metastasis (LNM) is important in patients with esophageal cancer (EC).
AIM
To construct machine learning (ML) models using routine clinical data to predict LNM in patients with EC, exploring predictive capacity after integrating circulating tumor DNA (ctDNA) features.
METHODS
In this retrospective study, we collected demographic information, risk factors, protein biomarkers, computed tomography (CT), endoscopic, and pathological data of 206 patients with EC. The ctDNA data were available for 57 patients. A total of 81 models were developed using different feature-selection techniques and ML algorithms. A total of 79 (38.3%) patients had pathologically confirmed LNM.
RESULTS
The different ML models demonstrated good predictive performance, with a median area under the curve (AUC) of 0.767 (interquartile range: 0.679, 0.828) and median F1 score of 0.715 (interquartile range: 0.672, 0.772). The variables were selected through univariate and multivariate logistic analyses and the best model was constructed using the random forest algorithm. It incorporated tumor length, location, CT results, depth of tumor invasion, and number of aberrant protein biomarkers. It demonstrated an AUC of 0.79 (95%CI: 0.65-0.93) and accuracy of 82.26% (95%CI: 70.47%-90.80%), which were superior to the CT results. Incorporating ctDNA features yielded modest improvements in AUC (9.0%) and F1 score (14.3%); however, these gains were not statistically significant.
CONCLUSION
Combining ctDNA features with preoperative clinical factors and CT results can enhance the predictive ability of LNM models in patients with EC.
Core Tip: This retrospective study developed machine learning models to predict lymph node metastasis in 206 esophageal cancer patients. The optimal random forest model, using clinical, computed tomography, and pathological features, achieved an area under the curve of 0.79 and 82.26% accuracy, outperforming computed tomography alone. Integrating circulating tumor DNA features from a 57-patient subset further improved area under the curve and F1 score by 9.0% and 14.3%, respectively, demonstrating enhanced predictive capability.