Published online Sep 28, 2022. doi: 10.3748/wjg.v28.i36.5338
Peer-review started: July 20, 2022
First decision: August 6, 2022
Revised: August 14, 2022
Accepted: September 6, 2022
Article in press: September 6, 2022
Published online: September 28, 2022
Processing time: 64 Days and 19.4 Hours
Gray-level co-occurrence matrix (GLCM) based feature extraction could serve as a robust and promising tool to improve the predictive efficiency for lymph node metastasis (LNM) of individual undifferentiated early gastric cancer (UEGC) patients. Additionally, machine learning (ML) adopts more optimized algorithms and more clear feature extraction. Models built using random forest classifier (RFC) have the highest predictive accuracy in Entropy, Haralick full angle (Haralick_all), Haralick 30° (Haralick_30), Inverse gap full angle (IG_all), Inverse gap 45° (IG_45), Inverse gap 0° (IG_0), and Inertia value 45° (IV_45). Further research is needed to develop these models for clinical practice.
The evaluation results indicate that the method of selecting radiological and textural features becomes more effective in the discrimination of LNM from UEGC patients. In addition, an ML-based prediction model developed using RFC can be used to derive treatment options and identify LNM that can improve clinical outcomes.
GLCM based feature extraction significantly correlated with LNM. The top 7 GLCM based factors included Inertia value 0°, IV_45, IG_0, IG_45, IG_all, Haralick_30, Haralick_all, and Entropy. The areas under the receiver operating characteristic (ROC) curve (AUCs) of the RFC model, support vector machine (SVM), eXtreme gradient boosting (XGBoost), artificial neural network (ANN), and decision tree (DT) ranged from 0.805 [95% confidence interval (CI): 0.258-1.352] to 0.925 (95%CI: 0.378-1.472) in the training set and from 0.794 (95%CI: 0.237-1.351) to 0.912 (95%CI: 0.355-1.469) in the testing set, respectively. The RFC (training set: AUC: 0.925, 95%CI: 0.378-1.472; testing set: AUC: 0.912, 95%CI: 0.355-1.469) model incorporating Entropy, Haralick_all, Haralick_30, IG_all, IG_45, IG_0, and IV_45 had the highest predictive accuracy.
We retrospectively selected 526 cases of UEGC confirmed by pathological examination after radical gastrectomy without endoscopic treatment in four tertiary hospitals between January 2015 to December 2021. GLCM-based features were extracted from grayscale images and ML was applied to the classification of candidate predictive variables. In order to evaluate robustness and clinical utility of each model, the following were made: ROC, decision curve analysis, and clinical impact curve.
Identifying a potential biomarker that predicts LNM is proven to be very useful in determining treatment.
To develop a ML-based integral procedure to construct the LNM gray level co-occurrence matrix (GLCM) prediction model.
The risk of LNM is the most important consideration in determining treatment strategies for UEGC. Therefore, identifying a potential biomarker that predicts LNM is proven to be very useful in determining treatment.
