Published online Jul 16, 2025. doi: 10.12998/wjcc.v13.i20.104556
Revised: March 3, 2025
Accepted: March 13, 2025
Published online: July 16, 2025
Processing time: 106 Days and 8.2 Hours
Endometriosis is a clinical condition characterized by the presence of endometrial glands outside the uterine cavity. While its incidence remains mostly uncertain, endometriosis impacts around 180 million women worldwide. Despite the presentation of several epidemiological and clinical explanations, the precise mechanism underlying the disease remains ambiguous. In recent years, resear
To identify genetic biomarkers linked to endometriosis by the application of machine learning (ML) approaches.
This case-control study accounted for the open-access transcriptomic data set of endometriosis and the control group. We included data from 22 controls and 16 endometriosis patients for this purpose. We used AdaBoost, XGBoost, Stochasting Gradient Boosting, Bagged Classification and Regression Trees (CART) for classification using five-fold cross validation. We evaluated the performance of the models using the performance measures of accuracy, balanced accuracy, sensiti
Bagged CART gave the best classification metrics. The metrics obtained from this model are 85.7%, 85.7%, 100%, 75%, 75%, 100% and 85.7% for accuracy, balanced accuracy, sensitivity, specificity, positive predictive value, negative predictive va
This study determined possible genomic biomarkers for endometriosis using transcriptomic data from patients with/without endometriosis. The applied ML model successfully classified endometriosis and created a highly accurate diagnostic prediction model. Future genomic studies could explain the underlying pathology of endometriosis, and a non-invasive diagnostic method could replace the invasive ones.
Core Tip: Genetic research has aimed to discover the gene or genes responsible for the disease through association or linkage studies involving candidate genes or DNA mapping techniques. This study aimed to determine genomic biomarkers associated with endometriosis by using machine learning models (AdaBoost, XGBoost, Stochasting Gradient Boosting, Bagged Classification and Regression Trees). According to the variables' importance in the modeling, CUX2, CLMP, CEP131, EHD4, CDH24, ILRUN, LINC01709, HOTAIR, SLC30A2, and NKG7 genes and transcripts whose other gene names are inaccessible can be used as candidate biomarkers for endometriosis.