Comparison between multiple logistic regression and machine learning methods in prediction of abnormal thallium scans in type 2 diabetes

doi:10.12998/wjcc.v11.i33.7951

Advanced Search

BPG is committed to discovery and dissemination of knowledge

Home / Archive / Volume 11, Issue 33

This Article

Table of Contents

Peer-Review Report of This Article

CrossCheck and Google Search of This Article

Academic Rules and Norms of This Article

Citation of this article

Corresponding Author of This Article

Research Domain of This Article

Article-Type of This Article

Open-Access Policy of This Article

Times Cited Counts in Google of This Article

Number of Hits and Downloads for This Article

Total Article Views (4803)

All Articles published online

The chart showing PDF series, WORD series, HTML series, Figures (1-4) series, Tables (1-5) series.

Item

Count

PDF

167

WORD

HTML

2109

Figures (1-4)

579

Tables (1-5)

464

Sum=3341

Publishing Process of This Article

The chart showing Browse series, Download series.

Item

Count

Browse

248

Download

1030

Sum=1278

Nov 26, 2023 (publication date) through Mar 7, 2026

Times Cited of This Article

Times Cited (0)

Journal Information of This Article

Publication Name

World Journal of Clinical Cases

ISSN

2307-8960

Publisher of This Article

Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA

Retrospective Cohort Study Open Access

World J Clin Cases. Nov 26, 2023; 11(33): 7951-7964
Published online Nov 26, 2023. doi: 10.12998/wjcc.v11.i33.7951

Comparison between multiple logistic regression and machine learning methods in prediction of abnormal thallium scans in type 2 diabetes

Chung-Chi Yang, Chung-Hsin Peng, Li-Ying Huang, Fang Yu Chen, Chun-Heng Kuo, Chung-Ze Wu, Te-Lin Hsia, Chung-Yu Lin

Chung-Chi Yang, Division of Cardiovascular Medicine, Taoyuan Armed Forces General Hospital, Taoyuan City 32551, Taiwan

Chung-Chi Yang, Division of Cardiovascular, Tri-service General Hospital, Taipei City 114202, Taiwan

Chung-Hsin Peng, Department of Urology, Cardinal Tien Hospital, New Taipei City 23148, Taiwan

Chung-Hsin Peng, School of Medicine, Fu-Jen Catholic University, New Taipei City 242062, Taiwan

Li-Ying Huang, Department of Internal Medicine, Department of Medical Education, School of Medicine, Fu Jen Catholic University Hospital, New Taipei City 243, Taiwan

Li-Ying Huang, Chun-Heng Kuo, School of Medicine, College of Medicine, Fu Jen Catholic University, New Taipei City 243, Taiwan

Fang Yu Chen, Department of Endocrinology, Fu Jen Catholic University Hospital, New Taipei City 243, Taiwan

Chun-Heng Kuo, Division of Endocrinology and Metabolism, Department of Internal Medicine, Fu Jen Catholic University Hospital, New Taipei City 243, Taiwan

Chung-Ze Wu, Division of Endocrinology, Shuang Ho Hospital, New Taipei City 23561, Taiwan

Chung-Ze Wu, School of Medicine, Taipei Medical University, Taipei City 11031, Taiwan

Te-Lin Hsia, Department of Internal Medicine, Cardinal Tien Hospital, New Taipei City 23148, Taiwan

Chung-Yu Lin, Department of Cardiology, Fu Jen Catholic University Hospital, New Taipei City 24352, Taiwan

Chung-Yu Lin, Graduate Institute of Business Administration, Fu Jen Catholic University, New Taipei City 242062, Taiwan

ORCID number: Chung-Chi Yang (0009-0000-8271-2885); Chung-Hsin Peng (0000-0002-7080-2602); Li-Ying Huang (0000-0002-0593-6428); Fang Yu Chen (0000-0002-5590-2744); Chun-Heng Kuo (0000-0001-7673-3567); Chung-Ze Wu (0000-0001-6118-6070); Te-Lin Hsia (0009-0000-9822-1570); Chung-Yu Lin (0009-0003-1209-3603).

Author contributions: Yang CC, Lin CY designed the research study; Huang LY, Chen FY and Hsia TL performed the research; Kuo CH and Wu CZ contributed new reagents and analytic tools; Yang CC, Peng CH and Lin CY analyzed the data and wrote the manuscript; All authors have read and approve the final manuscript.

Institutional review board statement: The study was reviewed and approved by the Cardinal Tien Hospital Institutional Review Board (Approval No. CTH-102-2-5-024).

Informed consent statement: Since this is a retrospective cohort study and we collected our data from the medical records of the hospital. Therefore, no informed consent was needed. This was approved by the IRB of the hospital.

Conflict-of-interest statement: There is no conflict of Interest in the current study.

Data sharing statement: The datasets generated and/or analyzed during the current study are not publicly available because they include other valuable information which could be used to produce additional papers, but are available from the corresponding author on reasonable request.

STROBE statement: The authors have read the STROBE Statement—checklist of items, and the manuscript was prepared and revised according to the STROBE Statement—checklist of items.

Corresponding author: Chung-Yu Lin, MD, Doctor, Department of Cardiology, Fu Jen Catholic University Hospital, No. 69 Guizi Road, Taishan District, New Taipei City 24352, Taiwan. a02076@mail.fjuh.fju.edu.tw

Received: August 4, 2023
Peer-review started: August 4, 2023
First decision: October 9, 2023
Revised: October 23, 2023
Accepted: November 13, 2023
Article in press: November 13, 2023
Published online: November 26, 2023
Processing time: 102 Days and 7.2 Hours

Abstract

BACKGROUND

The prevalence of type 2 diabetes (T2D) has been increasing dramatically in recent decades, and 47.5% of T2D patients will die of cardiovascular disease. Thallium-201 myocardial perfusion scan (MPS) is a precise and non-invasive method to detect coronary artery disease (CAD). Most previous studies used traditional logistic regression (LGR) to evaluate the risks for abnormal CAD. Rapidly developing machine learning (Mach-L) techniques could potentially outperform LGR in capturing non-linear relationships.

AIM

To aims were: (1) Compare the accuracy of Mach-L methods and LGR; and (2) Found the most important factors for abnormal TMPS.

METHODS

556 T2D were enrolled in the study (287 men and 269 women). Demographic and biochemistry data were used as independent variables and the sum of stressed score derived from MPS scan was the dependent variable. Subjects with a MPS score ≥ 9 were defined as abnormal. In addition to traditional LGR, classification and regression tree (CART), random forest, Naïve Bayes, and eXtreme gradient boosting were also applied. Sensitivity, specificity, accuracy and area under the receiver operation curve were used to evaluate the respective accuracy of LGR and Mach-L methods.

RESULTS

Except for CART, the other Mach-L methods outperformed LGR, with gender, body mass index, age, low-density lipoprotein cholesterol, glycated hemoglobin and smoking emerging as the most important factors to predict abnormal MPS.

CONCLUSION

Four Mach-L methods are found to outperform LGR in predicting abnormal TMPS in Chinese T2D, with the most important risk factors being gender, body mass index, age, low-density lipoprotein cholesterol, glycated hemoglobin and smoking.

Key Words: Myocardial perfusion scintigraphy; Machine learning; Type 2 diabetes; Thallium-201

Core Tip: This is a retrospective study to use four machine learning methods to evaluate the impacts of demographic and biochemistry data to identify subjects with abnormal myocardial perfusion scan in Chinese type 2 diabetes. Our results showed that gender was the most important factor, followed by body mass index, age, LDL-cholesterol, glycated hemoglobin and smoking accordingly.

Citation: Yang CC, Peng CH, Huang LY, Chen FY, Kuo CH, Wu CZ, Hsia TL, Lin CY. Comparison between multiple logistic regression and machine learning methods in prediction of abnormal thallium scans in type 2 diabetes. World J Clin Cases 2023; 11(33): 7951-7964
URL: https://www.wjgnet.com/2307-8960/full/v11/i33/7951.htm
DOI: https://dx.doi.org/10.12998/wjcc.v11.i33.7951

INTRODUCTION

The International Diabetes Federation reported a global diabetic population of 415 million people in 2018 [of which 91% were type 2 diabetes (T2D)] and this is expected to increase to 642 million by 2040[1]. In Taiwan, the diabetic population rose from 8.5% in 2008 to 12.3% in 2015[2]. As of 2019, T2D treatment accounted for 4.4% of the entire budget for Taiwan’s national health insurance program (equivalent to USD238 million), making it the second highest treatment category after dialysis[3]

Several micro- and macrovascular diseases are related to T2D, namely stroke, myocardial infarction, diabetic retinopathy, nephropathy and diabetic foot. According to World Health Organization, 50.3% of T2D patients die from cardiovascular diseases, and the disease typically shortens lifespans by 6 years[4,5]. Avogaro et al[6] reported that subjects with T2D bear 70% higher risk for acute myocardial infarction and the risk for the first-time myocardial infarction is higher than 20% 10 years after the diagnosis of diabetes. Asian patients have even higher mortality rate than those in Western countries[7]. Thus, early detection of coronary artery disease (CAD) in these patients is of great importance.

Methods for diagnosing CAD including coronary angiography, computed tomography coronary angiography, exercise electrocardiogram and myocardial perfusion scintigraphy (MPS). While coronary angiography provides the most detailed information for artery stenosis, it is an invasive procedure and is unsuited for routine application in subjects with mild symptoms. Tomography coronary angiography is expensive and not covered by Taiwan’s National Insurance Program, thus making it inappropriate for routine screening. Exercise electrocardiogram is relatively less expensive but requires a certain amount of treadmill exercise loading to increase the heart rate, making it inappropriate for patients already suffering from debility[8,9]. Lastly, MPS uses Thallium as a tracer to evaluate the perfusion of blood in myocardium. At the same time, dipyridamole is injected to increase the heart rate, thus allowing comparison between fast and slow heart rate. Giri et al[10] note MPS is widely used to reliably diagnose significant CAD and to stratify those at higher risk levels. Thus, MPS could also be used as a surrogate. Scholte et al[7] found that current smoker status, long duration of diabetes and high cholesterol/HDL ratio contribute to abnormal MPS. It should be noted that all the aforementioned studies used traditional multiple logistic regression (MLR) to analyze categorical data (i.e., the dependent variable).

Recently, artificial intelligence using machine learning (Mach-L) techniques have developed rapidly and are increasingly used in medical research. Mach-L is the use of computer algorithms that learn automatically without explicit programming through experience and data application[11]. Mach-L has emerged as a new mainstream modality for data analysis competitive with traditional MLR[12,13]. Since Mach-L could capture nonlinear relationships in the data and complex interactions among multiple predictors, it has the potential to outperform conventional logistic regression in disease prediction[14].

Our group previously explored the relationships between risks and MPS score in a group of T2D Chinese[15]. We applied traditional linear regression and treated the MPS score as the dependent variable. In the present study, we categorize subjects by MPS status (normal and abnormal) as a dependent variable, and compare the performance of traditional linear regression against multiple Mach-L methods for the first time, seeking to determine the relative importance of various risk factors.

MATERIALS AND METHODS

Subjects

This study recruited T2D patients, aged between 30 and 95 years old, who had undergone Thallium-201 MPS at Taiwan’s Cardinal Tien Hospital from 1999 to 2008. All study subjects were anonymous, and informed consent was obtained prior to participation. The study proposal was reviewed and approved by the institutional review board of the Cardinal Tien Hospital (Approval No. CTH-102-2-5-024) before the study began. The diagnostic criteria for T2DM were based on the 2012 American Diabetes Association criteria[16]. A total of 928 T2DM patients were initially recruited. Following exclusions for various causes, the final sample included 556 T2D patients (287 men and 269 women). Figure 1 shows the flowchart for subject selection. Since the patients were randomly selected and they were relatively stable at the time of the study, the bias should be minimun.

Open in New Tab Full Size Figure Download Figure

Figure 1 Flowchart of sample selection from the Cardinal Tien Hospital Diabetes Study Cohort.

The inclusion criteria of the study participants are: Age between 30-70 years old; Body mass index (BMI) between 20-30 kg/m²; Without dialysis at the time of the study; Without major medical diseases such as myocardial infarction, stroke and diabetic foot.

BMI was calculated as body weight (kg)/height (m)². Systolic and diastolic blood pressure (SBP and DBP) were measured on the right arm of seated subjects using a standard mercury sphygmomanometer. Blood samples were drawn from the antecubital vein for biochemical analysis.

MPS

On the day of testing, patients fasted for 4 h and avoided dipyridamole, b-blockers, calcium channel blockers, long-acting nitrates, xanthine-containing medications, caffeinated beverages. Dipyridamole was infused intravenously over 4 minutes at a concentration of 0.56 mg/kg in 20 mL of normal saline (an infusion rate of 0.14 mg/kg/min), followed 3 to 4 min later by Th-201 administration. The scan was conducted 5 to 8 min after radiopharmaceutical administration (stress scan) and 3 h later (rest).

The myocardial region was classified into 17 parts, each of which was evaluated by nuclear medicine experts based on a 5-point scoring system as follows[17]: 0, normal; 1, slight decrease of tracer uptake; 2, moderate decrease of tracer uptake; 3, severe decrease of tracer uptake; 4, absence of tracer uptake. The stress score and rest score of single vessels was initially counted into individual vessel scores. The sums of individual vessel stress scores (after injection of dipyridamole) were recognized as the presentative of the MPS results since some studies have shown that SSS provides important information to detect CAD and its outcome[18-20]. In the present study, an SSS score ≥ 9 was considered to be abnormal[21].

Laboratory evaluation

Following 10 h overnight fast, blood specimens were collected from each subject for further analysis. Plasma was separated from the whole blood within one hour and stored at -70 °C. A glucose oxidase method (YSI 203 glucose analyzer; Scientific Division, Yellow Springs Instruments, Yellow Springs, OH, United States) was used to determine fasting plasma glucose levels. The dry, multilayer analytical slide method in the Fuji Dri-Chem 3000 analyzer (Fuji Photo Film, Minato-Ku, Tokyo, Japan) was used to determine total cholesterol and triglyceride (TG) levels. An enzymatic cholesterol assay following dextran sulfate precipitation was used to determine serum high density lipoprotein (HDL-C) and low-density lipoprotein cholesterol (LDL-C) levels. The HbA1c level was measured using the Bio-Rad Variant II automatic analyzer (Bio-Rad Diagnostic Group, Los Angeles, CA, United States). Plasma insulin was assayed using a commercial solid phase radioimmunoassay technique (Coat-A-Count insulin kit, Diagnostic Products Corporation, Los Angeles, CA, United States) with intra- and inter-assay coefficients of variance of 3.3% and 2.5%, respectively.

Statistical analysis

The data were tested for normal distribution using the Kolmogorov–Smirnov test and for homogeneity of variances using the Levene’s test. Continuous variables were expressed as mean ± standard deviation.

Tables 1 and 2 lists the seventeen clinical variables (independent variables) used in this study: Sex, age, smoking, BMI, duration of diabetes, SBP, DBP, hemoglobin (Hb), TG, glycated hemoglobin (GA), HDL-C, LDL-C, homeostasis assessment insulin resistance (HOMA-IR), homeostasis assessment insulin secretion (HOMA-IS). As previously mentioned, the SSS derived from the Th-201 scan is the dependent variable, while the remaining 15 variables are used as predictor variables.

Table 1 Participant demographics.

Variables	mean ± SD	N
Age	67.38 ± 9.69	556
Body mass index	26.16 ± 3.9	556
Duration of diabetes	13.69 ± 7.94	556
Systolic blood pressure	131.14 ± 15.42	493
Diastolic blood pressure	73.32 ± 10.15	493
Hemoglobin	12.92 ± 1.68	444
Triglyceride	153.74 ± 45.85	539
Glycated hemoglobin	7.79 ± 1.36	538
High density lipoprotein cholesterol	122.65 ± 74.34	535
Low density lipoprotein cholesterol	49.65 ± 14.75	498
Alanine aminotransferase	23.87 ± 13.94	537
Creatinine	1.16 ± 1	536
Microalbumin creatinine ratio	194.18 ± 733.73	526
Homeostasis assessment-insulin resistance	0.63 ± 0.34	366
Homeostasis assessment-insulin secretion	1.71 ± 0.37	366

Open in New Tab Full Size Table

Table 2 Participant demographics – sex, smoking and sum stressed score.

	N (%)	N
Sex		556
0	287 (51.62)
1	269 (48.38)
Smoking		310
0	202 (65.16)
1	108 (34.84)
Sum stressed score		556
0	180 (32.37)
1	376 (67.63)

Open in New Tab Full Size Table

Machine learning methods and proposed scheme

The following methods were designed and published by our group in another recent study[22]. This research proposed a scheme based on four Mach-L methods, namely classification and regression tree (CART), random forest (RF), eXtreme gradient boosting (XGBoost) and naïve Byes (NB) to construct predictive models for determining abnormal MPS, and to identify the importance of these risk factors. These Mach-L methods have been widely applied in various healthcare and/or medical informatics applications and do not have prior assumptions about data distributions[23-31]. MLR is a used as a benchmark for comparison.

For the first method, CART is a tree structure method[32] composed of root nodes, branches, and leaf nodes based the recursive growth of trees from root nodes, splitting at each node based on the Gini index to produce branches and leaf nodes. A pruning node is applied to overgrown trees to produce optimal tree size by using a cost-complexity criterion, finally generating different decision rules to compose a complete tree structure[33,34].

RF is an ensemble learning decision tree algorithm that combines bootstrap resampling and bagging[35]. RF randomly generates many different and unpruned CART decision trees in which the decrease Gini impurity is regarded as the splitting criterion, and combining all generating trees into a forest. All the trees in the forest are then averaged or voted to generate output probabilities and a robust final model[36].

NB’s Classifier is widely used for classification tasks. This algorithm can sort objects according to specific characteristics and variables based on the Bayes theorem. It calculates the probability of hypotheses on presumed groups[37].

XGBoost is a gradient boosting technique based on the stochastic gradient boosting method optimized extension[38]. It sequentially trains and integrates many weak models using the gradient boosting method of outputs, thus improving prediction performance. The Taylor binomial expansion is used to approximate the objective function and determine arbitrary differentiable loss functions to accelerate model construction and convergence[39]. XGBoost then applies a regularized boosting technique to penalize model complexity and correct overfitting, thereby increasing model accuracy[38].

Figure 2 presents a flowchart of the proposed scheme combining the four Mach-L methods. The proposed scheme first collects patients to prepare the dataset for model construction. The dataset is then randomly split into a training dataset for model building (80%) and a testing dataset (20%) for out of sample testing. In the training process, each Mach-L method has its own hyperparameters to be tuned to construct a model with relatively good performance. We use a 10-fold cross-validation technique for hyperparameter tuning, in which the training dataset was randomly divided into a training dataset to build the model with different sets of hyperparameters and a validation dataset. All possible combinations of hyperparameters were investigated by grid search. The model with the highest accuracy, sensitivity, specificity and area under the receiver operation characteristic curve (AUROC) on the validation dataset was viewed as the best model for each Mach-L method. The turned best model CART, RF, XGBoost and NB are generated and the corresponding variable impact rankings can be obtained.

Open in New Tab Full Size Figure Download Figure

Figure 2 Proposed scheme for four machine learning methods. CART: Classification and regression tree; RF: random forest; XGBoost: eXtreme gradient boosting; NB: Naïve Byes.

To provide a more robust comparison, the training and testing process mentioned above is randomly repeated 10 times, taking the average accuracy, sensitivity, specificity and AUROC values of the Mach-L methods as the performance benchmark for the MLR model using the same training and testing dataset as that used for the Mach-L methods. A Mach-L model with a higher AUROC is considered to be the convincing model.

All the Mach-L methods used can produce an impact ranking of each predictor variable, and these rankings may differ among the various Mach-L methods due to differences in their modeling characteristics. We therefore integrated the variable importance rankings of the convincing Mach-L models to enhance model stability and integrity in terms of the relative risk factor impacts. Below we summarize and discuss our significant findings about convincing Mach-L models and the related impact factors.

According to the proposed scheme, for modeling effective RF, stellate ganglion block (SGB), NB, and XGBoost models, use 10-fold cross-validation hyperparameters of each method are tuned and evaluated. The MLR method without hyperparameter tuning, the baseline method, was constructed by using the proposed scheme. The values of hyperparameters which generate the best RF, SGB, NB, and XGBoost models are listed in the following (Table 3).

Table 3 Summary of the values of the hyperparameters for the best random forest, classification and regression tree, Naïve Byer’s classifier, eXtreme gradient boosting.

Methods	Hyperparameters	Best value	Meaning
RF	Mtry	8	The number of random features used in each tree
RF	Ntree	500	The number of trees in forest
CART	Minispilt	20	The minimum number of observations required to attempt a split in a node
	Minibucket	7	The minimum number of observations in a terminal node
	Maxdepth	10	The maximum depth of any node in the final tree
	Xval	10	Number of cross-validations
	Cp	0.03588	Complexity parameter: The minimum improvement required in the model at each node
XGBoost	Nrounds	100	The number of tree model iterations
	Max_depth	3	The maximum depth of a tree
	Eta	0.4	Shrinkage coefficient of tree
	Gamma	0	The minimum loss reduction
	Subsample	0.75	Subsample ratio of columns when building each tree
	Colsample_bytree	0.8	Subsample ratio of columns when constructing each tree
	Rate_drop	0.5	Rate of trees dropped
	Skip_drop	0.05	Probability of skipping the dropout procedure during a boosting iteration
	Min_child_weight	1	The minimum sum of instance weight
NB	Fl	0	Adjustment of Laplace smoother
	Usekernel	TRUE	Using kernel density estimate for continuous variable versus a Gaussian density estimate
	Adjust	1	Adjust the bandwidth of the kernel density

CART: Classification and regression tree; RF: random forest; XGBoost: eXtreme gradient boosting; NB: Naïve Byes.

Open in New Tab Full Size Table

All methods used R software version 4.0.5 and RStudio version 1.1.453 with the required packages installed (http://www.R-project.org; https://www.rstudio.com/products/rstudio/). The implementations of RF, NB, CART, and XGBoost are respectively “randomForest” R package version 4.6-14[40], “gbm” R package version 2.1.8[41], “rpart” R package version 4.1-15[42], and “XGBoost” R package version 1.5.0.2.[43]. To optimize the hyperparameter set for the developed CART, RF, NB, XGBoost methods, the “caret” R package version 6.0-90 was used[44]. The MLR was implemented using the “stats” R package version 4.0.5 using the default settings for model construction.

RESULTS

Tables 1 and 2 summarizes the demographic data of the T2D subjects. Table 4 compares the conventional MLR and four Mach-L methods in terms of accuracy in identifying abnormal MPS. We find that, aside from CART, the other three Mach-L methods outperformed MLR in terms of AUROC performance, suggesting that these three methods are more reliable and accurate than traditional MLR.

Table 4 The average performance of the LR, random forest, stellate ganglion block, classification and regression tree, and eXtreme gradient boosting methods.

	Accuracy	Sensitivity	Specificity	AUC
LGR	0.685 ± 0.072	0.687 ± 0.152	0.683 ± 0.114	0.703 ± 0.057
CART	0.541 ± 0.074	0.546 ± 0.078	0.529 ± 0.670	0.540 ± 0.070
RF	0.707 ± 0.047	0.711 ± 0.100	0.678 ± 0.099	0.707 ± 0.037
XGBoost	0.712 ± 0.072	0.727 ± 0.139	0.674 ± 0.088	0.719 ± 0.062
NB	0.692 ± 0.059	0.702 ± 0.116	0.669 ± 0.090	0.704 ± 0.056

AUC: Area under the curve; LGR: Logistic regression; CART: Classification and regression tree; RF: Random forest; XGBoost: eXtreme gradient boosting; NB: Naïve Byes.

Open in New Tab Full Size Table

The average ranking of each factor created by Mach-L is shown in Table 5. The different Mach-L methods generated different variable impact rankings for each risk factor. The impact ranking is obtained by averaging the variable impact. Note that a darker blue color indicates greater relative importance of a particular risk factor. To identify the overall predictive power of each parameter from all three Mach-L methods, the mean ranking of each risk factor is obtained by averaging the ranking values of each variable in each method.

Table 5 The variable importance and rank of the importance of the risk factors derived from machine learning methods.

Variables	RF	XGBoost	NB	Average	Rank
Sex	100.0 ± 0	100.0 ± 0	100.0 ± 0	100.0	1.0
Body mass index	54.2 ± 6.6	61.1 ± 14.7	86.2 ± 6.8	67.1	2.0
Age	13.1 ± 7.6	78.3 ± 13.2	67.9 ± 6.5	53.1	3.0
Low density lipoprotein cholesterol	30.4 ± 3.1	8.4 ± 12.8	71.0 ± 7.8	36.6	4.0
Glycated hemoglobin	15.4 ± 5.9	12.8 ± 11.9	48.0 ± 8.3	25.4	5.0
Smoking	12.2 ± 2.7	28.8 ± 9.2	34.5 ± 6.6	25.2	6.0
Creatinine	10.1 ± 2.3	5.3 ± 9.12	53.1 ± 7.3	22.8	7.0
Duration	6.3 ± 4.61	41.5 ± 8.6	10.1 ± 8.9	19.3	8.0
Hemoglobin	8.0 ± 4.16	16.6 ± 8.9	17.0 ± 5.7	13.8	9.0
Blood urine nitrogen	9.0 ± 8.15	6.5 ± 6.79	17.3 ± 9.6	11.0	10.0
Systolic blood pressure	4.2 ± 1.03	21.6 ± 5.1	6.4 ± 2.88	10.7	11.0
Triglyceride	5.4 ± 17.5	15.0 ± 4.4	11.1 ± 12.3	10.5	12.0
Microalbumin	4.3 ± 2.23	3.6 ± 3.83	22.7 ± 6.9	10.2	13.0
Diastolic blood pressure	2.5 ± 5.91	18.9 ± 3.7	5.6 ± 9.33	9.0	14.0
Alainine aminotransferase	3.2 ± 5.96	6.9 ± 3.90	13.0 ± 12.6	7.7	15.0
High density lipoprotein cholesterol	1.3 ± 3.60	9.8 ± 3.29	7.3 ± 8.41	6.1	16.0
HOMA-IR	5.7 ± 2.85	2.2 ± 2.52	10.2 ± 8.1	6.0	17.0
HOMA-B	4.3 ± 2.22	0.0 ± 0.00	7.4 ± 8.831	3.9	18.0

The most important sixth rank. RF: Random forest; XGBoost: eXtreme gradient boosting; NB: Naïve Byes; HOMA-IR: Homeostasis assessment insulin resistance; HOMA-B: homeostasis model assessment of beta-cell function.

Open in New Tab Full Size Table

Figure 3 graphically depicts the orders of risk factors importance. The top 6 risk factors in predicting abnormal MPS scan are sex, BMI, age, LDL-C, GH and smoking.

Open in New Tab Full Size Figure Download Figure

Figure 3 Integrated importance ranking of all risk factors.

Finally, Figure 4 shows the AUROC of the LGR and Mach-L. Since CART has poor AUROC results, this method was not included in the analysis.

Open in New Tab Full Size Figure Download Figure

Figure 4 Receiver operation curve of logistic regression and other four different machine learning methods. LGR: Logistic regression; CART: Classification and regression tree; RF: Random forest; XGBoost: eXtreme gradient boosting; NB: Naïve Byes.

DISCUSSION

Three out of the four Mach-L methods outperformed multiple logistic regression in identifying abnormal MPS. Also, the most important risk factors for abnormal MPS in T2D Chinese are (in descending order) sex, BMI, age, LDL-C, GH and smoking.

Our results suggest that gender is the most important factor selected by Mach-L. However, the role of gender on CAD is still under debate. Zafrir et al[45] and Miller et al[46] found that men are at greater than women for higher risk of abnormal MPS. Among Chinese subjects, a similar finding was also reported that diabetic men are more prone to CAD than women[47]. However, opposite findings have been reported by Wu et al[48] reporting that menopausal women had higher SSS than men of the same age, but their study was based on a relatively small sample of 94 T2D. Both Prior et al[49] and Scholte et al[7] also found that gender had no impact on MPS score, but these were also based on relatively small samples (133 and 120, respectively). This inconsistency may be due to the sample size, or differences in age and ethnicity of the sample populations. In the present study, sex was identified as the most important variable using three different Mach-L methods, and we believe that gender does play an important role in CAD.

BMI is the second most important factor on MPS score. It is not surprising that T2D patients with higher BMI would have increased susceptibility of CAD[50]. Katzel et al[51] also reported that subjects with higher BMI are more prone to having myocardial ischemia detected by MPS and treadmill exercise test (OR=91.7, 95%CI = 1.075-999). However, their study was done on non-diabetic subjects. The impact of BMI on CAD could easily be explained by the derangements directly induced by obesity itself[52]. Obesity is also the underlying pathophysiology for other metabolic alterations such as hyperglycemia, hypertriglyceridemia, low high density lipoprotein cholesterol, inflammation, hypercoagulation, endothelial dysfunction and oxidative stress. All these factors are well-known contributors to CAD[53]. It is well-known that obesity is also an important contributor for insulin resistance. But in the present study, HOMA-IR was not selected as an important contributor. This interesting finding could be explained by the tight relationship between BMI and CAD. In other words, the significance of the relationship between HOMA-IR and MPS is ‘diluted’ by the influence of BMI[53]. This finding suggests BMI plays a crucial role in determining the likelihood of developing CAD.

Age is strongly correlated with the occurrence of CAD. Budoff et al[54] used the engine proposed by United Kingdom Prospective Study to calculate the CAD in a 7-year longitudinal study of 1087 T2D patients. Another study in Finland followed a much larger cohort (14786 subjects) for 7 years, also finding that total cholesterol level, BP, BMI and diabetes could explain a 30% increase in coronary heart disease in men and 50% in women. Thus, this correlation is confirmed in both Indians and Caucasians, and the present study confirms it in ethnic Chinese. This finding is unsurprising given that aging causes hypertrophy of left ventricle, arrythmia, ischemic tissue, fibrosis of cardiac muscle and the appearance of apoptotic/necrotic cells[55].

According to the results of United Kingdom Prospective Diabetes Study, LDL-C was one of the many risk factors for CAD[56]. Our results show that LDL-C is the fourth most important factor. Similar to the previously discussed factors, the role of LDL-C on the CAD has long been recognized. As early as 2001, the Third Report of the National Cholesterol Education Program, Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults published guidelines for the diagnosis and treatment of dyslipidemia[57]. In this version, LDL-C was suggested as the most important factor for future occurrence of coronary heart disease. Colantonio et al[58] studied 370763 subjects in three different cohorts, finding that the hazard ratio and 95% confidence interval for having coronary heart disease in the highest vs lowest LDL-C level (≥ 146 mg/dL vs ≤ 102 mg/dL) ranged from 1.89 (1.42–2.51) to 1.25 (0.81–1.92). Hosokawa et al[59] showed that reducing total cholesterol in patients with either angina or old myocardial infarction by 22% resulted in myocardial perfusion improvement detectable by MPS. However, this study only included there were only 40 participants enrolled (15 patients, 25 controls). The mechanisms for LDL-C to increase CAD were comprehensively explained: The most crucial steps for artherosclerosis are the retention and accumulation of LDL-C and ApoB lipoprotein in the intima of coronary artery which leads to the appearance of plaque[60-62]. This is because these two particles are both smaller than 70 nm in diameters[62,63]. Interestingly, plaque formation is also dose-dependent to the LDL-C concentration[64].

To our knowledge, the earliest large cohort study exploring the relationship between glucose control and risks for CAD was UKPDS 33, published in 1998. This study included 3867 T2D patients, followed up for 20 years, and found that despite a 16% decrease of myocardial infarction in the intensive control group, the result did not reach the level of statistical significance[65]. However, at the end of 30 years, the same cohort was followed up again, finding a significant 15% decrease of myocardial infarction. This is the famous ‘legacy effect’ of the glucose control. Other large-scale studies followed the UKPDS. For instance, the Intensive Blood Glucose Control and Vascular Outcomes in Patients with Type 2 Diabetes (ADVANCE) study enrolled 11140 T2D patients and verified a significant decrease in combined major macrovascular disease[66]. The results of the present study are consistent with these cornerstone studies. The conjunctions between hyperglycemia and CAD are multi-faceted and include increased oxidative stress, advanced glycation end products and protein kinases C signaling. All these derangements lead to endothelial dysfunction of the coronary artery[67].

As early as in 1960, the Framingham Heart Study pointed out that smoking increases the risk for CAD[68], and also affects CAD severity and development pattern[69]. Smoking damages the epithelium of the coronary artery through oxidation of LDL-C, nicotine effects, increased sympathetic tone and myocardial necrosis[70-74]. Our findings are similar to those of other major studies but shows smoking at the sixth most important factor.

While our study verifies the most important factors for abnormal MPS using Mach-L methods which is naïve and informative, the present work is still subject to certain limitations. First, this is a cross-sectional study which is less convincing that a longitudinal one. Secondly, our sample size is smaller than several other previous works. However, since MPS is an expensive tool for detecting coronary artery perfusion, to increase the number of participants would be practically difficult. In the same time, by using four Mach-L methods, the bias could be reduced. Further longitudinal studies with larger samples are needed to further consolidate the present findings.

CONCLUSION

Three machine learning methods are found to outperform traditional logistic regression in predicting abnormal MPS in T2D Chinese subjects, with the most important risk factors identified (in descending order) as gender, BMI, age, LDL-C, GH and smoking.

ARTICLE HIGHLIGHTS

Research background

Research motivation

To compare the accuracy of LGR and Mach-L. To rank importance of risk factors for abnormal TMPS scan.

Research objectives

The present study enrolled 556 T2D patients, using four different Mach-L methods to analyze risk factors for abnormal MPS. Our goals are: (1) To compare the accuracy of LGR and Mach-L; and (2) To rank importance of risk factors for abnormal TMPS scan.

Research methods

Research results

Except for CART, the other Mach-L methods outperformed LGR, with gender, body mass index, age, LDL-cholesterol, glycated hemoglobin and smoking emerging as the most important factors to predict abnormal MPS.

Research conclusions

Four Mach-L methods are found to outperform LGR in predicting abnormal TMPS in Chinese T2D, with the most important risk factors being gender, body mass index, age, LDL-cholesterol, glycated hemoglobin and smoking.

Research perspectives

Mach-L methods outperformed LGR in this kind of study. Body mass index, age, LDL-cholesterol, glycated hemoglobin and smoking were most relevant to abnormal MPS.

References

1.	International Diabetes Federation. idf diabetes atlas. 7th ed. Brussels: International Diabetes Federation; 2015. [PubMed] [DOI]

2.	Health Promotion Administration, Ministry of Health and Welfare Statistical Yearbook of Health Promotion. [(accessed on 17 February 2020)];2015 Available from: https://www.hpa.gov.tw/Pages/Detail.aspx?nodeid=268&pid=8535. [PubMed] [DOI]

3.	National Health Insurance Administration. Statistics of Medical Care, National Health Insurance; 2020. Available from: https://www.nhi.gov.tw/Content_List.aspx?n=86ACD5A48FA3B1A1&topn=23C660CAACAA159D. [PubMed] [DOI]

4.	Federation ID. IDF Diabetes Atlas. 7th edn. 2015 2017 [cited 2017]. Available from: www.idf.org/diabetesatlas. [PubMed] [DOI]

Morrish NJ, Wang SL, Stevens LK, Fuller JH, Keen H. Mortality and causes of death in the WHO Multinational Study of Vascular Disease in Diabetes. Diabetologia. 2001;44 Suppl 2:S14-S21. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 767] [Cited by in RCA: 837] [Article Influence: 33.5] [Reference Citation Analysis (0)]

Avogaro A, Bonora E, Consoli A, Del Prato S, Genovese S, Giorgino F. Glucose-lowering therapy and cardiovascular outcomes in patients with type 2 diabetes mellitus and acute coronary syndrome. Diab Vasc Dis Res. 2019;16:399-414. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 17] [Cited by in RCA: 36] [Article Influence: 5.1] [Reference Citation Analysis (0)]

Scholte AJ, Schuijf JD, Kharagjitsingh AV, Dibbets-Schneider P, Stokkel MP, van der Wall EE, Bax JJ. Prevalence and predictors of an abnormal stress myocardial perfusion study in asymptomatic patients with type 2 diabetes mellitus. Eur J Nucl Med Mol Imaging. 2009;36:567-575. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 32] [Cited by in RCA: 38] [Article Influence: 2.1] [Reference Citation Analysis (0)]

Mark DB, Hlatky MA, Harrell FE Jr, Lee KL, Califf RM, Pryor DB. Exercise treadmill score for predicting prognosis in coronary artery disease. Ann Intern Med. 1987;106:793-800. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 450] [Cited by in RCA: 398] [Article Influence: 10.2] [Reference Citation Analysis (0)]

Mark DB, Shaw L, Harrell FE Jr, Hlatky MA, Lee KL, Bengtson JR, McCants CB, Califf RM, Pryor DB. Prognostic value of a treadmill exercise score in outpatients with suspected coronary artery disease. N Engl J Med. 1991;325:849-853. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 578] [Cited by in RCA: 517] [Article Influence: 14.8] [Reference Citation Analysis (0)]

10.

Giri S, Shaw LJ, Murthy DR, Travin MI, Miller DD, Hachamovitch R, Borges-Neto S, Berman DS, Waters DD, Heller GV. Impact of diabetes on the risk stratification using stress single-photon emission computed tomography myocardial perfusion imaging in patients with symptoms suggestive of coronary artery disease. Circulation. 2002;105:32-40. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 250] [Cited by in RCA: 233] [Article Influence: 9.7] [Reference Citation Analysis (0)]

11.

Koopaie M, Ghafourian M, Manifar S, Younespour S, Davoudi M, Kolahdooz S, Shirkhoda M. Evaluation of CSTB and DMBT1 expression in saliva of gastric cancer patients and controls. BMC Cancer. 2022;22:473. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 2] [Cited by in RCA: 17] [Article Influence: 4.3] [Reference Citation Analysis (1)]

12.

Wang C, Li Y, Tsuboshita Y, Sakurai T, Goto T, Yamaguchi H, Yamashita Y, Sekiguchi A, Tachimori H; Alzheimer’s Disease Neuroimaging Initiative. A high-generalizability machine learning framework for predicting the progression of Alzheimer's disease using limited data. NPJ Digit Med. 2022;5:43. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 1] [Cited by in RCA: 24] [Article Influence: 6.0] [Reference Citation Analysis (0)]

13.

Xu S, Arnetz JE, Arnetz BB. Applying machine learning to explore the association between biological stress and near misses in emergency medicine residents. PLoS One. 2022;17:e0264957. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 1] [Cited by in RCA: 1] [Article Influence: 0.3] [Reference Citation Analysis (0)]

14.

Lin Z, Chou WC, Cheng YH, He C, Monteiro-Riviere NA, Riviere JE. Predicting Nanoparticle Delivery to Tumors Using Machine Learning and Artificial Intelligence Approaches. Int J Nanomedicine. 2022;17:1365-1379. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 49] [Cited by in RCA: 91] [Article Influence: 22.8] [Reference Citation Analysis (0)]

15.

Lin JD, Pei D, Chen FY, Wu CZ, Lu CH, Huang LY, Kuo CH, Kuo SW, Chen YL. Comparison between Machine Learning and Multiple Linear Regression to Identify Abnormal Thallium Myocardial Perfusion Scan in Chinese Type 2 Diabetes. Diagnostics (Basel). 2022;12. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Reference Citation Analysis (0)]

16.	Mitchell TM. Machine Learning; New York McGraw Hill: New York, NY, USA, 1997. [PubMed] [DOI]

17.

Saquib N, Saquib J, Ahmed T, Khanam MA, Cullen MR. Cardiovascular diseases and type 2 diabetes in Bangladesh: a systematic review and meta-analysis of studies between 1995 and 2010. BMC Public Health. 2012;12:434. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 128] [Cited by in RCA: 117] [Article Influence: 8.4] [Reference Citation Analysis (0)]

18.

Ye Y, Xiong Y, Zhou Q, Wu J, Li X, Xiao X. Comparison of Machine Learning Methods and Conventional Logistic Regressions for Predicting Gestational Diabetes Using Routine Clinical Data: A Retrospective Cohort Study. J Diabetes Res. 2020;2020:4168340. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 70] [Cited by in RCA: 58] [Article Influence: 9.7] [Reference Citation Analysis (0)]

19.

Marateb HR, Mansourian M, Faghihimani E, Amini M, Farina D. A hybrid intelligent system for diagnosing microalbuminuria in type 2 diabetes patients without having to measure urinary albumin. Comput Biol Med. 2014;45:34-42. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 22] [Cited by in RCA: 23] [Article Influence: 1.9] [Reference Citation Analysis (0)]

20.

Nusinovici S, Tham YC, Chak Yan MY, Wei Ting DS, Li J, Sabanayagam C, Wong TY, Cheng CY. Logistic regression was as good as machine learning for predicting major chronic diseases. J Clin Epidemiol. 2020;122:56-69. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 96] [Cited by in RCA: 224] [Article Influence: 37.3] [Reference Citation Analysis (0)]

21.	2006 Astellas Pharma US, Inc. ADS10257 9/06. [PubMed] [DOI]

22.	Wu CZ, Huang LY, Chen FY, Kuo CH, Yeih DF. Using Machine Learning to Predict Abnormal Carotid Intima-Media Thickness in Type 2 Diabetes. Diagnostics (Basel). 2023;13. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 11] [Reference Citation Analysis (0)]

23.	Miller DD, Brown EW. Artificial Intelligence in Medical Practice: The Question to the Answer? Am J Med. 2018;131:129-133. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 323] [Cited by in RCA: 337] [Article Influence: 42.1] [Reference Citation Analysis (1)]

24.

Introduction: The American Diabetes Association's (ADA) evidence-based practice guidelines, standards, and related recommendations and documents for diabetes care. Diabetes Care. 2012;35 Suppl 1:S1-S2. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 16] [Cited by in RCA: 27] [Article Influence: 1.9] [Reference Citation Analysis (0)]

25.

Hachamovitch R, Berman DS, Shaw LJ, Kiat H, Cohen I, Cabico JA, Friedman J, Diamond GA. Incremental prognostic value of myocardial perfusion single photon emission computed tomography for the prediction of cardiac death: differential stratification for risk of cardiac death and myocardial infarction. Circulation. 1998;97:535-543. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 848] [Cited by in RCA: 818] [Article Influence: 29.2] [Reference Citation Analysis (1)]

26.

Gimelli A, Rossi G, Landi P, Marzullo P, Iervasi G, L'abbate A, Rovai D. Stress/Rest Myocardial Perfusion Abnormalities by Gated SPECT: Still the Best Predictor of Cardiac Events in Stable Ischemic Heart Disease. J Nucl Med. 2009;50:546-553. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 56] [Cited by in RCA: 59] [Article Influence: 3.5] [Reference Citation Analysis (0)]

27.

Nakajima K, Yamasaki Y, Kusuoka H, Izumi T, Kashiwagi A, Kawamori R, Shimamoto K, Yamada N, Nishimura T. Cardiovascular events in Japanese asymptomatic patients with type 2 diabetes: a 1-year interim report of a J-ACCESS 2 investigation using myocardial perfusion imaging. Eur J Nucl Med Mol Imaging. 2009;36:2049-2057. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 28] [Cited by in RCA: 26] [Article Influence: 1.5] [Reference Citation Analysis (0)]

28.

Tseng CJ, Lu CJ, Chang CC, Chen GD, Cheewakriangkrai C. Integration of data mining classification techniques and ensemble learning to identify risk factors and diagnose ovarian cancer recurrence. Artif Intell Med. 2017;78:47-54. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 47] [Cited by in RCA: 43] [Article Influence: 4.8] [Reference Citation Analysis (0)]

29.

Shih CC, Lu CJ, Chen GD, Chang CC. Risk Prediction for Early Chronic Kidney Disease: Results from an Adult Health Examination Program of 19,270 Individuals. Int J Environ Res Public Health. 2020;17. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 13] [Cited by in RCA: 34] [Article Influence: 5.7] [Reference Citation Analysis (0)]

30.

Chang CC, Chen SH. Developing a Novel Machine Learning-Based Classification Scheme for Predicting SPCs in Breast Cancer Survivors. Front Genet. 2019;10:848. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 13] [Cited by in RCA: 21] [Article Influence: 3.0] [Reference Citation Analysis (0)]

31.

Lee TS, Chen IF, Chang TJ, Lu CJ. Forecasting Weekly Influenza Outpatient Visits Using a Two-Dimensional Hierarchical Decision Tree Scheme. Int J Environ Res Public Health. 2020;17. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 4] [Cited by in RCA: 13] [Article Influence: 2.2] [Reference Citation Analysis (0)]

32.

Chang CC, Yeh JH, Chen YM, Jhou MJ, Lu CJ. Clinical Predictors of Prolonged Hospital Stay in Patients with Myasthenia Gravis: A Study Using Machine Learning Algorithms. J Clin Med. 2021;10. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 3] [Cited by in RCA: 20] [Article Influence: 4.0] [Reference Citation Analysis (0)]

33.

Chiu YL, Jhou MJ, Lee TS, Lu CJ, Chen MS. Health Data-Driven Machine Learning Algorithms Applied to Risk Indicators Assessment for Chronic Kidney Disease. Risk Manag Healthc Policy. 2021;14:4401-4412. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 17] [Cited by in RCA: 23] [Article Influence: 4.6] [Reference Citation Analysis (0)]

34.

Wu CW, Shen HL, Lu CJ, Chen SH, Chen HY. Comparison of Different Machine Learning Classifiers for Glaucoma Diagnosis Based on Spectralis OCT. Diagnostics (Basel). 2021;11. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 8] [Cited by in RCA: 27] [Article Influence: 5.4] [Reference Citation Analysis (0)]

35.

Wu TE, Chen HA, Jhou MJ, Chen YN, Chang TJ, Lu CJ. Evaluating the Effect of Topical Atropine Use for Myopia Control on Intraocular Pressure by Using Machine Learning. J Clin Med. 2020;10. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 11] [Cited by in RCA: 31] [Article Influence: 5.2] [Reference Citation Analysis (0)]

36.

Chang CC, Yeh JH, Chiu HC, Chen YM, Jhou MJ, Liu TC, Lu CJ. Utilization of Decision Tree Algorithms for Supporting the Prediction of Intensive Care Unit Admission of Myasthenia Gravis: A Machine Learning-Based Approach. J Pers Med. 2022;12. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 19] [Cited by in RCA: 24] [Article Influence: 6.0] [Reference Citation Analysis (0)]

37.

Huang YC, Cheng YC, Jhou MJ, Chen M, Lu CJ. Important Risk Factors in Patients with Nonvalvular Atrial Fibrillation Taking Dabigatran Using Integrated Machine Learning Scheme-A Post Hoc Analysis. J Pers Med. 2022;12. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 14] [Cited by in RCA: 14] [Article Influence: 3.5] [Reference Citation Analysis (0)]

38.

Tierney NJ, Harden FA, Harden MJ, Mengersen KL. Using decision trees to understand structure in missing data. BMJ Open. 2015;5:e007450. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 20] [Cited by in RCA: 28] [Article Influence: 2.5] [Reference Citation Analysis (0)]

39.	Breiman L. Random Forests. Machine Learning. 2001; 45: 5-32 [DOI: 10.1023/A:1010933404324. [PubMed] [DOI]

40.	Calle ML, Urrea V. Letter to the editor: Stability of Random Forest importance measures. Brief Bioinform. 2011;12:86-89. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 212] [Cited by in RCA: 154] [Article Influence: 9.6] [Reference Citation Analysis (0)]

41.	Friedman JH. Greedy function approximation: A gradient boosting machine. Ann Statist. 2001;29:189-1232. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 8988] [Cited by in RCA: 9244] [Article Influence: 369.8] [Reference Citation Analysis (0)]

42.	Friedman JH. Stochastic gradient boosting. Comput Stat Data Anal. 2002;38:367-78. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 3331] [Cited by in RCA: 1748] [Article Influence: 72.8] [Reference Citation Analysis (0)]

43.

Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; ACM: New York, NY, USA, 2016. [RCA] [DOI] [Full Text] [Cited by in Crossref: 12755] [Cited by in RCA: 9602] [Article Influence: 960.2] [Reference Citation Analysis (1)]

44.

Torlay L, Perrone-Bertolotti M, Thomas E, Baciu M. Machine learning-XGBoost analysis of language networks to classify patients with epilepsy. Brain Inform. 2017;4:159-169. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 203] [Cited by in RCA: 201] [Article Influence: 22.3] [Reference Citation Analysis (0)]

45.

Zafrir N, Mats I, Solodky A, Ben-Gal T, Battler A. Characteristics and outcome of octogenarian population referred for myocardial perfusion imaging: comparison with non-octogenarian population with reference to gender. Clin Cardiol. 2006;29:117-120. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 8] [Cited by in RCA: 10] [Article Influence: 0.5] [Reference Citation Analysis (0)]

46.

Miller TD, Roger VL, Hodge DO, Hopfenspirger MR, Bailey KR, Gibbons RJ. Gender differences and temporal trends in clinical characteristics, stress test results and use of invasive procedures in patients undergoing evaluation for coronary artery disease. J Am Coll Cardiol. 2001;38:690-697. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 62] [Cited by in RCA: 61] [Article Influence: 2.4] [Reference Citation Analysis (0)]

47.

Yao MF, He J, Sun X, Ji XL, Ding Y, Zhao YM, Lou HY, Song XX, Shan LZ, Kang YX, Zhang SZ, Shan PF. Gender Differences in Risks of Coronary Heart Disease and Stroke in Patients with Type 2 Diabetes Mellitus and Their Association with Metabolic Syndrome in China. Int J Endocrinol. 2016;2016:8483405. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 11] [Cited by in RCA: 16] [Article Influence: 1.6] [Reference Citation Analysis (0)]

48.

Wu YT, Chien CL, Wang SY, Yang WS, Wu YW. Gender differences in myocardial perfusion defect in asymptomatic postmenopausal women and men with and without diabetes mellitus. J Womens Health (Larchmt). 2013;22:439-444. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 7] [Cited by in RCA: 10] [Article Influence: 0.8] [Reference Citation Analysis (0)]

49.

Prior JO, Monbaron D, Koehli M, Calcagni ML, Ruiz J, Bischof Delaloye A. Prevalence of symptomatic and silent stress-induced perfusion defects in diabetic patients with suspected coronary artery disease referred for myocardial perfusion scintigraphy. Eur J Nucl Med Mol Imaging. 2005;32:60-69. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 14] [Cited by in RCA: 11] [Article Influence: 0.5] [Reference Citation Analysis (0)]

50.

Dilmanian H, Aronow WS, Kaplan S, Pucillo AL, Weiss MB, Kalapatapu K, Monsen CE. Comparison of age, body mass index, and frequency of systemic hypertension and diabetes mellitus in patients having coronary angioplasty in 1996 versus in 2006. Am J Cardiol. 2007;100:1224-1226. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 4] [Cited by in RCA: 1] [Article Influence: 0.1] [Reference Citation Analysis (0)]

51.

Katzel LI, Sorkin KD, Colman E, Goldberg AP, Busby-Whitehead MJ, Lakatta LE, Becker LC, Lakatta EG, Fleg JL. Risk factors for exercise-induced silent myocardial ischemia in healthy volunteers. Am J Cardiol. 1994;74:869-874. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 28] [Cited by in RCA: 26] [Article Influence: 0.8] [Reference Citation Analysis (0)]

52.

Powell-Wiley TM, Poirier P, Burke LE, Després JP, Gordon-Larsen P, Lavie CJ, Lear SA, Ndumele CE, Neeland IJ, Sanders P, St-Onge MP; American Heart Association Council on Lifestyle and Cardiometabolic Health; Council on Cardiovascular and Stroke Nursing; Council on Clinical Cardiology; Council on Epidemiology and Prevention; and Stroke Council. Obesity and Cardiovascular Disease: A Scientific Statement From the American Heart Association. Circulation. 2021;143:e984-e1010. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1107] [Cited by in RCA: 1995] [Article Influence: 399.0] [Reference Citation Analysis (0)]

53.

Poirier P, Giles TD, Bray GA, Hong Y, Stern JS, Pi-Sunyer FX, Eckel RH; American Heart Association; Obesity Committee of the Council on Nutrition, Physical Activity, and Metabolism. Obesity and cardiovascular disease: pathophysiology, evaluation, and effect of weight loss: an update of the 1997 American Heart Association Scientific Statement on Obesity and Heart Disease from the Obesity Committee of the Council on Nutrition, Physical Activity, and Metabolism. Circulation. 2006;113:898-918. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1901] [Cited by in RCA: 2017] [Article Influence: 96.0] [Reference Citation Analysis (0)]

54.

Budoff MJ, Raggi P, Beller GA, Berman DS, Druz RS, Malik S, Rigolin VH, Weigold WG, Soman P; Imaging Council of the American College of Cardiology. Noninvasive Cardiovascular Risk Assessment of the Asymptomatic Diabetic Patient: The Imaging Council of the American College of Cardiology. JACC Cardiovasc Imaging. 2016;9:176-192. [PubMed] [DOI] [Full Text]

55.	Boorse C. A rebuttal on health. In What is Disease?, ed. Humber JM, and Almeder RF, 3–134. Totowa, New Jersey: Humana Press 1997. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 115] [Cited by in RCA: 72] [Article Influence: 2.5] [Reference Citation Analysis (0)]

56.

Chew P, Yuen DY, Stefanovic N, Pete J, Coughlan MT, Jandeleit-Dahm KA, Thomas MC, Rosenfeldt F, Cooper ME, de Haan JB. Antiatherosclerotic and renoprotective effects of ebselen in the diabetic apolipoprotein E/GPx1-double knockout mouse. Diabetes. 2010;59:3198-3207. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 87] [Cited by in RCA: 95] [Article Influence: 5.9] [Reference Citation Analysis (0)]

57.

Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults. Executive Summary of The Third Report of The National Cholesterol Education Program (NCEP) Expert Panel on Detection, Evaluation, And Treatment of High Blood Cholesterol In Adults (Adult Treatment Panel III). JAMA. 2001;285:2486-2497. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 20476] [Cited by in RCA: 20979] [Article Influence: 839.2] [Reference Citation Analysis (2)]

58.

Colantonio LD, Bittner V, Reynolds K, Levitan EB, Rosenson RS, Banach M, Kent ST, Derose SF, Zhou H, Safford MM, Muntner P. Association of Serum Lipids and Coronary Heart Disease in Contemporary Observational Studies. Circulation. 2016;133:256-264. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 66] [Cited by in RCA: 75] [Article Influence: 7.5] [Reference Citation Analysis (0)]

59.

Hosokawa S, Hiasa Y, Tomokane T, Ogura R, Miyajima H, Ohara Y, Ogata T, Yuba K, Suzuki N, Takahashi T, Kishi K, Ohtani R. The effects of atorvastatin on coronary endothelial function in patients with recent myocardial infarction. Clin Cardiol. 2006;29:357-362. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 6] [Cited by in RCA: 7] [Article Influence: 0.4] [Reference Citation Analysis (0)]

60.

Schwenke DC, Carew TE. Initiation of atherosclerotic lesions in cholesterol-fed rabbits. II. Selective retention of LDL vs. selective increases in LDL permeability in susceptible sites of arteries. Arteriosclerosis. 1989;9:908-918. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 244] [Cited by in RCA: 236] [Article Influence: 6.4] [Reference Citation Analysis (0)]

61.	Frank JS, Fogelman AM. Ultrastructure of the intima in WHHL and cholesterol-fed rabbit aortas prepared by ultra-rapid freezing and freeze-etching. J Lipid Res. 1989;30:967-978. [PubMed] [DOI]

62.

Tabas I, Williams KJ, Borén J. Subendothelial lipoprotein retention as the initiating process in atherosclerosis: update and therapeutic implications. Circulation. 2007;116:1832-1844. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 893] [Cited by in RCA: 1050] [Article Influence: 55.3] [Reference Citation Analysis (0)]

63.	Nordestgaard BG, Zilversmit DB. Large lipoproteins are excluded from the arterial wall in diabetic cholesterol-fed rabbits. J Lipid Res. 1988;29:1491-1500. [PubMed] [DOI]

64.	Goldstein JL, Brown MS. A century of cholesterol and coronaries: from plaques to genes to statins. Cell. 2015;161:161-172. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 790] [Cited by in RCA: 876] [Article Influence: 79.6] [Reference Citation Analysis (0)]

65.	Effect of intensive blood-glucose control with metformin on complications in overweight patients with type 2 diabetes (UKPDS 34). UK Prospective Diabetes Study (UKPDS) Group. Lancet. 1998;352:854-865. [PubMed] [DOI]

66.

ADVANCE Collaborative Group; Patel A, MacMahon S, Chalmers J, Neal B, Billot L, Woodward M, Marre M, Cooper M, Glasziou P, Grobbee D, Hamet P, Harrap S, Heller S, Liu L, Mancia G, Mogensen CE, Pan C, Poulter N, Rodgers A, Williams B, Bompoint S, de Galan BE, Joshi R, Travert F. Intensive blood glucose control and vascular outcomes in patients with type 2 diabetes. N Engl J Med. 2008;358:2560-2572. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 4759] [Cited by in RCA: 4957] [Article Influence: 275.4] [Reference Citation Analysis (0)]

67.

Poznyak A, Grechko AV, Poggio P, Myasoedova VA, Alfieri V, Orekhov AN. The Diabetes Mellitus-Atherosclerosis Connection: The Role of Lipid and Glucose Metabolism and Chronic Inflammation. Int J Mol Sci. 2020;21. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 179] [Cited by in RCA: 737] [Article Influence: 122.8] [Reference Citation Analysis (0)]

68.

Kannel WB, Dawber TR, Friedman GD, Glennon WE, Mcnamara PM. Risk factors in coronary heart disease. An evaluation of several serum lipids as predictors of coronary heart disease; the framingham study. Ann Intern Med. 1964;61:888-899. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 388] [Cited by in RCA: 363] [Article Influence: 5.9] [Reference Citation Analysis (0)]

69.

Salehi N, Janjani P, Tadbiri H, Rozbahani M, Jalilian M. Effect of cigarette smoking on coronary arteries and pattern and severity of coronary artery disease: a review. J Int Med Res. 2021;49:3000605211059893. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in Crossref: 1] [Cited by in RCA: 39] [Article Influence: 7.8] [Reference Citation Analysis (0)]

70.

Barua RS, Ambrose JA, Srivastava S, DeVoe MC, Eales-Reynolds LJ. Reactive oxygen species are involved in smoking-induced dysfunction of nitric oxide biosynthesis and upregulation of endothelial nitric oxide synthase: an in vitro demonstration in human coronary artery endothelial cells. Circulation. 2003;107:2342-2347. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 173] [Cited by in RCA: 185] [Article Influence: 8.0] [Reference Citation Analysis (1)]

71.	Desideri G, Ferri C. Endothelial activation. Sliding door to atherosclerosis. Curr Pharm Des. 2005;11:2163-2175. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 38] [Cited by in RCA: 44] [Article Influence: 2.2] [Reference Citation Analysis (0)]

72.	Separham K, H. S. Smoking or high blood pressure, which one is more important in premature coronary artery disease? J Isfahan Med Sch Spring. 2007;25:1-9. [PubMed] [DOI]

73.	Leone A. Relation between coronary lesions and cigarette smoking of subjects deceased from acute myocardial infarction. A histopathological study. J Cardiobiol. 2014;2:5. [PubMed] [DOI]

74.	Leone A, Landini L Jr, Biadi O, Balbarini A. Smoking and cardiovascular system: cellular features of the damage. Curr Pharm Des. 2008;14:1771-1777. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 23] [Cited by in RCA: 22] [Article Influence: 1.2] [Reference Citation Analysis (0)]

Footnotes

Open-Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/

Provenance and peer review: Unsolicited article; Externally peer reviewed.

Peer-review model: Single blind

Specialty type: Endocrinology & metabolism

Country/Territory of origin: Taiwan

Peer-review report’s scientific quality classification

Grade A (Excellent): 0

Grade B (Very good): 0

Grade C (Good): C, C

Grade D (Fair): 0

Grade E (Poor): 0

P-Reviewer: Zhang W, China; Surani S, United States S-Editor: Liu JH L-Editor: A P-Editor: Zhao S