Copyright
©The Author(s) 2024.
World J Clin Cases. May 26, 2024; 12(15): 2506-2521
Published online May 26, 2024. doi: 10.12998/wjcc.v12.i15.2506
Published online May 26, 2024. doi: 10.12998/wjcc.v12.i15.2506
Table 1 Participant descriptive statistics and risk factors, percentage and mean (± SD)
Characteristics | None | Fatty liver | P value |
N number | 34, 335 | 31, 200 | |
Age (yr) | 36.75 ± 12.33 | 47.52 ± 12.8 | < 0.001 |
Income | 2.04 ± 1.47 | 1.61 ± 1.57 | < 0.001 |
Body fat (%) | 26.66 ± 5.55 | 35.96 ± 6.83 | < 0.001 |
Systolic blood pressure (mmHg) | 111.83 ± 16.07 | 124.52 ± 19.68 | < 0.001 |
Diastolic blood pressure (mmHg) | 66.81 ± 10.21 | 73.8 ± 11.69 | < 0.001 |
Leukocyte (× 103/μL) | 5.93 ± 1.73 | 6.45 ± 1.73 | < 0.001 |
Hemoglobin (× 106/μL) | 13.09 ± 1.14 | 13.36 ± 1.18 | < 0.001 |
Platelets (× 103/μL) | 248.96 ± 57.74 | 264.39 ± 63.31 | < 0.001 |
Fasting plasma glucose (mg/dL) | 92.75 ± 10.64 | 103.7 ± 25.6 | < 0.001 |
Total bilirubin (mg/dL) | 0.77 ± 0.32 | 0.73 ± 0.33 | < 0.001 |
Albumin (mg/dL) | 4.5 ± 0.26 | 4.45 ± 0.24 | < 0.001 |
Globulin (mg/dL) | 3.08 ± 0.36 | 3.15 ± 0.36 | < 0.001 |
Alkaline Phosphatase (IU/L) | 101.86 ± 49.29 | 110.72 ± 59.94 | < 0.001 |
Serum glutamic oxaloacetic transaminase (mg/dL) | 19.88 ± 11.71 | 24.61 ± 17.56 | < 0.001 |
Serum glutamic pyruvic transaminase (IU/L) | 17.68 ± 18.62 | 28.72 ± 27.22 | < 0.001 |
Serum γ-glutamyl transpeptidase (IU/L) | 14.22 ± 13.94 | 25.3 ± 30.09 | < 0.001 |
Lactate dehydrogenase (IU/L) | 241.42 ± 84.03 | 246.65 ± 92.63 | < 0.001 |
Estimated glomerular filtration rate (mL/min/1.73 m2) | 89.6 ± 76.89 | 84.33 ± 72.4 | < 0.001 |
Uric acid (mg/dL) | 4.88 ± 1.09 | 5.67 ± 1.34 | < 0.001 |
Triglyceride (mg/dL) | 78.81 ± 42.95 | 139.02 ± 101.03 | < 0.001 |
High density lipoprotein cholesterol (mg/dL) | 61.73 ± 14.8 | 54.17 ± 13.47 | < 0.001 |
Low density lipoprotein cholesterol (mg/dL) | 107.97 ± 30.48 | 125.76 ± 34.09 | < 0.001 |
Calcium (mg/dL) | 9.2 ± 0.39 | 9.29 ± 0.41 | < 0.001 |
Phosphorus (mg/dL) | 3.72 ± 0.44 | 3.7 ± 0.46 | < 0.001 |
Thyroid stimulating hormone (IU/mL) | 1.75 ± 3.25 | 1.94 ± 3.64 | < 0.001 |
C-reactive protein (mg/dL) | 0.19 ± 0.45 | 0.3 ± 0.51 | < 0.001 |
Forced expiratory volume in one second (L) | 2.2 ± 0.46 | 1.96 ± 0.53 | < 0.001 |
Drink area | 0.97 ± 6.5 | 1.38 ± 8.72 | < 0.001 |
Smoke area | 1.5 ± 7.15 | 1.57 ± 7.89 | < 0.001 |
Betel nut area | 0 ± 0 | 0.02 ± 1.2 | < 0.001 |
Sport area | 3.32 ± 5.99 | 3.96 ± 6.19 | < 0.001 |
Sleep time | 2.91 ± 0.59 | 2.87 ± 0.72 | 0.25 |
Marriage, n (%) | |||
Unmarried | 13 (458) | 8 (438) | < 0.001 |
Married | 19 (939) | 2 (1545) |
Table 2 Comparison with SMAPE, RAE, RRSE, and RMSE between multiple linear regression and machine learning methods
NAFLD+ group with age | MAPE | SMAPE | RAE | RRSE | RMSE |
Linear | 0.139 | 0.132 | 0.845 | 0.842 | 13.959 |
SGB | 0.138 | 0.131 | 0.841 | 0.834 | 13.825 |
XGBoost | 0.139 | 0.132 | 0.845 | 0.842 | 13.946 |
Elasticnet | 0.139 | 0.132 | 0.845 | 0.842 | 13.954 |
NAFLD- group with age | |||||
Linear | 0.133 | 0.128 | 0.868 | 0.862 | 14.671 |
SGB | 0.132 | 0.126 | 0.855 | 0.857 | 14.59 |
XGboost | 0.132 | 0.126 | 0.853 | 0.857 | 14.58 |
Elasticnet | 0.134 | 0.128 | 0.868 | 0.862 | 14.673 |
NAFLD+ group without age | |||||
Linear | 0.154 | 0.14 | 0.872 | 0.897 | 15.606 |
SGB | 0.153 | 0.139 | 0.865 | 0.888 | 15.444 |
XGboost | 0.153 | 0.14 | 0.869 | 0.891 | 15.49 |
Elasticnet | 0.154 | 0.14 | 0.872 | 0.897 | 15.596 |
NAFLD- group without age | |||||
Linear | 0.134 | 0.13 | 0.905 | 0.906 | 15.149 |
SGB | 0.133 | 0.129 | 0.895 | 0.892 | 14.915 |
XGboost | 0.133 | 0.129 | 0.895 | 0.893 | 14.916 |
Elasticnet | 0.134 | 0.13 | 0.904 | 0.905 | 15.119 |
Table 3 The average of the importance of risk factors derived from stochastic gradient boosting, random forest and extreme gradient boost, in NAFLD+ (Model 1, including age)
Variables | SGB | XGBoost | Elasticnet | Average | Rank |
Age | 100 | 100 | 15.08 | 71.69 | 1 |
Income | 0.14 | 0 | 0 | 0.05 | |
Body fat | 3.94 | 1.9 | 3.27 | 3.04 | |
Systolic blood pressure | 1.37 | 0.67 | 1.01 | 1.02 | |
Diastolic blood pressure | 2.67 | 0.67 | 1.8 | 1.71 | |
Leukocyte | 0.72 | 0.33 | 4.32 | 1.79 | |
Hemoglobin | 1 | 0 | 0 | 0.33 | |
Platelets | 3.64 | 1.62 | 0.31 | 1.86 | |
Fasting plasma glucose | 2.92 | 1.18 | 0.66 | 1.59 | |
Total bilirubin | 6.86 | 2.29 | 40.24 | 16.46 | 6 |
Albumin | 1.84 | 0.54 | 49.85 | 17.41 | 5 |
Globulin | 0.28 | 0.29 | 17.83 | 6.13 | |
Alkaline Phosphatase | 0.99 | 0.15 | 0 | 0.38 | |
Serum glutamic oxaloacetic transaminase | 1.63 | 0.36 | 0 | 0.66 | |
Serum glutamic pyruvic transaminase | 3.83 | 1.94 | 0.82 | 2.20 | |
Serum γ-glutamyl transpeptidase | 1.89 | 1.33 | 0.48 | 1.23 | |
Lactate dehydrogenase | 23.21 | 23.25 | 0.86 | 15.77 | |
Uric acid | 27.03 | 24.05 | 60.42 | 37.17 | 2 |
Triglyceride | 0.84 | 0 | 0.02 | 0.29 | |
High density lipoprotein cholesterol | 1.99 | 0.8 | 1.16 | 1.32 | |
Low density lipoprotein cholesterol | 1.78 | 0.14 | 0.1 | 0.67 | |
Calcium | 3.97 | 2.87 | 65 | 23.95 | 4 |
Phosphorus | 0.79 | 0.36 | 0 | 0.38 | |
Thyroid stimulating hormone | 6.92 | 3.99 | 4.91 | 5.27 | |
C-reactive protein | 0.61 | 0.22 | 6.99 | 2.61 | |
Forced expiratory volume in one second | 6.63 | 3.41 | 100 | 36.68 | 3 |
Drink area | 0.11 | 0 | 0.07 | 0.06 | |
Smoke area | 0.25 | 0 | 0 | 0.08 | |
Betel nut area | 0 | 0 | 0 | 0.00 | |
Sport area | 0.45 | 0.2 | 0.28 | 0.31 | |
Sleep time | 0.14 | 0 | 0 | 0.05 | |
Marriage | 0 | 0 | 1.99 | 0.66 |
Table 4 The average of the importance of risk factors derived from stochastic gradient boosting, random forest and extreme gradient boost, in NAFLD- (Model 1, including age)
Variables | SGB | XGBoost | Elasticnet | Average | Rank |
Age | 100 | 100 | 15.19 | 71.73 | 1 |
Income | 0 | 0 | 1.41 | 0.47 | |
Body fat | 3.69 | 1.05 | 4.15 | 2.96 | |
Systolic blood pressure | 0.46 | 0.09 | 0.75 | 0.43 | |
Diastolic blood pressure | 4.19 | 3.09 | 2.96 | 3.41 | |
Leukocyte | 1.21 | 0.34 | 4.52 | 2.02 | |
Hemoglobin | 4.81 | 1 | 11.57 | 5.79 | |
Platelets | 2.61 | 1.06 | 0.2 | 1.29 | |
Fasting plasma glucose | 0.42 | 0.21 | 0.17 | 0.27 | |
Total bilirubin | 3.11 | 1.84 | 19.24 | 8.06 | |
Albumin | 2.28 | 1.34 | 69.53 | 24.38 | 4 |
Globulin | 0.42 | 0.12 | 3.03 | 1.19 | |
Alkaline Phosphatase | 1.84 | 0.22 | 0.04 | 0.70 | |
Serum glutamic oxaloacetic transaminase | 0.52 | 0 | 0 | 0.17 | |
Serum glutamic pyruvic transaminase | 3.79 | 1.94 | 1.12 | 2.28 | |
Serum γ-glutamyl transpeptidase | 1.16 | 0 | 0.38 | 0.51 | |
Lactate dehydrogenase | 21.99 | 18.24 | 0.97 | 13.73 | 5 |
Uric acid | 26.62 | 22.99 | 76.35 | 41.99 | 2 |
Triglyceride | 1.59 | 0.31 | 0 | 0.63 | |
High density lipoprotein cholesterol | 1.4 | 0.25 | 0.65 | 0.77 | |
Low density lipoprotein cholesterol | 2.37 | 0.26 | 0.18 | 0.94 | |
Calcium | 1.66 | 0.64 | 29.96 | 10.75 | 6 |
Phosphorus | 2.05 | 2.07 | 10.98 | 5.03 | |
Thyroid stimulating hormone | 11.86 | 8.69 | 5.02 | 8.52 | |
C-reactive protein | 0.42 | 0 | 2.73 | 1.05 | |
Forced expiratory volume in one second | 5.06 | 3.01 | 100 | 36.02 | 3 |
Drink area | 0 | 0.25 | 0 | 0.08 | |
Smoke area | 0.37 | 0.17 | 0.79 | 0.44 | |
Betel nut area | 0 | 0 | 0 | 0.00 | |
Sport area | 1.13 | 0.65 | 1.55 | 1.11 | |
Sleep time | 0 | 0 | 0 | 0.00 | |
Marriage | 0.13 | 0 | 5.51 | 1.88 |
Table 5 The average of the importance of risk factors derived from stochastic gradient boosting, random forest and extreme gradient boost, in NAFLD+ (Model 2, excluding age)
Variables | SGB | XGBoost | Elasticnet | Average | Rank |
Income | 12.88 | 15.99 | 8.06 | 12.31 | |
Body fat | 22.25 | 16.83 | 3.85 | 14.31 | 5 |
Systolic blood pressure | 11.29 | 9.24 | 0.41 | 6.98 | |
Diastolic blood pressure | 9.17 | 6.85 | 1.19 | 5.74 | |
Leukocyte | 1.73 | 0.58 | 1.9 | 1.40 | |
Hemoglobin | 2.57 | 0.31 | 4.27 | 2.38 | |
Platelets | 17.83 | 14.87 | 0.32 | 11.01 | |
Fasting plasma glucose | 9.72 | 6.71 | 0.2 | 5.54 | |
Total bilirubin | 9.6 | 3.56 | 28.65 | 13.94 | |
Albumin | 12.05 | 9.81 | 100 | 40.62 | 3 |
Globulin | 1.85 | 1.25 | 10.62 | 4.57 | |
Alkaline Phosphatase | 4.28 | 0.99 | 0.06 | 1.78 | |
Serum glutamic oxaloacetic transaminase | 3.45 | 3.05 | 1.45 | 2.65 | |
Serum glutamic pyruvic transaminase | 16.27 | 11.92 | 1.57 | 9.92 | |
Serum γ-glutamyl transpeptidase | 1.14 | 0.65 | 0.28 | 0.69 | |
Lactate dehydrogenase | 100 | 100 | 0.6 | 66.87 | 1 |
Uric acid | 50.16 | 45.66 | 30.15 | 41.99 | 2 |
Triglyceride | 6.23 | 3.56 | 0.14 | 3.31 | |
High density lipoprotein cholesterol | 0.86 | 0.82 | 0.05 | 0.58 | |
Low density lipoprotein cholesterol | 9.6 | 6.73 | 0.46 | 5.60 | |
Calcium | 12.48 | 9.07 | 53.48 | 25.01 | 4 |
Phosphorus | 0.79 | 1.87 | 1.48 | 1.38 | |
Thyroid stimulating hormone | 16.42 | 11.23 | 2.23 | 9.96 | |
C-reactive protein | 0 | 0 | 2.11 | 0.70 | |
Forced expiratory volume in one second | 39.39 | 44.32 | 38.15 | 40.62 | 3 |
Drink area | 0 | 0 | 0.26 | 0.09 | |
Smoke area | 0 | 0 | 0 | 0.00 | |
Betel nut area | 0 | 0 | 0 | 0.00 | |
Sport area | 3.95 | 3.83 | 1.49 | 3.09 | |
Sleep time | 0.86 | 0.5 | 4.04 | 1.80 | |
Marriage | 0 | 0 | 2.38 | 0.79 |
Table 6 The average of the importance of risk factors derived from stochastic gradient boosting, random forest and extreme gradient boost, in NAFLD- (Model 2, excluding age)
Variables | SGB | XGBoost | Elasticnet | Average | Rank |
Income | 2.49 | 1.95 | 1.61 | 2.02 | |
Body fat | 7.57 | 2.68 | 1.48 | 3.91 | |
Systolic blood pressure | 28.66 | 30.68 | 0.59 | 19.98 | 6 |
Diastolic blood pressure | 18.44 | 21.96 | 1.73 | 14.04 | |
Leukocyte | 9.07 | 5.26 | 6.63 | 6.99 | |
Hemoglobin | 12.51 | 1.95 | 3.14 | 5.87 | |
Platelets | 12.13 | 8.68 | 0.23 | 7.01 | |
Fasting plasma glucose | 6.67 | 4.96 | 0.7 | 4.11 | |
Total bilirubin | 9.07 | 5.16 | 3.37 | 5.87 | |
Albumin | 21.95 | 20.16 | 100 | 47.37 | 4 |
Globulin | 1.32 | 0 | 0 | 0.44 | |
Alkaline Phosphatase | 2.75 | 0 | 0.02 | 0.92 | |
Serum glutamic oxaloacetic transaminase | 4.06 | 3.15 | 1.59 | 2.93 | |
Serum glutamic pyruvic transaminase | 9.09 | 6.48 | 1.74 | 5.77 | |
Serum γ-glutamyl transpeptidase | 1.11 | 0 | 0.11 | 0.41 | |
Lactate dehydrogenase | 100 | 100 | 0.63 | 66.88 | 1 |
Uric acid | 66.92 | 63.24 | 36.68 | 55.61 | 2 |
Triglyceride | 12.39 | 8 | 0.34 | 6.91 | |
High density lipoprotein cholesterol | 2.64 | 0.67 | 0.17 | 1.16 | |
Low density lipoprotein cholesterol | 14.18 | 10.11 | 0.51 | 8.27 | |
Calcium | 3.82 | 2.4 | 21.04 | 9.09 | |
Phosphorus | 5.65 | 6.52 | 0.1 | 4.09 | |
Thyroid stimulating hormone | 34.32 | 24.16 | 2.56 | 20.35 | 5 |
C-reactive protein | 2.68 | 0 | 0 | 0.89 | |
Forced expiratory volume in one second | 61.02 | 64.4 | 26.93 | 50.78 | 3 |
Drink area | 1.01 | 1.21 | 0 | 0.74 | |
Smoke area | 2.24 | 1.22 | 0.85 | 1.44 | |
Betel nut area | 0 | 0 | 0 | 0.00 | |
Sport area | 13.32 | 11.44 | 2.46 | 9.07 | |
Sleep time | 0.79 | 0.47 | 10.69 | 3.98 | |
Marriage | 8.88 | 7.85 | 30.55 | 15.76 |
- Citation: Chen IC, Chou LJ, Huang SC, Chu TW, Lee SS. Machine learning-based comparison of factors influencing estimated glomerular filtration rate in Chinese women with or without non-alcoholic fatty liver. World J Clin Cases 2024; 12(15): 2506-2521
- URL: https://www.wjgnet.com/2307-8960/full/v12/i15/2506.htm
- DOI: https://dx.doi.org/10.12998/wjcc.v12.i15.2506