BPG is committed to discovery and dissemination of knowledge
Letter to the Editor Open Access
Copyright ©The Author(s) 2025. Published by Baishideng Publishing Group Inc. All rights reserved.
World J Gastroenterol. Oct 14, 2025; 31(38): 112166
Published online Oct 14, 2025. doi: 10.3748/wjg.v31.i38.112166
Beyond biomarkers: An integrated traditional Chinese medicine-machine learning approach predicts hepatic steatosis in high metabolic risk populations
Yan-Chun Guo, Department of Ophthalmology, Binhai County People's Hospital, Yancheng 224500, Jiangsu Province, China
Ye Hong, Li Huang, Xiao-Wei Xu, Department of Clinical Nutrition, Binhai County People's Hospital, Yancheng 224500, Jiangsu Province, China
Jing-Qi Sun, Chao-Nian Li, Department of Traditional Chinese Medicine, Binhai County People's Hospital, Yancheng 224500, Jiangsu Province, China
Kang-Kang Ji, Department of Clinical Medical Research, Binhai County People’s Hospital, Yancheng 224500, Jiangsu Province, China
ORCID number: Chao-Nian Li (0009-0001-1319-695X).
Co-corresponding authors: Kang-Kang Ji and Chao-Nian Li.
Author contributions: Li CN and Ji KK conceived and designed the letter; Guo YC and Li CN wrote the manuscript; Hong Y, Huang L, Xu XW, and Sun JQ provided critical opinions about this topic; Li CN and Guo YC contributed to the revised version; All authors have read and approved the final manuscript.
Conflict-of-interest statement: The authors declare that they have no competing interests.
Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Chao-Nian Li, MD, PhD, Assistant Professor, Chief Physician, Department of Traditional Chinese Medicine, Binhai County People's Hospital, No. 299 Haibin Avenue, Yancheng 224500, Jiangsu Province, China. lichaonian2022@126.com
Received: July 21, 2025
Revised: August 22, 2025
Accepted: September 9, 2025
Published online: October 14, 2025
Processing time: 87 Days and 6.8 Hours

Abstract

Tian et al present a timely machine learning (ML) model integrating biochemical and novel traditional Chinese medicine (TCM) indicators (tongue edge redness, greasy coating) to predict hepatic steatosis in high metabolic risk patients. Their prospective cohort design and dual-feature selection (LASSO + RFE) culminating in an interpretable XGBoost model (area under the curve: 0.82) represent a significant methodological advance. The inclusion of TCM diagnostics addresses metabolic dysfunction-associated fatty liver disease (MAFLD’s) multisystem heterogeneity-a key strength that bridges holistic medicine with precision analytics and underscores potential cost savings over imaging-dependent screening. However, critical limitations impede clinical translation. First, the model’s single-center validation (n = 711) lacks external/generalizability testing across diverse populations, risking bias from local demographics. Second, MAFLD subtyping (e.g., lean MAFLD, diabetic MAFLD) was omitted despite acknowledged disease heterogeneity; this overlooks distinct pathophysiologies and may limit utility in stratified care. Third, while TCM features ranked among the top predictors in SHAP analysis, their clinical interpretability remains nebulous without mechanistic links to metabolic dysregulation. To resolve these gaps, we propose external validation in multiethnic cohorts using the published feature set (e.g., aspartate aminotransferase/alanine aminotransferase, low-density lipoprotein cholesterol, TCM tongue markers) to assess robustness. Subtype-specific modeling to capture MAFLD heterogeneity, potentially enhancing accuracy in high-risk subgroups. Probing TCM microbiome/metabolomic correlations to ground tongue phenotypes in biological pathways, elevating model credibility. Despite shortcomings, this work pioneers a low-cost screening paradigm. Future iterations addressing these issues could revolutionize early MAFLD detection in resource-limited settings.

Key Words: Traditional Chinese medicine-machine learning integration; Hepatic steatosis prediction; Machine learning; External validation; Metabolic dysfunction-associated fatty liver disease

Core Tip: Amid metabolic dysfunction-associated fatty liver disease (MAFLD’s) escalating global burden-a leading cause of chronic liver disease with significant economic strain - Tian et al pioneer an integrated traditional Chinese medicine (TCM) - machine learning model (area under the curve: 0.82) using dual-feature selection (LASSO + RFE) to predict hepatic steatosis in high metabolic risk populations. The inclusion of TCM tongue features (edge redness, greasy coating) addresses MAFLD’s heterogeneity and offers cost-saving potential over imaging. However, single-center validation and unmechanized TCM indicators limit clinical translation. Future work must prioritize multiethnic validation, subtype-specific modeling, and TCM-microbiome mechanistic studies to revolutionize early detection in resource-limited settings.



TO THE EDITOR

Metabolic dysfunction-associated fatty liver disease (MAFLD), formerly known as non-alcoholic fatty liver disease (NAFLD), affects approximately one-quarter of the adult population worldwide, thereby imposing a significant health and economic burden on all societies[1]. The issue has been demonstrated to result in increased instances of cirrhosis, hepatocellular carcinoma, and cardiovascular mortality, consequently engendering considerable healthcare expenditure, a phenomenon that is especially prevalent in regions characterised by a paucity of screening resources[2]. We read with interest the prospective cohort study by Tian et al[3]. This work pioneers an integrated traditional Chinese medicine (TCM)-machine learning (ML) approach for predicting MAFLD in high-risk populations. While the methodological innovation warrants commendation, we seek to contextualize its contributions, address critical limitations, and propose translational pathways to facilitate its integration into clinical practice.

STUDY OVERVIEW AND DISCUSSION

Tian et al[3] developed an XGBoost model (area under the curve: 0.82) utilizing dual feature selection (LASSO + recursive feature elimination) to identify hepatic steatosis within a cohort of 711 individuals at high metabolic risk. The model incorporated ten predictors, encompassing conventional biomarkers [e.g., aspartate aminotransferase (AST)/alanine aminotransferase (ALT) ratio, low-density lipoprotein cholesterol (LDL-C), triglycerides] alongside novel TCM indicators (tongue edge redness, greasy coating). By prospectively recruiting patients exhibiting metabolic dysregulation (representing 86.2% of the initial cohort, n = 1011) and employing FibroScan-CAP ≥ 238 dB/m as a steatosis confirmation criterion, the authors addressed a significant clinical need: The demand for scalable, non-invasive screening tools suitable for resource-constrained settings. The inclusion of TCM diagnostics aligns well with the multisystem pathophysiology characteristic of MAFLD, thereby offering a holistic perspective complementary to biochemical profiling. SHAP analysis notably positioned TCM tongue features among the leading predictors, underscoring their potential clinical utility. However, the absence of MAFLD subtyping (e.g., lean or diabetic MAFLD) overlooks established pathophysiological heterogeneity. As emphasized by Eslam et al[1], MAFLD is not a monolithic disease entity; distinct subtypes exhibit divergent fibrosis progression trajectories and cardiovascular risk profiles, necessitating stratified diagnostic and management strategies[1]. Early detection is critical, as MAFLD is often asymptomatic in its initial stages yet can progress to severe complications including cirrhosis, liver failure, and hepatocellular carcinoma, while also independently increasing cardiovascular mortality risk[1,4].

STRENGTHS AND LIMITATIONS

The study by Tian et al[3] demonstrates considerable methodological rigor, including its prospective cohort design and dual feature selection strategy (LASSO + recursive feature elimination), which collectively mitigate the risk of overfitting prevalent in high-dimensional biomarker research. The innovative integration of TCM diagnostics-specifically tongue edge redness and greasy coating-constitutes a pioneering effort to address the inherent multisystem heterogeneity of MAFLD from a holistic perspective. This approach resonates with findings from network pharmacology studies revealing multicomponent synergistic effects within TCM formulations (e.g., modulation of AKT1/IL-6/TNF-α pathways by XingQiChuShiYin)[5]. The synergistic application of TCM and ML analytics offers a promising low-cost screening alternative to imaging-dependent modalities like magnetic resonance imaging-derived proton density fat fraction (MRI-PDFF), aligning with urgent demands for accessible tools in resource-limited environments[6]. Nevertheless, critical limitations impede immediate clinical translation. Validation confined to a single center (n = 711) lacks generalizability assessment across diverse populations, introducing potential bias derived from local demographic characteristics (e.g., 77% male participants, median age 43–44 years). Furthermore, the omission of MAFLD subtyping disregards established pathophysiological heterogeneity, potentially diminishing the model's applicability within precision medicine frameworks.

A paramount concern is the limited biological anchoring of the TCM tongue features, despite their high ranking in SHAP analysis-a gap paralleled in herbal medicine research where active compound pharmacokinetics often remain unverified[7]. While Lu et al[8] demonstrated associations between yellow tongue coatings and gut microbiome dysbiosis in MAFLD patients, the mechanistic links for greasy coatings/edge redness -key predictors in this model-remain incompletely defined. The reduction of TCM's holistic framework to two imaging-derived variables (tongue redness/greasiness) may oversimplify its diagnostic richness. It has been suggested that the color of the tongue coating may be associated with insulin resistance (IR), oral microbiota[9], and metabolic disorders[10]. Furthermore, the presence of a thick and greasy tongue coating may be associated with the formation and permeability of vascular endothelial cells, as well as the protein expression of tight junction protein-1 (zonula occludens-1)[11,12]. In the absence of validated mechanistic links to underlying metabolic dysregulation (e.g., gut-liver axis dysfunction or microbial compositional shifts), these indicators risk functioning as "black-box" contributors within the model.

Additionally, while Tian et al[3] quantified TCM tongue features using the Intelligent Constitution Identifier - an imaging system with physician verification-the proprietary algorithms lack transparency. Furthermore, TCM tongue indicators risk capturing confounders unrelated to MAFLD pathophysiology: Greasy coatings may reflect transient dietary lipids or dehydration; redness could signal oral microbiome dysbiosis or systemic inflammation from non-hepatic conditions. Without controlling for these variables, their inclusion may introduce spurious correlations. Critically, TCM markers’ relationship with hepatic steatosis in metabolically high-risk populations (e.g., diabetics) remains poorly defined. Though statistically predictive, they likely capture systemic confounders-IR, dyslipidemia, or oral dysbiosis[12,13]-rather than direct steatotic processes. Without validation against histology or liver-specific biomarkers (e.g., MRI-PDFF, CK-18), their clinical utility for precise steatosis.

FUTURE RESEARCH DIRECTIONS

To address these limitations and advance the field, we propose a multifaceted research agenda. First, external validation of the identified feature set (e.g., AST/ALT, LDL-C, TCM tongue markers) within multiethnic cohorts is imperative to evaluate robustness across varying metabolic phenotypes. Collaborative initiatives, such as the MAFLD Consortium[2,14], could expedite this process. Second, prioritizing the development of subtype-specific predictive models is crucial to capture disease heterogeneity effectively. For instance, lean MAFLD [body mass index (BMI) < 23 kg/m2] demonstrates distinct biomarker signatures [e.g., predominance of uric acid (UA)/creatinine (Cr) ratio], suggesting that tailored algorithms could enhance diagnostic accuracy within high-risk subgroups. Specifically, the omission of MAFLD subtyping-despite known biomarker divergences (e.g., UA/Cr dominance in lean MAFLD[15] vs glycated hemoglobin in diabetic MAFLD[16])-limits the model’s precision in high-risk subgroups. Subtype-specific algorithms could enhance accuracy, particularly for lean MAFLD where conventional metabolic markers (e.g., BMI) are less reliable. Third, elucidating the biological foundations underpinning TCM indicators necessitates multi-omics correlation studies. Investigating potential links between specific tongue phenotypes (e.g., edge redness) and corresponding alterations in the gut microbiome and metabolomic profiles-as demonstrated by Lu et al[8] for yellow coatings-followed by validation of their metabolic impact (e.g., regulation of lipid droplet formation via PLIN-2/ATGL pathways, evidenced in NAFLD models[17]) would establish mechanistic grounding for these features, thereby enhancing model credibility. Finally, conducting real-world cost-effectiveness analyses comparing this integrated TCM-ML approach against established screening paradigms (e.g., FibroScan[18] or FIB-4[19]) is essential to evaluate long-term feasibility, particularly within primary care settings where low-cost tools yield the greatest impact. Future studies should stratify high-risk cohorts by age/comorbidity (e.g., type 2 diabetes mellitus, chronic kidney disease) to assess biomarker stability.

CONCLUSION

The study by Tian et al[3] represents a significant advancement toward accessible MAFLD screening through the synergistic application of TCM diagnostics and ML analytics. The model’s non-invasive nature and potential for cost reduction hold particular promise for resource-limited regions. However, successful clinical translation necessitates multi-center validation, refinement incorporating disease heterogeneity through subtyping, and biological validation of the integrated TCM indicators. Future research iterations addressing these critical gaps possess the potential to revolutionize the early detection of MAFLD, transforming its often asymptomatic progression into a more readily manageable condition.

Footnotes

Provenance and peer review: Invited article; Externally peer reviewed.

Peer-review model: Single blind

Specialty type: Gastroenterology and hepatology

Country of origin: China

Peer-review report’s classification

Scientific Quality: Grade A, Grade A, Grade B, Grade B

Novelty: Grade B, Grade B, Grade B, Grade B

Creativity or Innovation: Grade B, Grade B, Grade B, Grade B

Scientific Significance: Grade A, Grade A, Grade B, Grade B

P-Reviewer: Gutiérrez-Cuevas J, PhD, Professor, Mexico; Othman AA, MD, PhD, Egypt S-Editor: Li L L-Editor: A P-Editor: Wang WB

References
1.  Eslam M, Newsome PN, Sarin SK, Anstee QM, Targher G, Romero-Gomez M, Zelber-Sagi S, Wai-Sun Wong V, Dufour JF, Schattenberg JM, Kawaguchi T, Arrese M, Valenti L, Shiha G, Tiribelli C, Yki-Järvinen H, Fan JG, Grønbæk H, Yilmaz Y, Cortez-Pinto H, Oliveira CP, Bedossa P, Adams LA, Zheng MH, Fouad Y, Chan WK, Mendez-Sanchez N, Ahn SH, Castera L, Bugianesi E, Ratziu V, George J. A new definition for metabolic dysfunction-associated fatty liver disease: An international expert consensus statement. J Hepatol. 2020;73:202-209.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 2883]  [Cited by in RCA: 2905]  [Article Influence: 581.0]  [Reference Citation Analysis (1)]
2.  Lin H, Zhang X, Li G, Wong GL, Wong VW. Epidemiology and Clinical Outcomes of Metabolic (Dysfunction)-associated Fatty Liver Disease. J Clin Transl Hepatol. 2021;9:972-982.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 8]  [Cited by in RCA: 28]  [Article Influence: 7.0]  [Reference Citation Analysis (0)]
3.  Tian Y, Zhou HY, Liu ML, Ruan Y, Yan ZX, Hu XH, Du J. Machine learning-based identification of biochemical markers to predict hepatic steatosis in patients at high metabolic risk. World J Gastroenterol. 2025;31:108200.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in RCA: 1]  [Reference Citation Analysis (0)]
4.  Simon TG, Roelstraete B, Khalili H, Hagström H, Ludvigsson JF. Mortality in biopsy-confirmed nonalcoholic fatty liver disease: results from a nationwide cohort. Gut. 2021;70:1375-1382.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 423]  [Cited by in RCA: 441]  [Article Influence: 110.3]  [Reference Citation Analysis (0)]
5.  Ren C, Gao M, Li N, Tang C, Chu G, Yusuf A, Xiao L, Yang Z, Guan T. Identification and mechanism elucidation of medicative diet for food therapy XQCSY in NAFLD prevention: an integrative in silico study. Food Med Homol. 2024;1:9420015.  [PubMed]  [DOI]  [Full Text]
6.  Noureddin M, Jones C, Alkhouri N, Gomez EV, Dieterich DT, Rinella ME; NASHNET. Screening for Nonalcoholic Fatty Liver Disease in Persons with Type 2 Diabetes in the United States Is Cost-effective: A Comprehensive Cost-Utility Analysis. Gastroenterology. 2020;159:1985-1987.e4.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 56]  [Cited by in RCA: 99]  [Article Influence: 19.8]  [Reference Citation Analysis (0)]
7.  Nie WY, Ye Y, Tong HX, Hu JQ. Herbal medicine as a potential treatment for non-alcoholic fatty liver disease. World J Gastroenterol. 2025;31:100273.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in CrossRef: 2]  [Cited by in RCA: 2]  [Article Influence: 2.0]  [Reference Citation Analysis (1)]
8.  Lu C, Zhu H, Zhao D, Zhang J, Yang K, Lv Y, Peng M, Xu X, Huang J, Shao Z, Xiao M, Li X. Oral-Gut Microbiome Analysis in Patients With Metabolic-Associated Fatty Liver Disease Having Different Tongue Image Feature. Front Cell Infect Microbiol. 2022;12:787143.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in RCA: 11]  [Reference Citation Analysis (0)]
9.  Han S, Yang X, Qi Q, Pan Y, Chen Y, Shen J, Liao H, Ji Z. Potential screening and early diagnosis method for cancer: Tongue diagnosis. Int J Oncol. 2016;48:2257-2264.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 38]  [Cited by in RCA: 55]  [Article Influence: 6.1]  [Reference Citation Analysis (0)]
10.  Li Y, Cui J, Liu Y, Chen K, Huang L, Liu Y. Oral, Tongue-Coating Microbiota, and Metabolic Disorders: A Novel Area of Interactive Research. Front Cardiovasc Med. 2021;8:730203.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in Crossref: 4]  [Cited by in RCA: 34]  [Article Influence: 8.5]  [Reference Citation Analysis (0)]
11.  Wang RR, Chen JL, Duan SJ, Lu YX, Chen P, Zhou YC, Yao SK. Noninvasive Diagnostic Technique for Nonalcoholic Fatty Liver Disease Based on Features of Tongue Images. Chin J Integr Med. 2024;30:203-212.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 4]  [Reference Citation Analysis (0)]
12.  Qi WJ, Zhang MM, Wang H, Wen Y, Wang BE, Zhang SW. Research on the relationship between thick greasy tongue fur formation and vascular endothelial cell permeability with the protein expression of zonula occludens-1. Chin J Integr Med. 2011;17:510-516.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 9]  [Cited by in RCA: 10]  [Article Influence: 0.7]  [Reference Citation Analysis (0)]
13.  Dai S, Guo X, Liu S, Tu L, Hu X, Cui J, Ruan Q, Tan X, Lu H, Jiang T, Xu J. Application of intelligent tongue image analysis in Conjunction with microbiomes in the diagnosis of MAFLD. Heliyon. 2024;10:e29269.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in RCA: 7]  [Reference Citation Analysis (0)]
14.  Rasmussen DGK, Anstee QM, Torstenson R, Golding B, Patterson SD, Brass C, Thakker P, Harrison S, Billin AN, Schuppan D, Dufour JF, Andersson A, Wigley I, Shumbayawonda E, Dennis A, Schoelch C, Ratziu V, Yunis C, Bossuyt P, Karsdal MA. NAFLD and NASH biomarker qualification in the LITMUS consortium - Lessons learned. J Hepatol. 2023;78:852-865.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in RCA: 26]  [Reference Citation Analysis (0)]
15.  Liu J, Wang C, Wang Y, Yao S. Association of Uric Acid to Creatinine Ratio with Metabolic Dysfunction-Associated Fatty Liver in Non-Obese Individuals Without Type 2 Diabetes Mellitus. Diabetes Metab Syndr Obes. 2024;17:131-142.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in RCA: 2]  [Reference Citation Analysis (0)]
16.  Kanwal F, Kramer JR, Li L, Dai J, Natarajan Y, Yu X, Asch SM, El-Serag HB. Effect of Metabolic Traits on the Risk of Cirrhosis and Hepatocellular Cancer in Nonalcoholic Fatty Liver Disease. Hepatology. 2020;71:808-819.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 108]  [Cited by in RCA: 219]  [Article Influence: 43.8]  [Reference Citation Analysis (0)]
17.  Cao L, Wu Y, Liu K, Qi N, Zhang J, Tie S, Li X, Tian P, Gu S. Cornus officinalis vinegar alters the gut microbiota, regulating lipid droplet changes in nonalcoholic fatty liver disease model mice. Food Med Homol. 2024;1:9420002.  [PubMed]  [DOI]  [Full Text]
18.  Chan WL, Chandra Kumar CV, Chan WK. Fibroscan-AST (FAST) score and other non-invasive tests for the diagnosis of fibrotic non-alcoholic steatohepatitis. Hepatobiliary Surg Nutr. 2023;12:763-767.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Full Text (PDF)]  [Cited by in RCA: 4]  [Reference Citation Analysis (0)]
19.  Wu Y, Kumar R, Huang J, Wang M, Zhu Y, Lin S. FIB-4 cut-off should be re-evaluated in patients with metabolic associated fatty liver disease (MAFLD). J Hepatol. 2021;74:247-248.  [RCA]  [PubMed]  [DOI]  [Full Text]  [Cited by in Crossref: 11]  [Cited by in RCA: 14]  [Article Influence: 3.5]  [Reference Citation Analysis (0)]