中文English
ISSN 1001-5256 (Print)
ISSN 2097-3497 (Online)
CN 22-1108/R
Volume 42 Issue 4
Apr.  2026
Turn off MathJax
Article Contents

Construction and validation of machine learning predictive models for the risk of metabolic associated fatty liver disease

DOI: 10.12449/JCH260413
Research funding:

National Key R&D Program of China (2018YFC2000600);

Chinese Academy of Chinese Medical Sciences Science and Technology Innovation Project (CI2021A03005);

Beijing “Mentorship and Inheritance in Chinese Medicine 3+3 Project” (2023-SZ-A-51)

More Information
  • Corresponding author: REN Haiyan, haiyaner123@qq.com (ORCID: 0009-0002-7863-3513); ZHANG Jin, jin_zhang2000@hotmail.com (ORCID: 0009-0001-6557-7180)
  • Received Date: 2025-10-06
  • Accepted Date: 2025-11-18
  • Published Date: 2026-04-25
  •   Objective  To investigate the value of predictive models established based on machine learning methods in predicting the risk of metabolic associated fatty liver disease (MAFLD), and to analyze its key risk factors.  Methods  A retrospective analysis was performed for the 50 variables of 2 168 healthy individuals who underwent physical examination in Department of Health Assessment, Xiyuan Hospital, China Academy of Chinese Medical Sciences, from January 2021 to December 2024, including body composition, past history, and laboratory tests, and according to whether they were diagnosed with MAFLD or not, they were divided into MAFLD group with 265 individuals and non-MAFLD group with 1 903 individuals. The Mann-Whitney U test was used for comparison of continuous data between two groups, and the chi-square test was used for comparison of categorical data between two groups. Randomly split the research data into a training set and a validation set in a 70% to 30% ratio. Predictive factors were screened from the training set data using univariate analysis, LASSO regression, and multivariate Logistic regression analysis. Predictive models were then constructed using seven machine learning methods: Logistic regression, decision tree, random forest (RF), eXtreme gradient boosting, light gradient boosting machine, support vector machine, and artificial neural network. Model performance was evaluated by plotting receiver operating characteristic curve for the validation set and calculating the area under the curve (AUC), sensitivity, specificity, and Youden index for each model. Furthermore, the SHapley Additive exPlanation (SHAP) method was used to analyze the contribution of variables in the optimal model.  Results  The prevalence rate of MAFLD among the 2 168 subjects was 12.22% (265/2 168). Smoking, diastolic blood pressure, phase angle, visceral fat area, muscle fat ratio, waist-to-hip ratio, aspartate aminotransferase, non-HDL-C/HDL-C ratio, triglyceride-glucose index, and gallstones were independent risk factors for MAFLD (all P<0.05). The seven predictive models of support vector machine, eXtreme gradient boosting, decision tree, light gradient boosting machine, artificial neural network, RF, and Logistic regression had an AUC of 0.738, 0.754, 0.757, 0.786, 0.795, 0.796, and 0.815, respectively, in the validation set, among which the RF model had the best discriminatory ability (AUC=0.796, 95% confidence interval: 0.754 — 0.839), with a sensitivity of 81.01%, a specificity of 63.16%, and a Youden index of 44.17%. The SHAP analysis showed that visceral fat area, waist-to-hip ratio, and diastolic blood pressure were the top three predictive factors in terms of importance.  Conclusion  The RF model, constructed based on body composition and clinical indicators, has a good performance in predicting the risk of MAFLD, and its interpretability can help to identify high-risk individuals in the early stage in clinical practice.

     

  • loading
  • [1]
    Chinese Society of Hepatology, Chinese Medical Association. Guidelines for prevention and treatment of metabolic dysfunction-associated(non-alcoholic) fatty liver disease(version 2024)[J]. J Pract Hepatol, 2024, 27( 4): 494- 510. DOI: 10.3760/cma.j.cn501113-20240327-00163.

    中华医学会肝病学分会. 代谢相关(非酒精性)脂肪性肝病防治指南(2024年版)[J]. 实用肝脏病杂志, 2024, 27( 4): 494- 510. DOI: 10.3760/cma.j.cn501113-20240327-00163.
    [2]
    TANASE DM, GOSAV EM, COSTEA CF, et al. The intricate relationship between type 2 diabetes mellitus(T2DM), insulin resistance(IR), and nonalcoholic fatty liver disease(NAFLD)[J]. J Diabetes Res, 2020, 2020: 3920196. DOI: 10.1155/2020/3920196.
    [3]
    HOU MM, GU Q, CUI JW, et al. Proportion and clinical characteristics of metabolic-associated fatty liver disease and associated liver fibrosis in an urban Chinese population[J]. Chin Med J, 2025, 138( 7): 829- 837. DOI: 10.1097/CM9.0000000000003141.
    [4]
    KRISHNAN A, MUKHERJEE D. Association of cardiovascular health metrics and metabolic associated fatty liver disease: Methodological limitations, and future directions[J]. World J Hepatol, 2025, 17( 3): 105635. DOI: 10.4254/wjh.v17.i3.105635.
    [5]
    RIAZI K, AZHARI H, CHARETTE JH, et al. The prevalence and incidence of NAFLD worldwide: A systematic review and meta-analysis[J]. Lancet Gastroenterol Hepatol, 2022, 7( 9): 851- 861. DOI: 10.1016/S2468-1253(22)00165-0.
    [6]
    ZHOU M, BO T, FAN XD, et al. Metabolic dysfunction-associated fatty liver disease: A central hub in systemic metabolic dysregulation[J]. J Clin Hepatol, 2025, 41( 9): 1725- 1728. DOI: 10.12449/JCH250902.

    周蒙, 薄涛, 范修德, 等. 代谢相关脂肪性肝病: 全身代谢性紊乱的核心枢纽之一[J]. 临床肝胆病杂志, 2025, 41( 9): 1725- 1728. DOI: 10.12449/JCH250902.
    [7]
    YANG B, ZHANG R. Progress on the treatment of metabolic associated fatty liver disease[J/CD]. Chin J Liver Dis(Electronic Version), 2024, 16( 4): 25- 30. DOI: 10.3969/j.issn.1674-7380.2024.04.004.

    杨彬, 张瑞. 代谢相关脂肪性肝病治疗进展[J/CD]. 中国肝脏病杂志(电子版), 2024, 16( 4): 25- 30. DOI: 10.3969/j.issn.1674-7380.2024.04.004.
    [8]
    TENG ML, NG CH, HUANG DQ, et al. Global incidence and prevalence of nonalcoholic fatty liver disease[J]. Clin Mol Hepatol, 2023, 29( Suppl): S32- S42. DOI: 10.3350/cmh.2022.0365.
    [9]
    QUEK J, CHAN KE, WONG ZY, et al. Global prevalence of non-alcoholic fatty liver disease and non-alcoholic steatohepatitis in the overweight and obese population: A systematic review and meta-analysis[J]. Lancet Gastroenterol Hepatol, 2023, 8( 1): 20- 30. DOI: 10.1016/S2468-1253(22)00317-X.
    [10]
    YU PP, YANG HC, QI XY, et al. Gender differences in the ideal cutoffs of visceral fat area for predicting MAFLD in China[J]. Lipids Health Dis, 2022, 21( 1): 148. DOI: 10.1186/s12944-022-01763-2.
    [11]
    LI HJ, ZHANG Y, LUO HC, et al. The lipid accumulation product is a powerful tool to diagnose metabolic dysfunction-associated fatty liver disease in the United States adults[J]. Front Endocrinol, 2022, 13: 977625. DOI: 10.3389/fendo.2022.977625.
    [12]
    ZHOU BQ, GONG N, HUANG XJ, et al. Development and validation of a nomogram for predicting metabolic-associated fatty liver disease in the Chinese physical examination population[J]. Lipids Health Dis, 2023, 22( 1): 85. DOI: 10.1186/s12944-023-01850-y.
    [13]
    YUAN Y, XU MY, ZHANG XF, et al. Development and validation of a nomogram model for predicting the risk of MAFLD in the young population[J]. Sci Rep, 2024, 14( 1): 9376. DOI: 10.1038/s41598-024-60100-y.
    [14]
    ANTONIO-VILLA NE, BELLO-CHAVOLLA OY, VARGAS-VÁZQUEZ A, et al. Increased visceral fat accumulation modifies the effect of insulin resistance on arterial stiffness and hypertension risk[J]. Nutr Metab Cardiovasc Dis, 2021, 31( 2): 506- 517. DOI: 10.1016/j.numecd.2020.09.031.
    [15]
    MAVILIA MG, WU GY. Liver and serum adiponectin levels in non-alcoholic fatty liver disease[J]. J Dig Dis, 2021, 22( 4): 214- 221. DOI: 10.1111/1751-2980.12980.
    [16]
    KYHL LK, NORDESTGAARD BG, TYBJÆRG-HANSEN A, et al. High fat in blood and body and increased risk of clinically diagnosed non-alcoholic fatty liver disease in 105, 981 individuals[J]. Atherosclerosis, 2023, 376: 1- 10. DOI: 10.1016/j.atherosclerosis.2023.05.015.
    [17]
    FAHED G, AOUN L, ZERDAN M BOU, et al. Metabolic syndrome: Updates on pathophysiology and management in 2021[J]. Int J Mol Sci, 2022, 23( 2): 786. DOI: 10.3390/ijms23020786.
    [18]
    KATSIKI N, MIKHAILIDIS DP, MANTZOROS CS. Non-alcoholic fatty liver disease and dyslipidemia: An update[J]. Metabolism, 2016, 65( 8): 1109- 1123. DOI: 10.1016/j.metabol.2016.05.003.
    [19]
    KANG YH, KUANG YM, WEI JF, et al. Establishment and validation of a machine learning-based model for predicting metabolic dysfunction-associated fatty liver disease[J]. J Chin Pract Diagn Ther, 2025, 39( 7): 611- 618. DOI: 10.13507/j.issn.1674-3474.2025.07.005.

    康艳红, 邝亚梅, 魏君锋, 等. 基于机器学习算法的代谢功能障碍相关脂肪性肝病预测模型构建及验证[J]. 中华实用诊断与治疗杂志, 2025, 39( 7): 611- 618. DOI: 10.13507/j.issn.1674-3474.2025.07.005.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(5)  / Tables(4)

    Article Metrics

    Article views (49) PDF downloads(11) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return