基于机器学习算法构建心脏死亡器官捐献供肝肝移植后急性肾损伤风险预测模型的应用价值

Application value of risk prediction model for acute kidney injury after donation of cardiac death liver transplantation based on machine learning algorithm

  • 摘要:
    目的 探讨基于机器学习算法构建心脏死亡器官捐献(DCD)供肝肝移植后急性肾损伤风险预测模型的应用价值。
    方法 采用回顾性队列研究方法。收集2015年1月至2023年12月中国肝移植注册中心浙江大学医学院附属第一医院等5家医院收治的1 001对行DCD供肝肝移植供者及受者的临床病理资料;供者男825例,女176例;受者男806例,女195例,年龄为52(18~75)岁。运用过采样技术增加281例受者后获得1 282例受者,通过计算机产生随机数方法以7∶3比例分为训练集897例和验证集385例。基于机器学习算法,构建随机森林、极端梯度提升树、支持向量机、逻辑回归、决策树、K最近邻和梯度提升树7种肝移植后发生急性肾损伤预测模型。观察指标:(1)发生急性肾损伤和无急性肾损伤受者及供者临床病理特征比较。(2)发生急性肾损伤和无急性肾损伤受者随访及生存情况。(3)肝移植后发生急性肾损伤列线图预测模型的构建及评价。(4)肝移植后发生急性肾损伤机器学习预测模型的构建及评价。正态分布的计量资料组间比较采用独立样本检验。偏态分布的计量资料组间比较采用Mann-Whitney U检验,多组间比较采用Kruskal-Wallis H检验。计数资料组间比较采用χ2检验或校正χ2检验。采用Kaplan-Meier法计算生存率和绘制生存曲线。采用Logistic回归模型进行单因素和多因素分析。绘制受试者工作特征(ROC)曲线,并计算ROC曲线下面积(AUC)及95%可信区间(CI)。采用DeLong检验、准确度、灵敏度、特异度评价模型的预测能力。绘制校准曲线,评估预测模型的预测概率与实际概率效能。采用基于机器学习算法与SHapley Additive exPlanations(SHAP)方法的可解释性分析方法,生成对模型决策的单独解释。
    结果 (1)发生急性肾损伤和无急性肾损伤受者及供者临床病理特征比较。1 001例受者中,肝移植后发生急性肾损伤360例,无急性肾损伤641例。发生急性肾损伤和无急性肾损伤受者体质量指数、肝性脑病、乙型肝炎表面抗原、肝肾综合征,供者糖尿病、尿素氮、丙氨酸转氨酶、天冬氨酸转氨酶、移植物质量,肝移植术中出血量、供肝热缺血时间、手术时间比较,差异均有统计学意义(Z=-4.337,χ2=9.751、9.088,H=11.142,χ2=5.286,Z=-3.360、-2.539、-3.084、-1.730、-3.497、-1.996、-2.644,P<0.05)。(2)发生急性肾损伤和无急性肾损伤受者随访及生存情况。1 001例受者均获得随访,肝移植后发生急性肾损伤受者随访时间为18.6(0~102.3)个月,无急性肾损伤受者随访时间为31.9(0.1~105.5)个月。发生急性肾损伤受者1、3、5年总生存率分别为72.1%、63.5%、59.3%,无急性肾损伤受者1、3、5年总生存率分别为86.7%、76.7%、72.5%,两者总生存情况比较,差异有统计学意义(χ2=26.028,P<0.05)。(3)肝移植后发生急性肾损伤列线图预测模型的构建及评价。多因素分析结果显示:受者体质量指数、肌酐、乙型肝炎表面抗原、肝肾综合征,供者尿素氮、肌酐,肝移植无肝期、术中出血量均是影响受者肝移植后发生急性肾损伤的独立危险因素(比值比=1.113、0.998、0.605、1.580、1.047、0.998、1.006、1.157,95%CI为1.070~1.157、0.996~1.000、0.450~0.812、1.021~2.070、1.021~1.074、0.996~0.999、1.000~1.012、1.045~1.281,P<0.05)。根据多因素分析结果,构建肝移植后发生急性肾损伤的列线图预测模型。ROC曲线结果显示:AUC为0.666(95%CI为0.637~0.696)。(4)肝移植后发生急性肾损伤机器学习预测模型的构建及评价。基于Lasso回归分析结果,构建随机森林、极端梯度提升树、支持向量机、逻辑回归、决策树、K最近邻、梯度提升树7种机器学习算法预测模型,绘制验证集ROC曲线,AUC分别为0.863、0.841、0.721、0.637、0.620、0.708、0.731,准确度分别为0.764、0.782、0.701、0.592、0.605、0.605、0.681,灵敏度分别为0.764、0.789、0.719、0.588、0.694、0.694、0.704,特异度分别为0.763、0.774、0.683、0.597、0.511、0.511、0.656,Delong检验结果显示:随机森林模型AUC最高,为0.863(95%CI为0.828~0.899)。校准曲线结果显示:随机森林模型的预测概率与实际发生概率最接近,表明该模型具有良好的校准性能。进一步基于随机森林模型对不同临床因素的SHAP值进行排序,结果显示:受者体质量指数、供者尿素氮、肝移植术中出血量、供者年龄对预测模型的输出结果影响较大。
    结论 构建了DCD供肝肝移植后发生急性肾损伤列线图预测模型及7种机器学习算法预测模型,其中基于随机森林机器学习算法的模型预测效能更优。

     

    Abstract:
    Objective To investigate the application value of risk prediction model for acute kidney injury (AKI) after donation of cardiac death (DCD) liver transplantation based on machine learning algorithm.
    Methods The retrospective cohort study was conducted. The clinicopathological data of 1 001 pairs of DCD liver transplant donors and recipients at five hospitals, including The First Affiliated Hospital of Zhejiang University School of Medicine et al, in the Chinese Liver Transplantation Registry from January 2015 to December 2023 were collected. Of the donors, there were 825 males and 176 females. Of the recipients, there were 806 males and 195 females, aged 52 (range, 18-75)years. There were 281 recipients included using oversampling technique, and all 1 282 recipients were divided to the training set of 897 recipients and the validation set of 385 recipients by a ratio of 7∶3 using computer-generated random numbers. Seven prediction models, including Random Forest (RF), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), Logistic Regression (LR), Decision Tree (DT), K-Nearest Neighbors (KNN), and Categorical Boosting (CatBoost), were constructed for AKI after liver transplantation based on machine learning algorithm. Observation indicators: (1) comparison of clinicopathological characteristics between recipients with and without AKI and donors; (2) follow-up and survival of recipients with and without AKI; (3) construction and validation of nomogram prediction model of AKI after liver transplantation; (4) construction and validation of machine learning prediction model of AKI after liver transplantation. Comparison of measurement data with normal distribution between groups was conducted using the independent sample t test. Comparison of measurement data with skewed distribution between groups was conducted using the Mann-Whitney U test, and comparison among groups was conducted using the Kruskal-Wallis H test. Comparison of count data between groups was conducted using the chi-square test or corrected chi-square test. Kaplan-Meier method was used to calculate survival rates and plot survival curves. Logistic regression model was performed for univariate and multivariate analyses. The receiver operating characteristic (ROC) curve was plotted to calculate area under curve (AUC) and 95% confidence interval (CI). The performance of prediction model was evaluated using DeLong test, accuracy, sensitivity, specificity. The calibration curve was plotted to evaluate the performance of predicted probability and actual probability. The interpretability analysis of machine learning algorithm and SHapley Additive exPlanations was used to explain the model decision separately.
    Results (1) Comparison of clinicopathological characteristics between recipients with and without AKI and donors. Of 1 001 recipients, there were 360 cases with AKI and 641 cases without AKI after liver transplantation. There were significant differences in body mass index (BMI), hepatic encepha-lopathy, hepatitis B surfact antigen (HBsAg), hepatorenal syndrome (HRS) and donor diabetes, donor blood urea nitrogen, donor alanine aminotransferase, donor aspartate aminotransferase, mass of graft, volume of blood loss during liver transplantation, warm ischema time of donor liver, and operation time between recipients with and without AKI (Z=-4.337, χ2=9.751, 9.088, H=11.142, χ2=5.286, Z=-3.360, -2.539, -3.084, -1.730, -3.497, -1.996, -2.644, P<0.05). (2) Follow-up and survival of recipients with and without AKI. All the 1 001 recipients received follow-up. The recipients with AKI after liver transplantation were followed up for 18.6(range, 0-102.3)months, and recipients without AKI after liver transplantation were followed up for 31.9(range, 0.1-105.5)months. The 1-, 3-, and 5-year overall survival rates were 72.1%, 63.5%, and 59.3% of recipients with AKI, versus 86.7%, 76.7%, and 72.5% of recipients without AKI, respectively, showing a significant difference in overall survival between them (χ2=26.028, P<0.05). (3) Construction and validation of nomogram predic-tion model of AKI after liver transplantation. Results of multivariate analysis showed that recipient BMI, recipient creatinine, recipient HBsAg, recipient HRS, donor blood urea nitrogen, donor crea-tinine, anhepatic phase and volume of blood loss during liver transplantation were independent risk factors for AKI of recipients after liver transplantation (odds ratio=1.113, 0.998, 0.605, 1.580, 1.047, 0.998, 1.006, 1.157, 95%CI as 1.070-1.157, 0.996-1.000, 0.450-0.812, 1.021-2.070, 1.021-1.074, 0.996-0.999, 1.000-1.012, 1.045-1.281, P<0.05). The nomogram prediction model of AKI after liver transplantation was constructed based on the results of multivariate analysis. Results of ROC curve showed that the AUC of 0.666 (95%CI as 0.637-0.696). (4) Construction and validation of machine learning prediction model of AKI after liver transplantation. Based on the Lasso regression analysis, seven machine learning algorithm prediction models, including RF, XGBoost, SVM, LR, DT, KNN, and CatBoost, were constructed, with ROC curves of the validation set plotted. The AUC of above models were 0.863, 0.841, 0.721, 0.637, 0.620, 0.708, 0.731, accuracies were 0.764, 0.782, 0.701, 0.592, 0.605, 0.605, 0.681, sensitivities were 0.764, 0.789, 0.719, 0.588, 0.694, 0.694, 0.704, specificities were 0.763, 0.774, 0.683, 0.597, 0.511, 0.511, 0.656, respectively. Delong test showed that the RF model with the highest AUC of 0.863(95%CI as 0.828-0.899). Calibration curve analysis showed the predicted probability closest to the actual probability of RF model, indicating the model with a good validation value. Further sorting of SHAP of different clinical factors based on RF model showed that recipient BMI, donor blood urea nitrogen, volume of blood loss during liver transplantation, donor age had large effects on the output outcomes.
    Conclusion The nomogram prediction model and seven machine learning algorithm prediction models for AKI after DCD liver transplantation are constructed, and the RF model based on machine learning has a better predictive performance.

     

/

返回文章
返回