机器学习算法在肝细胞癌微血管侵犯术前预测中的应用价值

刘红枝; 林海涛; 林昭旺; 傅俊; 丁宗仁; 郭鹏飞; 刘景丰

doi:10.3760/cma.j.issn.1673-9752.2020.02.008

机器学习算法在肝细胞癌微血管侵犯术前预测中的应用价值

Application value of machine learning algorithms for preoperative prediction of microvascular invasion in hepatocellular carcinoma

摘要

摘要: 目的:探讨机器学习算法在肝细胞癌微血管侵犯(MVI)术前预测中的应用价值。
方法:采用回顾性描述性研究方法。收集2015年5月至2018年12月福建医科大学孟超肝胆医院收治的277例肝细胞癌患者的临床病理资料；男235例，女42例；年龄为(56±10)岁，年龄范围为33～80岁。患者术前均行磁共振成像检查。227例肝细胞癌患者通过计算机产生随机数方法以7∶3比例分为训练集193例和验证集84例。应用逻辑回归列线图，支持向量机(SVM)、随机森林(RF)、人工神经网络(ANN)和轻量级梯度提升机(LightGBM)机器学习算法构建MVI术前预测模型。观察指标：(1)训练集及验证集患者临床病理资料分析。(2)影响训练集患者肿瘤MVI危险因素分析。(3)机器学习算法预测模型构建及其术前预测肿瘤MVI准确性比较。正态分布的计量资料以±s表示，组间比较采用配对t检验。计数资料以绝对数表示，组间比较采用x²检验。单因素和多因素分析采用Logistic回归模型。
结果:(1)训练集及验证集患者临床病理资料分析：训练集和验证集患者性别(男，女)分别为157、36例和78、6例，两组比较，差异有统计学意义(x²=6.028，P<0.05)。(2)影响训练集患者肿瘤MVI危险因素分析：训练集193例患者中，MVI阳性108例，MVI阴性85例。单因素分析结果显示：年龄、肿瘤数目、肿瘤直径、卫星病灶、肿瘤边界、甲胎蛋白(AFP)、碱性磷酸酶(ALP)和纤维蛋白原水平是影响肿瘤MVI的相关因素(比值比=0.971，2.449，1.368，4.050，2.956，4.083，2.532，1.996，95%可信区间为0.943～1.000，1.169～5.130，1.180～1.585，1.316～12.465，1.310～6.670，2.214～7.532，1.016～6.311，1.323～3.012，P<0.05)。多因素分析结果显示：AFP>20 μg/L、肿瘤多发、肿瘤直径越大、肿瘤边界不光滑是影响肿瘤MVI的独立危险因素(比值比=3.680，3.100，1.438，3.628，95%可信区间为1.842～7.351，1.334～7.203，1.201～1.721，1.438～9.150, P<0.05)，而年龄越大，MVI发生风险越低(比值比=0.958，95%可信区间为0.923～0.994，P<0.05)。(3)机器学习算法预测模型构建及其术前预测肿瘤MVI准确性比较：①应用多因素分析结果筛选指标，包括年龄、AFP、肿瘤数目、肿瘤直径、肿瘤边界，构建逻辑回归列线图，SVM、RF、ANN及LightGBM机器学习算法预测模型，一致性分析结果显示逻辑回归列线图预测模型稳定性较好。逻辑回归列线图、SVM、RF、ANN、LightGBM机器学习算法预测模型训练集和验证集曲线下面积(AUC)分别为0.812、0.794、0.807、0.814、0.810和0.784、0.793、0.783、0.803、0.815，SVM、RF、ANN、LightGBM机器学习算法AUC分别与逻辑回归列线图AUC比较，差异均无统计学意义［(95%可信区间为0.731～0.849，0.744～0.860，0.752～0.867，0.747～ 0.862，Z=0.995，0.245，0.130，0.102，P>0.05)和(95%可信区间为0.690～0.873，0.679～0.865，0.702～0.882，0.715～ 0.891，Z=0.325，0.026，0.744，0.803，P>0.05)］。②应用RF、LightGBM机器学习算法自行筛选临床病理因素指标构建预测模型。根据指标对预测模型重要度排序，选择重要度>0.01的指标，包括年龄、肿瘤直径、AFP、白细胞(WBC)、血小板、总胆红素、天冬氨酸氨基转移酶、γ谷氨酰转移酶、ALP和纤维蛋白原，构建RF机器学习算法预测模型；挑选重要度>5.0的指标，包括年龄、肿瘤直径、AFP、WBC、ALP和纤维蛋白原，构建LightGBM机器学习算法预测模型；由于ANN及SVM机器学习算法不具备筛选指标能力，应用单因素分析结果筛选指标，包括年龄、肿瘤数目、肿瘤直径、卫星病灶、肿瘤边界、AFP、ALP和纤维蛋白原水平，构建SVM、ANN机器学习算法预测模型。SVM、RF、ANN、LightGBM机器学习算法预测模型训练集和验证集AUC分别为0.803、0.838、0.793、0.847和0.810、0.802、0.802、0.836，分别与逻辑回归列线图AUC比较，差异均无统计学意义［(95%可信区间为0.740～0.857，0.779～0.887，0.729～0.848，0.789～0.895，Z=0.421，0.119，0.689，1.517，P>0.05)和(95%可信区间为0.710～0.888，0.700～0.881，0.701～0.881，0.740～0.908，Z=0.856，0.458，0.532，1.306，P>0.05)］。
结论:机器学习算法可用于术前预测肝细胞癌MVI，但其应用价值尚需多中心大样本数据进一步验证。

Abstract: Objective:To investigate the application value of machine learning algorithms for preoperative prediction of microvascular invasion (MVI) in hepatocellular carcinoma (HCC).
Methods:The retrospective and descriptive study was conducted. The clinicopathological data of 277 patients with HCC who were admitted to Mengchao Hepatobiliary Hospital of Fujian Medical University between May 2015 and December 2018 were collected. There were 235 males and 42 females, aged (56±10)years, with a range from 33 to 80 years. Patients underwent preoperative magnetic resonance imaging examination. According to the random numbers showed in the computer, all the 277 HCC patients were divided into training dataset consisting of 193 and validation dataset consisting of 84, with a ratio of 7∶3. Machine learning algorithms, including logistic regression nomogram, support vector machine (SVM), random forest (RF), artificial neutral network (ANN) and light gradient boosting machine (LightGBM), were used to develop models for preoperative prediction of MVI. Observation indicators: (1) analysis of clinicopathological data of patients in the training dataset and validation dataset; (2) analysis of risk factors for tumor MVI of the training dataset; (3) construction of machine learning algorithm prediction models and comparison of their accuracy of preoperative tumor MVI prediction. Measurement data with normal distribution were represented as Mean±SD, and comparison between groups was analyzed using the paired t test. Count data were described as absolute numbers, and comparison between groups was analyzed using the chisquare test. Univariate and multivariate analyses were performed using the Logistic regression model.
Results:(1) Analysis of clinicopathological data of patients in the training dataset and validation dataset: there were 157 males and 36 females in the training dataset, 78 males and 6 females in the validation dataset, showing a significant difference in the sex between the training dataset and validation dataset (x²=6.028, P<0.05). (2) Analysis of risk factors for tumor MVI of the training dataset: of the 193 patients, 108 had positive MVI, and 85 had negative MVI. Results of univariate analysis showed that age, the number of tumors, tumor diameter, satellite lesions, tumor margin, alpha fetaprotein (AFP), alkaline phosphatase (ALP), fibrinogen were related factors for tumor MVI ［odds ratio (OR)=0.971, 2.449, 1.368, 4.050, 2.956, 4.083, 2.532, 1.996, 95% confidence interval (CI): 0.943-1.000, 1.169-5.130, 1.180-1.585, 1.316-12.465, 1.310-6.670, 2.214-7.532, 1.016-6.311, 1.323-3.012, P<0.05］. Results of multivariate analysis showed that AFP>20 μg/L, multiple tumors, larger tumor diameter, unsmooth tumor margin were independent risk factors for tumor MVI (OR=3.680, 3.100, 1.438, 3.628, 95%CI: 1.842-7.351, 1.334-7.203, 1.201-1.721, 1.438-9.150, P<0.05). Larger age was associated with lower risk of preoperative tumor MVI (OR=0.958, 95%CI: 0.923-0.994, P<0.05). (3) Construction of machine learning algorithm prediction models and comparison of their accuracy of preoperative tumor MVI prediction: ①machine learning algorithm prediction models involving logistic regression nomogram, SVM, RF, ANN and LightGBM were constructed based on results of multivariate analysis including age, AFP, the number of tumors, tumor diameter, tumor margin, and consistency analysis of the logistic regression nomogram prediction model showed a good stability. For the training dataset and validation dataset, the area under curve (AUC) of logistic regression nomogram model, SVM model, RF model, ANN model, LightGBM model was 0.812, 0.794, 0.807, 0.814, 0.810 and 0.784, 0.793, 0.783, 0.803, 0.815, respectively, showing no significant difference between SVM model and logistic regression nomogram model, between RF model and logistic regression nomogram model, between ANN model and logistic regression nomogram model, between LightGBM model and logistic regression nomogram model ［(95%CI: 0.731-0.849, 0.744-0.860, 0.752-0.867, 0.747-0.862, Z=0.995, 0.245, 0.130, 0.102, P>0.05) and (95%CI: 0.690-0.873, 0.679-0.865, 0.702-0.882, 0.715-0.891, Z=0.325, 0.026, 0.744, 0.803, P>0.05)］.② Clinicopathological factors were selected using RF, LightGBM machine learning algorithm to construct corresponding prediction models. According to importance scale of factors to prediction models, factors with importance scale>0.01 were selected to construct RF model, including age, tumor diameter, AFP, white blood cell, platelet, total bilirubin, aspartate transaminase, γglutamyl transpeptidase, ALP, and fibrinogen. Factors with importance scale>5.0 were selected to construct LightGBM model, including age, tumor diameter, AFP, white blood cell, ALP, and fibrinogen. Due to lack of factor selection ability, factors based on results of univariate analysis were secected to construct SVM model and ANN model, including age, the number of tumors, tumor diameter, satellite lesions, tumor margin, AFP, ALP, and fibrinogen. For the training dataset and validation dataset, the AUC of SVM model, RF model, ANN model, LightGBM model was 0.803, 0.838, 0.793, 0.847 and 0.810, 0.802, 0.802, 0.836, respectively, showing no significant difference between SVM model and logistic regression nomogram model, between RF model and logistic regression nomogram model, between ANN model and logistic regression nomogram model, between LightGBM model and logistic regression nomogram model ［(95%CI: 0.740-0.857, 0.779-0.887, 0.729-0.848, 0.789-0.895, Z=0.421, 0.119, 0.689, 1.517, P>0.05) and (95%CI: 0.710-0.888, 0.700-0.881, 0.701-0.881, 0.740-0.908, Z=0.856, 0.458, 0.532, 1.306, P>0.05)］.
Conclusion:Machine learning algorithms can predict MVI of HCC preoperatively, but its application value needs to be further verified by large sample data from multi centers.

HTML全文

参考文献(0)

施引文献

资源附件(0)