基于术前增强MRI检查影像学特征构建不同机器学习模型预测肝细胞癌瘤内三级淋巴结构阳性的临床价值

Clinical value of machine learning models constructed by preoperative contrast‑enhanced MRI characteristics in predicting positive intratumoral tertiary lymphoid structures of hepato-cellular carcinoma

  • 摘要:
    基于术前增强磁共振成像(MRI)检查影像学特征构建不同机器学习模型预测肝细胞癌(简称肝癌)瘤内三级淋巴结构(TLSs)阳性的临床价值。
    采用回顾性队列研究方法。收集2019年1月至2024年12月温州医科大学丽水医院收治的252例肝癌患者的临床资料;男214例,女38例;年龄为(57±12)岁。采用随机数字表法按7∶3分为训练集176例和测试集76例。训练集用于构建预测模型,测试集用于内部验证模型效能。基于多因素分析结果构建7 种机器学习算法的预测模型。观察指标:(1)肝癌患者瘤内TLSs阳性的影响因素分析。(2)机器学习预测等级资料模型构建及性能比较。正态分布的计量资料组间比较采用独立样本t检验。计数资料组间比较采用χ2检验,等级资料组间比较采用Mann⁃Whitney U检验。单因素和多因素分析采用Logistic回归模型。绘制受试者工作特征曲线(ROC)并计算曲线下面积(AUC)评估和比较各模型的预测效能,Delong检验评估测试集中模型间AUC;Brier分数和Hosmer‑Lemeshow检验判断模型的校准度。
    (1)肝癌患者瘤内TLSs阳性的影响因素分析。252例肝癌患者MRI检查结果显示:肿瘤形态不规则203例,包膜不完整154例,马赛克征163例,肝胆期瘤周低信号103例,动脉期瘤周异常强化75例,动脉期环状高强化47例,瘤内坏死面积>20% 83例,瘤内出血84例,静脉内癌栓56例,卫星灶37例,瘤内动脉90例。252例肝癌患者经病理学检查证实TLSs阳性98例,TLSs阴性154例。多因素分析结果显示:年龄>55岁、包膜不完整、瘤内坏死面积>20%、有门静脉癌栓、动脉期环状高强化、有马赛克征均是影响训练集肝癌患者TLSs阳性的独立危险因素(优势比=0.35、6.39、3.68、4.32、4.87、4.03,95%可信区间为0.15~0.79、2.19~18.66、1.54~8.82、1.66~11.23、1.79~13.24、1.63~9.98,P<0.05)。(2)机器学习预测模型构建及性能比较。根据多因素分析结果构建7种机器学习算法的预测模型。测试集中,7种模型的AUC范围为0.71~0.80,均表现出较好的区分能力。其中,逻辑回归模型的准确度和AUC最高,分别为0.76和0.80。DeLong检验结果显示:逻辑回归模型的AUC均高于K‑最近邻模型和决策树模型(Z=2.38、1.99,P<0.05),与支持向量机、随机森林模型、梯度提升机模型、反向传播神经网络模型比较,差异均无统计学意义(Z=0.01、0.03、0.16、0.04,P>0.05)。临床实用性与校准度:逻辑回归模型的灵敏度和特异度取得较好平衡。Hosmer⁃Lemeshow检验评估模型校准度结果显示:逻辑回归模型在测试集中显示出良好的预测一致性(χ²=7.82,P>0.05)。
    7种基于多因素分析结果构建的机器学习预测模型均可有效预测肝癌TLSs阳性,其中逻辑回归模型的综合指标性能最优。

     

    Abstract:
    Objective To investigate the clinical value of machine learning models cons-tructed by preoperative contrast‑enhanced magnetic resonance imaging (MRI) characteristics in predicting positive intra-tumoral tertiary lymphoid structures (TLSs) of hepatocellular carcinoma (HCC).
    Methods The retrospective cohort study was conducted. The clinical data of 252 patients with HCC who were admitted to Wenzhou Medical University Lishui Hospital from January 2019 to December 2024 were collected. There were 214 males and 38 females, aged (57±12)years. All patients were divided into a training set of 176 cases and a testing set of 76 cases using a 7∶3 ratio of random number table. The training set was used to construct the prediction model, and the testing set was used for internal validation of the effectiveness of prediction model. Seven machine learning models were constructed based on the results of multivariate analysis. Observation indicators: (1) analysis of factors influencing intratumoral TLSs of HCC patients; (2) construction and performance com-parison of machine learning models. Comparison of measurement data with normal distribution between groups was conducted using the independent sample t test. Comparison of ordered cate-gorical data between groups was conducted using the Mann‑Whitney U test. Comparison of count data between groups was conducted using the chi‑square test. The Logistic regression model was used for univariate and multivariate analyses. The receiver operating characteristic (ROC) curve was plotted to calculate the area under the curve (AUC) for evaluating the predictive performance of models. The Delong test was used to evaluate the AUC of prediction models in the testing set, and the Brier score and Hosmer‑Lemeshow test were used to determine the calibration degree of prediction models.
    Results (1) Analysis of factors influencing positive intratumoral TLSs of HCC patients. Results of MRI in 252 HCC patients showed that there were 203 cases with irregular tumor morpho-logy, 154 cases with incomplete capsule tumor, 163 cases with mosaic sign, 103 cases with low signal around the tumor in the hepatobiliary phase, 75 cases with abnormal enhancement around the tumor in the arterial phase, 47 cases with annular high enhancement in the arterial phase, 83 cases with intratumoral necrosis >20%, 84 cases with intratumoral bleeding, 56 cases with portal vein tumor thrombus, 37 cases with satellite lesions, and 90 cases with intratumoral arteries. Results of pathological examination showed there were 98 cases with positive of intratumoral TLSs and 154 cases with negative intratumoral TLSs. Results of multivariate analysis showed that age >55 years, incomplete capsule, intratumoral necrosis >20%, portal vein tumor thrombus, annular high enhance-ment in the arterial phase, and mosaic sign were independent risk factors for positive intratumoral TLSs of HCC patients in the training set (odds ratio=0.35, 6.39, 3.68, 4.32, 4.87, 4.03, 95% confidence interval as 0.15-0.79, 2.19-18.66, 1.54-8.82, 1.66-11.23, 1.79-13.24, 1.63-9.98, P<0.05). (2) Construction and performance comparison of machine learning models. Seven machine learning models were constructed based on the results of multivariate analysis. In the testing set, the AUC of 7 machine learning models ranged from 0.71-0.80, indicating good discriminative ability. Among the 7 machine learning models in the testing set, the Logistic regression (LR) model had the accuracy of 0.76 and the AUC of 0.80, all of which were the highest. Results of Delong test showed that the AUC of LR model was significantly higher than K‑nearest neighbor and decision tree models (Z=2.38, 1.99, P<0.05), but there was no significant difference in the AUC between LR model and support vector machine, random forest, gradient boosting machine, backpropagation neural network models (Z=0.01, 0.03, 0.16, 0.04, P>0.05). Clinical practicality and calibration: the LR model achieved a good balance between sensitivity and specificity. Results of Hosmer‑Lemeshow test showed that the LR model had good calibration in the testing set (χ²=7.82, P>0.05).
    Conclusion The 7 machine learning prediction models constructed based on multivariate analysis can predict positive intratumoral TLSs of HCC, in which the LR model demonstrate the best comprehensive performance.

     

/

返回文章
返回