Abstract:
Objective To investigate the clinical value of machine learning models cons-tructed by preoperative contrast‑enhanced magnetic resonance imaging (MRI) characteristics in predicting positive intra-tumoral tertiary lymphoid structures (TLSs) of hepatocellular carcinoma (HCC).
Methods The retrospective cohort study was conducted. The clinical data of 252 patients with HCC who were admitted to Wenzhou Medical University Lishui Hospital from January 2019 to December 2024 were collected. There were 214 males and 38 females, aged (57±12)years. All patients were divided into a training set of 176 cases and a testing set of 76 cases using a 7∶3 ratio of random number table. The training set was used to construct the prediction model, and the testing set was used for internal validation of the effectiveness of prediction model. Seven machine learning models were constructed based on the results of multivariate analysis. Observation indicators: (1) analysis of factors influencing intratumoral TLSs of HCC patients; (2) construction and performance com-parison of machine learning models. Comparison of measurement data with normal distribution between groups was conducted using the independent sample t test. Comparison of ordered cate-gorical data between groups was conducted using the Mann‑Whitney U test. Comparison of count data between groups was conducted using the chi‑square test. The Logistic regression model was used for univariate and multivariate analyses. The receiver operating characteristic (ROC) curve was plotted to calculate the area under the curve (AUC) for evaluating the predictive performance of models. The Delong test was used to evaluate the AUC of prediction models in the testing set, and the Brier score and Hosmer‑Lemeshow test were used to determine the calibration degree of prediction models.
Results (1) Analysis of factors influencing positive intratumoral TLSs of HCC patients. Results of MRI in 252 HCC patients showed that there were 203 cases with irregular tumor morpho-logy, 154 cases with incomplete capsule tumor, 163 cases with mosaic sign, 103 cases with low signal around the tumor in the hepatobiliary phase, 75 cases with abnormal enhancement around the tumor in the arterial phase, 47 cases with annular high enhancement in the arterial phase, 83 cases with intratumoral necrosis >20%, 84 cases with intratumoral bleeding, 56 cases with portal vein tumor thrombus, 37 cases with satellite lesions, and 90 cases with intratumoral arteries. Results of pathological examination showed there were 98 cases with positive of intratumoral TLSs and 154 cases with negative intratumoral TLSs. Results of multivariate analysis showed that age >55 years, incomplete capsule, intratumoral necrosis >20%, portal vein tumor thrombus, annular high enhance-ment in the arterial phase, and mosaic sign were independent risk factors for positive intratumoral TLSs of HCC patients in the training set (odds ratio=0.35, 6.39, 3.68, 4.32, 4.87, 4.03, 95% confidence interval as 0.15-0.79, 2.19-18.66, 1.54-8.82, 1.66-11.23, 1.79-13.24, 1.63-9.98, P<0.05). (2) Construction and performance comparison of machine learning models. Seven machine learning models were constructed based on the results of multivariate analysis. In the testing set, the AUC of 7 machine learning models ranged from 0.71-0.80, indicating good discriminative ability. Among the 7 machine learning models in the testing set, the Logistic regression (LR) model had the accuracy of 0.76 and the AUC of 0.80, all of which were the highest. Results of Delong test showed that the AUC of LR model was significantly higher than K‑nearest neighbor and decision tree models (Z=2.38, 1.99, P<0.05), but there was no significant difference in the AUC between LR model and support vector machine, random forest, gradient boosting machine, backpropagation neural network models (Z=0.01, 0.03, 0.16, 0.04, P>0.05). Clinical practicality and calibration: the LR model achieved a good balance between sensitivity and specificity. Results of Hosmer‑Lemeshow test showed that the LR model had good calibration in the testing set (χ²=7.82, P>0.05).
Conclusion The 7 machine learning prediction models constructed based on multivariate analysis can predict positive intratumoral TLSs of HCC, in which the LR model demonstrate the best comprehensive performance.