Abstract:
Objective To construct an prediction model based on magnetic resonance imaging (MRI) machine learning algorithm and radiomics and investigate its application value in predicting lymphovascular invasion (LVI) status of rectal cancer without lymph node metastasis.
Methods The retrospective cohort study was conducted. The clinicopathological data of 204 rectal cancer patients without lymph node metastasis who were admitted to Gansu Provincial Hospital from February 2016 to January 2024 were collected. There were 123 males and 81 females, aged (61±7)years. All 204 patients were randomly divided into the training dataset of 163 cases and the testing dataset of 41 cases by a ratio of 8∶2 using the electronic computer randomization method. The training dataset was used to construct the prediction model, and the testing dataset was used to validate the prediction model. The clinical prediction model, radiomics model and joint prediction model were constructed based on the selected clinical and/or imaging features. Measurement data with normal distribution were represented as Mean±SD. Count data were described as absolute numbers, and the chi-square test or Fisher exact probability were used for comparison between the groups. Comparison of ordinal data was conducted using the nonparameter rank sum test. The inter-class correlation coefficient (ICC) was used to evaluate the consistency of the radiomics features of the two doctors, and ICC > 0.80 was good consistency. Univariate analysis was conducted by corres-ponding statistic methods. Multivariate analysis was conducted by Logistic stepwise regression model. The receiver operating characteristic (ROC) curve was drawn, and the area under the curve (AUC), Delong test, decision curve and clinical impact curve were used to evaluate the diagnostic efficiency and clinical utility of the model.
Result (1) Analysis of factors affecting LVI status of patients. Of the 204 rectal cancer patients without lymph node metastasis, there were 71 cases with positive of LVI and 133 cases with negative of LVI. Results of multivariate analysis showed that gender, platelet (PLT) count and carcinoembryonic antigen (CEA) were independent factors affecting LVI status of rectal cancer without lymph node metastasis in training dataset odds ratio=2.405, 25.062, 2.528, 95% confidence interval (CI) as 1.093-5.291, 2.748-228.604, 1.181-5.410, P < 0.05. (2) Construction of clinical prediction model. The clinical prediction model was conducted based on the results of multivariate analysis including gender, PLT count and CEA. Results of ROC curve showed that the AUC, accuracy, sensitivity and specificity of clinical prediction model were 0.721 (95%CI as 0.637-0.805), 0.675, 0.632 and 0.698 for the training dataset, and 0.795 (95%CI as 0.644-0.946), 0.805, 1.000 and 0.429 for the testing dataset. Results of Delong test showed that there was no significant difference in the AUC of clinical prediction model between the training dataset and the testing dataset (Z=-0.836, P > 0.05). (3) Construction of radiomics model. A total of 851 radiomics features were extracted from 204 patients, and seven machine learning algorithms, including logistic regression, support vector machine, Gaussian process, logistic regression-lasso algorithm, linear discriminant analysis, naive Bayes and automatic encoder, were used to construct the prediction model. Eight radiomics features were finally selected from the optimal Gaussian process learning algorithm to construct a radiomics prediction model. Results of ROC curve showed that the AUC, accuracy, sensitivity and specificity of radiomics prediction model were 0.857 (95%CI as 0.800-0.914), 0.748, 0.947 and 0.642 for the training dataset, and 0.725 (95%CI as 0.571-0.878), 0.634, 1.000 and 0.444 for the testing dataset. Results of Delong test showed that there was no significant difference in the AUC of radiomics prediction model between the training dataset and the testing dataset (Z=1.578, P > 0.05). (4) Construction of joint prediction model. The joint prediction model was constructed based on the results of multivariate analysis and the radiomics features. Results of ROC curve showed that the AUC, accuracy, sensitivity and specificity of radiomics prediction model were 0.885 (95%CI as 0.832-0.938), 0.791, 0.912 and 0.726 for the training dataset, and 0.857 (95%CI as 0.731-0.984), 0.854, 0.714 and 0.926 for the testing dataset. Results of Delong test showed that there was no significant difference in the AUC of joint prediction model between the training dataset and the testing dataset (Z=0.395, P > 0.05). (5) Performance comparison of three prediction models. Results of the Hosmer-Lemeshow goodness-of-fit test showed that all of the clinical prediction model, radiomics prodiction model and joint prediction model having good fitting degree (χ2=1.464, 12.763, 10.828, P > 0.05). Results of Delong test showed that there was no signifi-cant difference in the AUC between the clinical prediction model and the joint prediction model or the radiomics model (Z=1.146, 0.658, P > 0.05), and there was a significant difference in the AUC between the joint prediction model and the radiomics model (Z=2.001, P < 0.05). Results of calibration curve showed a good performance in the joint prediction model. Results of decision curve and clinical impact curve showed that the performance of joint prediction model in predicting LVI status of rectal cancer without lymph node metastasis was superior to the clinical prediction model and the radiomics model.
Conclusions The clinical prediction model is constructed based on gender, PLT count and CEA. The radiomics predictive model is constructed based on 8 selected radiomics features. The joint prediction model is constructed based on the clinical prediction model and the radiomics predictive model. All of the three models can predict the LVI status of rectal cancer with-out lymph node metastasis, and the joint prediction model has a superior predictive performance.