基于生物信息学构建胰腺癌患者预后相关miRNA预测模型及其应用价值

Construction and application value of prognosis associated miRNA prediction model based on bioinforma-tics analysis in pancreatic cancer patients

  • 摘要: 目的:探讨基于生物信息学构建胰腺癌患者预后相关微小RNA(miRNA)预测模型及其应用价值。
    方法:采用回顾性队列研究方法。收集肿瘤基因图谱计划(TCGA)数据库(https://cancergenome.nih.gov/)自建库起至2017年9月171例胰腺癌患者的临床病理资料;男93例,女78例;中位年龄为65岁, 年龄范围为35~88岁。171例患者中,64例临床病理资料完整。171例患者采用随机抽样法按7〖DK〗∶3比例分为训练集123例和测试集48例。训练集用于构建预测模型,测试集用于验证预测模型效能。从GEO基因公共表达数据库中下载包含9对胰腺癌及对应癌旁组织的miRNA测序数据的数据集GSE41372,从癌组织及癌旁组织差异表达的miRNA中筛选出候选差异表达miRNA,基于训练集患者信息,进行LASSO-COX回归分析,从候选差异表达miRNA中筛选出与生存相关miRNA,将其拟合成一个相对精简的预后相关miRNA模型。分别在训练集和测试集中对构建的预后相关miRNA模型的预测效能进行验证,以受试者工作曲线下面积(AUC)评价模型准确性,以一致性指数(C-index值)评价模型效能。观察指标:(1)患者生存情况。(2)差异表达miRNA筛选结果。(3)预后相关miRNA模型的构建。(4)预后相关miRNA模型的验证。(5)胰腺癌患者临床病理因素比较。(6)影响胰腺癌患者预后的相关因素分析。(7)预后相关miRNA模型与第8版TNM分期预测效能的比较。正态分布的计量资料以±s表示,两组间比较采用Student-t检验,多组间比较采用方差分析。偏态分布的计量资料以M(范围)表示,组间比较采用Mann-Whitney U检验。计数资料以绝对数或百分比表示,组间比较采用x2检验。等级资料分析采用秩和检验。采用计数资料相关性分析,挖掘预后相关miRNA模型与患者临床病理参数之间的相关性。采用COX进行单因素及多因素分析,并判断相关性,结果以风险比及95%可信区间表示,风险比<1时,证明该因素为保护因素;风险比>1时,证明该因素为危险因素;风险比=1时,说明对生存无明显影响。采用Kaplan-Meier法绘制生存曲线并计算生存率,采用Log-rank检验进行生存分析。
    结果:
    (1)患者生存情况:123例训练集患者随访时间为31~2 141 d,中位随访时间为449 d,3、5年总体生存率分别为16.67%、8.06%。48例测试集患者随访时间为41~2 182 d,中位随访时间为457 d,3、5年总体生存率分别为15.63%、9.68%。两组患者3、5年总体生存率比较,差异均无统计学意义(x2=0.017,0.068,P>0.05)。(2)差异表达miRNA筛选结果:生物信息学分析结果显示,共筛选102个候选差异表达miRNA,其中63个miRNA在癌组织中上调,39个miRNA在癌组织中下调。(3)预后相关miRNA模型的构建:在102个候选差异表达基因中,筛选出5个与生存相关的miRNA。名称分别为miR-21、 miR-125a-5p、miR-744、miR-374b、miR-664,差异表达模式(癌组织对比癌旁组织)分别为升高、升高、降低、升高、降低,差异表达倍数分别为4.00、3.43、3.85、2.62、2.35倍。由5个与生存相关miRNA构建的预后表达方程=0.454×miR-21表达量-0.492×miR-125a-5p表达量-0.49×miR-744表达量-0.419×miR-374b表达量-0.036×miR-664表达量。(4)预后相关miRNA模型的验证:在训练集和测试集中,预后相关miRNA模型 C-index值分别为0.643和0.642。(5)胰腺癌患者临床病理因素比较:COX分析结果显示,预后相关miRNA模型与肿瘤病理学T分期和肿瘤位置具有相关性(Z=45.481, x2=10.176,P<0.05)。(6)影响胰腺癌患者预后的相关因素分析:单因素分析结果显示为肿瘤病理学N分期、接受放射治疗、接受分子靶向治疗、预后相关miRNA模型评分是胰腺癌患者预后的相关因素(风险比=2.471,0.290,0.172,2.001,95%可信区间为1.012~6.032,0.101~0.833,0.082~0.364,1.371~2.922,P<0.05)。多因素分析结果显示:接受分子靶向治疗是胰腺癌患者预后的独立保护因素(风险比=0.261,95%可信区间为0.116~0.588,P<0.05);预后相关miRNA模型评分≥1.16分是胰腺癌患者预后的独立危险因素(风险比=1.608,95%可信区间为1.091~2.369,P<0.05)。(7)预后相关miRNA模型与第 8版TNM分期预测效能的比较:在训练集中,预后相关miRNA模型对胰腺癌患者3、5年生存时间预测概率与第8版TNM分期比较,差异有统计学意义(Z=-1.671,-1.867,P<0.05)。预后相关miRNA模型与第 8版TNM分期AUC分别为0.797、0.935与0.737、0.703,95%可信区间分别为0.622~0.972、0.828~1.042与0.571~0.904、0.456~0.951;C-index值分别为0.643与0.534。在测试集中,预后相关miRNA模型对胰腺癌患者3、5年生存时间预测概率与第8版TNM分期比较,差异有统计学意义(Z=-1.729,-1.923,P<0.05)。预后相关miRNA模型与第8版TNM分期AUC分别为0.750、0.873与0.721、0.703,95%可信区间分别为0.553~0.948、0.720~1.025与0.553~0.889、0.456~0.950;C-index值分别为0.642与0.544。
    结论:基于5个与胰腺癌生存相关miRNA构建可用于胰腺癌患者生存预测的预后相关miRNA模型。该模型可与现有TNM分期系统及其他临床病理参数形成互补,为患者提供个体化、准确的生存时间预测,为临床治疗提供参考。

     

    Abstract: Objective:To construct a prognosis associated micro RNA(miRNA) prediction model based on bioinformatics analysis and evaluate its application value in pancreatic cancer patients.
    Methods:The retrospective cohort study was conducted. The clinicopathological data of 171 pancreatic cancer patients from the Cancer Genome Atlas (TCGA) (https://cancergenome.nih.gov/) between establishment of database and September 2017 were collected. There were 93 males and 78 females, aged from 35 to 88 years, with a median age of 65 years. Of the 171 patients, 64 had complete clinicopathological data. Patients were allocated into training dataset consisting of 123 patients and validation dataset consisting of 48 patients using the random sampling method, with a ratio of 7〖DK〗∶3. The training dataset was used to construct a prediction model, and the validation dataset was used to evaluate performance of the prediction model. Nine pairs of miRNA sequencing data (GSE41372) of pancreatic cancer and adjacent tissues were downloaded from Gene Expression Omnibus database. The candidate miRNAs were selected from differentially expressed miRNAs in pancreatic cancer and adjacent tissues for LASSO-COX regression analysis based on the patients of training dataset. A prognosis associated miRNA prediction model was constructed upon survival associated miRNAs which were selected from candidate differentially expressed miRNAs. The performance of prognosis associated miRNA prediction model was validated in training dataset and validation dataset, the accuracy of model was evaluated using the area under curve (AUC) of the receiver operating characteristic curves and the efficiency was evaluated using the consistency index (C-index). Observation indicarors: (1) survival of patients; (2) screening results of differentially expressed miRNAs; (3) construction of prognosis associated miRNA model; (4) validation of prognosis associated miRNA model; (5) comparison of clinicopathological factors in pancreatic cancer patients; (6) analysis of factors for prognosis of pancreatic cancer patients; (7) comparison of prediction performance between prognosis associated miRNA model and the eighth edition TNM staging. Measurement data with normal distribution were represented as Mean±SD, comparison between groups was analyzed by the student-t test, and comparison between multiple groups was analyzed by the AVONA. Measurement data with skewed data were represented as M (range), and comparison between groups was analyzed using the Mann-Whitney U test. Count data were described as absolute numbers or percentages, and comparison between groups was conducted using the chi-square test. Ordinal data were analyzed using the rank sum test. Correlation analysis was conducted based on count data to mine the correlation between prognosis associated miRNA model and clinicopathological factors. COX univariate analysis and multivariate analysis were applied to evaluate correlation with the results described as hazard ratio (HR) and 95% confidence interval (CI). HR<1 indicated the factor as a protective factor, HR>1 indicated the factor as a risk factor, and HR equal to 1 indicated no influence on survival. The Kaplan-Meier method was used to draw survival curve and calculate survival rates, and the Log-rank test was used for survival analysis.
    Results:(1) Survival of patients: 123 patients in the training dataset were followed up for 31-2 141 days, with a median follow-up time of 449 days. The 3- and 5-year survival rates were 16.67% and 8.06%. Forty-eight patients in the validation dataset were followed up for 41-2 182 days, with a median follow-up time of 457 days. The 3- and 5-year survival rates were 15.63% and 9.68%. There was no significant difference in the 3- or 5-year survival rates between the two groups (x2=0.017, 0.068, P>0.05). (2) Screening results of differentially expressed miRNAs. Results of bioinformatics analysis showed that 102 candidate differentially expressed miRNAs were selected, of which 63 were up-regulated in tumor tissues while 39 were down-regulated. (3) Construction of prognosis associated miRNA model: of the 102 candidate differentially expressed miRNAs, 5 survival associated miRNAs were selected, including miR-21, miR-125a-5p, miR-744, miR-374b, miR-664. The differential expression patterns of pancreatic cancer to adjacent tissues were up-regulation, up-regulation, down-regulation, up-regulation, and down-regulation, respectively, with the fold change of 4.00, 3.43, 3.85, 2.62, and 2.35. A prognostic expression equation constructed based on 5 survival associated miRNAs = 0.454×miR-21 expression level-0.492×miR-125a-5p expression level-0.49×miR-744 expression level-0.419×miR-374b expression level-0.036×miR-664 expression level. (4) Validation of prognosis associated miRNA model: The C-index of prognosis associated miRNA model was 0.643 and 0.642 for the training dataset and validation dataset,respectively. (5) Comparison of clinicopathological factors in pancreatic cancer patients: results of COX analysis showed that the prognosis associated miRNA model was highly related with pathological T stage and location of pancreatic cancer (Z=45.481, x2=10.176, P<0.05). (6) Analysis of factors for prognosis of pancreatic cancer patients: results of univariate analysis showed that pathological N stage, radiotherapy,molecular targeted therapy, score of prognosis associated miRNA model were related factors for prognosis pf pancreatic cancer patients (HR=2.471, 0.290, 0.172, 2.001, 95%CI: 1.012-6.032, 0.101-0.833, 0.082-0.364, 1.371-2.922, P<0.05). Results of multivariate analysis showed that molecular targeted therapy was an independent protective factor for prognosis of pancreatic cancer patients (HR=0.261, 95%CI: 0.116-0.588, P<0.05) and score of prognosis associated miRNA model≥1.16 was an independent risk factor for prognosis of pancreatic cancer patients (HR=1.608, 95%CI: 1.091-2.369, P<0.05). (7) Comparison of prediction performance between prognosis associated miRNA model and the eighth edition TNM staging: in the training dataset, there was a significant difference in the prediction probability for 3- and 5-year survival of pancreatic cancer patients between prognosis associated miRNA model and the eighth edition TNM staging (Z= -1.671,-1.867, P<0.05). The AUC of the prognosis associated miRNA model and the eight edition TNM staging for 3- and 5-year survival prediction was 0.797, 0.935 and 0.737 , 0.703, with the 95%CI of 0.622-0.972, 0.828-1.042 and 0.571-0.904 , 0.456-0.951. The C-index was 0.643 and 0.534. In the validation dataset, there was a significant difference in the prediction probability for 3- and 5-year survival of pancreatic cancer patients between prognosis associated miRNA model and the eighth edition TNM staging (Z=-1.729, -1.923, P<0.05). The AUC of the prognosis associated miRNA model and the eight edition TNM staging was 0.750, 0.873 and 0.721 , 0.703, with the 95%CI of 0.553-0.948, 0.720-1.025 and 0.553-0.889, 0.456-0.950, respectively. The C-index was 0.642 and 0.544.
    Conclusions:A prognosis associated miRNA prediction model can be constructed based on 5 survival associated miRNAs in pancreatic cancer patients, as a complementation to current TNM staging and other clinicopathological parameters, which provides individual and accurate prediction of survival for reference in the clinical treatment.

     

/

返回文章
返回