中国癌症杂志 ›› 2020, Vol. 30 ›› Issue (8): 636-640.doi: 10.19401/j.cnki.1007-3639.2020.08.012

• 论著 • 上一篇    

基于CT影像特征的非小细胞肺癌复发相关性因素研究

鲁晓腾,许 青   

  1. 复旦大学附属肿瘤医院放疗科,复旦大学上海医学院肿瘤学系,上海 200032
  • 出版日期:2020-08-30 发布日期:2020-09-04
  • 通信作者: 许 青 E-mail: qingxu68@hotmail.com

A study on factors associated with recurrence of non-small cell lung cancer based on CT image features

LU Xiaoteng, XU Qing   

  1. Department of Radiation Oncology, Fudan University Shanghai Cancer Center; Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
  • Published:2020-08-30 Online:2020-09-04
  • Contact: XU Qing E-mail: qingxu68@hotmail.com

摘要: 背景与目的:基于CT图像特征对非小细胞肺癌(non-small cell lung cancer,NSCLC)患者的复发相关性因素进行探究。方法:选用NSCLC-Radiogenomics数据库中的157组数据。首先,对肺部肿瘤及其图像特征进行提取;然后,使用独立样本t检验对特征数据进行单因素分析,使用logistic回归模型进行进一步分析,得到NSCLC复发情况的显著性相关因素;其次,使用Z-score标准化方法对数据进行标准化处理,采用合成少数过采样技术(synthetic minority over-sampling technique,SMOTE)算法对标准化后的数据进行平衡化操作;最后,利用随机森林、K最邻近算法(K-nearest neighbor,KNN)、支持向量机(support vector machine,SVM)、决策树算法以及留一交叉验证方法训练分类器并检验相关性因素对患者复发情况的预测能力。结果:独立样本t检验分析结果显示,Variance、Energy、Relative message、和熵以及Coarseness与NSCLC复发情况相关(P<0.05)。Logistic回归分析显示,Energy及和熵与NSCLC复发情况显著相关P<0.05),分类器分类结果显示最高分类准确率为82.7%,最大曲线下面积(area under curve,AUC)为0.891,即这两种特征可以对患者复发情况作出较为准确的预测。结论:Energy以及和熵是非小细胞肺癌复发的显著性相关因素。

关键词: 非小细胞肺癌, 图像特征, 复发, 分类器

Abstract: Background and purpose: The purpose of this paper was to explore the factors associated with non-small cell lung cancer (NSCLC) patient’s recurrence situation based on CT image features. Methods: A hundred and fifty-seven sets of data collected in NSCLC radiogenomics database were used in the experiment. The lung tumors were segmented, and image features were extracted. Independent samples t test was used to perform a univariate analysis. And logistic regression model was used to obtain the significant factors associated with NSCLC recurrence. Z-score normalization and synthetic minority over-sampling technique (SMOTE) methods were used to analyze data. Finally, random forest, K-nearest neighbor (KNN), support vector machine (SVM), decision-tree and leave-one-out cross validation were used to train classifier and test the validity of results. Results: The independent samples t test showed that Variance, Energy, Relative message, Add-entropy and Coarseness were related to NSCLC recurrence (P<0.05). And the logistic regression analysis showed that Energy and Add-entropy were significantly correlated with NSCLC recurrence (P<0.05). Furthermore, the classification results revealed that the best accuracy was 82.7% and the maximum area under curve (AUC) was 0.891. These two features could make a well prediction for NSCLC patient’s recurrence. Conclusion: Energy and Add-entropy were the factors significantly associated with NSCLC recurrence.

Key words: Non-small cell lung cancer, Image features, Recurrence, Classifiers