中国癌症杂志 ›› 2016, Vol. 26 ›› Issue (10): 801-812.doi: 10.19401/j.cnki.1007-3639.2016.10.001

• 特约专家论著及述评 • 上一篇    下一篇

一种新型肿瘤组织起源分子标志物的建立与评价

王奇峰1,徐清华2,陈金影2,钱琛晖2,刘晓健3,杜 祥1   

  1. 1. 复旦大学附属肿瘤医院病理科,复旦大学上海医学院肿瘤学系,上海 200032 ;
    2. 杭州可帮基因科技有限公司,浙江 杭州 311188 ;
    3. 复旦大学附属肿瘤医院化疗科,复旦大学上海医学院肿瘤学系,上海 200032
  • 出版日期:2016-10-30 发布日期:2016-11-17
  • 通信作者: 杜 祥 E-mail: dx2008cn@163.com

Identification and validation of a novel gene expression signature for diagnosing tumor tissue origin

WANG Qifeng1, XU Qinghua2, CHEN Jinying2, QIAN Chenhui2, LIU Xiaojian3, DU Xiang1   

  1. 1.Department of Pathology, Fudan University Shanghai Cancer Center, Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China; 2.Canhelp Genomics Co., Ltd, Hangzhou 311188, Zhejiang Province, China; 3.Department of Chemotherapy, Fudan University Shanghai Cancer Center, Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
  • Published:2016-10-30 Online:2016-11-17
  • Contact: DU Xiang E-mail: dx2008cn@163.com

摘要: 背景与目的:原发灶不明恶性肿瘤是一类转移性肿瘤的统称,在诊断时无法找到原发位点,约占所有恶性肿瘤的5%~10%。明确肿瘤的组织起源对于患者的诊断和治疗具有重要意义。方法:整合ArrayExpress和Gene Expression Omnibus数据库中肿瘤类型明确的样本数据,构建涵盖22种常见肿瘤类型、5 800例样本的基因表达谱数据库;通过支持向量机递归特征消除算法筛选组织特异性基因,建立肿瘤分类模型;采用实时定量聚合酶链反应(real-time quantitative polymerase chain reaction,RTQ-PCR)检测石蜡包埋肿瘤组织中基因的表达水平,并将基因分型结果与病理诊断结果进行比较。结果:基于肿瘤基因表达谱大数据,筛选出96个组织特异性基因,其中包含常见的肿瘤相关基因,如钙黏蛋白1(cadherin 1,CDH1)、激肽释放酶相关酶3(kallikrein related peptidase 3,KLK3)和表皮生长因子受体(epidermal growth factor receptor,EGFR)等。在206例石蜡包埋组织样本中,182例的基因分型结果与病理诊断结果一致,准确率达到88.4%(95%CI:83.2%~92.4%)。结论:96基因RTQ-PCR检测对22种常见肿瘤类型具有较好的分类性能,可作为临床和病理诊断的辅助工具。

关键词: 原发灶不明恶性肿瘤, 肿瘤组织起源, 基因表达谱, 实时定量聚合酶链反应, 免疫组化

Abstract: Background and purpose: Cancer of unknown primary (CUP) represents approximately 5%~10% of malignant neoplasms. For CUP patients, identification of tumor origin allows for more specific therapeutic regimens and improves outcomes. Methods: By retrieving the gene expression data from ArrayExpress and Gene Expression Omnibus data repositories, we established a comprehensive gene expression database of 5 800 tumor samples encompassing 22 main tumor types. The support vector machine-recursive feature elimination algorithm was used for feature selection and classification modelling. We further optimized the RNA isolation and real-time quantitative polymerase chain reaction (RTQ-PCR) methods for candidate gene expression profiling and applied the RTQ-PCR assays to a set of formalin-fixed, paraffin-embedded tumor samples. Results: Based on the pan-cancer transcriptome database, we identified a list of 96-tumor specific genes, including common tumor markers, such as cadherin 1 (CDH1), kallikrein-related peptidase 3 (KLK3), and epidermal growth factor receptor (EGFR). Furthermore, we successfully translated the microarray-based gene expression signature to the RTQ-PCR assays, which allowed an overall success rate of 88.4% (95%CI: 83.2%-92.4%) in classifying 22 different tumor types of 206 formalin-fixed, paraffin-embedded samples. Conclusion: The 96-gene RTQ-PCR assay represents a useful tool for accurately identifying tumor origins. The assay uses RTQ-PCR and routine formalin-fixed, paraffin-embedded samples, making it suitable for rapid clinical adoption.

Key words: Cancer of unknown primary, Tumor tissue origin, Gene expression profiling, Real-time quantitative polymerase chain reaction, Immunohistochemistry