China Oncology ›› 2016, Vol. 26 ›› Issue (10): 801-812.doi: 10.19401/j.cnki.1007-3639.2016.10.001

Previous Articles     Next Articles

Identification and validation of a novel gene expression signature for diagnosing tumor tissue origin

WANG Qifeng1, XU Qinghua2, CHEN Jinying2, QIAN Chenhui2, LIU Xiaojian3, DU Xiang1   

  1. 1.Department of Pathology, Fudan University Shanghai Cancer Center, Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China; 2.Canhelp Genomics Co., Ltd, Hangzhou 311188, Zhejiang Province, China; 3.Department of Chemotherapy, Fudan University Shanghai Cancer Center, Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
  • Online:2016-10-30 Published:2016-11-17
  • Contact: DU Xiang E-mail: dx2008cn@163.com

Abstract: Background and purpose: Cancer of unknown primary (CUP) represents approximately 5%~10% of malignant neoplasms. For CUP patients, identification of tumor origin allows for more specific therapeutic regimens and improves outcomes. Methods: By retrieving the gene expression data from ArrayExpress and Gene Expression Omnibus data repositories, we established a comprehensive gene expression database of 5 800 tumor samples encompassing 22 main tumor types. The support vector machine-recursive feature elimination algorithm was used for feature selection and classification modelling. We further optimized the RNA isolation and real-time quantitative polymerase chain reaction (RTQ-PCR) methods for candidate gene expression profiling and applied the RTQ-PCR assays to a set of formalin-fixed, paraffin-embedded tumor samples. Results: Based on the pan-cancer transcriptome database, we identified a list of 96-tumor specific genes, including common tumor markers, such as cadherin 1 (CDH1), kallikrein-related peptidase 3 (KLK3), and epidermal growth factor receptor (EGFR). Furthermore, we successfully translated the microarray-based gene expression signature to the RTQ-PCR assays, which allowed an overall success rate of 88.4% (95%CI: 83.2%-92.4%) in classifying 22 different tumor types of 206 formalin-fixed, paraffin-embedded samples. Conclusion: The 96-gene RTQ-PCR assay represents a useful tool for accurately identifying tumor origins. The assay uses RTQ-PCR and routine formalin-fixed, paraffin-embedded samples, making it suitable for rapid clinical adoption.

Key words: Cancer of unknown primary, Tumor tissue origin, Gene expression profiling, Real-time quantitative polymerase chain reaction, Immunohistochemistry