Long non-coding RNAs (lncRNAs) have emerged as key players in a remarkably variety of biological processes and pathologic conditions, including cancer. Next-generation sequencing technologies and bioinformatics procedures predict the existence of tens of thousands of lncRNAs, from which we know the functions of only a handful of them, and very little is known in cancer types such as head and neck squamous cell carcinomas (HNSCCs).
Here, researchers at CIEMAT, Spain use RNA-seq expression data from The Cancer Genome Atlas (TCGA) and various statistic and software tools in order to get insight about the lncRNome in HNSCC. Based on lncRNA expression across 426 samples, they discover five distinct tumor clusters that they compare with reported clusters based on various genomic/genetic features. Results demonstrate significant associations between lncRNA-based clustering and DNA methylation, TP53 mutation, and human papillomavirus infection. Using “guilt-by-association” procedures, the researchers infer the possible biological functions of representative lncRNAs of each cluster. Furthermore, they found that lncRNA clustering is correlated with some important clinical and pathologic features, including patient survival after treatment, tumor grade, or sub-anatomical location.
lncRNA clusters and other molecular aberrations
a LncRNA-based clustering of HNSCC samples is significantly associated with clustering based on diverse molecular features, mainly DNA methylation and expression of PCGs (mRNA). Significance values are plot upon chi-square test computation. b, c Association with HPV infection and HNSCC mutations with lncRNA clusters. Chi-square or odds ratio values are plot upon chi-square test (b) or Fisher’s exact test (c) computation, respectively. Dashed red line: threshold of significance (p val <0.05). d Distribution of lncRNA clusters and HPV-infected samples or samples with mutations in KMT2D or NSD1. Note the enrichment of HPV infection in c5, the NSD1 mutations in c1, and the KMT2D mutations in c2 (red lines) and the depletion of KMT2D mutations in c4 (green line). p values are calculated with Fisher’s exact test. Vertical black lines in d showed HPV+ samples and mutated samples for the selected genes KMT2D and NSD1, respectively