To comprehensively detect diverse and novel RNA species, researchers from Washington University, St. Louis compared deep small RNA and RNA sequencing (RNA-seq) methods applied to a primary acute myeloid leukemia (AML) sample. They were able to discover previously unannotated small RNAs using deep sequencing of a library method employing broader insert size selection. The researchers analyzed the long non-coding RNA (lncRNA) landscape in AML by comparing deep sequencing from multiple RNA-seq library construction methods for the sample they studied, then integrating RNA-seq data from 179 AML cases. This identified lncRNAs that are completely novel, differentially expressed, and associate with specific AML subtypes. This study revealed the complexity of the non-coding RNA transcriptome through a combined strategy of strand-specific small RNA and total RNA-seq. This dataset will serve as an invaluable resource for future RNA based analyses.
Unannotated lncRNAs in AML
(A) Expression level of unannotated lncRNAs (with multiple exons) in 4 types of TCGA AML31 RNA-seq data, TCGA-AB-2969, and other TCGA AMLs datasets. A large percent (78%) of the expressed unannotated lncRNAs detected in MGI AML31 data are also detected in TCGA-AB-2969 RNA-seq data with ≥1 RNA-seq reads. The remaining lncRNAs are not detected in TCGA-AB-2969 RNA-seq data, but 224 of them are detected in other TCGA AML RNA-seq datasets. Genes are sorted according to whether they were detected in TCGA-AB-2969 and by FPKM values. The heatmap shows that almost all of the unannotated lncRNAs discovered by MGI AML31 data are also recurrent in other datasets with at least one read, and the single TCGA-AB-2969 RNA-seq data missed a subset (22%) of the unannotated lncRNAs without any reads, most of which have very low FPKM according to the TCGA AML cohort to assemble. (B) Comparison of MGI AML31 RNA-seq data with TCGA-AB-2969 RNA-seq data in discovering unannotated lncRNAs. (C) An example of an unannotated lncRNA that was identified in the MGI AML31 dataset. Also see Table S5 for the unannotated lncRNA transcripts.