Recent advances in RNA-sequencing technologies have led to the discovery of thousands of previously unannotated noncoding transcripts, including many long noncoding RNAs (lncRNAs) whose functions remain largely unknown. Discussed here are considerations and best practices in lncRNA identification and annotation, which we hope will foster functional and mechanistic exploration.
Perhaps the biggest surprise of the postgenomic era has been the enormous number and diversity of transcriptional products arising from the previously presumed wastelands of the non-protein-coding genome. These include a plethora of small regulatory RNAs and tens of thousands of polyadenylated and nonpolyadenylated lncRNAs that are antisense, intronic, intergenic and overlapping with respect to protein-coding loci. The functions of these transcripts are largely unknown, although there is increasing in vitro and in vivo evidence that lncRNAs have key roles across diverse biological processes, with an emerging theme of interfacing with epigenetic regulatory pathways. Thus, the sheer number and the increasing pace of discovery of new lncRNAs are accompanied by the growing challenge of their definition and annotation.
The broad term lncRNA refers to a transcript >200 nt in length that does not appear to contain a protein-coding sequence. The size threshold is an arbitrary but convenient biophysical cutoff that excludes most known, although still poorly understood, classes of small infrastructural and regulatory RNAs, such as tRNAs, small nuclear RNAs, small nucleolar RNAs and their derivatives, microRNAs, short interfering RNAs, Piwi-interacting RNAs, transcription-initiation RNAs and small RNAs that regulate splicing. Occasionally other terminology, such as transcripts of unknown function (TUFs) and transcriptionally active regions (TARs), has been suggested, but the consensus has settled on the generic descriptor lncRNA, at least for the time being. (read more…)
- Mattick JS, Rinn JL. (2015) Discovery and annotation of long noncoding RNAs. Nat Struct Mol Biol 22(1):5-7. [abstract]