Long noncoding RNAs (lncRNAs) are emerging as potential key regulators in gene expression networks and exhibit a surprising range of shapes and sizes. Several distinct classes of lncRNAs are transcribed from different DNA elements, including promoters, enhancers, and intergenic regions in eukaryotic genomes. Additionally, others are derived from long primary transcripts with noncanonical RNA processing pathways, generating new RNA species with unexpected formats. These lncRNAs can be processed by several mechanisms, including ribonuclease P (RNase P) cleavage to generate mature 3′ ends, capping by small nucleolar RNA (snoRNA)–protein (snoRNP) complexes at their ends, or the formation of circular structures. Here researchers from ShanghaiTech University review current knowledge on lncRNAs and highlight the most recent discoveries of the underlying mechanisms related to their formation.
- Eukaryotic DNA transcription and RNA processing yield a diverse catalog of long noncoding RNAs (lncRNAs) that are longer than 200 nucleotides and lack significant protein-coding potential.
- lncRNAs transcribed from promoters and enhancers are usually targeted by nuclear exosomes and have short half-lives.
- Although they have a 7-methyl guanosine (m7G) cap and 3′ poly(A) at their ends, the mRNA-like long intervening/intergenic ncRNAs (lincRNAs) have patterns of transcription and processing distinct from those of mRNAs.
- New types of linear lncRNA species are stabilized by various mechanisms, including the processing of 3′ ends by endoribonucleases, of 5′ ends by small nucleolar RNA–protein (snoRNP) caps, or of both ends by snoRNP protection.
- Circular RNAs represent yet another new type of lncRNA that is processed from back-spliced exons or spliced intron lariats of RNA polymerase II-transcribed RNA precursors.
New Long Noncoding RNA (lncRNA) Species Generated from Unusual Processing Pathways
(A) Ribonuclease P (RNase P) processing of the 3′ end of MALAT1. The nascent metastasis-associated lung adenocarcinoma transcript 1 (MALAT1) transcript forms a tRNA-like structure at its 3′ end, which can be recognized and cleaved by RNase P to generate stable MALAT1 with a U-A·U triple-helical structure at the 3′ end. (B) Processing of small nucleolar RNA (snoRNA)-ended lncRNAs (sno-lncRNAs). sno-lncRNAs are formed when one intron contains two snoRNA genes. (C) The diversity of lncRNAs related to snoRNAs (sno-processed lncRNAs). Four types of sno-lncRNA have been found in mammalian genomes and their ends are both capped by a Box C/D or a Box H/ACA snoRNA protein (snoRNP) complex (blue box) or each capped by one Box C/D or one Box H/ACA snoRNP (red box). (D) Species- or cell-type-specific expression of sno-lncRNA. One example is shown to illustrate that alternative splicing (AS) leads to two snoRNAs embedded within one intron and therefore sno-lncRNA formation. (E) Processing of SPA. SPA is derived from readthrough transcripts and its processing is associated with the kinetic competition of XRN2 and Pol II downstream of polyadenylation signals.