UClncR – Ultrafast and comprehensive long non-coding RNA detection from RNA-seq

Long non-coding RNA (lncRNA) is a large class of gene transcripts with regulatory functions discovered in recent years. Many more are expected to be revealed with accumulation of RNA-seq data from diverse types of normal and diseased tissues. However, discovering novel lncRNAs and accurately quantifying known lncRNAs is not trivial from massive RNA-seq data.

Reserchers from the Mayo Clinic have developed UClncR, an Ultrafast and Comprehensive lncRNA detection pipeline to tackle the challenge. UClncR takes standard RNA-seq alignment file, performs transcript assembly, predicts lncRNA candidates, quantifies and annotates both known and novel lncRNA candidates, and generates a convenient report for downstream analysis. The pipeline accommodates both un-stranded and stranded RNA-seq so that lncRNAs overlapping with other genes can be predicted and quantified. UClncR is fully parallelized in a cluster environment yet allows users to run samples sequentially without a cluster. The pipeline can process a typical RNA-seq sample in a matter of minutes and complete hundreds of samples in a matter of hours. Analysis of predicted lncRNAs from two test datasets demonstrated UClncR’s accuracy and their relevance to sample clinical phenotypes.

UClncR workflow diagram

rna-seq

The workflow starts from aligned bam (right parameters for stranded/unstranded RNA-seq should be set) for transcript assembly by StringTie. For un-stranded RNA-seq, the workflow only works with lincRNAs. Known lincRNAs are simply quantified and novel lincRNAs are predicted and quantified. For stranded RNA-seq, overlap transcripts in the opposite strand are quantified and predicted.

Availability – UClncR is publically available at http://bioinformaticstools.mayo.edu/research/UClncR .

Sun Z, Nair A, Chen X, Prodduturi N, Wang J, Kocher JP. (2017) UClncR: Ultrafast and comprehensive long non-coding RNA detection from RNA-seq. Sci Rep 7(1):14196. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *

*