PredcircRNA: computational classification of circular RNA from other long non-coding RNA using hybrid features

Recently circular RNA (circularRNA) has been discovered as an increasingly important type of long non-coding RNA (lncRNA), playing an important role in gene regulation, such as functioning as miRNA sponges. So it is very promising to identify circularRNA transcripts from de novo assembled transcripts obtained by high-throughput sequencing, such as RNA-seq data.


In this study, researchers from the University of Copenhagen present a machine learning approach, named as PredcircRNA, focused on distinguishing circularRNA from other lncRNAs using multiple kernel learning. Firstly they extracted different sources of discriminative features, including graph features, conservation information and sequence compositions, ALU and tandem repeats, SNP densities and open reading frames (ORFs) from transcripts. Secondly, to better integrate features from different sources, they proposed a computational approach based on a multiple kernel learning framework to fuse those heterogeneous features. Their preliminary 5-fold cross-validation result showed that our proposed method can classify circularRNA from other types of lncRNAs with an accuracy of 0.778, sensitivity of 0.781, specificity of 0.770, precision of 0.784 and MCC of 0.554 in our constructed gold-standard dataset, respectively. Their feature importance analysis based on Random Forest illustrated some discriminative features, such as conservation features and a GTAG sequence motif.

Availability – PredcircRNA tool is available for download at:

  • Pan X, Xiong K. (2015) PredcircRNA: computational classification of circular RNA from other long non-coding RNA using hybrid features. Mol Biosyst [Epub ahead of print]. [abstract]

Non-Coding RNA and Evolution of Complexity


Non-coding DNA in genomes increases in concert with the increase in developmental complexity in evolution, and is consonant with the important regulatory roles identified for the many classes of non-coding RNAs transcribed from more than 85 % of the DNA regarded as ‘junk’ not so long ago Dr Mae-Wan Ho

A vast RNA underworld exposed

It wasn’t so long ago that most people still believed DNA carries the instructions for making an organism, while RNA simply copies out (transcribes) the instructions (by complementary base pairing) that are then translated into protein via a genetic code, in which different triplets of bases (codons) specify one of twenty amino acids plus start and stop signals. The proteins are the real workhorses in this hierarchy, with the DNA akin to the Holy Scripture – ‘Book of Life’ the Central Dogma – faithfully copied and transmitted by scribes (RNA), to be interpreted and implemented by the faithful (proteins).

But soon after the human genome sequence was announced, it became clear that RNA plays a much more substantive, central role than previously thought.

What are lncRNAs?



It was traditionally thought that the transcriptome would be mostly comprised of mRNAs, however advances in high-throughput RNA sequencing technologies have revealed the complexity of our genome. Non-coding RNA is now known to make up the majority of transcribed RNAs and in addition to those that carry out well-known housekeeping functions (e.g. tRNA, rRNA etc), many different types of regulatory RNAs have been and continue to be discovered. Many of these non-coding RNAs are thought to have a wide range of functions in cellular and developmental processes.

Long noncoding RNAs (lncRNAs) are a large and diverse class of transcribed RNA molecules with a length of more than 200 nucleotides that do not encode proteins. Their expression is developmentally regulated and lncRNAs can be tissue- and cell-type specific. A significant proportion of lncRNAs are located exclusively in the nucleus. They are comprised of many types of transcripts that can structurally resemble mRNAs, and are sometimes transcribed as whole or partial antisense transcripts to coding genes. LncRNAs are thought to carry out important regulatory functions, adding yet another layer of complexity to our understanding of genomic regulation.

A summary of the various functions described for lncRNA. (Click for a larger image)

Functions of lncRNA

LncRNAs may exert their functions…

