Integrating Large-Scale RNA-Seq and CLIP-Seq Datasets Enables Study of lncRNA

Long non-coding RNAs (lncRNAs) are emerging as important regulatory molecules in developmental, physiological, and pathological processes. However, the precise mechanism More »

Scientists discover long-sought genetic mechanism for cancer progression

Action of a key lncRNA different in colon cancer versus normal colon tissue Genetics researchers from Case Western Reserve School More »

MEG3 long noncoding RNA regulates the TGF-β pathway genes through formation of RNA-DNA triplex structures

Long noncoding RNAs (lncRNAs) regulate gene expression by association with chromatin, but how they target chromatin remains poorly understood. Researchers More »

LncRNA Regulator Of Brown Fat Identified

from Asian Scientist AsianScientist (Apr. 29, 2015) – A study by researchers in Duke-NUS Graduate Medical School Singapore (Duke-NUS) has More »

An update on LNCipedia – a database for annotated human lncRNA sequences

LNCipedia collects long non-coding RNA sequences and annotation from different sources. In version 3.0, over 90,000 new transcripts were added More »


The role of lncRNA in maintaining genome stability

Long non-coding RNAs (lncRNAs) are important players in diverse biological processes. Upon DNA damage, cells activate a complex signaling cascade referred to as the DNA damage response (DDR). Using a microarray screen, researchers from the National Cancer Institute identify here a novel lncRNA, DDSR1 (DNA damage-sensitive RNA1), which is induced upon DNA damage. DDSR1 induction is triggered in an ATM-NF-κB pathway-dependent manner by several DNA double-strand break (DSB) agents. Loss of DDSR1 impairs cell proliferation and DDR signaling and reduces DNA repair capacity by homologous recombination (HR). The HR defect in the absence of DDSR1 is marked by aberrant accumulation of BRCA1 and RAP80 at DSB sites. In line with a role in regulating HR, DDSR1 interacts with BRCA1 and hnRNPUL1, an RNA-binding protein involved in DNA end resection.


This study establishes a role for the lncRNA DDSR1 in maintaining genome stability. DDSR1 promotes homologous recombination by regulating recruitment of DNA repair factors to DSB after DNA damage.

  • The lncRNA DDSR1 is induced upon DNA damage and interacts with BRCA1 and the RNA-binding repair protein hnRNPUL1.
  • DDSR1 and hnRNPUL1 interact to form a complex which prevents BRCA1 from promiscuous DNA binding and fine-tunes the recruitment of BRCA1 to DSBs upon DNA damage.
  • Absence of DDSR1 or hnRNPUL1 during DNA damage leads to increased recruitment of RAP80 and BRCA1 to DSBs to limit HR.

Sharma V, Khurana S, Kubben N, Abdelmohsen K, Oberdoerffer P, Gorospe M, Misteli T. (2015) A BRCA1-interacting lncRNA regulates homologous recombination. EMBO Rep [Epub ahead of print]. [abstract]

Integrating Large-Scale RNA-Seq and CLIP-Seq Datasets Enables Study of lncRNA


Long non-coding RNAs (lncRNAs) are emerging as important regulatory molecules in developmental, physiological, and pathological processes. However, the precise mechanism and functions of most of lncRNAs remain largely unknown. Recent advances in high-throughput sequencing of immunoprecipitated RNAs after cross-linking (CLIP-Seq) provide powerful ways to identify biologically relevant protein-lncRNA interactions.

In this study, researchers at Sun Yat-sen University analyzed millions of RNA-binding protein (RBP) binding sites from 117 CLIP-Seq datasets generated by 50 independent studies and identified 22,735 RBP-lncRNA regulatory relationships.

The researchers found that one single lncRNA will generally be bound and regulated by one or multiple RBPs, the combination of which may coordinately regulate gene expression. They also revealed the expression correlation of these interaction networks by mining expression profiles of over 6000 normal and tumor samples from 14 cancer types. Our combined analysis of CLIP-Seq data and genome-wide association studies data discovered hundreds of disease-related single nucleotide polymorphisms resided in the RBP binding sites of lncRNAs.

Finally, the researchers developed interactive web implementations to provide visualization, analysis, and downloading of the aforementioned large-scale datasets.

Availability – StarBase V2.0 is available at:

  • Li JH, Liu S, Zheng LL, Wu J, Sun WJ, Wang ZL, Zhou H, Qu LH, Yang JH. (2015) Discovery of Protein-lncRNA Interactions by Integrating Large-Scale CLIP-Seq and RNA-Seq Datasets. Front Bioeng Biotechnol 2:88. [article]

C-It-Loci – a knowledge database for tissue-enriched loci

Increasing evidences suggest that most of the genome is transcribed into RNAs, but many of them are not translated into proteins. All those RNAs that do not become proteins are called ‘non-coding RNAs (ncRNAs)’, which outnumbers protein-coding genes. Interestingly, these ncRNAs are shown to be more tissue specifically expressed than protein-coding genes. Given that tissue-specific expressions of transcripts suggest their importance in the expressed tissue, researchers are conducting biological experiments to elucidate the function of such ncRNAs. Owing greatly to the advancement of next-generation techniques, especially RNA-seq, the amount of high-throughput data are increasing rapidly. However, due to the complexity of the data as well as its high volume, it is not easy to re-analyze such data to extract tissue-specific expressions of ncRNAs from published datasets.

Here, researchers from Goethe University Frankfurt introduce a new knowledge database called ‘C-It-Loci’, which allows a user to screen for tissue-specific transcripts across three organisms: human, mouse and zebrafish. C-It-Loci is intuitive and easy to use to identify not only protein-coding genes but also ncRNAs from various tissues. C-It-Loci defines homology through sequence and positional conservation to allow for the extraction of species-conserved loci. C-It-Loci can be used as a starting point for further biological experiments.


Scheme of C-It-Loci. (a) Flowchart of building of C-It-Loci. All the analyzed results were imported as MySQL data tables into C-It-Loci. (b) Definition of CGP. The genomic coordinates from one protein-coding gene (‘Gene A’) to the immediately downstream protein-coding gene (‘Gene B’) are defined as one locus unit. When homologous protein-coding genes are found in another species for both protein-coding genes in the locus, this locus is defined as ‘conserved locus’, which we called ‘C-It-Loci Genomic Positions (CGP)’

Availability – C-It-Loci is freely available online without registration at

  • Weirick T, John D, Dimmeler S, Uchida S. (2015) C-It-Loci: a knowledge database for tissue-enriched loci. Bioinformatics [Epub ahead of print]. [abstract]

Penn/Arizona Team to Study Little-understood lncRNA Molecules


There is a theory that RNA, instead of DNA, is the original building block of all life. Yet many RNA molecules remain mysterious, their true nature and function little understood.

Now, with an award of more than $2.5 million from the National Science Foundation’s Plant Genome Research Program, the University of Pennsylvania’s Brian Gregory will join two scientists from the University of Arizona to study the true nature of a class of mysterious RNA molecules known as lncRNA.

The research project is expected to take four years, over the course of which, “we hope to gain a greater understanding of this potentially important class of molecules, their biology and their function in the cell nucleus,” said Gregory, an assistant professor of biology in Penn’s School of Arts & Sciences and a leading expert in RNA regulation of cellular processes.

Co-LncRNA – investigating the lncRNA combinatorial effects in GO annotations and KEGG pathways based on human RNA-Seq data

Long non-coding RNAs (lncRNAs) are emerging as key regulators of diverse biological processes and diseases. However, the combinatorial effects of these molecules in a specific biological function are poorly understood. Identifying co-expressed protein-coding genes of lncRNAs would provide ample insight into lncRNA functions.

To facilitate such an effort, researchers at Harbin Medical University, China have developed Co-LncRNA, which is a web-based computational tool that allows users to identify GO annotations and KEGG pathways that may be affected by co-expressed protein-coding genes of a single or multiple lncRNAs. LncRNA co-expressed protein-coding genes were first identified in publicly available human RNA-Seq datasets, including 241 datasets across 6560 total individuals representing 28 tissue types/cell lines. Then, the lncRNA combinatorial effects in a given GO annotations or KEGG pathways are taken into account by the simultaneous analysis of multiple lncRNAs in user-selected individual or multiple datasets, which is realized by enrichment analysis.

In addition, this software provides a graphical overview of pathways that are modulated by lncRNAs, as well as a specific tool to display the relevant networks between lncRNAs and their co-expressed protein-coding genes. Co-LncRNA also supports users in uploading their own lncRNA and protein-coding gene expression profiles to investigate the lncRNA combinatorial effects. It will be continuously updated with more human RNA-Seq datasets on an annual basis. Taken together, Co-LncRNA provides a web-based application for investigating lncRNA combinatorial effects, which could shed light on their biological roles and could be a valuable resource for this community.


Flowchart used in Co-LncRNA for investigating the combinatorial effects of lncRNAs in GO annotations and KEGG pathways.


  • Zhao Z, Bai J, Wu A, Wang Y, Zhang J, Wang Z, Li Y, Xu J, Li X. (2015) Co-LncRNA: investigating the lncRNA combinatorial effects in GO annotations and KEGG pathways based on human RNA-Seq data. Database (Oxford). 2015 Sep 10. [article]