Long noncoding RNAs (lncRNAs) are key regulators of diverse cellular processes. Recent advances in high-throughput sequencing have allowed for an unprecedented discovery of novel lncRNAs. To identify functional lncRNAs from thousands of candidates for further functional validation is still a challenging task.
Here, researchers from the Chinese University of Hong Kong present a novel computational framework, lncFunNet (lncRNA Functional inference through integrated Network) that integrates ChIP-seq, CLIP-seq and RNA-seq data to predict, prioritize and annotate lncRNA functions. In mouse embryonic stem cells (mESCs), using lncFunNet they not only recovered most of the functional lncRNAs known to maintain mESC pluripotency but also predicted a plethora of novel functional lncRNAs. Similarly, in mouse myoblast C2C12 cells, applying lncFunNet led to prediction of reservoirs of functional lncRNAs in both proliferating myoblasts (MBs) and differentiating myotubes (MTs). Further analyses demonstrated that these lncRNAs are frequently bound by key transcription factors, interact with miRNAs and constitute key nodes in biological network motifs. Further experimentations validated their dynamic expression profiles and functionality during myoblast differentiation. Collectively, these studies demonstrate the use of lncFunNet to annotate and identify functional lncRNAs in a given biological system.
Schematic view of lncFunNet
The lncFunNet composes of three consecutive modules: network integration, lncRNA functionality prediction, and lncRNA functional annotation modules. (A) Inferring TF–lncRNA, TF–miRNA and TF–PCG interactions using ChIP-seq data. (B) Establishing miRNA mediated interactions among miRNAs, lncRNAs, TFs and PCGs. (C) Using gene expression correlation from RNA-seq to infer interactions among lncRNAs and other network components. (D) Constructing a gene regulatory network by integrating the above three sub-networks. (E) Optimizing the weights for the above three types of nodes by logistic regression and calculating a functional information score (FIS) for each lncRNA based on its network interactions (left panel) and selecting functional lncRNAs by calculating false discovery rate (FDR) obtained through comparing to the randomized networks (right panel). (F) Annotating lncRNA functions using GO terms or KEGG pathways associated with its interacting partners.