by Dinar Yunusov
In the article called “HIPSTR and thousands of lncRNAs are heterogeneously expressed in human embryos, primordial germ cells and stable cell lines”, we report a story of how a functional study of a single gene can provide an important insight into a general biological phenomenon, such as overall low expression levels of long non-coding RNAs (lncRNAs).
Our project started as a characterization study of a novel gene, HIPSTR, which turned out to be a really “non-mainstream”, “hipster” antisense lncRNA gene. Unlike previously described antisense lncRNAs, expression levels of HIPSTR and its sense counterpart gene (TFAP2A) neither correlated in tissues and cell lines, nor showed consistent co-induction in developmental models. TFAP2A is known to be the gene encoding a transcription factor involved in development, which is also associated with various cancers, yet we could not find any association between HIPSTR expression and tumor or normal phenotype. Interestingly, we demonstrated that evolutionarily conserved HIPSTR lncRNA is activated during the major wave of human and mouse embryonic genome activation (EGA), independently from, and prior to other genes of the TFAP2A locus.
Here is where the story of HIPSTR made an unexpected and a very intriguing twist. When we used available single-cell RNA-seq data, it became obvious that in human embryos, which are as big as 8 cells when human EGA occurs, HIPSTR is not induced in all blastomeres, but only in a subset of them. In these cells, HIPSTR expression levels were up to 5-fold higher, compared to its average levels in the embryo. We next asked whether such pattern was an exception, and found that thousands of other transcripts had high cell-to-cell variability of expression in human embryos. The same heterogeneous pattern of expression of numerous human genes was observed in the population of human primordial germ cells, and in stable cell lines. Our systematic analyses showed that, although common for both – lncRNA and protein-coding genes, cell-to-cell variability in expression of lncRNA genes was significantly higher than that of expression-matched protein-coding genes. A deeper look into the phenomenon of gene expression heterogeneity revealed that a degree of expression heterogeneity for a given gene is dynamic and depends on the system in consideration.
Recent work by Petropoulos et al. reported that individual cells of early human embryos at E3 and E4 (when HIPSTR is expressed) cannot be grouped based on their transcriptomic profiles. We therefore believe that expression heterogeneity in the early embryos, where the number of cells is small and finite, may serve as a starting material for cell fate decisions later in development.
To date, it is widely accepted that lncRNAs function as modular scaffolds for regulatory proteins. Our work emphasizes the importance of a careful evaluation of RNA Immunoprecipitation and endogenous-RNA pulldown assays that aim at identification of protein partners of lncRNAs. High heterogeneity of lncRNAs expression represents a serious obstacle in such experiments and we speculate that development of reliable and easy-to-use techniques facilitating enrichment for subpopulations of live cells expressing a lncRNA of interest is necessary.