An update on LNCipedia – a database for annotated human lncRNA sequences

LNCipedia collects long non-coding RNA sequences and annotation from different sources. In version 3.0, over 90,000 new transcripts were added to the database. 6917 of these transcripts were obtained from RefSeq by filtering for accession prefix (NR_) and size (200bp). This filtering strategy however, does not confine to long non-coding RNAs and also yields transcripts associated with protein coding genes. Transcripts with incomplete open reading frames that are subject to nonsense-mediated mRNA decay for instance are also annotated with accession prefix NR_. These transcripts are generally not considered as true lncRNAs and typically exhibit a high coding potential score when assessed by PhyloCSF. The authors therefore chose to exclude these transcripts from the database and confine their analysis to the RefSeq subset with keyword biomol_ncrna_lncrna as suggested by RefSeq’s Dr. Kimm D. Pruit. This change is reflected in update 3.1 and this corrigendum serves to elucidate the discrepancies in the article caused by this update. (read more…)

