Arabidopsis Thaliana Reference Transcript Dataset 3 (AtRTD3)

  • Runxuan Zhang (Creator)
  • Richard I Kuo (Creator)
  • Max Coulter (Creator)
  • Cristiane Calixto (Creator)
  • Juan Entizne (Creator)
  • Wenbin Guo (Creator)

Dataset

Description

We have generated a new and comprehensive Arabidopsis thaliana Reference Transcript Dataset 3 (AtRTD3) based on extensive Iso-seq data (79% of transcripts) from a broad range of plant samples. We developed novel methods to determine splice junctions and transcription start and end sites (TSS and TES) accurately. Mis-match profiles around splice junctions provided a powerful and distinguishable feature between false and correct splice junctions allowing effective removal of spurious splice junctions. Stratified approaches identified high confidence transcription start/end sites and removed fragmentary transcripts due to degradation while taking into account expression abundance. AtRTD3 is a major improvement over existing transcriptomes as demonstrated by analysis of an extensive RNA-seq time-series dataset from Arabidopsis plants exposed to cold. AtRTD3 provided higher resolution of transcript expression profiling and identified cold- and light-induced differential transcription start and polyadenylation site usage.

Cite this