2passtools: two-pass alignment using machine-learning-filtered splice junctions increases the accuracy of intron detection in long-read RNA sequencing

Matthew T. Parker (Lead / Corresponding author), Katarzyna Knop, Geoffrey J. Barton, Gordon G. Simpson (Lead / Corresponding author)

Research output: Contribution to journalArticlepeer-review

12 Citations (Scopus)
230 Downloads (Pure)

Abstract

Transcription of eukaryotic genomes involves complex alternative processing of RNAs. Sequencing of full-length RNAs using long reads reveals the true complexity of processing. However, the relatively high error rates of long-read sequencing technologies can reduce the accuracy of intron identification. Here we apply alignment metrics and machine-learning-derived sequence information to filter spurious splice junctions from long read alignments and use the remaining junctions to guide realignment in a two-pass approach. This method, available in the software package 2passtools (https://github.com/bartongroup/2passtools), improves the accuracy of spliced alignment and transcriptome assembly for species both with and without existing high-quality annotations.
Original languageEnglish
Article number72
Number of pages24
JournalGenome Biology
Volume22
Issue number1
DOIs
Publication statusPublished - 1 Mar 2021

Keywords

  • splicing
  • long read sequencing
  • spliced alignment
  • RNA-seq
  • gene expression
  • transcriptome assembly
  • machine learning
  • nanopore
  • Splicing
  • Long-read sequencing
  • Gene expression
  • Transcriptome assembly
  • Nanopore sequencing
  • Machine learning
  • Spliced alignment

ASJC Scopus subject areas

  • Genetics
  • Ecology, Evolution, Behavior and Systematics
  • Cell Biology

Fingerprint

Dive into the research topics of '2passtools: two-pass alignment using machine-learning-filtered splice junctions increases the accuracy of intron detection in long-read RNA sequencing'. Together they form a unique fingerprint.

Cite this