riboSeed

leveraging prokaryotic genomic architecture to assemble across ribosomal regions

Nicholas R Waters, Florence Abram, Fiona Brennan, Ashleigh Holmes, Leighton Pritchard

Research output: Contribution to journalArticle

87 Downloads (Pure)

Abstract

The vast majority of bacterial genome sequencing has been performed using Illumina short reads. Because of the inherent difficulty of resolving repeated regions with short reads alone, only ∼10% of sequencing projects have resulted in a closed genome. The most common repeated regions are those coding for ribosomal operons (rDNAs), which occur in a bacterial genome between 1 and 15 times, and are typically used as sequence markers to classify and identify bacteria. Here, we exploit the genomic context in which rDNAs occur across taxa to improve assembly of these regions relative to de novo sequencing by using the conserved nature of rDNAs across taxa and the uniqueness of their flanking regions within a genome. We describe a method to construct targeted pseudocontigs generated by iteratively assembling reads that map to a reference genome's rDNAs. These pseudocontigs are then used to more accurately assemble the newly sequenced chromosome. We show that this method, implemented as riboSeed, correctly bridges across adjacent contigs in bacterial genome assembly and, when used in conjunction with other genome polishing tools, can assist in closure of a genome.

Original languageEnglish
Pages (from-to)e68
Number of pages10
JournalNucleic Acids Research
Volume46
Issue number11
Early online date28 Mar 2018
DOIs
Publication statusPublished - 20 Jun 2018

Fingerprint

Ribosomal DNA
Bacterial Genomes
Genome
Operon
Chromosomes
Bacteria

Keywords

  • Computational Methods
  • Genomics

Cite this

Waters, Nicholas R ; Abram, Florence ; Brennan, Fiona ; Holmes, Ashleigh ; Pritchard, Leighton. / riboSeed : leveraging prokaryotic genomic architecture to assemble across ribosomal regions. In: Nucleic Acids Research. 2018 ; Vol. 46, No. 11. pp. e68.
@article{4ca10ed61f2a43a3adb45f934d888dc3,
title = "riboSeed: leveraging prokaryotic genomic architecture to assemble across ribosomal regions",
abstract = "The vast majority of bacterial genome sequencing has been performed using Illumina short reads. Because of the inherent difficulty of resolving repeated regions with short reads alone, only ∼10{\%} of sequencing projects have resulted in a closed genome. The most common repeated regions are those coding for ribosomal operons (rDNAs), which occur in a bacterial genome between 1 and 15 times, and are typically used as sequence markers to classify and identify bacteria. Here, we exploit the genomic context in which rDNAs occur across taxa to improve assembly of these regions relative to de novo sequencing by using the conserved nature of rDNAs across taxa and the uniqueness of their flanking regions within a genome. We describe a method to construct targeted pseudocontigs generated by iteratively assembling reads that map to a reference genome's rDNAs. These pseudocontigs are then used to more accurately assemble the newly sequenced chromosome. We show that this method, implemented as riboSeed, correctly bridges across adjacent contigs in bacterial genome assembly and, when used in conjunction with other genome polishing tools, can assist in closure of a genome.",
keywords = "Computational Methods, Genomics",
author = "Waters, {Nicholas R} and Florence Abram and Fiona Brennan and Ashleigh Holmes and Leighton Pritchard",
note = "James Hutton Institute, Dundee, Scotland and National University of Ireland, Galway, Ireland Joint Studentship. Funding for open access charge: James Hutton Institute, Dundee, Scotland and National University of Ireland, Galway, Ireland Joint Studentship (to N.W.).",
year = "2018",
month = "6",
day = "20",
doi = "10.1093/nar/gky212",
language = "English",
volume = "46",
pages = "e68",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "11",

}

riboSeed : leveraging prokaryotic genomic architecture to assemble across ribosomal regions. / Waters, Nicholas R; Abram, Florence; Brennan, Fiona; Holmes, Ashleigh; Pritchard, Leighton.

In: Nucleic Acids Research, Vol. 46, No. 11, 20.06.2018, p. e68.

Research output: Contribution to journalArticle

TY - JOUR

T1 - riboSeed

T2 - leveraging prokaryotic genomic architecture to assemble across ribosomal regions

AU - Waters, Nicholas R

AU - Abram, Florence

AU - Brennan, Fiona

AU - Holmes, Ashleigh

AU - Pritchard, Leighton

N1 - James Hutton Institute, Dundee, Scotland and National University of Ireland, Galway, Ireland Joint Studentship. Funding for open access charge: James Hutton Institute, Dundee, Scotland and National University of Ireland, Galway, Ireland Joint Studentship (to N.W.).

PY - 2018/6/20

Y1 - 2018/6/20

N2 - The vast majority of bacterial genome sequencing has been performed using Illumina short reads. Because of the inherent difficulty of resolving repeated regions with short reads alone, only ∼10% of sequencing projects have resulted in a closed genome. The most common repeated regions are those coding for ribosomal operons (rDNAs), which occur in a bacterial genome between 1 and 15 times, and are typically used as sequence markers to classify and identify bacteria. Here, we exploit the genomic context in which rDNAs occur across taxa to improve assembly of these regions relative to de novo sequencing by using the conserved nature of rDNAs across taxa and the uniqueness of their flanking regions within a genome. We describe a method to construct targeted pseudocontigs generated by iteratively assembling reads that map to a reference genome's rDNAs. These pseudocontigs are then used to more accurately assemble the newly sequenced chromosome. We show that this method, implemented as riboSeed, correctly bridges across adjacent contigs in bacterial genome assembly and, when used in conjunction with other genome polishing tools, can assist in closure of a genome.

AB - The vast majority of bacterial genome sequencing has been performed using Illumina short reads. Because of the inherent difficulty of resolving repeated regions with short reads alone, only ∼10% of sequencing projects have resulted in a closed genome. The most common repeated regions are those coding for ribosomal operons (rDNAs), which occur in a bacterial genome between 1 and 15 times, and are typically used as sequence markers to classify and identify bacteria. Here, we exploit the genomic context in which rDNAs occur across taxa to improve assembly of these regions relative to de novo sequencing by using the conserved nature of rDNAs across taxa and the uniqueness of their flanking regions within a genome. We describe a method to construct targeted pseudocontigs generated by iteratively assembling reads that map to a reference genome's rDNAs. These pseudocontigs are then used to more accurately assemble the newly sequenced chromosome. We show that this method, implemented as riboSeed, correctly bridges across adjacent contigs in bacterial genome assembly and, when used in conjunction with other genome polishing tools, can assist in closure of a genome.

KW - Computational Methods

KW - Genomics

U2 - 10.1093/nar/gky212

DO - 10.1093/nar/gky212

M3 - Article

VL - 46

SP - e68

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - 11

ER -