Projects per year
Abstract
RNA-seq experiments are usually carried out in three or fewer replicates. In order to work well with so few samples, differential gene expression (DGE) tools typically assume the form of the underlying gene expression distribution. In this paper, the statistical properties of gene expression from RNA-seq are investigated in the complex eukaryote, Arabidopsis thaliana, extending and generalizing the results of previous work in the simple eukaryote Saccharomyces cerevisiae. Results: We show that, consistent with the results in S.cerevisiae, more gene expression measurements in A.thaliana are consistent with being drawn from an underlying negative binomial distribution than either a log-normal distribution or a normal distribution, and that the size and complexity of the A.thaliana transcriptome does not influence the false positive rate performance of nine widely used DGE tools tested here. We therefore recommend the use of DGE tools that are based on the negative binomial distribution. Availability and implementation: The raw data for the 17 WT Arabidopsis thaliana datasets is available from the European Nucleotide Archive (E-MTAB-5446). The processed and aligned data can be visualized in context using IGB (Freese et al., 2016), or downloaded directly, using our publicly available IGB quickload server at https://compbio.lifesci.dundee.ac.uk/arabidopsisQuickload/public-quickload/ under 'RNAseq>Froussios2019'. All scripts and commands are available from github at https://github.com/bartongroup/KF-arabidopsis-GRNA. Supplementary information: Supplementary data are available at Bioinformatics online.
Original language | English |
---|---|
Pages (from-to) | 3372-3377 |
Number of pages | 6 |
Journal | Bioinformatics |
Volume | 35 |
Issue number | 18 |
Early online date | 6 Feb 2019 |
DOIs | |
Publication status | Published - 15 Sept 2019 |
ASJC Scopus subject areas
- Statistics and Probability
- Biochemistry
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics
Fingerprint
Dive into the research topics of 'How well do RNA-Seq differential gene expression tools perform in a complex eukaryote? A case study in Arabidopsis thaliana'. Together they form a unique fingerprint.Projects
- 5 Finished
-
Diversifying Transcription Termination Function
Barton, G. (Investigator) & Simpson, G. (Investigator)
Biotechnology and Biological Sciences Research Council
1/06/15 → 31/05/19
Project: Research
-
The Arabidopsis Epitranscriptome (Joint with University of Nottingham)
Barton, G. (Investigator) & Simpson, G. (Investigator)
Biotechnology and Biological Sciences Research Council
1/04/15 → 31/03/19
Project: Research
-
The Non-Coding Arabidopsis Genome
Barton, G. (Investigator) & Simpson, G. (Investigator)
1/07/12 → 31/12/15
Project: Research