Multi-batch TMT reveals false positives, batch effects and missing values

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Multiplexing strategies for large-scale proteomic analyses have become increasingly prevalent, Tandem Mass Tags (TMT) in particular. Here we used a large iPSC proteomic experiment with twenty-four 10-plex TMT batches to evaluate the effect of integrating multiple TMT batches within a single analysis. We identified a significant inflation rate of protein missing values as multiple batches are integrated and show that this pattern is aggravated at the peptide level. We also show that without normalisation strategies to address the batch effects, the high precision of quantitation within a single multiplexed TMT batch is not reproduced when data from multiple TMT batches are integrated.

Furthermore, the incidence of false positives was studied by using Y chromosome peptides as an internal control. The iPSC lines quantified in this dataset were derived from both male and female donors, hence the peptides mapped to the Y chromosome should be absent from female lines. Nonetheless, these Y chromosome-specific peptides were consistently detected in the female channels of all TMT batches. We then used the same Y chromosome specific peptides to quantify the level of ion co-isolation as well as the effect of primary and secondary reporter ion interference. These results were used to propose solutions to mitigate the limitations of multi-batch TMT analyses. We confirm that including a common reference line in every batch increases precision by facilitating normalisation across the batches and we propose experimental designs that minimise the effect of cross population reporter ion interference.
Original languageEnglish
JournalMolecular and Cellular Proteomics
Early online date22 Jul 2019
DOIs
Publication statusE-pub ahead of print - 22 Jul 2019

Fingerprint

Y Chromosome
Chromosomes
Peptides
Ions
Proteomics
Economic Inflation
Multiplexing
Design of experiments
Research Design
Incidence
Population
Proteins
Experiments

Keywords

  • proteomics
  • TMT
  • mass spectrometry
  • isobaric tags
  • bioinformatics
  • data analysis
  • missing values
  • false positives

Cite this

@article{945911ba49884a30bcc2b000f44f5bc4,
title = "Multi-batch TMT reveals false positives, batch effects and missing values",
abstract = "Multiplexing strategies for large-scale proteomic analyses have become increasingly prevalent, Tandem Mass Tags (TMT) in particular. Here we used a large iPSC proteomic experiment with twenty-four 10-plex TMT batches to evaluate the effect of integrating multiple TMT batches within a single analysis. We identified a significant inflation rate of protein missing values as multiple batches are integrated and show that this pattern is aggravated at the peptide level. We also show that without normalisation strategies to address the batch effects, the high precision of quantitation within a single multiplexed TMT batch is not reproduced when data from multiple TMT batches are integrated.Furthermore, the incidence of false positives was studied by using Y chromosome peptides as an internal control. The iPSC lines quantified in this dataset were derived from both male and female donors, hence the peptides mapped to the Y chromosome should be absent from female lines. Nonetheless, these Y chromosome-specific peptides were consistently detected in the female channels of all TMT batches. We then used the same Y chromosome specific peptides to quantify the level of ion co-isolation as well as the effect of primary and secondary reporter ion interference. These results were used to propose solutions to mitigate the limitations of multi-batch TMT analyses. We confirm that including a common reference line in every batch increases precision by facilitating normalisation across the batches and we propose experimental designs that minimise the effect of cross population reporter ion interference.",
keywords = "proteomics, TMT, mass spectrometry, isobaric tags, bioinformatics, data analysis, missing values, false positives",
author = "{Brenes Murillo}, Alejandro and Jens Hukelmann and Dalila Bensaddek and Angus Lamond",
note = "This work was funded by the Wellcome Trust / MRC [098503/E/12/Z] and Wellcome Trust grants [073980/Z/03/Z, 105024/Z/14/Z].",
year = "2019",
month = "7",
day = "22",
doi = "10.1074/mcp.RA119.001472",
language = "English",
journal = "Molecular & Cellular Proteomics",
issn = "1535-9476",
publisher = "American Society for Biochemistry and Molecular Biology",

}

TY - JOUR

T1 - Multi-batch TMT reveals false positives, batch effects and missing values

AU - Brenes Murillo, Alejandro

AU - Hukelmann, Jens

AU - Bensaddek, Dalila

AU - Lamond, Angus

N1 - This work was funded by the Wellcome Trust / MRC [098503/E/12/Z] and Wellcome Trust grants [073980/Z/03/Z, 105024/Z/14/Z].

PY - 2019/7/22

Y1 - 2019/7/22

N2 - Multiplexing strategies for large-scale proteomic analyses have become increasingly prevalent, Tandem Mass Tags (TMT) in particular. Here we used a large iPSC proteomic experiment with twenty-four 10-plex TMT batches to evaluate the effect of integrating multiple TMT batches within a single analysis. We identified a significant inflation rate of protein missing values as multiple batches are integrated and show that this pattern is aggravated at the peptide level. We also show that without normalisation strategies to address the batch effects, the high precision of quantitation within a single multiplexed TMT batch is not reproduced when data from multiple TMT batches are integrated.Furthermore, the incidence of false positives was studied by using Y chromosome peptides as an internal control. The iPSC lines quantified in this dataset were derived from both male and female donors, hence the peptides mapped to the Y chromosome should be absent from female lines. Nonetheless, these Y chromosome-specific peptides were consistently detected in the female channels of all TMT batches. We then used the same Y chromosome specific peptides to quantify the level of ion co-isolation as well as the effect of primary and secondary reporter ion interference. These results were used to propose solutions to mitigate the limitations of multi-batch TMT analyses. We confirm that including a common reference line in every batch increases precision by facilitating normalisation across the batches and we propose experimental designs that minimise the effect of cross population reporter ion interference.

AB - Multiplexing strategies for large-scale proteomic analyses have become increasingly prevalent, Tandem Mass Tags (TMT) in particular. Here we used a large iPSC proteomic experiment with twenty-four 10-plex TMT batches to evaluate the effect of integrating multiple TMT batches within a single analysis. We identified a significant inflation rate of protein missing values as multiple batches are integrated and show that this pattern is aggravated at the peptide level. We also show that without normalisation strategies to address the batch effects, the high precision of quantitation within a single multiplexed TMT batch is not reproduced when data from multiple TMT batches are integrated.Furthermore, the incidence of false positives was studied by using Y chromosome peptides as an internal control. The iPSC lines quantified in this dataset were derived from both male and female donors, hence the peptides mapped to the Y chromosome should be absent from female lines. Nonetheless, these Y chromosome-specific peptides were consistently detected in the female channels of all TMT batches. We then used the same Y chromosome specific peptides to quantify the level of ion co-isolation as well as the effect of primary and secondary reporter ion interference. These results were used to propose solutions to mitigate the limitations of multi-batch TMT analyses. We confirm that including a common reference line in every batch increases precision by facilitating normalisation across the batches and we propose experimental designs that minimise the effect of cross population reporter ion interference.

KW - proteomics

KW - TMT

KW - mass spectrometry

KW - isobaric tags

KW - bioinformatics

KW - data analysis

KW - missing values

KW - false positives

U2 - 10.1074/mcp.RA119.001472

DO - 10.1074/mcp.RA119.001472

M3 - Article

JO - Molecular & Cellular Proteomics

JF - Molecular & Cellular Proteomics

SN - 1535-9476

ER -