Abstract
Current proteomics experiments can generate vast quantities of data very quickly, but this has not been matched by data analysis capabilities. Although there have been a number of recent reviews covering various aspects of peptide and protein identification methods using MS, comparisons of which methods are either the most appropriate for, or the most effective at, their proposed tasks are not readily available. As the need for high-throughput, automated peptide and protein identification systems increases, the creators of such pipelines need to be able to choose algorithms that are going to perform well both in terms of accuracy and computational efficiency. This article therefore provides a review of the currently available core algorithms for PMF, database searching using MS/MS, sequence tag searches and de novo sequencing. We also assess the relative performances of a number of these algorithms. As there is limited reporting of such information in the literature, we conclude that there is a need for the adoption of a system of standardised reporting on the performance of new peptide and protein identification algorithms, based upon freely available datasets. We go on to present our initial suggestions for the format and content of these datasets.
Original language | English |
---|---|
Pages (from-to) | 4082-4095 |
Number of pages | 14 |
Journal | Proteomics |
Volume | 5 |
Issue number | 16 |
Early online date | 27 Oct 2005 |
DOIs | |
Publication status | Published - 1 Nov 2005 |
Keywords
- Database searching
- De novo sequencing
- High throughput
- MS/MS
- Peptide identification
ASJC Scopus subject areas
- Biochemistry
- Molecular Biology