Abstract
MOTIVATION: An automatic sequence searching method (ProtEST) is described which constructs multiple protein sequence alignments from protein sequences and translated expressed sequence tags (ESTs). ProtEST is more effective than a simple TBLASTN search of the query against the EST database, as the sequences are automatically clustered, assembled, made non-redundant, checked for sequence errors, translated into protein and then aligned and displayed.
RESULTS: A ProtEST search found a non-redundant, translated, error- and length-corrected EST sequence for > 58% of sequences when single sequences from 1407 Pfam-A seed alignments were used as the probe. The average family size of the resulting alignments of translated EST sequences contained > 10 sequences. In a cross-validated test of protein secondary structure prediction, alignments from the new procedure led to an improvement of 3.4% average Q3 prediction accuracy over single sequences.
AVAILABILITY: The ProtEST method is available as an Internet World Wide Web service http://barton.ebi.ac.uk/servers/protest.html+ ++ The Wise2 package for protein and genomic comparisons and the ProtESTWise script can be found at http://www.sanger.ac.uk/Software/Wise2
CONTACT: [email protected]
Original language | English |
---|---|
Pages (from-to) | 111-6 |
Number of pages | 6 |
Journal | Bioinformatics |
Volume | 16 |
Issue number | 2 |
DOIs | |
Publication status | Published - Feb 2000 |
Keywords
- Amino Acid Sequence
- Expressed Sequence Tags
- Molecular Sequence Data
- Protein Biosynthesis
- Proteins
- Sequence Alignment