Protein fold recognition by mapping predicted secondary structures

Robert B. Russell, Richard R. Copley, Geoffrey J. Barton

Research output: Contribution to journalArticle

114 Citations (Scopus)

Abstract

A strategy is presented for protein fold recognition from secondary structure assignments (alpha-helix and beta-strand). The method can detect similarities between protein folds in the absence of sequence similarity. Secondary structure mapping first identifies all possible matches (maps) between a query string of secondary structures and the secondary structures of protein domains of known three-dimensional structure. The maps are then passed through a series of structural filters to remove those that do not obey simple rules of protein structure. The surviving maps are ranked by scores from the alignment of predicted and experimental accessibilities. Searches made with secondary structure assignments for a test set of 11 fold-families put the correct sequence-dissimilar fold in the first rank 8/11 times. With cross-validated predictions of secondary structure this drops to 4/11 which compares favourably with the widely used THREADER program (1/11). The structural class is correctly predicted 10/11 times by the method in contrast to 5/11 for THREADER. The new technique obtains comparable accuracy in the alignment of amino acid residues and secondary structure elements. Searches are also performed with published secondary structure predictions for the von-Willebrand factor type A domain, the proteasome 20 S alpha subunit and the phosphotyrosine interaction domain. These searches demonstrate how the method can find the correct fold for a protein from a carefully constructed secondary structure prediction, multiple sequence alignment and distant restraints. Scans with experimentally determined secondary structures and accessibility, recognise the correct fold with high alignment accuracy (86% on secondary structures). This suggests that the accuracy of mapping will improve alongside any improvements in the prediction of secondary structure or accessibility. Application to NMR structure determination is also discussed.

Original languageEnglish
Pages (from-to)349-365
Number of pages17
JournalJournal of Molecular Biology
Volume259
Issue number3
DOIs
Publication statusPublished - 14 Jun 1996

Fingerprint

Proteins
Imino Acids
Phosphotyrosine
Sequence Alignment
von Willebrand Factor
Proteasome Endopeptidase Complex
beta-Strand Protein Conformation
alpha-Helical Protein Conformation
Protein Domains

Keywords

  • Algorithms
  • Amino acid sequence
  • Cysteine endopeptidases
  • Models, Molecular
  • Molecular sequence data
  • Multienzyme complexes
  • Phosphotyrosine
  • Proteasome endopeptidase complex
  • Protein folding
  • Protein structure, Secondary
  • Proteins
  • Sequence alignment
  • Software
  • von willebrand factor

Cite this

Russell, Robert B. ; Copley, Richard R. ; Barton, Geoffrey J. / Protein fold recognition by mapping predicted secondary structures. In: Journal of Molecular Biology. 1996 ; Vol. 259, No. 3. pp. 349-365.
@article{76e4719ebd9a4660b32e2f82a75897cb,
title = "Protein fold recognition by mapping predicted secondary structures",
abstract = "A strategy is presented for protein fold recognition from secondary structure assignments (alpha-helix and beta-strand). The method can detect similarities between protein folds in the absence of sequence similarity. Secondary structure mapping first identifies all possible matches (maps) between a query string of secondary structures and the secondary structures of protein domains of known three-dimensional structure. The maps are then passed through a series of structural filters to remove those that do not obey simple rules of protein structure. The surviving maps are ranked by scores from the alignment of predicted and experimental accessibilities. Searches made with secondary structure assignments for a test set of 11 fold-families put the correct sequence-dissimilar fold in the first rank 8/11 times. With cross-validated predictions of secondary structure this drops to 4/11 which compares favourably with the widely used THREADER program (1/11). The structural class is correctly predicted 10/11 times by the method in contrast to 5/11 for THREADER. The new technique obtains comparable accuracy in the alignment of amino acid residues and secondary structure elements. Searches are also performed with published secondary structure predictions for the von-Willebrand factor type A domain, the proteasome 20 S alpha subunit and the phosphotyrosine interaction domain. These searches demonstrate how the method can find the correct fold for a protein from a carefully constructed secondary structure prediction, multiple sequence alignment and distant restraints. Scans with experimentally determined secondary structures and accessibility, recognise the correct fold with high alignment accuracy (86{\%} on secondary structures). This suggests that the accuracy of mapping will improve alongside any improvements in the prediction of secondary structure or accessibility. Application to NMR structure determination is also discussed.",
keywords = "Algorithms, Amino acid sequence, Cysteine endopeptidases, Models, Molecular, Molecular sequence data, Multienzyme complexes, Phosphotyrosine, Proteasome endopeptidase complex, Protein folding, Protein structure, Secondary, Proteins, Sequence alignment, Software, von willebrand factor",
author = "Russell, {Robert B.} and Copley, {Richard R.} and Barton, {Geoffrey J.}",
year = "1996",
month = "6",
day = "14",
doi = "10.1006/jmbi.1996.0325",
language = "English",
volume = "259",
pages = "349--365",
journal = "Journal of Molecular Biology",
issn = "0022-2836",
publisher = "Elsevier",
number = "3",

}

Protein fold recognition by mapping predicted secondary structures. / Russell, Robert B. ; Copley, Richard R.; Barton, Geoffrey J.

In: Journal of Molecular Biology, Vol. 259, No. 3, 14.06.1996, p. 349-365.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Protein fold recognition by mapping predicted secondary structures

AU - Russell, Robert B.

AU - Copley, Richard R.

AU - Barton, Geoffrey J.

PY - 1996/6/14

Y1 - 1996/6/14

N2 - A strategy is presented for protein fold recognition from secondary structure assignments (alpha-helix and beta-strand). The method can detect similarities between protein folds in the absence of sequence similarity. Secondary structure mapping first identifies all possible matches (maps) between a query string of secondary structures and the secondary structures of protein domains of known three-dimensional structure. The maps are then passed through a series of structural filters to remove those that do not obey simple rules of protein structure. The surviving maps are ranked by scores from the alignment of predicted and experimental accessibilities. Searches made with secondary structure assignments for a test set of 11 fold-families put the correct sequence-dissimilar fold in the first rank 8/11 times. With cross-validated predictions of secondary structure this drops to 4/11 which compares favourably with the widely used THREADER program (1/11). The structural class is correctly predicted 10/11 times by the method in contrast to 5/11 for THREADER. The new technique obtains comparable accuracy in the alignment of amino acid residues and secondary structure elements. Searches are also performed with published secondary structure predictions for the von-Willebrand factor type A domain, the proteasome 20 S alpha subunit and the phosphotyrosine interaction domain. These searches demonstrate how the method can find the correct fold for a protein from a carefully constructed secondary structure prediction, multiple sequence alignment and distant restraints. Scans with experimentally determined secondary structures and accessibility, recognise the correct fold with high alignment accuracy (86% on secondary structures). This suggests that the accuracy of mapping will improve alongside any improvements in the prediction of secondary structure or accessibility. Application to NMR structure determination is also discussed.

AB - A strategy is presented for protein fold recognition from secondary structure assignments (alpha-helix and beta-strand). The method can detect similarities between protein folds in the absence of sequence similarity. Secondary structure mapping first identifies all possible matches (maps) between a query string of secondary structures and the secondary structures of protein domains of known three-dimensional structure. The maps are then passed through a series of structural filters to remove those that do not obey simple rules of protein structure. The surviving maps are ranked by scores from the alignment of predicted and experimental accessibilities. Searches made with secondary structure assignments for a test set of 11 fold-families put the correct sequence-dissimilar fold in the first rank 8/11 times. With cross-validated predictions of secondary structure this drops to 4/11 which compares favourably with the widely used THREADER program (1/11). The structural class is correctly predicted 10/11 times by the method in contrast to 5/11 for THREADER. The new technique obtains comparable accuracy in the alignment of amino acid residues and secondary structure elements. Searches are also performed with published secondary structure predictions for the von-Willebrand factor type A domain, the proteasome 20 S alpha subunit and the phosphotyrosine interaction domain. These searches demonstrate how the method can find the correct fold for a protein from a carefully constructed secondary structure prediction, multiple sequence alignment and distant restraints. Scans with experimentally determined secondary structures and accessibility, recognise the correct fold with high alignment accuracy (86% on secondary structures). This suggests that the accuracy of mapping will improve alongside any improvements in the prediction of secondary structure or accessibility. Application to NMR structure determination is also discussed.

KW - Algorithms

KW - Amino acid sequence

KW - Cysteine endopeptidases

KW - Models, Molecular

KW - Molecular sequence data

KW - Multienzyme complexes

KW - Phosphotyrosine

KW - Proteasome endopeptidase complex

KW - Protein folding

KW - Protein structure, Secondary

KW - Proteins

KW - Sequence alignment

KW - Software

KW - von willebrand factor

U2 - 10.1006/jmbi.1996.0325

DO - 10.1006/jmbi.1996.0325

M3 - Article

VL - 259

SP - 349

EP - 365

JO - Journal of Molecular Biology

JF - Journal of Molecular Biology

SN - 0022-2836

IS - 3

ER -