A strategy for the rapid multiple alignment of protein sequences: confidence levels from tertiary structure comparisons

Geoffrey J. Barton (Lead / Corresponding author), Michael J. E. Sternberg

Research output: Contribution to journalArticle

388 Citations (Scopus)

Abstract

An algorithm is presented for the multiple alignment of protein sequences that is both accurate and rapid computationally. The approach is based on the conventional dynamic-programming method of pairwise alignment. Initially, two sequences are aligned, then the third sequence is aligned against the alignment of both sequences one and two. Similarly, the fourth sequence is aligned against one, two and three. This is repeated until all sequences have been aligned. Iteration is then performed to yield a final alignment. The accuracy of sequence alignment is evaluated from alignment of the secondary structures in a family of proteins. For the globins, the multiple alignment was on average 99% accurate compared to 90% for pairwise comparison of sequences. For the alignment of immunoglobulin constant and variable domains, the use of many sequences yielded an alignment of 63% average accuracy compared to 41% average for individual variable/constant alignments. The multiple alignment algorithm yields an assignment of disulphide connectivity in mammalian serotransferrin that is consistent with crystallographic data, whereas pairwise alignments give an alternative assignment.

Original languageEnglish
Pages (from-to)327-337
Number of pages11
JournalJournal of Molecular Biology
Volume198
Issue number2
DOIs
Publication statusPublished - 20 Nov 1987

Fingerprint

Sequence Alignment
Proteins
Globins
Transferrin
Disulfides
Immunoglobulins

Keywords

  • Algorithms
  • Amino acid sequence
  • Globins
  • Immunoglobulins
  • Molecular sequence data
  • Protein conformation
  • Transferrin

Cite this

@article{2838dd9371eb45bdb75b2feda24f8b82,
title = "A strategy for the rapid multiple alignment of protein sequences: confidence levels from tertiary structure comparisons",
abstract = "An algorithm is presented for the multiple alignment of protein sequences that is both accurate and rapid computationally. The approach is based on the conventional dynamic-programming method of pairwise alignment. Initially, two sequences are aligned, then the third sequence is aligned against the alignment of both sequences one and two. Similarly, the fourth sequence is aligned against one, two and three. This is repeated until all sequences have been aligned. Iteration is then performed to yield a final alignment. The accuracy of sequence alignment is evaluated from alignment of the secondary structures in a family of proteins. For the globins, the multiple alignment was on average 99{\%} accurate compared to 90{\%} for pairwise comparison of sequences. For the alignment of immunoglobulin constant and variable domains, the use of many sequences yielded an alignment of 63{\%} average accuracy compared to 41{\%} average for individual variable/constant alignments. The multiple alignment algorithm yields an assignment of disulphide connectivity in mammalian serotransferrin that is consistent with crystallographic data, whereas pairwise alignments give an alternative assignment.",
keywords = "Algorithms, Amino acid sequence, Globins, Immunoglobulins, Molecular sequence data, Protein conformation, Transferrin",
author = "Barton, {Geoffrey J.} and Sternberg, {Michael J. E.}",
year = "1987",
month = "11",
day = "20",
doi = "10.1016/0022-2836(87)90316-0",
language = "English",
volume = "198",
pages = "327--337",
journal = "Journal of Molecular Biology",
issn = "0022-2836",
publisher = "Elsevier",
number = "2",

}

TY - JOUR

T1 - A strategy for the rapid multiple alignment of protein sequences

T2 - confidence levels from tertiary structure comparisons

AU - Barton, Geoffrey J.

AU - Sternberg, Michael J. E.

PY - 1987/11/20

Y1 - 1987/11/20

N2 - An algorithm is presented for the multiple alignment of protein sequences that is both accurate and rapid computationally. The approach is based on the conventional dynamic-programming method of pairwise alignment. Initially, two sequences are aligned, then the third sequence is aligned against the alignment of both sequences one and two. Similarly, the fourth sequence is aligned against one, two and three. This is repeated until all sequences have been aligned. Iteration is then performed to yield a final alignment. The accuracy of sequence alignment is evaluated from alignment of the secondary structures in a family of proteins. For the globins, the multiple alignment was on average 99% accurate compared to 90% for pairwise comparison of sequences. For the alignment of immunoglobulin constant and variable domains, the use of many sequences yielded an alignment of 63% average accuracy compared to 41% average for individual variable/constant alignments. The multiple alignment algorithm yields an assignment of disulphide connectivity in mammalian serotransferrin that is consistent with crystallographic data, whereas pairwise alignments give an alternative assignment.

AB - An algorithm is presented for the multiple alignment of protein sequences that is both accurate and rapid computationally. The approach is based on the conventional dynamic-programming method of pairwise alignment. Initially, two sequences are aligned, then the third sequence is aligned against the alignment of both sequences one and two. Similarly, the fourth sequence is aligned against one, two and three. This is repeated until all sequences have been aligned. Iteration is then performed to yield a final alignment. The accuracy of sequence alignment is evaluated from alignment of the secondary structures in a family of proteins. For the globins, the multiple alignment was on average 99% accurate compared to 90% for pairwise comparison of sequences. For the alignment of immunoglobulin constant and variable domains, the use of many sequences yielded an alignment of 63% average accuracy compared to 41% average for individual variable/constant alignments. The multiple alignment algorithm yields an assignment of disulphide connectivity in mammalian serotransferrin that is consistent with crystallographic data, whereas pairwise alignments give an alternative assignment.

KW - Algorithms

KW - Amino acid sequence

KW - Globins

KW - Immunoglobulins

KW - Molecular sequence data

KW - Protein conformation

KW - Transferrin

U2 - 10.1016/0022-2836(87)90316-0

DO - 10.1016/0022-2836(87)90316-0

M3 - Article

C2 - 3430611

VL - 198

SP - 327

EP - 337

JO - Journal of Molecular Biology

JF - Journal of Molecular Biology

SN - 0022-2836

IS - 2

ER -