Residue preferences in beta sheets

Introduction

Over recent years, genome projects have seen the rapid accumulation of protein sequences deposited in databanks. However, structural information (which is more complex to determine) has progressed more slowly; resulting in a vast protein sequence to structure deficit.

All information required for folding of a thermodynamically stable 3D structure is contained within the amino acid sequence (Anfinsen, 1973). However, it is not yet known how protein folding is determined from this sequence, hence this has been a major focus of research. A knowledge of the relationship between protein sequence and structure will aid in the prediction of new protein structures and in the rational design of novel proteins and peptides.

Methods are available for the prediction of protein secondary structures, e.g. Rost and Sander, 1993. Predicting higher order structure of beta sheets requires understanding of the interaction of the component strands. This will be dependent on their constituent amino acids, on the basis of cross-strand pairing being influenced by the nature of the residue side-chains.

Several investigations of the pairing of amino acids in beta-sheets have been carried out. For anti-parallel beta strands, Wouters and Curmi (1995) investigated cross-strand pair correlations, and established that the hydrogen bonded (HB) and non-hydrogen bonded (nHB) sites (Figure 1) favour particular residue pairs and Hutchinson et al (1998) determined that the nHB site was favoured by Cys-Cys, Aromatic-Pro, Thr-Thr and Val-Val pairs, whilst at the HB site pairs comprising glycine and/or aromatic residues were preferred.

Figure 1: The hydrogen bonding patterns in parallel and anti-parallel data. Residues may be at the hydrogen bonded or non-hydrogen bonded position of the main chain: in parallel data one residue is hydrogen bonded whilst the pairing residue is non-hydrogen bonded and in anti-parallel data either BOTH residues of a pair are hydrogen bonded or BOTH are non-hydrogen bonded.

The side chains of residues may be orientated into one of three possible chi rotamer positions; gauche+, gauche- and trans. Chi 1 and chi 2 side-chain torsion angle preferences were analysed for each residue within a pair, to establish if any preferences were observed.

Figure 2: An example of the possible chi conformations, gauche+, gauche- and trans. The three residues have been overlapped by least squares fitting N - Ca - C - O atoms. Each residue has its side-chain orientated in a different direction.


Research objectives/Procedure

The objective of this research is to examine and establish trends of residue pair preferences in parallel beta-sheets. The rules to be derived will relate protein sequence and tertiary structure, and can subsequently be applied to the prediction of beta-strand alignment in simple-beta structured motifs.


A set of programs was devised in order to extract and manipulate the residue pair information from SST files

Programs were used to generate a list of parallel bridged amino acid pairs. This list was then used to generate 20 x 20 matrices of observed and expected frequencies of each residue pair in the data set, and the observed results were subsequently normalised against the expected data to produce ratio values. Expected pair frequencies were calculated using the following equation:

Where Ni is the total frequency of residue i, Nj is the total frequency of residue j, and NTOT is the total number of observations.

Chi-squared test values were subsequently calculated using the following equation:

Corresponding data were generated for anti-parallel bridged residue pairs.


Results

Observed parallel HB-nHB frequencies
Expected parallel HB-nHB frequencies
Parallel HB-nHB ratio values (O/E)
Parallel HB-nHB chi-squared test values
Observed antiparallel HB-HB frequencies
Expected antiparallel HB-HB frequencies
Antiparallel HB-HB ratio values (O/E)
Antiparallel HB-HB chi-squared test values
Observed antiparallel nHB-nHB frequencies
Expected antiparallel nHB-nHB frequencies
Antiparallel nHB-nHB ratio values (O/E)
Antiparallel nHB-nHB chi-squared test values

Specific residue-residue interactions
(To view the PDB images of particular amino acid pairs, you will need to install a viewer such as RasMol)

Observed parallel HB-nHB frequencies

ASP GLU LYS ARG GLY ALA ILE LEU VAL MET PRO PHE TYR TRP HIS GLN ASN THR SER CYS Total1 Total2
ASP 18 15 42 36 30 27 31 40 55 9 12 10 13 4 20 15 22 41 42 9 491 508
GLU 10 13 73 55 18 28 62 67 71 9 8 29 33 5 16 23 15 50 39 5 629 548
LYS 22 54 20 19 29 35 64 52 109 5 8 25 54 5 22 19 22 64 35 12 675 447
ARG 52 32 11 19 22 36 74 63 92 11 5 25 27 10 21 10 26 70 34 9 649 529
GLY 22 33 18 20 45 83 132 148 138 24 24 67 61 23 26 18 19 53 49 18 1021 850
ALA 29 36 20 34 70 109 265 222 335 26 26 89 55 22 14 28 26 77 52 27 1562 1441
ILE 43 51 40 30 99 180 548 449 630 80 40 191 140 39 36 28 46 108 70 47 2895 2958
LEU 46 45 22 51 89 209 470 402 538 66 35 145 90 23 27 32 37 114 77 54 2572 2619
VAL 59 59 32 58 117 257 488 456 726 89 60 251 161 52 50 37 49 136 99 70 3306 3723
MET 14 9 6 16 35 33 83 70 89 15 10 37 23 7 11 6 18 21 20 9 532 498
PRO 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 350
PHE 20 25 19 17 68 89 202 160 227 45 14 113 59 21 13 18 26 63 48 34 1281 1277
TYR 24 25 21 23 53 84 137 120 155 25 18 71 49 15 20 22 21 49 36 17 985 938
TRP 2 9 8 11 15 22 48 30 51 6 5 16 10 4 5 10 8 11 15 8 294 285
HIS 19 12 11 18 24 32 57 47 75 7 9 31 25 6 17 12 8 32 20 5 467 428
GLN 13 15 21 23 22 27 35 47 46 11 11 27 20 8 18 11 20 32 25 4 436 366
ASN 30 16 14 12 17 20 23 34 33 7 9 25 28 10 16 17 45 79 26 3 464 499
THR 40 48 39 53 47 81 123 99 191 27 35 51 45 12 48 29 45 105 69 20 1207 1187
SER 35 40 24 27 34 63 68 75 91 25 17 46 30 16 44 24 38 60 49 18 824 819
CYS 10 11 6 7 16 26 48 38 71 11 4 28 15 3 4 7 8 22 14 22 371 391

Total number of HB-nHB observations = 20661

Observed pairing frequencies for parallel strands from the Sreps CATHlist. Residues on the x-axis indicate those positioned at the non-hydrogen bonded (nHB) site of the main chain, whilst residues on the y-axis indicate those at the hydrogen bonded (HB) site. Total 1 refers to the total number of each type of hydrogen bonded residue observed and Total 2 refers to total number of each type of non-hydrogen bonded residue observed

Back to top of Results

Expected parallel HB-nHB frequencies

ASP GLU LYS ARG GLY ALA ILE LEU VAL MET PRO PHE TYR TRP HIS GLN ASN THR SER CYS Total1 Total2
ASP 12 13 10 12 20 34 70 62 88 11 8 30 22 6 10 8 11 28 19 9 491 508
GLU 15 16 13 16 25 43 90 79 113 15 10 38 28 8 13 11 15 36 24 11 629 548
LYS 16 17 14 17 27 47 96 85 121 16 11 41 30 9 13 11 16 38 26 12 675 447
ARG 15 17 14 16 26 45 92 82 116 15 10 40 29 8 13 11 15 37 25 12 649 529
GLY 25 27 22 26 42 71 146 129 183 24 17 63 46 14 21 18 24 58 40 19 1021 850
ALA 38 41 33 39 64 108 223 198 281 37 26 96 70 21 32 27 37 89 61 29 1562 1441
ILE 71 76 62 74 119 201 414 366 521 69 49 178 131 39 59 51 69 166 114 54 2895 2958
LEU 63 68 55 65 105 179 368 326 463 61 43 158 116 35 53 45 62 147 101 48 2572 2619
VAL 81 87 71 84 136 230 473 419 595 79 56 204 150 45 68 58 79 189 131 62 3306 3723
MET 13 14 11 13 21 37 76 67 95 12 9 32 24 7 11 9 12 30 21 10 532 498
PRO 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 350
PHE 31 33 27 32 52 89 183 162 230 30 21 79 58 17 26 22 30 73 50 24 1281 1277
TYR 24 26 21 25 40 68 141 124 177 23 16 60 44 13 20 17 23 56 39 18 985 938
TRP 7 7 6 7 12 20 42 37 52 7 4 18 13 4 6 5 7 16 11 5 294 285
HIS 11 12 10 11 19 32 66 59 84 11 7 28 21 6 9 8 11 26 18 8 467 428
GLN 10 11 9 11 17 30 62 55 78 10 7 26 19 6 9 7 10 25 17 8 436 366
ASN 11 12 10 11 19 32 66 58 83 11 7 28 21 6 9 8 11 26 18 8 464 499
THR 29 32 26 30 49 84 172 153 217 29 20 74 54 16 25 21 29 69 47 22 1207 1187
SER 20 21 17 21 33 57 117 104 148 19 13 50 37 11 17 14 19 47 32 15 824 819
CYS 9 9 8 9 15 25 53 47 66 8 6 22 16 5 7 6 8 21 14 7 371 391

Expected pairing frequencies for parallel strands from the Sreps CATHlist, calculated using (Total number of residue A x Total number of residue B)/Total number of all residues. Residues on the x-axis indicate those positioned at the non-hydrogen bonded (nHB) site of the main chain, whilst residues on the y-axis indicate those at the hydrogen bonded (HB) site. Total 1 refers to the total number of each type of hydrogen bonded residue observed and Total 2 refers to total number of each type of non-hydrogen bonded residue observed.

Back to top of Results

Parallel HB-nHB ratio values (O/E)

ASP GLU LYS ARG GLY ALA ILE LEU VAL MET PRO PHE TYR TRP HIS GLN ASN THR SER CYS
ASP 1.49 1.15 3.95 2.86 1.49 0.79 0.44 0.64 0.62 0.76 1.44 0.33 0.58 0.59 1.97 1.72 1.86 1.45 2.16 0.97
GLU 0.65 0.78 5.36 3.42 0.70 0.64 0.69 0.84 0.63 0.59 0.75 0.75 1.16 0.58 1.23 2.06 0.99 1.38 1.56 0.42
LYS 1.33 3.02 1.37 1.10 1.04 0.74 0.66 0.61 0.90 0.31 0.70 0.60 1.76 0.54 1.57 1.59 1.35 1.65 1.31 0.94
ARG 3.26 1.86 0.78 1.14 0.82 0.80 0.80 0.77 0.79 0.70 0.45 0.62 0.92 1.12 1.56 0.87 1.66 1.88 1.32 0.73
GLY 0.88 1.22 0.81 0.77 1.07 1.17 0.90 1.14 0.75 0.98 1.39 1.06 1.32 1.63 1.23 1.00 0.77 0.90 1.21 0.93
ALA 0.76 0.87 0.59 0.85 1.09 1.00 1.18 1.12 1.19 0.69 0.98 0.92 0.78 1.02 0.43 1.01 0.69 0.86 0.84 0.91
ILE 0.60 0.66 0.64 0.40 0.83 0.89 1.32 1.22 1.21 1.15 0.82 1.07 1.07 0.98 0.60 0.55 0.66 0.65 0.61 0.86
LEU 0.73 0.66 0.40 0.77 0.84 1.17 1.28 1.23 1.16 1.06 0.80 0.91 0.77 0.65 0.51 0.70 0.60 0.77 0.76 1.11
VAL 0.73 0.67 0.45 0.69 0.86 1.11 1.03 1.09 1.22 1.12 1.07 1.23 1.07 1.14 0.73 0.63 0.61 0.72 0.76 1.12
MET 1.07 0.64 0.52 1.17 1.60 0.89 1.09 1.04 0.93 1.17 1.11 1.13 0.95 0.95 1.00 0.64 1.40 0.69 0.95 0.89
PRO-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00
PHE 0.63 0.74 0.69 0.52 1.29 1.00 1.10 0.99 0.98 1.46 0.65 1.43 1.01 1.19 0.49 0.79 0.84 0.86 0.95 1.40
TYR 0.99 0.96 0.99 0.91 1.31 1.22 0.97 0.96 0.87 1.05 1.08 1.17 1.10 1.10 0.98 1.26 0.88 0.87 0.92 0.91
TRP 0.28 1.15 1.26 1.46 1.24 1.07 1.14 0.80 0.96 0.85 1.00 0.88 0.75 0.99 0.82 1.92 1.13 0.65 1.29 1.44
HIS 1.65 0.97 1.09 1.51 1.25 0.98 0.85 0.79 0.89 0.62 1.14 1.07 1.18 0.93 1.76 1.45 0.71 1.19 1.08 0.57
GLN 1.21 1.30 2.23 2.06 1.23 0.89 0.56 0.85 0.59 1.05 1.49 1.00 1.01 1.33 1.99 1.42 1.90 1.28 1.45 0.48
ASN 2.63 1.30 1.39 1.01 0.89 0.62 0.35 0.58 0.39 0.63 1.15 0.87 1.33 1.56 1.66 2.07 4.02 2.96 1.41 0.34
THR 1.35 1.50 1.49 1.72 0.95 0.96 0.71 0.65 0.88 0.93 1.71 0.68 0.82 0.72 1.92 1.36 1.54 1.51 1.44 0.88
SER 1.73 1.83 1.35 1.28 1.00 1.10 0.58 0.72 0.61 1.26 1.22 0.90 0.80 1.41 2.58 1.64 1.91 1.27 1.50 1.15
CYS 1.10 1.12 0.75 0.74 1.05 1.00 0.90 0.81 1.06 1.23 0.64 1.22 0.89 0.59 0.52 1.07 0.89 1.03 0.95 3.13

Normalised pairing frequencies for parallel strands from the Sreps CATHlist. Observed results were normalised against expected data. Residues on the x-axis indicate those positioned at the non-hydrogen bonded (nHB) site of the main chain, whilst residues on the y-axis indicate those at the hydrogen bonded (HB) site. Values greater than 1.0 indicate pairings which have a greater prevalence than expected (blue), those less than 1.0 indicate pairings with prevalence lower than expected (green). Pairings highlighted red indicate interactions that are not observed as a result of proline being positioned at the hydrogen bonded site of the main chain.

Back to top of Results

Parallel HB-nHB chi-squared test values

ASP GLU LYS ARG GLY ALA ILE LEU VAL MET PRO PHE TYR TRP HIS GLN ASN THR SER CYS
ASP 2.91 0.30 92.68 43.66 4.75 1.53 21.97 7.95 12.67 0.68 1.63 13.64 3.87 1.14 9.50 4.57 8.67 5.80 26.10 0.01
GLU 1.93 0.81 259.20 93.94 2.40 5.74 8.74 2.03 15.82 2.50 0.66 2.51 0.69 1.56 0.68 12.62 0.00 5.32 7.94 4.00
LYS 1.76 72.78 1.99 0.17 0.05 3.10 11.02 13.17 1.31 7.81 1.03 6.70 17.80 2.00 4.60 4.15 1.99 16.40 2.54 0.05
ARG 81.41 12.70 0.66 0.34 0.83 1.90 3.85 4.51 5.32 1.38 3.27 5.69 0.21 0.12 4.25 0.19 6.80 28.70 2.66 0.88
GLY 0.38 1.29 0.76 1.44 0.21 1.95 1.37 2.67 11.49 0.02 2.60 0.24 4.63 5.64 1.11 0.00 1.30 0.55 1.80 0.09
ALA 2.30 0.71 5.63 0.90 0.51 0.00 7.65 2.91 10.18 3.60 0.01 0.59 3.57 0.01 10.41 0.00 3.64 1.81 1.59 0.22
ILE 11.16 8.66 8.18 26.26 3.39 2.38 43.02 18.34 22.50 1.50 1.67 0.81 0.56 0.02 9.58 10.57 8.18 20.45 17.46 1.11
LEU 4.70 7.90 20.34 3.35 2.67 4.89 28.13 17.70 11.99 0.26 1.69 1.23 6.14 4.39 12.96 4.04 10.16 7.72 6.11 0.58
VAL 6.11 9.38 21.84 8.39 2.66 3.03 0.46 3.25 28.49 1.09 0.29 10.66 0.79 0.90 4.99 7.94 11.92 15.32 7.84 0.88
MET 0.06 1.85 2.64 0.42 7.86 0.45 0.61 0.10 0.49 0.37 0.11 0.52 0.05 0.02 0.00 1.24 2.07 2.99 0.06 0.11
PRO -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00
PHE 4.20 2.37 2.74 7.61 4.44 0.00 1.89 0.03 0.06 6.46 2.73 14.45 0.01 0.63 6.90 0.97 0.79 1.53 0.15 3.93
TYR 0.00 0.05 0.00 0.20 3.84 3.41 0.11 0.19 2.85 0.07 0.10 1.68 0.41 0.15 0.01 1.19 0.33 1.02 0.24 0.14
TRP 3.78 0.19 0.42 1.60 0.70 0.11 0.83 1.42 0.07 0.17 0.00 0.26 0.84 0.00 0.20 4.41 0.11 2.05 0.96 1.07
HIS 4.92 0.01 0.08 3.05 1.19 0.01 1.45 2.51 1.00 1.61 0.15 0.16 0.68 0.03 5.55 1.68 0.95 1.00 0.12 1.67
GLN 0.48 1.02 14.18 12.55 0.92 0.38 12.05 1.24 13.50 0.02 1.77 0.00 0.00 0.66 8.90 1.39 8.52 1.93 3.45 2.19
ASN 30.30 1.11 1.56 0.00 0.23 4.72 28.39 10.47 30.63 1.57 0.17 0.47 2.28 2.02 4.25 9.38 101.91 102.78 3.15 3.81
THR 3.59 7.98 6.36 15.80 0.14 0.12 14.35 19.06 3.23 0.15 10.36 7.47 1.75 1.30 21.15 2.71 8.62 18.33 9.35 0.35
SER 10.72 15.06 2.14 1.65 0.00 0.53 21.17 8.30 22.25 1.33 0.66 0.48 1.47 1.89 42.49 6.06 16.46 3.39 8.17 0.37
CYS 0.08 0.14 0.51 0.66 0.04 0.00 0.49 1.73 0.26 0.47 0.83 1.12 0.20 0.88 1.77 0.03 0.10 0.02 0.03 31.96

Chi squared test values of pairing frequencies for parallel strands from the Sreps CATHlist. Residues on the x-axis indicate those positioned at the non-hydrogen bonded (nHB) site of the main chain, whilst residues on the y-axis indicate those at the hydrogen bonded (HB) site. p ≤0.0001 was chosen as the threshold for evaluating significance of these values, and pairings that were statistically significant at this level are high-lighted in red.

Back to top of Results

Observed antiparallel HB-HB frequencies

ASP GLU LYS ARG GLY ALA ILE LEU VAL MET PRO PHE TYR TRP HIS GLN ASN THR SER CYS Total
ASP 72 97 147 133 106 100 95 94 157 29 0 53 57 21 44 60 75 160 106 23 1629
GLU 97 104 340 318 84 135 184 160 291 37 0 77 117 32 75 90 111 205 161 34 2652
LYS 147 340 216 101 73 128 212 184 308 65 0 112 150 43 72 145 64 218 197 47 2822
ARG 133 318 101 124 98 170 191 178 301 60 0 122 153 51 77 147 76 209 166 51 2726
GLY 106 84 73 98 198 192 264 244 402 64 0 263 189 86 71 69 86 200 135 77 2901
ALA 100 135 128 170 192 318 354 382 574 90 0 207 213 64 70 86 63 196 149 67 3558
ILE 95 184 212 191 264 354 628 624 635 127 0 360 343 88 91 116 53 254 186 95 4900
LEU 94 160 184 178 244 382 624 664 736 155 0 349 285 112 104 164 64 247 184 97 5027
VAL 157 291 308 301 402 574 635 736 876 207 0 532 434 130 154 220 121 354 320 125 6877
MET 29 37 65 60 64 90 127 155 207 42 0 83 64 26 29 37 18 60 60 29 1282
PRO 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
PHE 53 77 112 122 263 207 360 349 532 83 0 330 251 82 80 86 46 131 144 77 3385
TYR 57 117 150 153 189 213 343 285 434 64 0 251 248 71 76 99 77 168 128 66 3189
TRP 21 32 43 51 86 64 88 112 130 26 0 82 71 32 37 29 35 37 58 22 1056
HIS 44 75 72 77 71 70 91 104 154 29 0 80 76 37 64 44 41 82 88 28 1327
GLN 60 90 145 147 69 86 116 164 220 37 0 86 99 29 44 72 71 145 118 28 1826
ASN 75 111 64 76 86 63 53 64 121 18 0 46 77 35 41 71 54 147 112 19 1333
THR 160 205 218 209 200 196 254 247 354 60 0 131 168 37 82 145 147 356 262 53 3484
SER 106 161 197 166 135 149 186 184 320 60 0 144 128 58 88 118 112 262 246 50 2870
CYS 23 34 47 51 77 67 95 97 125 29 0 77 66 22 28 28 19 53 50 38 1026

Total number of HB-HB observations = 53870

Observed HB-HB pairing frequencies for anti-parallel strands from the Sreps CATHlist. Total refers to the total number of each type of hydrogen bonded residue observed. Pairings of residues with 'same' residues are on the diagonal and are duplicated. This is accounted for in calculating totals.

Back to top of Results

Expected antiparallel HB-HB frequencies

ASP GLU LYS ARG GLY ALA ILE LEU VAL MET PRO PHE TYR TRP HIS GLN ASN THR SER CYS Total
ASP 49 80 85 82 87 107 148 152 207 38 0 102 96 31 40 55 40 105 86 31 1629
GLU 80 130 138 134 142 175 241 247 338 63 0 166 156 51 65 89 65 171 141 50 2652
LYS 85 138 147 142 151 186 256 263 360 67 0 177 167 55 69 95 69 182 150 53 2822
ARG 82 134 142 137 146 180 247 254 347 64 0 171 161 53 67 92 67 176 145 51 2726
GLY 87 142 151 146 156 191 263 270 370 69 0 182 171 56 71 98 71 187 154 55 2901
ALA 107 175 186 180 191 234 323 332 454 84 0 223 210 69 87 120 88 230 189 67 3558
ILE 148 241 256 247 263 323 445 457 625 116 0 307 290 96 120 166 121 316 261 93 4900
LEU 152 247 263 254 270 332 457 469 641 119 0 315 297 98 123 170 124 325 267 95 5027
VAL 207 338 360 347 370 454 625 641 877 163 0 432 407 134 169 233 170 444 366 130 6877
MET 38 63 67 64 69 84 116 119 163 30 0 80 75 25 31 43 31 82 68 24 1282
PRO 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
PHE 102 166 177 171 182 223 307 315 432 80 0 212 200 66 83 114 83 218 180 64 3385
TYR 96 156 167 161 171 210 290 297 407 75 0 200 188 62 78 108 78 206 169 60 3189
TRP 31 51 55 53 56 69 96 98 134 25 0 66 62 20 26 35 26 68 56 20 1056
HIS 40 65 69 67 71 87 120 123 169 31 0 83 78 26 32 44 32 85 70 25 1327
GLN 55 89 95 92 98 120 166 170 233 43 0 114 108 35 44 61 45 118 97 34 1826
ASN 40 65 69 67 71 88 121 124 170 31 0 83 78 26 32 45 32 86 71 25 1333
THR 105 171 182 176 187 230 316 325 444 82 0 218 206 68 85 118 86 225 185 66 3484
SER 86 141 150 145 154 189 261 267 366 68 0 180 169 56 70 97 71 185 152 54 2870
CYS 31 50 53 51 55 67 93 95 130 24 0 64 60 20 25 34 25 66 54 19 1026

Expected HB-HB pairing frequencies for anti-parallel strands from the Sreps CATHlist, calculated using (Total number of residue A x Total number of residue B)/Total number of all residues. Total refers to the total number of each type of hydrogen bonded residue observed. Pairings of residues with 'same' residues are on the diagonal and are duplicated.

Back to top of Results

Antiparallel HB-HB ratio values (O/E)

ASP GLU LYS ARG GLY ALA ILE LEU VAL MET PRO PHE TYR TRP HIS GLN ASN THR SER CYS
ASP 1.46 1.21 1.72 1.61 1.21 0.93 0.64 0.62 0.75 0.75-1.00 0.52 0.59 0.66 1.10 1.09 1.86 1.52 1.22 0.74
GLU 1.21 0.80 2.45 2.37 0.59 0.77 0.76 0.65 0.86 0.59-1.00 0.46 0.75 0.62 1.15 1.00 1.69 1.20 1.14 0.67
LYS 1.72 2.45 1.46 0.71 0.48 0.69 0.83 0.70 0.85 0.97-1.00 0.63 0.90 0.78 1.04 1.52 0.92 1.19 1.31 0.87
ARG 1.61 2.37 0.71 0.90 0.67 0.94 0.77 0.70 0.86 0.92-1.00 0.71 0.95 0.95 1.15 1.59 1.13 1.19 1.14 0.98
GLY 1.21 0.59 0.48 0.67 1.27 1.00 1.00 0.90 1.09 0.93-1.00 1.44 1.10 1.51 0.99 0.70 1.20 1.07 0.87 1.39
ALA 0.93 0.77 0.69 0.94 1.00 1.35 1.09 1.15 1.26 1.06-1.00 0.93 1.01 0.92 0.80 0.71 0.72 0.85 0.79 0.99
ILE 0.64 0.76 0.83 0.77 1.00 1.09 1.41 1.36 1.02 1.09-1.00 1.17 1.18 0.92 0.75 0.70 0.44 0.80 0.71 1.02
LEU 0.62 0.65 0.70 0.70 0.90 1.15 1.36 1.42 1.15 1.30-1.00 1.10 0.96 1.14 0.84 0.96 0.51 0.76 0.69 1.01
VAL 0.75 0.86 0.85 0.86 1.09 1.26 1.02 1.15 1.00 1.26-1.00 1.23 1.07 0.96 0.91 0.94 0.71 0.80 0.87 0.95
MET 0.75 0.59 0.97 0.92 0.93 1.06 1.09 1.30 1.26 1.38-1.00 1.03 0.84 1.03 0.92 0.85 0.57 0.72 0.88 1.19
PRO-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00-1.00
PHE 0.52 0.46 0.63 0.71 1.44 0.93 1.17 1.10 1.23 1.03-1.00 1.55 1.25 1.24 0.96 0.75 0.55 0.60 0.80 1.19
TYR 0.59 0.75 0.90 0.95 1.10 1.01 1.18 0.96 1.07 0.84-1.00 1.25 1.31 1.14 0.97 0.92 0.98 0.81 0.75 1.09
TRP 0.66 0.62 0.78 0.95 1.51 0.92 0.92 1.14 0.96 1.03-1.00 1.24 1.14 1.55 1.42 0.81 1.34 0.54 1.03 1.09
HIS 1.10 1.15 1.04 1.15 0.99 0.80 0.75 0.84 0.91 0.92-1.00 0.96 0.97 1.42 1.96 0.98 1.25 0.96 1.24 1.11
GLN 1.09 1.00 1.52 1.59 0.70 0.71 0.70 0.96 0.94 0.85-1.00 0.75 0.92 0.81 0.98 1.16 1.57 1.23 1.21 0.81
ASN 1.86 1.69 0.92 1.13 1.20 0.72 0.44 0.51 0.71 0.57-1.00 0.55 0.98 1.34 1.25 1.57 1.64 1.71 1.58 0.75
THR 1.52 1.20 1.19 1.19 1.07 0.85 0.80 0.76 0.80 0.72-1.00 0.60 0.81 0.54 0.96 1.23 1.71 1.58 1.41 0.80
SER 1.22 1.14 1.31 1.14 0.87 0.79 0.71 0.69 0.87 0.88-1.00 0.80 0.75 1.03 1.24 1.21 1.58 1.41 1.61 0.91
CYS 0.74 0.67 0.87 0.98 1.39 0.99 1.02 1.01 0.95 1.19-1.00 1.19 1.09 1.09 1.11 0.81 0.75 0.80 0.91 1.94

Normalised HB-HB pairing frequencies for anti-parallel strands from the Sreps CATHlist. Observed results were normalised against expected data. Values greater than 1.0 indicate pairings which have a greater prevalence than expected (blue), those less than 1.0 indicate pairings with prevalence lower than expected (green). Pairings highlighted red indicate interactions that are not observed as a result of proline being positioned at the hydrogen bonded site of the main chain.

Back to top of Results

Antiparallel HB-HB chi-squared test values

ASP GLU LYS ARG GLY ALA ILE LEU VAL MET PRO PHE TYR TRP HIS GLN ASN THR SER CYS
ASP 10.50 3.52 44.56 31.02 3.81 0.54 19.08 22.14 12.49 2.46 -1.00 23.80 16.13 3.74 0.37 0.41 29.86 28.34 4.25 2.08
GLU 3.52 5.40 291.02 251.73 24.22 9.21 13.58 30.92 6.68 10.80 -1.00 48.22 10.19 7.68 1.43 0.00 31.38 6.54 2.75 5.40
LYS 44.56 291.02 31.43 12.24 41.04 18.29 7.78 23.90 7.58 0.07 -1.00 24.06 1.74 2.74 0.09 25.45 0.49 6.90 14.48 0.85
ARG 31.02 251.73 12.24 1.41 16.22 0.56 13.08 22.94 6.35 0.37 -1.00 14.18 0.43 0.11 1.44 32.26 1.08 6.06 2.97 0.02
GLY 3.81 24.22 41.04 16.22 11.17 0.00 0.00 2.64 2.71 0.37 -1.00 35.74 1.74 14.92 0.00 8.75 2.82 0.82 2.47 8.56
ALA 0.54 9.21 18.29 0.56 0.00 29.32 2.85 7.52 31.59 0.34 -1.00 1.23 0.03 0.47 3.55 9.93 7.12 5.06 8.68 0.01
ILE 19.08 13.58 7.78 13.08 0.00 2.85 74.56 60.81 0.14 0.93 -1.00 8.82 9.66 0.68 7.31 15.11 38.42 12.49 21.58 0.03
LEU 22.14 30.92 23.90 22.94 2.64 7.52 60.81 80.97 13.84 10.46 -1.00 3.47 0.53 1.84 3.18 0.24 29.32 18.77 26.23 0.02
VAL 12.49 6.68 7.58 6.35 2.71 31.59 0.14 13.84 0.00 11.48 -1.00 23.08 1.78 0.17 1.40 0.74 14.21 18.52 5.87 0.27
MET 2.46 10.80 0.07 0.37 0.37 0.34 0.93 10.46 11.48 4.33 -1.00 0.07 1.86 0.03 0.21 0.96 5.94 6.33 1.01 0.86
PRO -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00 -1.00
PHE 23.80 48.22 24.06 14.18 35.74 1.23 8.82 3.47 23.08 0.07 -1.00 64.69 12.78 3.69 0.14 7.20 17.02 35.31 7.32 2.44
TYR 16.13 10.19 1.74 0.43 1.74 0.03 9.66 0.53 1.78 1.86 -1.00 12.78 18.58 1.15 0.08 0.77 0.05 7.09 10.33 0.46
TRP 3.74 7.68 2.74 0.11 14.92 0.47 0.68 1.84 0.17 0.03 -1.00 3.69 1.15 6.17 4.64 1.29 3.01 14.34 0.05 0.18
HIS 0.37 1.43 0.09 1.44 0.00 3.55 7.31 3.18 1.40 0.21 -1.00 0.14 0.08 4.64 29.99 0.02 2.03 0.17 4.23 0.29
GLN 0.41 0.00 25.45 32.26 8.75 9.93 15.11 0.24 0.74 0.96 -1.00 7.20 0.77 1.29 0.02 1.65 14.75 6.13 4.41 1.32
ASN 29.86 31.38 0.49 1.08 2.82 7.12 38.42 29.32 14.21 5.94 -1.00 17.02 0.05 3.01 2.03 14.75 13.39 42.86 23.65 1.61
THR 28.34 6.54 6.90 6.06 0.82 5.06 12.49 18.77 18.52 6.33 -1.00 35.31 7.09 14.34 0.17 6.13 42.86 75.78 31.43 2.69
SER 4.25 2.75 14.48 2.97 2.47 8.68 21.58 26.23 5.87 1.01 -1.00 7.32 10.33 0.05 4.23 4.41 23.65 31.43 56.68 0.40
CYS 2.08 5.40 0.85 0.02 8.56 0.01 0.03 0.02 0.27 0.86 -1.00 2.44 0.46 0.18 0.29 1.32 1.61 2.69 0.40 17.44

Chi squared test values of HB-HB pairing frequencies for anti-parallel strands from the Sreps CATHlist. p ≤ 0.0001 was chosen as the threshold for evaluating significance of these values, and pairings that were statistically significant at this level are high-lighted in red. Pairs high-lighted in green are those which are statistically disfavoured at this level.

Back to top of Results

Observed antiparallel nHB-nHB frequencies

ASP GLU LYS ARG GLY ALA ILE LEU VAL MET PRO PHE TYR TRP HIS GLN ASN THR SER CYS Total
ASP 38 50 192 151 68 58 70 79 89 31 38 56 66 22 71 78 54 141 144 12 1508
GLU 50 120 300 265 60 81 180 198 238 59 76 128 167 28 55 101 85 250 173 22 2636
LYS 192 300 88 79 54 92 159 172 228 40 46 120 232 79 50 87 97 299 156 37 2607
ARG 151 265 79 88 62 109 182 179 287 33 67 116 171 69 52 89 90 231 135 35 2490
GLY 68 60 54 62 120 126 191 222 227 41 68 120 105 54 43 75 57 115 116 36 1960
ALA 58 81 92 109 126 252 429 410 474 77 86 255 199 72 70 68 67 206 134 59 3324
ILE 70 180 159 182 191 429 710 594 865 144 98 296 249 79 83 98 84 243 175 94 5023
LEU 79 198 172 179 222 410 594 742 764 144 174 404 313 139 70 117 102 212 182 97 5314
VAL 89 238 228 287 227 474 865 764 1258 163 177 437 316 102 100 135 103 381 232 128 6704
MET 31 59 40 33 41 77 144 144 163 56 53 104 63 26 22 34 32 63 56 31 1272
PRO 38 76 46 67 68 86 98 174 177 53 34 132 162 54 45 48 56 138 73 33 1658
PHE 56 128 120 116 120 255 296 404 437 104 132 244 171 86 53 88 52 132 126 75 3195
TYR 66 167 232 171 105 199 249 313 316 63 162 171 168 91 71 105 76 135 140 54 3054
TRP 22 28 79 69 54 72 79 139 102 26 54 86 91 48 23 60 28 26 59 27 1172
HIS 71 55 50 52 43 70 83 70 100 22 45 53 71 23 10 42 31 125 77 22 1115
GLN 78 101 87 89 75 68 98 117 135 34 48 88 105 60 42 72 75 204 116 25 1717
ASN 54 85 97 90 57 67 84 102 103 32 56 52 76 28 31 75 88 133 137 13 1460
THR 141 250 299 231 115 206 243 212 381 63 138 132 135 26 125 204 133 564 302 47 3947
SER 144 173 156 135 116 134 175 182 232 56 73 126 140 59 77 116 137 302 228 42 2803
CYS 12 22 37 35 36 59 94 97 128 31 33 75 54 27 22 25 13 47 42 192 1081

Total number of nHB-nHB observations = 54040

Observed nHB-nHB pairing frequencies for anti-parallel strands from the Sreps CATHlist. Total refers to the total number of each type of non-hydrogen bonded residue observed. Pairings of residues with 'same' residues are on the diagonal and are duplicated. This is accounted for in calculating totals.

Back to top of Results

Expected antiparallel nHB-nHB frequencies

ASP GLU LYS ARG GLY ALA ILE LEU VAL MET PRO PHE TYR TRP HIS GLN ASN THR SER CYS Total
ASP 42 73 72 69 54 92 140 148 187 35 46 89 85 32 31 47 40 110 78 30 1508
GLU 73 128 127 121 95 162 245 259 327 62 80 155 148 57 54 83 71 192 136 52 2636
LYS 72 127 125 120 94 160 242 256 323 61 79 154 147 56 53 82 70 190 135 52 2607
ARG 69 121 120 114 90 153 231 244 308 58 76 147 140 54 51 79 67 181 129 49 2490
GLY 54 95 94 90 71 120 182 192 243 46 60 115 110 42 40 62 52 143 101 39 1960
ALA 92 162 160 153 120 204 308 326 412 78 101 196 187 72 68 105 89 242 172 66 3324
ILE 140 245 242 231 182 308 466 493 623 118 154 296 283 108 103 159 135 366 260 100 5023
LEU 148 259 256 244 192 326 493 522 659 125 163 314 300 115 109 168 143 388 275 106 5314
VAL 187 327 323 308 243 412 623 659 831 157 205 396 378 145 138 213 181 489 347 134 6704
MET 35 62 61 58 46 78 118 125 157 29 39 75 71 27 26 40 34 92 65 25 1272
PRO 46 80 79 76 60 101 154 163 205 39 50 98 93 35 34 52 44 121 85 33 1658
PHE 89 155 154 147 115 196 296 314 396 75 98 188 180 69 65 101 86 233 165 63 3195
TYR 85 148 147 140 110 187 283 300 378 71 93 180 172 66 63 97 82 223 158 61 3054
TRP 32 57 56 54 42 72 108 115 145 27 35 69 66 25 24 37 31 85 60 23 1172
HIS 31 54 53 51 40 68 103 109 138 26 34 65 63 24 23 35 30 81 57 22 1115
GLN 47 83 82 79 62 105 159 168 213 40 52 101 97 37 35 54 46 125 89 34 1717
ASN 40 71 70 67 52 89 135 143 181 34 44 86 82 31 30 46 39 106 75 29 1460
THR 110 192 190 181 143 242 366 388 489 92 121 233 223 85 81 125 106 288 204 78 3947
SER 78 136 135 129 101 172 260 275 347 65 85 165 158 60 57 89 75 204 145 56 2803
CYS 30 52 52 49 39 66 100 106 134 25 33 63 61 23 22 34 29 78 56 21 1081

Expected nHB-nHB pairing frequencies for anti-parallel strands from the Sreps CATHlist, calculated using (Total number of residue A x Total number of residue B)/Total number of all residues. Total refers to the total number of each type of non-hydrogen bonded residue observed. Pairings of residues with 'same' residues are on the diagonal and are duplicated. This is accounted for in calculating totals.

Back to top of Results

Antiparallel nHB-nHB ratio values (O/E)

ASP GLU LYS ARG GLY ALA ILE LEU VAL MET PRO PHE TYR TRP HIS GLN ASN THR SER CYS
ASP 0.90 0.68 2.64 2.17 1.24 0.63 0.50 0.53 0.48 0.87 0.82 0.63 0.77 0.67 2.28 1.63 1.33 1.28 1.84 0.40
GLU 0.68 0.93 2.36 2.18 0.63 0.50 0.73 0.76 0.73 0.95 0.94 0.82 1.12 0.49 1.01 1.21 1.19 1.30 1.27 0.42
LYS 2.64 2.36 0.70 0.66 0.57 0.57 0.66 0.67 0.70 0.65 0.58 0.78 1.57 1.40 0.93 1.05 1.38 1.57 1.15 0.71
ARG 2.17 2.18 0.66 0.77 0.69 0.71 0.79 0.73 0.93 0.56 0.88 0.79 1.22 1.28 1.01 1.12 1.34 1.27 1.05 0.70
GLY 1.24 0.63 0.57 0.69 1.69 1.05 1.05 1.15 0.93 0.89 1.13 1.04 0.95 1.27 1.06 1.20 1.08 0.80 1.14 0.92
ALA 0.63 0.50 0.57 0.71 1.05 1.23 1.39 1.25 1.15 0.98 0.84 1.30 1.06 1.00 1.02 0.64 0.75 0.85 0.78 0.89
ILE 0.50 0.73 0.66 0.79 1.05 1.39 1.52 1.20 1.39 1.22 0.64 1.00 0.88 0.73 0.80 0.61 0.62 0.66 0.67 0.94
LEU 0.53 0.76 0.67 0.73 1.15 1.25 1.20 1.42 1.16 1.15 1.07 1.29 1.04 1.21 0.64 0.69 0.71 0.55 0.66 0.91
VAL 0.48 0.73 0.70 0.93 0.93 1.15 1.39 1.16 1.51 1.03 0.86 1.10 0.83 0.70 0.72 0.63 0.57 0.78 0.67 0.95
MET 0.87 0.95 0.65 0.56 0.89 0.98 1.22 1.15 1.03 1.87 1.36 1.38 0.88 0.94 0.84 0.84 0.93 0.68 0.85 1.22
PRO 0.82 0.94 0.58 0.88 1.13 0.84 0.64 1.07 0.86 1.36 0.67 1.35 1.73 1.50 1.32 0.91 1.25 1.14 0.85 0.99
PHE 0.63 0.82 0.78 0.79 1.04 1.30 1.00 1.29 1.10 1.38 1.35 1.29 0.95 1.24 0.80 0.87 0.60 0.57 0.76 1.17
TYR 0.77 1.12 1.57 1.22 0.95 1.06 0.88 1.04 0.83 0.88 1.73 0.95 0.97 1.37 1.13 1.08 0.92 0.61 0.88 0.88
TRP 0.67 0.49 1.40 1.28 1.27 1.00 0.73 1.21 0.70 0.94 1.50 1.24 1.37 1.89 0.95 1.61 0.88 0.30 0.97 1.15
HIS 2.28 1.01 0.93 1.01 1.06 1.02 0.80 0.64 0.72 0.84 1.32 0.80 1.13 0.95 0.43 1.19 1.03 1.53 1.33 0.99
GLN 1.63 1.21 1.05 1.12 1.20 0.64 0.61 0.69 0.63 0.84 0.91 0.87 1.08 1.61 1.19 1.32 1.62 1.63 1.30 0.73
ASN 1.33 1.19 1.38 1.34 1.08 0.75 0.62 0.71 0.57 0.93 1.25 0.60 0.92 0.88 1.03 1.62 2.23 1.25 1.81 0.45
THR 1.28 1.30 1.57 1.27 0.80 0.85 0.66 0.55 0.78 0.68 1.14 0.57 0.61 0.30 1.53 1.63 1.25 1.96 1.48 0.60
SER 1.84 1.27 1.15 1.05 1.14 0.78 0.67 0.66 0.67 0.85 0.85 0.76 0.88 0.97 1.33 1.30 1.81 1.48 1.57 0.75
CYS 0.40 0.42 0.71 0.70 0.92 0.89 0.94 0.91 0.95 1.22 0.99 1.17 0.88 1.15 0.99 0.73 0.45 0.60 0.75 8.88

Normalised nHB-nHB pairing frequencies for anti-parallel strands from the Sreps CATHlist. Observed results were normalised against expected data. Values greater than 1.0 indicate pairings which have a greater prevalence than expected (purple), those less than 1.0 indicate pairings with prevalence lower than expected (green).

Back to top of Results

Antiparallel nHB-nHB chi-squared test values

ASP GLU LYS ARG GLY ALA ILE LEU VAL MET PRO PHE TYR TRP HIS GLN ASN THR SER CYS
ASP 0.40 7.54 195.48 95.63 3.24 13.02 35.13 32.38 51.42 0.57 1.48 12.33 4.34 3.50 51.13 18.89 4.31 8.65 55.32 10.94
GLU 7.54 0.57 234.90 169.64 13.26 40.61 17.25 14.45 24.23 0.15 0.29 4.98 2.18 14.88 0.01 3.55 2.67 17.16 9.62 17.91
LYS 195.48 234.90 11.34 14.08 17.39 29.14 28.65 27.76 28.15 7.44 14.44 7.56 48.66 8.92 0.27 0.21 10.02 61.93 3.19 4.40
ARG 95.63 169.64 14.08 6.23 8.87 12.73 10.56 17.71 1.55 11.19 1.16 6.62 6.52 4.17 0.01 1.24 7.68 13.27 0.26 4.40
GLY 3.24 13.26 17.39 8.87 33.65 0.25 0.43 4.44 1.07 0.57 1.03 0.15 0.30 3.11 0.16 2.60 0.31 5.54 2.02 0.26
ALA 13.02 40.61 29.14 12.73 0.25 11.05 46.63 21.15 9.21 0.02 2.51 17.40 0.66 0.00 0.03 13.40 5.79 5.57 8.56 0.84
ILE 35.13 17.25 28.65 10.56 0.43 46.63 126.59 20.27 93.88 5.62 20.43 0.00 4.28 8.23 4.11 23.77 19.70 41.82 28.08 0.42
LEU 32.38 14.45 27.76 17.71 4.44 21.15 20.27 92.16 16.65 2.86 0.74 25.68 0.54 4.90 14.33 15.92 12.04 79.92 31.81 0.81
VAL 51.42 24.23 28.15 1.55 1.07 9.21 93.88 16.65 218.54 0.17 4.00 4.17 10.43 12.95 10.62 28.57 33.70 24.11 38.52 0.28
MET 0.57 0.15 7.44 11.19 0.57 0.02 5.62 2.86 0.17 22.68 5.00 11.03 1.10 0.09 0.69 1.02 0.16 9.63 1.51 1.21
PRO 1.48 0.29 14.44 1.16 1.03 2.51 20.43 0.74 4.00 5.00 5.59 11.77 49.79 9.05 3.40 0.42 2.80 2.36 1.96 0.00
PHE 12.33 4.98 7.56 6.62 0.15 17.40 0.00 25.68 4.17 11.03 11.77 16.07 0.51 4.03 2.53 1.80 13.64 44.02 9.52 1.92
TYR 4.34 2.18 48.66 6.52 0.30 0.66 4.28 0.54 10.43 1.10 49.79 0.51 0.12 9.26 1.01 0.65 0.51 34.76 2.14 0.82
TRP 3.50 14.88 8.92 4.17 3.11 0.00 8.23 4.90 12.95 0.09 9.05 4.03 9.26 20.06 0.06 13.91 0.42 41.50 0.05 0.54
HIS 51.13 0.01 0.27 0.01 0.16 0.03 4.11 14.33 10.62 0.69 3.40 2.53 1.01 0.06 7.35 1.22 0.03 23.30 6.35 0.00
GLN 18.89 3.55 0.21 1.24 2.60 13.40 23.77 15.92 28.57 1.02 0.42 1.80 0.65 13.91 1.22 5.58 17.65 49.25 8.15 2.54
ASN 4.31 2.67 10.02 7.68 0.31 5.79 19.70 12.04 33.70 0.16 2.80 13.64 0.51 0.42 0.03 17.65 59.77 6.52 49.57 8.99
THR 8.65 17.16 61.93 13.27 5.54 5.57 41.82 79.92 24.11 9.63 2.36 44.02 34.76 41.50 23.30 49.25 6.52 263.70 46.22 12.93
SER 55.32 9.62 3.19 0.26 2.02 8.56 28.08 31.81 38.52 1.51 1.96 9.52 2.14 0.05 6.35 8.15 49.57 46.22 46.94 3.53
CYS 10.94 17.91 4.40 4.40 0.26 0.84 0.42 0.81 0.28 1.21 0.00 1.92 0.82 0.54 0.00 2.54 8.99 12.93 3.531342.40

Chi squared test values of nHB-nHB pairing frequencies for anti-parallel strands from the Sreps CATHlist. p <= 0.0001 was chosen as the threshold for evaluating significance of these values, and pairings that were statistically significant at this level are high-lighted in red. Pairs high-lighted in green are those which are statistically disfavoured at this level.

Back to top of Results



Specific residue-residue interactions
(To view the PDB images of particular amino acid pairs, you will need to install a viewer such as RasMol)