J. Mol. Biol. (1991) 217, 153-176

Crystal Structure of a Kunitz-type Trypsin Inhibitor Erythrina caffra Seeds

from

Silvia Onesti, Peter Brick and David M. Blow Blackett Laboratory, Imperial College London SW7 2BZ, England, U.K. (Received 4 July 1990; accepted 4 September 1990) The trypsin inhibitor DE-3 from Erythrina ca#ra (ETI) belongs to the Kunitz-type soybean trypsin inhibitor (STI) family and consists of 172 amino acid residues with two disulphide bridges. The amino acid sequence of ET1 shows high homology to other trypsin inhibitors from the samefamily but ET1 has the unique ability to bind and inhibit tissue plasminogen activator. The crystal structure of ET1 has been determined using the method of isomorphous replacement and refined using a combination of simulated annealing and conventional restrained least-squares crystallographic refinement. The refined model includes 60 water molecules and 166 amino acid residues, with a root-mean-square deviation in bond lengths from ideal values of O-016A. The crystallographic R-factor is 2O.8o/o for 7770 independent reflections between 160 and 2.5 A. The three-dimensional structure of ET1 consists of 12 antiparallel P-strands joined by long loops. Six of the strands form a short antiparallel B-barrel that is closed at one end by a “lid” consisting of the other six strands coupled in pairs. The molecule shows approximate 3-fold symmetry about the axis of the barrel, with the repeating unit consisting of four sequential P-strands and the connecting loops. Although there is no sequencehomology, this same fold is present in the structure of interleukin-la and interleukin-l/3. When the structure of ET1 and interleukin-l/l are superposed, the close agreement between the a-carbon positions for the P-strands is striking. The scissile bond (Arg63-Ser64) is located on an external loop that protrudes from the surface of the molecule and whose architecture is not constrained by secondary structure elements, disulphide bridges or strong electrostatic interactions. The hydrogen bonds made by the side-chain amide group of Asnl2 play a key role in maintaining the three-dimensional structure of the loop. This residue is in a position corresponding to that of a conserved asparagine in the Kazal inhibitor family. Although the overall structure of ET1 is similar to the partial structure of STI, the scissile bond loop is displaced by about 4 A. This displacement probably arises from the fact that the structure of ST1 has been determined in a complex with trypsin but could possibly be a consequence of the close molecular contact between Arg63 and an adjacent molecule in the crystal lattice.

1. Introduction‘

10,000) and rich in cysteine residues. An example of the first type is STI, crystallized and characterized by Kunitz (1947a,b), that contains 181 amino acid residues and has two disulphide bridges.

Leguminosae seeds are rich sources of inhibitors of trypsin and other serine proteinases, some of which have been isolated and characterized. Although their physiological role has not been clearly explained, it has been proposed that they may have a defensive function against insect infestation by inhibiting insect proteinases. These inhibitors can be divided into those of the Kunitz soybean trypsin inhibitor (STIt) type and those of the Bowman-Birk proteinase inhibitor type. Proteins from these two classesare very different in both size and disulphide bridge pattern: members of the Kunitz family have a molecular weight of about 20,000 and two disulphide bridges, while members of the Bowman-Birk family are smaller (n/r, 8000 to 0022-2836/91/010153-24

$03.00/O

t Abbreviations used: STI, soybean trypsin inhibitor; t-PA, tissue plasminogen activator; ETI, Erythrina trypsin inhibitor; WTI, winged-bean trypsin inhibitor; PCMBS, p-chloromercuribenzene sulphonate; m.i.r., multiple isomorphous replacement; r.m.s., root-meansquare; PSTI, pancreatic secretory trypsin inhibitor; OMPJPQ3, ovomucoid Japanese quail inhibitor III domain; OMSVP3, ovomucoid silver pheasant inhibitor III domain; OMTKY3, ovomucoid turkey inhibitor III domain; CI-2, chymotrypsin inhibitor-2; PI-l, potato inhibitor-l; PTI, pancreatic trypsin inhibitor; SSI, Streptomyces subtilisin inhibitor. 153

0 1991 Academic Press Limited

S. One&i

154

et al.

Table B Sequencecomparison

WAS1 WA1 WTI ST1 ET1 ILl-p ILL-a

10 20 30 DPPPVHDTDGNELRADANYYVLPANRAHGGGLTMAPGH ADDPVYDAEGNKLVNRGKYTI VSFSDGAGI DVVATGNE - - EPLLDSEGELVRNGGTYYLLPDRWALGGGI EAAATG - - DFVLDNEGNPLENGGTYYI LSD1 TAFGGI RAAPTG - - - VLLDGNGEVVQNGGTYYLLPQVWAQGGGVQLAKTG **es** ******** ----s-----APVRSLNCTLRDSQ- - QKSLVMSG- - FLSNVKYNFMRI I KYEFI LNDAL- - NQSI I RANbbbbbb Al

WASI WA1 WTI ST1 ETI ILl-p ILl-a

40 60 e 50 GRRCPLFVSQEA- - DGQRDGLPVRI APPEGGAPSDNPEDPLSI VKST- - - - RN1 MYATSI SSEDKTPPQP TETCPLTVVRSP- - NEVSVGEPLRI SSQLRSGNERCPLTVVQS R- - - NELDKGI GTI I S S PYRI REETCPLTVVQSP- - NELSDGKPI RI ESRLRSA****** ********* - - - - PYELKALHLQGQDMEQQVVFSMSFVQGEE- - - - DQYLTAAALH- NLDEAVKFDMGAYKSSKbbbb A3

WASI WA1 WTI ST1 ETT ILl-/cl ILl-a

70 80 RLSTDVRI SFRAYTTLENMRLKI NFATDPHPDYSLVRI GFANPPKAEGHPLSLKFDSFAVI PDDDKVRI GFAYAPK********* --SNDKIPVALGLKEK---------NLYLSCVLKDD - - DDAKI TVI L RI S KT- - KI I RN1 - FI - FI - FI

bbbbb Bl

WAS1 WA1 WTI ST1 ET1 ILl-p ILl-cy

bbbb A2

- - - - - -

bbbbb A4

90 - - - CVQSTEWHI DSELV - - - - - KCDVWSVVDFQP - - - CAPSPWWTVVEDQP MLCVGI PTEWSVVEDLP - - - CAPSPWWTVVEDEQ **** - - - - - - QLYVTAQ-

DED

bbbb B2

100 110 120 SGRRHVI TGPVRD- - - PSPSGRENAFRI EKYSGAE DGQQLKLAGRYI’N- - - - - - - QVKGAFTI QKGSNTQQPSVKLSELKST--------KFDYLFKFEKVTS-K EGPAVKI GENKDA- - - - - - - - MDGWFRLERVS DDE EGLSVKLSEDEST- - - - - - - QFDYPFKFEQVSDQ **** ******** - KPTLQLESVDPKNYPKKKMEKRFVFNKI EI N- - - - QPVLLKEMPEI PKTI T- GSETNLLFFWETHG- bbbb bbbbbbb B3 B4

-

Structure

Trypsin Inhibitor

of Erythrina

155

Table 1 (continued)

WAS1 WA1 WTI ST1 ET1

ILl-p ILl-a

130 VHEYKLMACG- - PRTYKLLFCPVGFS SYKLKYCAKRFNNYKLVFCPQQAEDDKCGDI LHSYKLLYCEGKHE********* - NKLEFESAQF- - TKNYFTSVAH- -

140 - DSCQDLGVFRDL - SPCKNI GI STDP - DTCKDI GI YRDQ GI SI DH - KCASI GI NRDQ ***** - - PNWYI STSQAE - - PNLFI ATKQ-

bbbbbbb

bbbbbbbb c2

Cl

WAS1

WA1 WTI ST1

ET1 ILl-p ILl-cx

150 KGGAWFLGATEP-----YHVVVFKKAPPA E- GKKRLVVSYQS- - GYARL’VVTDEDDGTRRLVVS KNK- GYRRLVVTED****** - - - NMPVFLGGTKGGQDI - - - DYWVCLAGG-

160 -

170

DPLVVKFHRHEPE NPLVVI FKKVESS KPLVVQFQKLDKESL YPLTVVLKKDESS ******** TDFTMQFVSSTNPRDS PPSI TDFQI LENEA

bbbbb c3

-

P

bbbbbb c4

ETI, Erythrim CC&~ trypsin inhibitor (Joubert & Dowdle, 1987); STI, soybean trypsin inhibitor (Kim et al., 1985); WTI, winged-bean trypsin inhibitor (Yamamoto et al., 1983); WAl, winged-bean albumin-l (Kortt et al., 1989); WASI, wheat c+amylase/subtilisin inhibitor (Jany & Lederer, 1985); ILl-B, human interleukin-lp (March et al., 1985); ILl-CC, human interleukin-la (March et al., 1985). The numbering is for the amino acid sequence of ETI. Residues that belong to B-strands in the crystal structure of ET1 are indicated by b. The bullet (0) marks the position of the reactive site arginine in ETI. Stars (*) denote residues used in superposing the crystal structure of IL-lb (Finzel et al.. 1989) on to that of ETI. When the 2 molecules are overlapped, these pairs of residues are separated by less than 25 d

The genus Erythrina belongs to the Leguminosae and comprises plants ranging from shrubs to trees distributed in Africa, lndochina, Malaysia, Australasia and Polynesia, as well as Central and South America. Seeds from various Erythrina species contain high concentrations of proteinase inhibitors of the Kunitz type. They can be divided into three groups according to their relative abilities to inhibit trypsin, chymotrypsin and tissue plasminogen activator (t-PA): group a inhibitors are specific for chymotrypsin, group b relatively specific for trypsin, group c bind strongly to trypsin and in addition they have the ability to bind and inhibit t-PA (Joubert et al., 1987). An inhibitor from Erythrina caffra belonging to group a has recently been crystallized (Shieh et al., 1990). From the seeds of E. latissima and E. cafira the DE-3 inhibitors (belonging to group c) have been purified and their amino acid sequences determined (Joubert et al., 1987; Joubert & Dowdle, 1987). They each contain 172 amino acid residues (&i’, 19,000) including two disulphide bridges. The sequences of the Erythrina trypsin inhibitor

(ETI) DE-3 from E. caflra and E. latissima show a high degree of homology to those of the Kunitz-type trypsins inhibitors from soybean and winged-bean seeds (Kim et al., 1985; Yamamoto et aZ., 1983; see Table l), but neither ST1 nor WTI inhibit t-PA. The Erythrina DE-3 inhibitors are selective, binding t-PA without affecting the homologous urokinasetype plasminogen activator, a specificity that makes it possible to use ET1 coupled with agarose in affinity chromatography columns to purify t-PA from melanoma cells cultured in viwo (Heussen et al., 1984). This is particularly useful as t-PA is being used clinically as a thrombolytic agent. The amino acid sequences of a number of homologous seed proteins have recently been elucidated (Table 1). Bifunctional a-amylaselsubtilisin inhibitors from wheat and barley show a similarity of about 30% to both ST1 and ET1 (Jany & Lederer, 1985; Leah & Mundy, 1989). The seed storage protein winged-bean albumin-l also reveals sequence similarity with the ST1 (Kunitz) family but lacks the first of the two disulphide bridges conserved in the inhibitors and has no inhibitory activity (Kortt

S.

156

Onesti

et al., 1989). The presence of an inactive member of the Kunitz family suggests that the inhibitors could have evolved from a seed storage protein. Whereas many three-dimensional structures of

proteinase

inhibitors

belonging

to different

families

have been published, the structural information about members of the ST1 (Kunitz) family is limited to the structure of the complex between ST1 and porcine trypsin at 2.6 .k (1 A= @l nm) resolution (Sweet et al., 1974; Blow et al., 1974) in which the inhibitor itself is not complete. Crystals of the inhibitor DE-3 from E. caflra (ETI) have been obtained (Onesti et at., 1989) and the structure

determined

to 2,5 A resolution

by the

isomorphous replacement method. Solving the structure by molecular replacement using the model of the homologous ST1 was not attempted because of the incomplete ST1 model and high symmetry of the space group. 2. Methods (a)

Crystallization

Purified

and

space

group

determination

trypsin inhibitor DE-3 from gift from E. Dowdle, University of Capetown. Crystals were grown as described (Onesti et al., 1989) by vapour diffusion in sitting drops at 4°C. Droplets containing 10 mg protein/ml in 20 mM-acetate buffer at pH 35 and 150 m&r-NaCl were equilibrated against 25 ml of distilled water. The crystallization was induced by the pH change due to the loss of volatile acetic acid. To avoid infinite dilution of the drops with consequent dissolution of the crystals, after 2 or 3 weeks the solution in the well (originally distilled water) was changed to 20 miw-buffer (pH 4%), 50 m-M-NaCl. Hexagonal rod-shaped crystals could be grown in a couple of weeks with a cross-section of between 100 and 300 pm and length up to 1 mm. The crystals were not stable and tended to degrade after 30 to 40 days. Precession photographs of the principal zones show symmet,ry and systematic absences consistent with the space group P6,22 or its enantiomorph P6,22. The cell parameters are a= b=734 A and c= 143.0 A with a unit cell volume of 6.67 x 10’ A3. Assuming 1 molecule in the asymmetric unit, this corresponds to a crystal volume V, of 2.92 A’/dalton of protein, implying a solvent content of 57% (Matthews, 1968). The crystals are stable in the X-ray beam and diffract to beyond 2.5 A. E. caflra

Erythrina

was

a

(b) Data

collection

and processing

All the X-ray data for the native and the heavy-atom derivatives used in determining the phases were collected on a CAD4 Enraf-Nonius diffractometer equipped with a FAST television area detector. Graphite monochromatized CuKcl radiation was produced using an Elliot GX21 rotating anode X-ray generator operated at 40 kV and 100 mA. The temperature of the detector was maintained at 10°C by a stream of cold air to reduce fluctuations in the dark current. The crystals were kept at room temperature during data collection. Frames with a width of 01” were collected with exposure times varying between 60 and 120 s/frame, depending on the size of t,he crystal. The lifetime of the crystals in the X-ray beam was up to 48 h enabling a

et al

oomplete 25 A resolution dataset to be collected on ? crystal. The initial screening for heavy-atoms was performed by collecting 3.2 A resolution datasets. At this resolution all the data could be collected at 1 detector position with the detector swung 18” from the camera axis and a detector to crystal distance of 110 mm. When higher resolution data were collected, 2 detector positions were necessary: a low resolution set was collected with a swing angle of 12”, and then data in the range 7 to 25 A were collected using a swing angle of 25”. In all cases 2 crystal orientations were required: the crystal was initially rotated around an axis making an angle of 25” with the c* direction and then around an axis making an angle of 60” to 65” with e*. The data from the FAST area detector were evaluated on-line using the program MADNES system (Messerschmidt $ Pflugraph, 1987). The crystal orientation was initially determined by scanning 2 slices (typically 2”) of reciprocal lattice 90” apart and looking for strong (larger than 8 0) reflections to be used for the automatic indexing and refinement of the cell parameters, camera alignment constants, crystal mosaicity, cry&al-todetector distance and crystal orientation. The crysta.! orientation and camera constants were continuahy updated during data collection; every 3” a cycle of refinement was performed and a new batch of reflections predicted on the basis of the updated parameters. This overcame any problems arising from the slippage of the crystal during the data collection. The intensities of the reflections were determined using a profile fitting algorithm in which spot profiles were accumulated for 16 different areas of the detector; a, best least-squares plane was then fitted to the background pixels (pixels whose intensities exceed 2.5 cr above the plane were eliminated) and this plane background was subtracted from all pixels. The position of adjacent spots was monitored so that tails of neighbouring strong reflections could be identified and eliminated from the background. The profiles were used only in the evaluation of weak (less than 5 rr) reflections, while strong reflections were simply integrated by summation, after background correction. The correction for the dark current was determined by collecting images with the shutter closed. The Lorentz and polarization corrections were applied to individual reflections during the integration step. The reflections were then reduced to the asymmetric unit and corrected for obliquity. The data were subdivided into batches of 4” rotation angle and a scale factor and temperature factor for each batch calculated and refined by the method of Fox & Holmes (1966). Structure factor amplitudes were obtained from intensity data and modified on the basis of a priori knowledge of the expected Wilson’s distribution and the non-negativity of the true intensities (French & Wilson, 1978). A summary of data collection and processing of the native dataset is given in Table 2. (c) Heavy-atom

derivatives

und

phase

rejinement

Potential heavy-atom derivatives were prepared by soaking the crystals in solutions containing the heavyatom salt dissolved in 20 rnx-sodium acetate at, pH 4%. The soaking time varied between 1 and 5 days. Part,icular care was taken to keep the pH between 47 and 50, since the buffering power of the harvesting solution was rather low at such a low acetate concentration. The chemical sequence showed a lack of cysteine and methionine residues, which are usually useful targets for heavy-atom reagents. Despite this, the crystals appeared

Xtructure

of Erythrina

Trypsin

157

Inhibitor

Table 2 Native

Resolution

data collection

Observed reflections

(A)

788 558 456 395 3.53 3.23 2.99 2.79 2.64 2.50 Total

with

multiple

processing R merge shell (%)

Completeness (%I

115 500 637 742 816 892 944 1015 1029 1080 7770

er of reflections

and

50 53 5.7 57 6.2 7.5 7.0 10.0 146 203 6.5

697 942 967 96.4 95.8 952 946 93.9 92.3 904 936

measurement

to be rather sensitive to heavy-atom compounds. Of the reagents examined very few did not have an effect. Sometimes cracking parallel to the crystallographic c axis followed by reannealing was observed. Attempts to avoid cracking by soaking crystals in a dilute solution and then transferring them to gradually more concentrated solutions failed. However, in most cases, after 2 or 3 days the crystals looked perfect and good X-ray data could be collected. This was the case for the Sm(NO,),, La(CH,COO), and Pb(CH,COO), derivatives. The heavy-atom derivative datasets were scaled to the native using a relative Wilson plot, with the heavy-atom contribution estimated as 5%. Difference Patterson maps with coefficients (Fderiv- Enat)’ were calculated. Because of their complexity (due to the high symmetry and the number of sites) the difference Patterson maps for all the derivatives failed to yield interpretable heavy-atom sites. Some of the binding sites of the Pb derivative were initially located using the direct methods program MULTAN (Wilson, 1978). The isomorphous differences AF = (Fderiu-F,,J were normalized to obtain E values, and the 400 highest E values where E=AF/((AF)2)“2, used as input in MULTAN. Starting phases were assigned to all these reflections using the RANTAN procedure; no specific choices for defining the origin and the enantiomorph were made. Fifty phase sets of reflections were generated; using the Sayre’s tangent formula, phases were determined for all the reflections in each set and a combined figure of merit was calculated. E maps were computed for the 6 sets with the highest figure of merit

Reflections used for R,,,,,t 108 443 624 720 788 851 871 907 919 947 7178

used in the calculation

of R,,,,,.

and for each map the 6 highest peaks were examined as potential heavy-atom sites. The 2 strongest peaks in the E map corresponding to the 2nd highest combined figure of merit (but the highest ABSFOM, representing the internal consistency of the phase set) were consistent with the distribution of peaks of the difference Patterson map. Attempts at using MULTAN with the other heavy-atom datasets were not successful. Positions and occupancies of the 2 Pb sites obtained using MULTAN were refined with the phase refinement program PHARE (CCP4 program suite) using only the centric reflections. A residual map calculated with the 2 sites showed 2 additional Pb sites. Single isomorphous replacement phases obtained using the Pb derivative were used to calculate difference Fourier maps for the other potential derivatives. Two sites were found for p-chloromercuribenzene sulphonate (PCMBS), 2 for Hg(CH,COO),, 3 for KJrCl,, K,Pt(NOs),, La(CH,COO), and 4 for Sm(NOs), and K,OsCl,. The Sm, La, OS, Pt datasets were not used in the refinement, since they showed the major peaks at the same sites as Pb (Sm and La) and Ir (OS and Pt). Data at higher resolution were collected only for the Pb and the Ir derivatives, since in the case of the 2 mercury datasets (PCMBS and Hg(CH,COO),) the phasing power decreased with the resolution to such an extent as to indicate a high degree of non-isomorphism. A summary of the final data collection is reported in Table 3. The anomalous contribution from Ir and Pb lead to the solution of the ambiguity between the 2 enantiomorphic space groups P6,22 and P6,22, with

Table 3 Data collection

summary

for

native

Native Soak concentration (m&r) Soak time (days) Resolution (A) Number of reflections Independent reflections

2.5 33,533 7770 936 65

3 mx-Pb(CH,COOH),; PB3, 1 mM-Hg(CH,COOH),; IRlO, R merge = cf! Wh,i--7hIPh

xi 1h.i

PCMBS, 10 miv-K&Cl,].

and

heavy-atom

PB3

derivatives

PCMBS 5-O 2 32

3.0 3 25 33,998 7728 93.1 5.8

14,427 3973 942 63

5 mwp-chloromercuribenzene

HGAl

IRlO 1.0 2 3.2

12,392 3635 86.5 6.3 sulphonate;

10.0 2 2.7 35,986 5348 795 7.2 HGAl,

158

X. One&i

et al.

--

Table 4 Heavy-atom

refinement

Resolution (A)

7.47

5.97

4.97

4.25

3.72

3.30

2.97

2.70

Total

FOM Phasing power PS3 PCMBS HGAl IRlO

084

0%

0.84

0.77

072

068

0.62

0.53

067

2.02 1.32

2.45 1.44

2.25 1.13

1.36

1.09

205 087 071

2.30 o&i 066

2.63 ~

2.99

1.20 1.14

2.35 0.93 0.79

1.53

1.30

2.38 0.75 @68 I.05

1.06

1.00

1.05

1.13

1.13

FOM, mean figure of merit; phasing power, r.m.s. heavy-atom structure factor/r.m.s. lack of closure. For other abbreviations. see Table 3

residues in the molecule) of which 120 residues were built with side-chains, while the remaining 18 residues were built as polyalanine because of ambiguity in the aiignment of the amino acid sequence to the electron density. The initial crystallographic R-factor was 44.7 y. using ail the datja to 2.7 A resolution.

the assignment of the space group P6,22. Alternate cycles of heavy-atom refinement and phase calculation were performed: a summary of heavy-atom refinement is given in Tables 4 and 5. (d) Interpretation

of the electron density map

A m.i.r. electron density map at 2.7 A resolution was calculated using phase information from the 4 derivatives Pb(CH,COO),, KsIrCl,, PCMBS and Hg(CH,COO),. The anomalous contributions for the iridium and lead derivatives were included to give an overall figure of merit of 0.67. A molecular envelope was determined (Wang, 1985) using a reciprocal space method (Leslie, 1987) assuming a solvent content of 50%. Four cycles of solvent flattening and phase combination were carried out, which lead to a small improvement in the quality of the map. The r.m.s. phase difference between the solvent flattened phases and those calculated from the final refined model is 71.7” (average difference 484”), while when the m.i.r. phases are compared with the model phases the r.m.s. difference is 795’ (average difference 56.7”). The m.i.r.-solvent flattened map, displayed on an Evans & Sutherland PS330 using the interactive graphic program FRODO (Jones, 1978), showed clearly the overall fold of the molecule. An a-carbon tracing of the partial model of ST1 was superposed on to a skeletonized representation of the electron density (Greer, 1985) and an atomic model built by fitting fragments from a database of highly refined structures (Jones & Thirup, 1986). Although the m.i.r. map was easily interpretable in the P-sheets, there were several main-chain discontinuities in the loop regions. The initial model consisted of 138 amino acid residues (approximately 80% of the total number of

(e) Crystallographic

rejinement

The m.i.r. model was refined with the program XPLOR (Briinger et at., 1987) using a combination of simulated annealing and conventional restrained crystallographic refinement. The protocol used for the refinement comprised 3 stages. During the 1st stage. conjugate gradient minimization was employed to relieve any bad contacts in the structure that could cause problems when starting the molecular dynamics calculations. To avoid large movements due to particularly bad intera,ctions the C” atoms were harmonically restrained to their initial positions. The 2nd stage performed the actual simuia.ted annealing step: the initial velocities were assigned from a Maxwellian distribution at 4000 K but no equilibration at high temperature was performed. Instead a “slowcooling” protocol was used in which the system was coupled with a heat-bath whose temperature was initially 4000 K and then slowly cooled to 300 K by reducing the temperature by 25 K every 50 dynamic steps (where each step corresponded to 0.5 fs). The 3rd stage consisted of conjugate gradient minimization that re-optimized the R-factor and the stereochemistry of the system. A summary of the refinement is reported in Table 6. During the first round the simulated annealing procedure was compared with conventional crystallographic refinement, bypassing the heating/cooling stage. In comparison wit’h the least-square method, the molecular dynamics

Table S Heavy-atom Derivative

So. sites

Rmerge(%)

4

5.8 6.2 6.2 7.2

PB3 PCMBS RGAl

2

IRlO

3

2

re$nement Li”

summary

(%)

&“,,i, (9/b)

Phasing power

20.1 16.8

60.6 87.9

235 0.93

21% 150

87.1 79.0

0.79 1.13

Rmerpe= % wh,irf,l/-&l .-4 Ih,i. f&iv = UFderirh - Fn.r,,l/& Fn,,, R cUilis= CJF,,ca,c-FPH,bll/FPH,b. - F,I for centric terms only. Phasing power, r.m.s. heavy-atom structure factor/r.m.s. lack of closure. For other abbreviations. see Table 3.

Structure

of Erythrina

159

Trypsin Inhibitor

Table 6 Crystallographic rejinement Round

Resolution

(A)

2.7 27 2.1 2.7 2.5 2.5 2.5 2-5 25 2.5 2.5

1 1A 2 3 4 5 5A 6 7 8 SA; simulated from ideality.

annealing;

SA

Bref

No Yes No

No Y ;: Yes Yes Yes Yes Yes Yes Yes

NO NO NO

Yes NO

NO NO

Bref,

refinement lowered the R-factor by 2.4% and improved the stereochemistry, decreasing the bond length deviation from ideality. The peptide carbonyl group of Gly28 automatically flipped by 180”. After the 1st round, subsequent refinement was carried out using conjugate gradient minimization; omitting the simulated annealing procedure. Rounds of automated refinement were alternated with rounds of manual intervention: (3F0,, - 2F,,,,) and (Fobs-F,,,,) maps with model phases were calculated and examined with interactive computer graphics. During the 2nd round isotropic temperature factors were refined with the temperature factor of each atom restrained to those of the a,toms involved in the same bond or bond angle. In the 3rd round the resolution limit was extended from 2.7 A to 2.5 A and some of the solvent molecules included. At the 5th round an additional check was done by the simulated annealing procedure with comparing conventional crystallographic refinement: the R-factor dropped only from 22.8% to 22.5%. In the early stages no electron density was observed for some loops on the surface of the structure but during the course of the refinement more density difference maps allowing additional

appeared

the

lution

10 to 2-5 A) were

residues to be included in the model. It was found that the density for the very mobile loops was much clearer when lower resoreflections

included

(15 to 2.5 A rather

than

in the map calculations.

3. Results

(%I

Abonds (4

447 321 297 27.1 26.2 23.9 228 225 21.9 21.4 20-S

factors refinement;

temperature

in

R-factor

Abonds, bond

0026 0023 0.025 0022 0023 0025 0022 0.021 6018 0016 length

r.m.s.

deviation

residues (more often negatively charged glut’amate and aspartate residues), in which a high degree of flexibility is to be expected; it is therefore not surprising that these external loops appear to be disordered. For two of these loops (residues 104 to 110 and 133 to 138) a map calculated using lower resolution terms (up to 15 A) allowed the main chain to be traced, even if the reliability of the dihedral angles is rather low. This uncertainty in the main-chain conformation is also reflected in the unusually high temperature factors for both loops. To demonstrate that the electron density shown for those loops was not an artifact due to the bias of the model in the calculation of the phases, two segments corresponding to residues 104 to 110 and 133 to 138 were deleted. The resulting model was refined for few cycles and the (3F,,,,- 2Fcalc) map calculated with all reflections between 15 and 2.5 A. This map is shown in Figure 1 for the two regions corresponding to the missing loops. Despite small interruptions, the electron density follows rather closely the original atomic model. The region between residues 94 and 97 (loop B2-B3) is still uninterpretable. There is also no density for the last two amino acid residues (171 and 172) at the C terminus: repeated attempts to fit these two residues to small peaks appearing in (FOss-Fca,J Fourier maps were unsuccessful. In contrast, the N terminus could be built

(a) Quality of the model The crystallographic R-factor after eight rounds of refinement is 2O+3o/o for the 7770 independent reflections between 10.0 and 2.5 A resolution (Table 7). There is a large variation in the quality of the electron density map in different parts of the structure. A map calculated with coefficients (3E”,,, - 2Fca,J, using data between 10 and 2.5 A, shows well-defined density for the hydrophobic core of the protein and the P-sheets. However, there are small regions in the loops connecting P-strands where the poor definition of the electron density has caused difficulties in the interpretation of the mainchain conformational angles. These regions correspond to sequences rich in hydrophilic charged

Table 7 Crystallographic R-factor versus resolution Resolution 10~00-788 7.88-558 5584.56 4.56-3.95 3.95-3.53 3.5333.23 3.23-2.99 2.99-2.79 2.79-2.64 26442.50 10.00-2.50

(A)

R-factor 245 252 17.8 15.7 16.5 197 20.9 23.9 27.2 29.7 20.8

(%)

160

S. Onesti et ai.

Figure 1. A stereoscopic electron density map calculated with coefficient,8 (3F0,,-2E’ talc) (using ail reflections between 15 and 2.5 A) and phases from a model in which the residues 104 to 110 and 133 to 138 have been omitted for few cycles of refinement,. The map is contoured at the 1 (T level. (a) The loop B3-B4 (residues 104 to Ill). (b) The loop Cl-C2 (including the disulphide bridge between Cys132 and Cys139).

unambiguously starting from the first residue (Vall). The side-chains of some hydrophilic residues on the surface of the protein are disordered and have been omitted from the model. The present model consists of 166 amino acid residues out of 172 and a total of 60 ordered water molecules. The stereochemistry of the model obtained is close to ideal geometry, with r.m.s. deviation of 0.016 A in bond lengths and 3.27” in bond angles. The main-chain dihedral angles are shown in Figure 2 (Ramakrishman & Ramachandran, 1965). All the non-glycine residues

have dihedral (4,$) angles that lie in (or close to) allowed regions of the Ramachandran plot. (b) Overall

architecture

The three-dimensional structure of ~ry~~~~~a t,rypsin inhibitor shows the same fold t,hat has been found in the crystal structure of ST1 (Blow et ak., 1974; Sweet et al., 1974) and whose architecture has been described by Melachlan (1979). ET1 is an all-/? protein consisting of 12 antiparallel b-strands joined by long loops (Fig. 3). Six of the strands form a short antiparallel P-barrel

Structure

of Erythrina

r

Trypsin

161

Inhibitor (c) Secondary

-60 0

0 -o_ n -180 1 ?X, -0, , , El, ,’ 1 a t , , 1 1 , 1 ‘1’ -180 -120 -60 0 60 120 1

4’ Figure 2. Ramachandran plot of the main-chain dihedral angles for the final model. Squares denote glycine residues. The continuous lines enclose areas that are fully allowed conformational regions for T(V) of 110” and the broken lines show the area of acceptable van der Waals’ contacts for Z(P) of 115”.

whose shape is more similar to a cone than a cylinder, being narrower at one end (the “bottom”) and larger at the other end (the “top”). At the bottom of the barrel there are three loops joining adjacent strands which despite their length are well ordered. One of these contains the reactive site (the scissile bond Arg63-Ser64), while another is the N terminus. The C terminus is a short segment lying at the same end of the barrel so that the amino and carboxy termini emanate from two adjacent strands. The top of the barrel is closed by a “lid” consisting of the other six strands coupled in three pairs. These strands form a cap connected to the barrel by six loops extending outside, perpendicular to the axis of the barrel, giving the molecule a roughly spherical shape (dimensions 44 A x 40 A x 40 A). The electron density corresponding to the loops (with the exception of the ones at the bottom of the molecule) is often not well defined and there is a gap between residues 93 and 97, at the top of the molecule. For the ribbon diagrams presented in Figure 3 the main chain has been built for the four omitted residues, in order to give an overview of the topology. approximate 3-fold The molecule shows symmetry about the axis of the barrel, the repeating unit consisting of four sequential P-strands and the connecting loops. To underline this feature, the P-strands have been identified by the labels Al, A2, A3, A4, Bl etc., where A, B, C indicate the three different subdomains and the numbers correspond to topologically equivalent elements (Fig. 4). The same unusual topology has been found in the structure of interleukin-l/l (Priestle et aZ., 1988, 1989; Finzel et al., 1989) and of interleukin-la (Graves et al., 1990).

structure

elements

Residues have been assigned to secondary structure elements on the basis of hydrogen bonds that can be inferred from the structure (Kabsch & Sander, 1983). The main-chain hydrogen bonding pattern is shown in Figure 5 and the assignment of secondary structure elements is presented in Table 8. The principal structural features in ET1 are antiparallel b-strands. The distinction between strands belonging to the barrel and strands belonging to the lid is quite clear, but the interactions between them are rather complicated. Although the b-barrel (strands Al, A4, Bl, B4, Cl, C4) looks rather regular, the interactions between strands from different subdomains (A4-Bl, B4-Cl, C4-Al) are always much stronger (from 6 to 8 hydrogen bonds) than the intrasubdomain contacts (Al-A4, Bl-B4, Cl-C4, typically 2 or 3 hydrogen bonds). In the C subdomain the exposed side of the first strand (Cl) maintains a regular /?-conformation by hydrogen bonding to the beginning of the C2 strand; a weaker interaction between Bl and B2 is also observed. In the lid the basic hydrogen bonding pattern is the antiparallel ladder between the second and the third strands (p2 and p,) of the same subdomain. The /I,-strands of the B and C subdomains interact with the barrel as already described, while all the P,-strands (the more internal ones) form a triangular structure, whose edges are joined by hydrogen bonds (Fig. 6). The b-sheets show a strong right-handed twist, particularly for the strands &&. This is in agreement with the observation (Salemme, 1983) that

Table 8 ETI

secondary

P-Strands

Al

15-20

A2

29-32

A3

42-45

A4

56-60

81

73-17

B2

89-92

133

100-103

84

115-121

Cl

126-132

c2

139-146

c3

152-156

c4

163-168

structure

Loops

/I-Turns

N-term

1-14

Al-A2

21-28

A2-A3

33-41

A3-A4

46-55

48-51

61-72

68-71

Bl-B2

78-88

81-84

112%B3

93-99

$4-B

1

H3-B4

104-114

B4-Cl

122-125

Cl-C2

133-138

c&-c3

147F151

C3-C4

157-162

C-term

168-170

s-s

4-7 11-14 22-26 cys39

Cys83

109-112

Cys132 cys139 147-150

1

Figure 3. Stereo representations of the overall fold of the molecule. The pictures have been generated using the program RIBBON (Priestle. 1988). (a) “Side” view which emphasizes the distinction between the 6-stranded p-barrel (at t’he bottom) and the “lid” (at the top). The pseudo S-fold axis is vertical and corresponds to the axis of the barrel. (b) Viewing along the pseudo Y-fold axis from the top showing the triangular interactions of strands 93, R3 and C3. (c) Viewing along the axis from the bottom showing the interactions between the N terminus and the loon containing *be reactive site. Y

-

\,

.-_-

of Erythrina

Structure

Trypsin

163

Inhibitor

topologically equivalent positions in the other subdomains. Six /?-bulges can be found in the structure: the parameters are given in Table 1I, using the nomenclature proposed by Richardson et al. (1978). The two b-bulges Gly7,Glu8;Asp4 and Glyl50,Tyrl5l;Asp147 are very similar. The residues at positions x and 1 correspond to residues i and i+3, respectively, of a type I turn. They can also be described as b-bulge loops belonging to class 3 P-hairpins according to Milner-White & Poet (1986). The conformation of the bend is stabilized by the hydrogen bonds made by the aspartate carboxylate with the main-chain nitrogen atonis at positions 1 and 2 in the bulge. Asp4, Gly7 and Asp147 are conserved in all the homologous sequences(Table 1). The P-bulge Gly27,Gly28;Ser46 is similar to a classic bulge although residues at positions 1 and 2, both being glycine, show unusual main-chain dihedral angles for such a structure. Bolth Ser46 and Gly28 are conserved in four out of five homologous sequences (Table I). In the bulge Glyl3,Glyl4;Ile58 Glyl3 is at the same time the required glycine in position i+2 of a type II t>urn, a fairly usual structural feature (Richardson et al., 1978). All three homologous trypsin inhibitors (ETI, ST1 and WTI) have glycine residues at positions 13 and 14 (Table I), suggesting that the conservation of the p-bulge could be important in maint’aining the correct geometry of the reactive site loop. The region between Val120 and Ser127 shows a very complicated main-chain hydrogen bonding pattern that can be best described as a classic P-bulge (Vall20,Serl2l;Serl26), but in addition Serl26-0 is hydrogen bonded to Vall20-N and SerlBl-N, while Serl21-0 is hydrogen bonded to Hisl25-N and Serl26-N. Residues 122, 123 and 124 are approximately in a helical conformation. All the /?-bulges are strategically located at the end of the P-sheets and their functional role is probably to help to provide the strong twist required by the barrel and the double-stranded flz-p3 sheets. At the bottom of the barrel, at the position where in subdomain A a P-bulge (Glyl3,Glyl4;Ile58) helps the Al and A4 strands to diverge, in subdomain B there is a “pseudo’‘-bulge, where Phell7-N interacts with Asp’ll-0 through the solvent molecule Wat227 and Phell7-0 interacts with Asp70-N through Wat238. In spite of the fact that the interactions are mediated by the two water molecules, the dihedral

Figure 4. Nomenclature of the p-strands. The structure is seen from the side with the pseudo a-fold axis vertical. The view is as in Fig. 3(a). The position of the scissile bond is indicated by an arrow.

double-stranded antiparallel sheets have a high flexibility and can twist in a manner which completely preserves the integrity of interchain hydrogen bonds. The average dihedral angles for the P-strands are reported in Table 9. Topologically equivalent strands show similar average dihedral angles. Strands p1 and p4 have average values in good agreement with those found in other refined structures, while the different values obtained for f12 and fi3 result from the high degree of twisting. Several p-turns have been identified on the basis of the hydrogen bonding Oi-Ni+, (Crawford et al., 1973; Chou & Fasman, 1977). A complete list is given in Table IO using the classification adopted by Richardson (1981). The polypeptide chain between residues 22 and 26, connecting the strands Al and A2, can be described either as a double turn (2 overlapping type II turns) or as two turns of a 3,, helix (Isogai et al., 1980). Unlike single bends, this kind of double bend does not reverse abruptly the direction of the chain. This feature is not present at

Table 9 Average

11 (4) Al Bl Cl

(4,$)

for the /Lstrands

83 (4)


  • B2 (4) c*>

    P4 (4) ($)

    -104 -118

    141 150

    A2 B2

    -128 -115

    170 150

    A3 B3

    - 101 -100

    137 144

    A4 B4

    -117

    145

    c2

    -126

    161

    C3

    -95

    128

    C4

    -113

    145

    -123

    160

    -99

    136

    -115 -118 -110

    149 142

    -114

    153

    168

    164

    X. One& et al.

    w22

    Figure have

    5. A diagram

    been

    enclosed

    showing

    in rectangular

    the main-chain hydrogen bonding pattern of ETI. Residues belonging t’o the /?-strands boxes. Disulphide bridges are shown as wavy lines. The loop 124 to 126 and the regions

    between residues 143 to 155 and 163 to 169 are represented chain where included.

    the

    degree

    of disorder

    is too

    high

    to allow

    angies for Asp70, Asp71 and Phel17 are very close to the theoretical values for a Gl P-bulge and the conformations of the two structures are similar. There is good electron density for three intramolecular salt bridges (Glu37-Lys102, Glu59-Arg74 and Asp174-Arg153). Glu37, Lys102, Asp147 and Arg153 are conserved in most of the homologous sequences while the ion pair Glu59-Arg74 is present only in ET1 (Table 1).

    twice. Broken lines denote flexible regions of the polypeptide a definitive

    analysis.

    Some

    water

    molecules

    have

    also

    been

    (d) Pseudo 3-joid .symmetq An interesting feature of the molecule is the presence of the internal pseudo 3-fold symmetry (McLachlan, 1979), with t,he 3-fold axis corresponding to the axis of the barrel. The repeating unit is a. four-stranded motif consisting of about 60 amino acid residues, structurally organized as Lfl,LB,L/S,L/I, (where L denotes a loop connecting

    Structure

    P

    c3

    Vw -4

    /iy

    /’

    of Erythrina

    “yp(

    c2

    N

    -

    YY

    ‘Nq

    /

    /’

    165

    \, \

    0

    0

    Trypsin Inhibitor

    LP82 088

    @-

    d

    (0)

    Figure molecule

    6. (a) A schematic diagram of the and the role of Wat254 in stabilizing

    interactions the tertiary

    between structure.

    strands A3, (b) Stereo

    B3, C3 and AZ, B2, C2 at the top of the view of the corresponding atomic model.

    8. On&i

    Subdomain

    Figure 7. (a) (b) Superposition t,he thinnest line subdomain B.

    A

    Subdomain

    et ai.

    B

    Subdomain

    C

    Ribbon representat,ion of the three subdomains of ET1 shown in the same relative orientation. of the C” backbones of subdomains A, B; C in the same view (in stereo). Subdomain 9 is represented b?and subdomain B by the thickest line. A star (*) indicates the position of the scissile bond within

    Table 10 P-Turn

    D4-G5-X6-G7 Qll-X12-G13-G14 V22-W23-A24-Q25 W23-$24.&25-G26 S4%E49-L50-851 P68-D69-D70-D7 P81-982.C83-A84 AlOg-QllO-Flll-D112 D147-Q148~A149-G150

    1

    classi$cation

    -51 -62

    -34 122

    -90

    -49 -55 -53 -58 -45 -64 -54

    -52 -23 -13 145 -44 -39 -42

    -55 -63 -92

    91

    62 -69 - 101 -77

    -; -23 -32 -5 15 -1l 32 -13

    I II III III I II I I I

    Structure

    of Erythrina

    Trypsin

    167

    Inhibitor

    Table 11 P-Bulge

    D4 158 S46 A32 S126 D147 F117

    G7 G13 G27 L41 v120 G150 D170

    GU G14 028 T42 s121 Y151 D171

    -87 -118 -79 -152 -109 -74 - 107

    175 132 141 164 155 178 142

    classification

    79 91 - 105 -122 -102 108 62

    consecutive P-strands). Figure 7(a) shows the three subdomains in the same orientation. Each subdomain starts with a long coiled loop that eventually enters the barrel with strand fil, goesinto the lid with the double-stranded antiparallel P-sheet’ made up by strands bz and /j3, and then comesback to the barrel with strand p4. Thus the six-stranded barrel consists of the b1 and fi4 strands (Al, A4, Bl, B4, Cl, (34); strands fil and p4 from the same subdomain are adjacent, where strand p1 is hydrogen-bonded to strand b4 of the previous subdomain and strand p4 is hydrogen-bonded to strand /?I of the following one. The N terminus, the loop between A4 and Bl, and the shorter loop between B4 and Cl lie at the bottom of the barrel. The six loops between fil and pz and between ,83 and b4 form an

    15 -4 -168 -5 -50 5 15

    -95 -71 -75 -87 -148 -78 -78

    167 160 -176 132 135 134 160

    Gl Gl Classic Classic Classic Gl Pseudo

    Gl

    equatorial “belt” around the hydrophobic core of the protein and other three loops at the top connect the pairs of strands pz and f13. There is high structural similarity between the three subdomains (Fig. 7(b)). A superposition between topologically equivalent residues has been made for each pair of subdomains and the difference in the C” position is presented in Table 12. From a qualitative point of view the A and B subdomains look more similar: the length of the structural elements is more or less the same and the loops meander in a very similar way, while the C subdomain has longer /?-sheetsand shorter loops. At the bottom of the barrel, residues 1 to 6 of the N terminus appear to substitute topologically for the short loop between B4 and Cl (Fig. 8).

    Table 12 Superposition

    A

    B

    Separation (4

    G13 G14 T15 Y16 Y17 L18 L19 P20

    D70 D71 K72 v73 R74 175 G76 F77

    l-18 091 0.76 0.89 0.67 0.35 1.85 1.67

    G28 v29 Q30 L31

    W88 W89 T90 v91

    1.01 0.79 OS3 1.80

    L41 T42 v43 v44 Q45

    L99 SlOO VlOl K102 L103

    0.70 1.22 1.01 @70 1.56

    156 R57 I58 E59

    F115 K116 F117 El18

    0.62 @91 0.64 0.97

    21 c”‘s r.m.s. = 1.08

    of the three subdomains Separation (4

    B

    C

    V73 R74 I75 G76 F77

    Y127 K128 L129 Ll30 Y131

    0.98 0.26 @59 @38 1.34

    P87 W88 W89 T90 v91 v92 A93

    A140 s141 1142 G143 1144 N145 R146

    1.03 0.46 0.31 032 0.88 1.36 0.55

    G98 L99 SlOO VlOl K102 L103 A104

    Y151 R152 R153 L154 v155 V156 T157

    1.19 0.28 0.52 0.49 0.33 0.21 0.54

    F115 Kll6 F117 El18 A119 v120

    V164 V165 Ll66 K167 K168 D169

    1.28 142 0.49 0.42 0.51 0.68

    25 V’s

    r.m.s. =085

    Separation (4

    A

    C

    T15 T16 Y17 L18

    S126 Y127 K128 L129

    0.94 069 1.08 1.32

    G27 G28 v29 Q30 L31 A32

    A140 s141 1142 G143 1144 N145

    1.02 1.11 078 0.30 066 0+39

    L41 T42 v43 v44

    R152 R153 L154 v155

    0.56 063 0.85

    P55 156 R57 I58 E59 S60

    T163 V164 V165 L166 K167 K168

    @93 1 .oo 1.03 @76 0.73 036

    20 C”s r.m.s.

    0.52

    =0,78

    168

    8. Qnesti

    et al.

    (a)

    Figure 8. (a) A schematic diagram of t’he interactions occurring at the bottom of’ Lhe barrel between the S terminus and the reactive site loop showing the role of Wat220. (b) Stereo view of the corresponding at’omic model.

    It has been suggested (McLachian, 1979) that this fold could be the result of gene triplication. The alignment of topologically equivalent fragments does not. show a,ny internal sequence homology: not a single amino acid is conserved in all the three

    subdomains. Tn the case of interleukin-l/3, Finzei et al. (1989) observe that in the DNA sequence the two introns do not have a topologically equivalent, position, and in particular are not located at the bounda.ry of the subdomains.

    Structure

    of Erythrina

    Trypsin

    Inhibitor

    169

    (e) Solvent structure Sixty solvent molecules have been assigned to peaks bigger than 3 B in the (Fobs- Fcalc) map when the putative water molecule could make good hydrogen bonds to the protein. Some of the water molecules play an important role in maintaining the overall structure, being strategically located where the pairing of two antiparallel b-strands comes to an end. The water molecule thus helps to mediate the interactions at the end of the ladder, allowing the strands to diverge without disrupting the favourable hydrogen bonding pattern of the P-sheet. In some cases the interactions mediated by the solvent molecule involve three strands: for example, Wat234 interacts with Leu129(Cl), Ile142(C2) and Leu162(C4), where strands Cl and C4 diverge and the pairing of Cl and C2 begins. In the same way Wat254 is at the centre of the triangular structure made up by strands A3, B3, C3, being almost located on the 3-fold (Fig. 6). On the bottom side of the molecule a similar triangular hydrogen bonding pattern is due to the residues from 2 to 4, 8 to 10, 65 to 67 (Fig. 8). Wat220 at the centre has the same structural role. Similar “structural” water molecules have been observed in the X-ray crystal structures of interleukin- 1p. (f) Atomic mobility The average isotropic atomic temperature factor for ET1 (32.4 A2 for main-chain atoms, 59.7 A2 for all protein atoms) is unusually high, a characteristic that is often associated with crystals containing a high proportion of solvent. When a plot of the average main-chain temperature factor for each residue is compared with the position of secondary structure elements (Fig. 9), a strong correlation between residues belonging to the P-strands and

    Figure 9. Average atomic temperature the main-chain

    atoms of each amino

    horizontal

    denote

    bars

    the position

    factors

    acid residue. of the P-strands.

    (B) for The

    those having low temperature factors is observed. The effect of crystal packing could explain some of the differences observed in the degree of thermal motion or disorder of the loops. Most of the intermolecular contacts involve the A subdomain and the bottom of the molecule: this is clearly reflected in the B-factors’ trend. In addition the N terminus and the loop A4-Bl are involved in the hydrogen bonding network shown in Figure 8, which stabilizes the structure. In contrast, the loops in the B and C subdomain are generally exposed to the solvent and show a high degree of flexibility. The remarkably high values (up to 100 A2) that occur for some of the residues in loop B3-B4 and Cl-C2 need to be discussed briefly. As previously mentioned these regions show an unusual degree of disorder and a model could be built only when low resolution reflections were added, An additional check was done by calculating a map using phases with no bias from the model loops (Fig. 1). Similar observations of unusually large temperature factors have been reported for the structure of

    Table 13 Reactive site loop conformation for various proteinase inhibitors

    ET1 ST1 PST1 CI-2 Eglin c PTI

    Ser Pro GUY Ile Pro GUY

    Ser CYS Val Val Pro

    ET1 ST1 OMSVP3 PSTI/Tg CI-2 CI-B/Subt EGjCht EG/Subt PTI PTI/Tryp

    4N -86/171 -581127 - 149/163 - 162/166 -go/117 -93/140 -58/145 -71/139 92/178 78/178

    -69/-28 -4o/-39 - 133/155 -119/151 - 103/166 - 133/166 - 142/163 - 141/165 -87/-S -771-32

    Ax

    4lti

    OMSVP3, Bode et al. (1985); PSTI/Tg, Bolognesi EG/Cht, Bolognesi et al. (personal communication);

    Leu TV Pro Thr Thr CYS

    LYS Met Leu LYS

    41* - 115/149 -65/141 -871174 -61/147 -77/134 -641147 -79/157 -581142 -82/160 -7oj154

    -63/136 - 93147 -9619 - 95145 -63/27 - 103/34 - 102/50 -116/47 - 120/34 - 120/49

    Ax Arg

    4/J/

    et al. (1982); CI-2, McPhalen EG/Subt, Bode et al. (1987);

    Ser Ile Ile Glu Asp Ala

    Ala Arg TY~ TY~ Leu Arg

    4lll,

    4/ti

    - 170/161 -93/123 -58/139 -92/145 -9ljl28 -91/146 - 104/162 -98/169 -88/179 -94/170

    -791-34 -165/-178 -99/93 -99jlOl -93/117 - lOSjlO9 - 161/117 -1lSjlOS - 132/84 - 112/79

    & James (1987); PTI, Deisenhofer

    Phe Phe Asn Arg A% Ile

    4/i - 118/178 8149 - 130/69 - 139/73 - 125/115 -118/113 -132/128 - 121/112 -118/120 - 105/120

    CI-S/Subt, McPhalen et (J. (1985b); & Steigemann (1975).

    170

    X. One&i

    interleukin-lp and it has been proposed that the flexibility of the external loops may have a functional role (Finzel et al., 1989).

    4. Discussion (a)

    The conformation

    of the scissile

    bond

    The scissile bond (Arg63-Ser64) is located on an external loop (A4-Bl) that protrudes from the surface of the molecule. There is good electron density for the entire loop and the refined temperature factors are much lower than those of other external loops, although higher than those of the residues in the core of the molecule (Fig. 9). The structure of the loop is not constrained by secondary structure elements or disulphide bridges that could limit its conformational freedom, as often observed in the structures of proteinase inhibitors, but is stabilized by hydrogen bonds. The hydrogen bonding pattern is shown in Figure 10. Particularly important are the hydrogen bonds made by the side-chain amide group of Asn12 with the carbonyl oxygen atom of Leu62 and the nitrogen atom and hydroxyl group of Ser60. The region between residues 65 to 67, adjacent to the scissile bond on the P’ side (according to the nomenclature int’roduced by Schechter & Berger, 1967) interacts with residues 1 to 4 and 8 to 12; the three segments are at the edge of a triangular face with a water molecule in the centre (Fig. 8). At the angles of such a triangle the three segments are held together by antiparallel P-type hydrogen bonds. (b) Comparison

    with soybean trypsin

    inhibitor

    The structure of the complex between ST1 and porcine trypsin has been determined at 2.6 a resolution (Sweet et al., 1974; Blow et al.; 1974). Since the electron density in some regions was poor, t,he interpretation of the inhibitor was not complete. A

    et al.

    partial model comprising residues 1 to 93 was built and refined with a certain degree of confidence, only a C” tracing is available for the segments 94 to 106 and 130 to 176, while another stretch of density was tentatively assigned to residues 116 to 122. Despite these difficulties, the t.opology of the model is clear. ET1 shows a similar topology, as expected from the high homology shown by the amino acid sequences of the two inhibit,ors: 83 Cd positions can be superposed in the two sbructures with a r.m.s. deviation of 1.34 a (Fig. 11). Most of the residues with a functional role in the structure, such as those involved in p-turns and p-bulges, are conserved in STI: in particuiar Gly7 (residue 1 in a Gl bulge) and Gly13 (residue i-i-2 in a type II turn associated with a Gl bulge). A deletion in ST1 occurs near position 27 and modifies a double bend into a single one (Table 1). The ST1 model is not of sufficient quality to allow an analysis of secondary structure elements in terms of hydrogen bonds, but it is obvious from a qualitative analysis that the P-sheets are well conserved while some of the loops a,re quite different. For example the loops Bl-B2 and Cl-C2 are much shorter in ETI. In particular the loop between residues 61 and 67 (where the scissile bond- is loca,ted) is displaced, moving the reactive site of ET1 by about 4 a with respect to ST1 (Fig. 12). It is difficult to decide if this should be regarded as an intrinsic feature of ETI: which may have a functional role, if it results from the fa,ct that the structure of ST1 has been determined in the complex with trypsin or if it is due to packing interactions. The differences in conformation at the reactive site do not disappear even if a more local superposition of atoms is performed. When the C” atoms of residues 60 to 65 (P4 to Pz) are overlapped the r.m.s. difference is 0.64 8. It is of possiblesignificance that many of the intermolecular contacts in the crystal lattice involve the scissilebond region (residues 59 to 63) and the loop

    Figure 10. A stereo view of the conformation of the reactive site loop in ET1 showing the hydrogen bonding patt,ern. The scissile bond is Brg63-Ser64. The conserved residue Asn12 has a key role in maintaining the 3-dimensional structure of the loop.

    Structure

    of Erythrina

    Trypsin

    Inhibitor

    Figure 11. Stereoscopic superposition of the C” backbone of ET1 (thick Fig. 4. The reactive site loop is at the bottom left of the Figure.

    20 to 23 of a symmetry-related molecule. The guanidinium group of Arg63 makes strong hydrogen bonds with the carbonyl of Pro20 on an adjacent molecule, while the hydrophobic segment of the arginine side-chain is packed against the aromatic group of Trp23 (Fig. 13). The possibility that the different conformation observed in ET1 could be due to the crystal packing has to be taken into account. Results from other structural studies of proteinaseinhibitors support the hypothesis that the difference in the conformation of the reactive site loop in ST1 and ET1 depends on the fact that the former structure has been determined in the complexed form. To a first approximation the binding of the inhibitor to a proteinase occurs with a key-lock mechanism, implying a reactive site loop rigidly clamped

    in a conformation

    complementary

    to the

    enzyme active site. However, comparisons between free and enzyme-bound inhibitor structures as well

    cf

    as structures

    171

    line) and ST1 (thin line) in the same view as

    of inhibitors

    complexed

    with

    different

    proteinases reveal a conformational flexibility in the reactive site loop. For example, when the structures of members of the pancreatic secretory trypsin inhibitor (PSTI) family (OMJPQS, Papamokos et al., 1982; OMSVPS, Bode et al., 1985) are overlapped with the homologous complexed inhibitors (PST1 : trypsinogen, Bolognesi et al.; 1982; OMTKY3 : cc-chymotrypsin, Fujinaga et al., 1987) the major differences are found in the active site loop, whereas the core of the molecule is structurally highly conserved. These differences can be better described as “hinge bending” of the loop with respect to the serine proteinase structure. Despite this, the overlap is very good when only the reactive site residues are considered. Analogous observations have been reported by McPhalen & James (1987) when comparing the free and complexed forms of the chymotrypsin inhibitor-2 (CT-2, which belongs, with eglin c, to the potato inhibitor-l (PI-l) family).

    RC 63

    Figure 12. Difference in the relative position of the reactive site loop when the crystal structures of ET1 (thick line) and ST1 (thin line) are overlapped as in Fig. 11 (in stereo). The position of the N terminus, Asnl2 and the scissile bond (Arg63-Ser64) are indicated.

    S. Onesti

    et al.

    Figure 13. (SF,,,,-Zp
  • Crystal structure of a Kunitz-type trypsin inhibitor from Erythrina caffra seeds.

    The trypsin inhibitor DE-3 from Erythrina caffra (ETI) belongs to the Kunitz-type soybean trypsin inhibitor (STI) family and consists of 172 amino aci...
    3MB Sizes 0 Downloads 0 Views