J. Mol.
Biol.
121, x39-3%
(1978)
Structure of Prealbumin: Secondary, Tertiary and Quaternary Interactions Determined by Fourier Refinement at 143 A C. C. F. BLAKE,
M. J. GEISOW~,
S.
.J. OATLEY
Laboratory of Molecular Biophysics, Departmd of Zoology South Parks Road, Oxjord OX1 3PK. England R. R~RAT
Laboratoire
AND
C. R&RAT
de Cristallographie, CNRI) Laboratoires de Bellevue I, Place Aristide Briand. Bulle,vue. France (Received
20 December
1977)
The principal elements of t,he secondary, tertiary and quaternary structure of t.he tetrameric human plasma prealburnin molecule have been determined by Fourier refinement of X-ray diffraction data at, 1.8 d resolution. The subunit has an extensive /?-structure composed of eight strands organised into two fourstranded sheets. There is also one short a-helix. The t,ertiary structure is largely determined by the association of the two b-sheets. Important contributions to the tertiary structure are made by three tyrosines and one aspartic acid involved a buried histidine associated with a in side-chain-main-chain interactions; group of internal water molecules; and a compact cluster of seven aromatic residues. Quaternary interactions occur at two sets of interfaces closely organised around two of the three molecular Z-fold axes. The exclusive monomermonomer interface is chiefly concerned with ant’iparallel hydrogen bond interactions which extend the two four-stranded sheets in thr monomers to eiphtstranded sheets in the dimer. One of the sheet interactions includes water molecules and tyrosine hydroxyls in the hydrogen bond patt,ern. The dimers associate through both hydrophilic and hydrophobic interactions at interfaces that involve all four monomers.
1. Introduction Human prealbumin is a plasma protein that is known to interact with the thyroid hormones (Robbins t Rall, 1960), and with retinol-binding protein (Kanai et al., 1968), the carrier protein for vitamin A. Preliminary X-ray studies of the crystalline protein (Blake et al., 1971a,b) ha,ve shown that t,he prealbumin molecule is a symmetrical tetramer of 55,000 molecular weight, exhibiting 222 symmetry. Extension of the X-ray studies to 2.5 A resolution (Blake et al., 1974) revealed that the subunits had an extensive p-sheet structure, and were compactly arranged around a welldefined channel that runs through the molecule. in which are located two equivalent hormone-binding sites. A detailed description of t’hese binding sites, and of a further t Present N\Vi IALA.
a&hews:
h’ational
Institute
for Medical
Rvsmrch.
The Ritlgrwsy.
Mill
Hill,
London.
The primary structure of human prt~nlbumin has hetw reported itltlcpt,lldcrltl~ in the form of a tentat)ive overlap of a complete set’ of peptides (Gonzalez, 1972 : Gonzalez & Offord, 1973), and as a complete amino acid sequence (Kanda ct nb.. 1974). These data have been incorporated in the X-rap model determined at 2.5 a resolution, which n-as then used as the basis for an extensive Fourier refinement against a complete set, of X-ray data to 1.8 A resolution (the practical limit’ of the diffraction pattern). This refinement has confirmed the sequence of Kanda et nl. (1974). and has established all the major details of the molccu1a.r struct,ure. which WC no\\ describe here. One feature of the prealbumin molecule that is particularly relevant to the present work is its exceptional resistance to denaturation. The internal organisation of the molecule does not appear to change between pH 3.5 and pH 12, nor is it dissociated into subunibs by strong acid, alkali or Wlq:, sodium dodecyl sulphat,e (Branch it al., 1971); it appears to be totally resistant to urea, and at neutral pH guanidinium hydrochloride concentrations greater than 6 M are required to denature the molecule at a significant rate (Branch et al.. 1972; Nilsson et al., 1975). It is therefore interesting to see if it is possible to account qualitatively for this extreme stability in terms of the observed tertiary and quaternary interactions in the molecular struct’ure.
2. Experimental The calculation of the electron density map at 2.5 A resolution, phased by 3 isomorphous derivatives, has been described earlier (Blake et al., 1974). The map was interpreted by constructing a model from Kendrew model parts (scale 2 cm = I Ak), produced by Cambridge Repetition Engineers, using an optical comparator (Richards, 1968) modified for a vertical mirror. There are 2 prealbumin monomers in the crystallographic asymmetric unit, related by a local 2-fold axis inclined at about 4.5” to the crystal y-axis. The electron density map the 2 monomers being constructed independently. was not averaged about this axis, Although the map was not of high quality, it allowed the polypeptide chain to be traced unambiguously, as described previously (Blake et al., 1974). except that very little density was seen for the first 9 residues in each monomer. Amino acid sequence information was provided prior to publication by Drs Gonzalez and Offord and Drs Canfield and Goodman. Although the side-chain density was not particularly well-defined, it allowed the sequence dat,a to be incorporated unambiguously into the model, and also enabled clarification of some less securely assigned regions of the sequence. At a later stage we discovered that, the relatively poor quality of the map interpreted was partially due to a computing error; however, the correctly calculated map confirmed that no gross errors had occurred in the original interpretation. Atomic co-ordinates of all atoms except those of the first 9 residues in each monomer were rneasured from the model. After checking, they were adjusted by the method of Dodson et al. (1976) to give a model in which bond lengths and angles were constrained to agree closely with stcndard values. Following this treatment, the co-ordinates were plotted back on to a srnall scale (1 cm = 2 A) version of the rnap in an attempt to eliminate some errors in depth perception of which we had been aware while working with the cornparator. We found this small scale to be so useful that all further operations were carried out uaing such maps. Although tile corrected, constrained co-ordinates afforded a useful overall description of the molecule, they clearly needed considerable improvement if a reasonable understanding of the extreme stability of the molecule, and a detailed description of its biologically important ligand interactions were to be obtained. We therefore decided to
STRUCTURE
OF
PREALBUMIN
341
refine the co-ordinates against X-ray data collected to the practical limit of resoWion of the diffraction pattern, about 1.8 A. A new set of X-ray intensity data were collected between 2.6 d and 1.8 A resolution on the &counter, 5-circle diffractometer (Banner et ab., 1977). Only one prealbumin crystal was used to collect the 33,800 intensities, requiring an X-ray exposure of 380 h. Over this period of time, the decline in intensities due t)o radiation damage was only 1076, for which the appropriate correction was applied. Each quintuplet of reflections was scanned in 20 steps of 0.05”. The individual step counts were output to magnetic tape for subsequent analysis by a profile fitting method developed b> Dr G. S. French in this laboratory (French, 1975). The usual absorption correction (Norttl et nl.. 1968) and Lorentz-polarisation corrections were then applied. The resulting strrlcture amplitudes were scaled t)o the exist,ing 2.5 .% resolution data to give a complt>tc data set to 1.8 A4 resolution. The crystallographic refinement (Oatley, 1976) was carried out on the Oxford University ICL 1906A computer using a program system initially developed by E. J. Dodson for the refinement of insulin. Shifts in atomic co-ordinates and temperature factors were estimated automatically from the difference maps (( (F,I - 1Fc/)exp(i~,)) and the resultant coordinates regularized by the method of Dodson et al. (1976). From time to time residues OI groups of residues which had behaved unsatisfactorily were omitted from the structure factor calculation and t,he resulting difference maps examined visually. These maps were mostly very clear and where necessary the region in question was rebuilt and the refinrmmt cont,inued. Visual examination of difference maps also enabled the identifica.tion of solvent molecules and their incorporation into the refinement,. A few preliminary cycles of refinement were carried out at 2.5 A in order to remove tlte larger errors before the introduction of the 1.8 A resolution data. The incorporation of this data improved the reliability of the calculated shifts and was especially valuable in facilitating rebuilding of doubtful regions. The starting model had a convent-ional R valuet of 0.532 for the 8500 terms to 2.5 A resolution. This has now been reduced to 0.289 for all 23,000 terms to 1.8 A resolution or 0.275 if the 10 .& shell is omitted. This calculat,ion includes 102 solvent molecules and 1766 protein atoms, that is, all protein atoms except those of residues 1 to !) in each monomer, for which no density has appeared during the, course of refinements, and of residues 124 to 127 whose exact conformation remains uncert’ain. Thtl refinement is being continued and at its conclusion will be described elsewhrrc. However. we feel t,tlat t,he present stage enables a reliable description of the major details of the molecule to be given. As an indication of t,he present position, some sect’ions of tho map calculated using coefficients (31 P,I - 21 F,I )eup(ix,) are illustrated in Figure 1. ‘I’hc current co-ordinate set has been deposited with the Protein Data Bank at Brookhaven National Laboratory, Upton, L.T., New York 11973. U.S.A., from whom copies are avail&lP .
3. Secondary
Structure
The amino acid sequence of prealbumin as determined by Goodman, Canfield and their colleagues (Kanda et al., 1974) is shown in Figure 2. The polypeptide chain conformation ot the prealbumin dimer is illustrated as a stereo drawing of the a-carbon positions in Figure 3. We have illustrated thcl dimcr because it represents t’he crystallographic asymmetric unit, the two monomers of which have been treated independently throughout the interpretation and refinement processes. Also, as described later, this dimer involves a very close and extensive interaction between monomers, suggesting that it, rather than the monomers themselves, is the basic unit of structure in the tetrameric molecule. It can be seen in Figure 3 that the polypeptide chain is mostly in extended conformations. The dominant secondary structural element is the p-pleated sheet: just under half (4576) of the amino acid residues are organized into eight /I-strands,
i N’+,
Disordered l IO GLY -PRO-THR-GLY-THR-GLY-GLU-SER-LYS-CYS-F’RO-L~-~-v!&-~ +Mtends7pj7 I 1 - VJ - L& - &F’ --i& - VAL - ARG - GLY - SER - PRO - qLn B-Strand B-----, - HIS-V~-PHE-ARG-LYS-ALA-ALA-ASP-ASP-MR-TRP-GLU-PRO-~-ALP,
-&Strand
A--
- ILE - ASN - VA-
40
-30 ALA - VA
r-&Strond
C-
, -fl-Bend (&Strand D + -SER-GLY-LYS-THR-~~-GC”-SER-GLY-W*I-LE”-nrs-GLY-IEU-T”R-THR -l---p I c@Stm~--70 -GLX-G;!%>N-P&VAL-GW-GLY -ILE-TX-LYS-VA-GLU-G-ASP-Tt----a-Helix-60- LYS-SER-MA-T&-LYS-ALA-m-GLY
>
-ILE-SER-PRO-m-&S-GW-H% #-Bend I
-@Strand F _____, -AL-A-GL”-V~-VAL-f&THR-~-ASN-ASP-&%-GLY-PRO-ARG-ARG-& P-Stmnd G- 110-THR-~-~-A&-CEU-~-SER-PRO-TYR-SER-TYR-~-THJ-~~-@J
60
----end--,&?-strand
H -120
127
Pro. 2. The amino showing the location has been ronsrrvatively neighbowing strands. (‘OCi, singly locatctl
(““t,+*) untlwlincd in thP
acid sequence of human prealbumin as determined by Kanda et ~1. (1954). of the major elements of secondary structure. The extent of the /J-strands defined on the basis of appropriate 4, I,/I values rend hydrogen bonding to The p-bends shown in this diagram fulfil only the criterion that the distance that in addition fulfil othrr criteria are menConed in the text,. The .-T 5.i a; those residues are located in thr core of the subunit; thaw doubly nnderlincd are wnkal channrl of the tetramrr.
ahcalled A to H, and A’ to H’ in the second monomer, according to thta order they occur in the chain (Fig. 2). Seven of the strands, A,B,C,E,F,G and H are six t’o nine residues long but strand D is irregular and makes only t’hree p-type hydrogen bonds. The strands are organised into two p-sheets, separabed by about 10 I%, one composed of strands DAGH and the other CBEF. The hydrogen bond arrangement within thta sheets is shown schematically in Figure 4. Figure 5 shows the arrangement of t,he sheets in the monomer and dimer, from which it is clear that they form much of t,he surface of the subunits, making prealbumin a classic “/I-barrel” or “all+? protein. All the st’rand interactions are antiparallel with the exception of that between strands A and G. This single parallel interaction is unique among the all-/3 proteins whose st,ructures are currently known (Levitt & Chothia, 1976). The overall shape of the, monomer can be approximated by a truncated cone about 45 .!I long whose diameter decreases from 30 A4 to 20 A. with the /I-strands running roughly parallel to the cone axis. The p-sheets display the twisting usually observed in such structures that reduces thfb non-bonded interactions hetween adjacent carbonyl groups and residue sidechains (Chothia, 1973). Viewtxd normal to the dire&on of the polypeptide chains, the edge strands of the two sheetas (D and H. C and F) are twisted by about’ 60” wit,h respect, to one another, i.e. wit’hin each sheet. each strand is twisbed 11,y about, 20”’ relative to its adjacent st,rands. However. the two pairs of strands H and H’, and F and F’: that form part of the monomer-monomer interface (see below) are not FIG. 1. A few sections of an electron density map of prealbumin at 1.8 A resolut)ion, showing part of the main-chains of the DAGHH’G’A’I)’ sheet,. The main-chain carbonyl groups, point,@ alt,crnately to either side of the p-strands, can be clearly seen. This map was calculated from the cwront, set of atomic co-ordinates using (3 IF01 - 2 1F, ~)rxp(icc,) as Fourier coeficients, i.e. it carresponds to adding twice the difference map coefficients (I F,i 1PJ)exp(iaC)? to the ‘-calculated” map coefficient~s 1P, lexp(ia,).
(a)
Fro. 3. Stereo drawings of 2 prealbumin subunits, (a) related about z across and z down, and (b) related about the molecular z axis with figures in the circles correspond to the a-carbon positions of the residues
the molocular~ y axis, with z across and y down. The numbered in Fig. 2.
STRI;(‘T‘IJRE
OF
34.5
T’RE.~LI~DJITK
appreciably twisted relative to each other. The twisting is apparent in the mean main-chain torsion angles given in Table 1: the mean 9 and ~JJangles are -123” and 135”. respectively, compared with the values for w straight strand of -139” and 135’ (Arnott ef al., 1967) and the mean values for a twisted strand of -120” and 140” found in a survey of protein torsion angles by Pohl (1971). Residues 75 to 83 form the single a-helical segment in the prealbumin mouomer, whose location at the end of p-strand E can be seen in Figures 3 and 4. It is striking that the transit’ion from sheet to helix can be made in a single peptide: peptide 73-74 makes hydrogen bonds of the normal p-type to t’he two neighbouring strands in the sheet and peptide 75-76 makes a normal hydrogen bond within the helix, Iea’ving only peptide 74-75 unable to hydrogen bond within either structure. As has been observed in a number of other proteins a hydroxy amino acid (in t,his case Thr7.5) forms a side-chain-main-chain hydrogen bond with an SH group in the first t,urn of t,he helix. Despite the presence of only nine residues in the helix, it is perfectly regular as shown by the good agreement between its mean 4 and I/Jvaluc~ of --61” and ---41“, respectively (Table I), and those of thr regular x-helix (+ ~ -57’ : I,!J= -47”) (hrnott & Dover. 1967). TABLE Mean
torsion.
angles
in
p-strands I);\GH
1
the secondary
125.l.j. 13.7 134.6 4: 14.!)
135.7:?-17+
s+J
-61.4+
*
- 41.7,ll.O
featurea
1yj.l)
4 16.2
135.9
1 17.2
- 1192+ “3.0 134.3 1 18.3
- 120~0*2:+0
whelk
structural
9.3
-60~3
+ 8.1
-~ 45.4-k
10.6
The remaining residues in the monomer are located in the seven loops which link the eight P-strands together, plus a ten-residue N-terminal “head” and a five-residue C-t’erminal “tail”. The first nine residues were not marked by any significant density in the initial isomorphous electron density map. nor has a,ny density for them appeared in the difference maps calculated at various stages of the refinement, throughout which they have been permanently excluded from the structure factor calculation. We must conclude that this part of the polypeptide chain is disordered in the prealbumin molecule. The last four residues in the chain, although not as disordered as the first nine, are not yet sufficiently well-defined in any of the maps for their conformation to be known with any certainty. In all drawings of the molecule, therefore, only residues 10 to 123 are shown. The seven loops joining the p-strands vary widely in length and character. The AK loop of eight residues, which makes the primary dimer-dimer contact, contains an extensive hydrogen bonding network and will be described laber. Two loops, CD and GH, and possibly a third, BC, contain reverse turns. The CD corner is a type I turn with torsion angles dl, &; &, & of -49”, -22”; -102”. --12O and -59”: --35”: -80’ ) -4” for the two monomers and a 4 --f 1 hydrogen bond between the NH of
:wi
(
(*
18’. I< I, .\ ti I+: E 7’
t I,
Gly53 and the carbony oxygen of’ Srr50. GH is also iL three-pepMe 180 turn tvith torsion a,ngles --50”. --49’ : --~-113”. 20- and -- 61 ‘. ~~11 ’ : - 1I5 II but t h(, presence of Pro113 prevents the format’ion of a hydrogen bond betaeeu the NH of Ser115 and the carbonyl oxygen of Her1 12. by forcing the latter t,o point into the interior of the molecule, roughly parallel to the carbonyl of Pro1 13. where it forms a hydrogen bond with an internal wat,er molecule (see Pig. 7). The remaining loops a,rc quite long and consist mos+ of extended chain: the DE loop is 11 residues long, EP comprises seven residues b&veen the end of the a-helix a,nd strand F. and t,he FG loop is seven residues long and presumably rather flexible since it, is poorly defined in all the electron density maps calculated and its atoms have high tempera.ture factors in th(a current prot*ein model. 2
TABLE
Total
8.shret x 1
x-helix
Othrr
I
t 2
3 10 67 (44.9%)
(7.P%,
61
(48.0’73
Table 2 gives the distribution of the different amino acids among the secondary structural features in the molecule. The aliphatic residues Ala and Val comprise 32% of the p-sheet residues while these two plus Leu and Thr increase the total to 49O/,. The residues Gly, Pro, Thr, Ser and Glu make up 49% of t’he residues which are involved in neither tc- nor /?-structures. Pro11 is at, the N-terminal end of strand A and turns the N-terminal peptide away from the surface of the molecule while Pro43 is within strand C. but distorts the main-chain conformation so that its hydrogen bonding pattern is disrupted (see Figs 4 and 5). Of the three glycine residues found in the /?-strands, Gly53 and Gly67 are in the N-terminal position. while a fourth, Gly83. is the last) residue in the short’ a-helix, consistent, with its role as a helix breaker (Chou & Fasman, 1974).
90
91
92
93
94
95
F‘ 98
97
96
F E B C 56 D
55 0
54 T
53 0
A I G
Frc:. 4. A schematic proalbumin monomer. and H and H’ that fl-st,rands are shown
diagram of t,he hydrogen bond arrangement within the 2 B-sheets of thr Also shown are the interactions between the equivalent strands F and F’, extend each sheet to 8 strand-; in thv dimer. Wat,er molecules bridging t’ho as (a), and thra hydroxyls of the pair of tyrosinw 1 I6 are shown as (0).
FIG. 5. Stereo diagram showing the p-sheet arrangement in the dimer Fig. 3(a). Hydrogen bonds are shown as broken lines. Also shown are tyrosines, residues 69, 78, 106, 114 and 116 and of AsplR.
in a view equivalent to the side-chains of the 5
348
(‘.
(‘.
F.
lil,;\KE
4. Tertiary
f