J. xoz. Biol. (1979) 131, 249-258

Genetic Studies of the ZucRepressor XI. On Aspects of lac Repressor Structure Suggested by Genetic Experiments ?JEFFREY H. MILLER

Dipartement de Biologic Molkculairv Universiti de Gendve Geneva, fi’uitzerlund (Received 2 January


The preceding studies of amino acid substitutions in the Zac repressor of Escherichia coli resulting from missense mutations and suppressed nonsense: mutations in the Zacl gene are combined and critically evaluated with regard to the advantages, limitations and future applications of similar methods in the study of protein structure and function. These analyses reveal regions of tht: protein involved in different repressor functions. The pattern of mutational sites in t’ht? Zacl gene leading to loss of inducer binding of the repressor is striking, for in tht: carboxyl half of the protein the affected residues cluster in nearly equally spaced regions. Possible similarities between tbc inducer binding site of repressor are discussed. and the antigen binding site of immunoglobnlin

1. Introduction accompanying studies present two different genetic approaches towards generating altered lac repressor molecules (Miller et al., 1979: Miller 8.1Schmeissner, 1979). One t’echnique employs suppressed nonsense mut’at’ions to produce a family of Zac repressor proteins containing single amino acid replacements. At each of 90 positions in a polypeptide chain of 360 residues, three to five amino acid substitutions have been made. Therefore, for every position being test’ed a hierarchy of changes can be compared. This study is systematic, in that replacements are effected irrespective of the resulting properties of the altered repressor. Severly damaged proteins as well as those with no detectable change in activity are recovered without bias. Many of the replacements resuking from suppression would require two or even three different base changes in order t’o arise directly from the wild-type codon, since exchanges result from a two-step process and some of the nonsense codons were derived by tandem double base changes (Coulondre & Miller, 1977). Tnspection of the array of repressors generated in this manner has led to the discovery of a number of interesting molecules. The other technique, the examination of amino acid replacements resulting from




does not offer

an unbiased

or systematic


of different


tions in a protein, since mutagenic specificity and the presence of genetic hotspots can hinder the interpretation of results. It, often becomes difficult, to consider the location of a few scwninglv





the important’



in a proper

advant,agca of allowing

perspe&ve. t,liv


0 1979 Academic




of mutant’s

Press Inc. (London)




H. IvllLLElt

having predetermined properties. A large number of replacements, which result in different kinds of altered proteins, can be scored. In many cases these substitutions involve positions in the polypeptide chain that cannot, be examined by nonsense suppression. The preceding paper (Miller & Schmeissner, 1979) describes a detailed analysis of the missense mutations in the I gene that lead to a variety of phenot’ypes. Thes:r results can now be considered together with those from the nonsense mutation study. The combination of the two complementary approaches forms the basis of t,his report.

2. Combined Study An example of how the two preceding st,udies complement one another is shown in Figure 1. Here I represent one of the more intriguing findings that emerges from the analysis of missense mutations leading to the is (altered response to inducers) character. In the latter portion of the gene the I” sites are grouped into distinct clusters, which are separated by defined spaces. The top portion of Figure I depicts these missensc sites. From the is selection, four clusters are found. A fifth group arises from the is.‘” screening. The is effects of these mutat,ions are weaker than for the other four chrstrrs. (Very weak is effects are displayed by other temperature-sensitive mutants, but, those will be dealt with in a subsequent section.) There is a striking regularity in the spacing between clusters. Do these intervals of approximately 26 amino acids have any deep significance, or are they due merely to chance’! Before we can even address ourselvos to this question it is necessary to demonstrate that the clusbers and spaces are really on the protein level and do not reflect, an artifact of the genetic methodology. Even though precautions have been taken to avoid bias from hotspots and the specificit) of mutagens, it is not ruled out that, the local structure of the DNA can affect the mutability of certain subregions of the gene. Spaces might simply represent stretches of sequences t’hat are hard to mutate. Thus. t,he failure to find mutations in cettain regions of the gene cannot be taken as a definitive proof that they do not cause a certain character. It is important to demonstrat,e directly that when amino acid replacements are made in this region of the protein: t’he repressor remains unchanged. This can be done by examining suppressed nonsense sites in the corresponding region of the gene. I have therefore rearranged the data from the suppression of nonsense mutations (Miller et al., 1979) in the bottom part, of Figure 1, aligning the position of residues altered by suppression with those substituted by missense mutations. Small squares represent amino acid replacements. Here I have considered only the is phenotype. Substitutions that create is repressors are indicated by a filled-in square. and those that do not are left open. (Several of the open boxes depict replacements that generate i- or it” proteins, but these do not correspond to is proteins.) In the region of the protein considered in Figure 1. 118 replacements effected by nonsense suppression are shown. The correlation between the misxense and nonsense studies is extraordinary. Clusters appear in precisely t,he same places in both sets of data, and the spaces are shown to be significant in t#hat close to 100 substitutions have been tested in these regions without, detecting is proteins. These results clearly sho\j t,hat the pattern of replacements in this part of t’he protein leading to the is character is a true reflection of some aspect of lac repressor structure, and presumably derives from the architecture of the inducer binding site. Further interpretations of these iintlings are considered in the Discussion. In the following section I use diagrams such as that shown in Figure 1 to explore*



I y


3 ;g 3 qc p

; 5:

Bi “y 210

95 Tp

co 00 p 220

q l lm



B BB ypp




8%: 240






B 260





B 0Y


Fro. 1. Amino acitl rcplscements causing thr is phenotype. The missense mutation sites mapping in the second are shown in the top of the diagram. The horizontal axis represents the length of the corresponding repressor arises from the is,ts selection and represents weaker is effects than the other clusters. The bottom part of the suppressed nonsense mutjations. Each replacement made in this portion of the protein is indicated by a box. what weaker effect,s. (0) Replacements that do not produce the is phenotype. In a few cases these replacements by referring to Figs 2 and 3.




Suppressed nonsense


FpP 280








half of the gene that generate is repressors protein. The cluster of sites on the far right diagram abstracts results from the study of ( n ) is proteins; (0) is proteins with someresult in i- proteins, which can be deduced







the effwt~s OII reprws01~ structurt: of’ mut~atioris tlt~rivrti from ho1 h ot’ 1~1~mcGocls I’(‘ferred to abort:. These give an overall picture of rc~pressor struct,urcL that can bet nsrtl as a framework for compiling additional informat8ion from other studies.

3. Results The pattern of mutational sites is particularly revealing in this portion of thr protein. Here we observe a high den&y of I- sites, including those resulting in t,emperature-sensitive repression. From the suppression data at 18 sites in the first 62 residues. 35 of 66 replacements result in incomplete repressor a&vity. (Within the first 59 amino acids, 15 positions have been tested yielding 35 partially or ful1.y i proteins from 56 substitutions.) This is significantly above the average for the rest of the protein. wht:r~ A pattern first 60 residues form a separate domain from th(, rest of the protein is t,he fact t,hat, none of the temperat,ure-sensitive i prot8rins displays is (altered rcasponse to induct&I-s) character at, permissive temperatures. Throughout the remainder of the protJein virtually all substitutions affecting the i character at high t’emprraturr, also impair induc*tion at, low temperature. Mutations resulting in repressors that bind opcxrat’or ant1 DSA more t#ightly art* rart’. These have been found only in the beginning of thca gc*ntl. Exohangt~s at, residues 3 a,ntl 61 have been shown to result in tight-binding rc~prc~o~~s, as indicated by the, asftGskh in line I) of Figure 2. (I)) IZepwmw


Figures 2 and 3 reveal a number of silent’ regions of the prot’ein after residue GO where the density of mutations is very low. Very few substit’ut,ions leading t,o the i phenotype are detected between residues (approximately) 75 and 115 and none at all in the last, 35 amino acids of the protein. Even nonsense mutations affecting this latter part of the protein apparently have some activity (see Discussion). Wibh a single except)ion. there is no is position for almost lOOresidues. bet wren a point near residur 97 anal







approximately residue 193. The region between residues 75 and 100 might be involved in the conformational change that takes place following inducer binding, since some of the is replacements in this region result in proteins with altered allosteric effects (Jobe et al., 1972; Myers &I Sadler, 1971; Miller & Schmeissner, 1979; see also review by Miiller-Hill, 1975). The i” exchanges in the latter half of the protein cluster in four groups, with a fifth group emerging if weaker is sites are considered. Although the precise dimensions of each cluster cannot be determined at this time, at least, one exact position in each group is known from nonsense suppression. Thus, one of the groups involves the glutamine residue at position 248, although one substitution in this group affects an amino acid t,hat, is two to five positions earlier. The near equal spacing of these clusters is tantalizing. Some possible interpretations of this pattern are considered in the Discussion.

4. Discussion The combined data from missense and suppressed nonsense studies offer a powerful tool for analyzing the effects of specific struct,ural alterations on protein funct’ion, since the methods are complementary (see Introduction). The results presented in this paper strongly support the notion (Adler et al., 1972; Miiller-Hill et aZ., 1975; Weber et al., 1975; see also reviews by Miiller-Hill, 1975: Bourgeois C%Pfahl, 1976; Miller, 1979) that the repressor consists of at least t)wo principal structural domains: the amino-t’erminal 59 residues, which can be cleaved intact by a limited tryptic digest, and the remaining core of the protein. The amino-terminal region contains most or all of the determinants for DNA (Miiller-Hill et ab., 1976; Geisler & Weber, 1977) and operat’or binding (Ogata & Gilbert, 1978), while the inducer binding site is in the core of the protein (Platt et al., 1973), which can aggregate normally and form tet’ramers. .Is seen in Figure 2, there is a high density of mutat)iona in the beginning of the gene that lead to the i- phenotype, but’ no mut,ation bhat destroys inducer binding. (A?‘met~hgl-LV’-nitro-N-nitrosoguanidine-induced I” mutations also map past the initial portion of the gene; Pfahl et al., 1974.) Moreover: t!wo replacements resulting in prot,oins that bind DNA more Cghtly affect positions 3 and 61, respectively. (At least one &her mutation of this type has been detected and shown also to affect the aminot’erminal region: Betz & Sadler, 1976.) The I- mutations in this part of the protein arc transdominant, as has been found by obher authors (Adler et al., 1972; Platt et al., 1972 : Pfahl et al., 1974), resulting in proteins that cannot bind DNA but can form mixed aggregates with wild-type and bind JPTG. In the core of the protein, exchanges result in both i- and is proteins, although t,here are a number of “silent” regions of the protein, where only a few changes lead to the i- phenotype. One of t’hese includes t,he carboxyl-terminal end of the protein. Not, even nonsense mutations affecting the last 35 residues result in complet,ely defective proteins. Deletions removing the last, nine (Miller et al.? 1970a,h; W. Gilbert, A. Maxam & D. States, unpublished work) and last four (Brake et al., 1978) residues, respectively, have been charactjerized and shown to result in dimer formation (Miller rt nl.. 1970&: Kania & Brown, 1976). suggesting that this part of the protein plays a role in aggregation. Whether the carboxyl-terminal 20 to 35 residues can be considered a t’hird domain, however, remains an open quest,ion.


.J. Ji.

(a) hducer

Ml.LLElt binding


The most striking aspect of the data presented in Figures 2 and 3 is t,he pattern of mut,ations leading to the is (non-inducible) phenotype. In the middle of t,he protein the respective sites cluster in discrete regions separat,ed by almost regular intervals. Mutants selected for being both i- at low temperature and is at high temperature also tend to group in the same places. The use of suppressed nonsense mutations shons that these clusters and spaces are significant and occur at the protein level (Fig. 1). Because IS mutations that alter the affinity for inducer are likely to affect residues at or near the inducer binding site+, it is reasonable to assume that the clusters deiinc the residues involved in forming the binding site. But why t,he apparent, regularity in their spacing? Francis Crick has suggested, by analogy with immunoglobulins, t.hat this pattern may reflect the fact that the residues that form a binding site are oft,en found at turns in the protein chain, like those that occur between /Lshe& st,ructures. In fact’, the comparison of t,hese data for LX repressor with t)he &uct,ure of a Benco -. Jones protein suggests that both proteins may have basic feat,ures in common. (b) ComTparison



Epp et al. (1974) have reported the crystal and molecular structure of t,he variablt part of a K-type BenceJones protein REI at 2.8 AL. The antigen binding site is a cavity formed by residues in hairpin turns that lie close t)ogether in the final structure. The hypervariable regions, where mutational changes affect the antigen binding properties. are in the positions of the turns. ‘It is assumed that changes at these points can change the binding site without greatly affecting the overall structure of the protein, whereas exchanges in the regions involved in the hydrogen-bonded secondary structures will lead more frequently to disruption of t’he overall structure. Making the direct analogy with the repressor would predict that mutations leading to the is character (affecting inducer binding but leaving the rest of the protein intact) would appear at the hairpin turns and t,end to cluster, whereas the positions of I - and I -U mutations would be less rest,ricted. Figures 2 and 3 indicate t,hat, t,his is the case. In Figure 4(a) I have reproduced a hydrogen-bonding scheme of t’he Bence-Jones variable region elucidated by Epp et al. (1974). The prot,ein is composed of four layers. consisting of p-sheets with turns at almost regular intervals. (This can be appreciated better by viewing the stereo diagrams in the original paper.) In Figure 4(b) I have transformed this structure into a linear sequence, giving t#he order of each layer with a Roman numeral. The residues in the hairpin t,urns on t#he same side are separated by 27, 23, 17 and 26 amino acids. respectively. The positions of the turns on the other side occur near the midpoints of the respective segments of the chain. Arrows indicate the position of the hypervariable regions. We might speculate how t,his region of the repressor protein looks by assuming that, each of the is regions corresponds to a turn. These need not be turns on1.v betfween p-sheets, but might instead represent turns occurring between alternating a-helical and p-sheet segments$. This generat,ea four layers. with turns occurring at the posit Even t,hough physical-chemical rxperiments intlrcate that tryptophan 220 dons not make direct contjact8 with inducer (Sommer et al., 1976; O’Gorman & Matt,hcws, 1977), &he is proteins produced by substitution of t,his r&duo wguc t,hat, t,rypt,rq,han 220 is involved in maintaining tht> conform&on of the binding sitf>. $ Whereas immun(,globulills fall intO a c:atupory of’ Jrott.irls consisting mainly of /3-struc%urt-~. the Zac repressor (a soluble, intritoellular, globular protein) is very likely to have an arrangement of alternating helicea and ,8-sheet units (M. Levitt, personal communication).



FIG. 2.


t One partbl i- effect may result from incomplete type (Reyreuther, 1979).

suppression in this specific case. An exchange at &neighboring

residue does result in an i- pheno-

FIGS 2 and 3. The combined results from stud& of mwense mutations (linou A and C) and suppressed nonsense mutations (lmes B and D) are &can for the entwa gene protein map. The upper part of the dqram conaders the 1. phenotype, and the lover portion the i” phenotype. For lines A and C, bars pointed upnwds indicate temperature-independent mutalions, and bars pointed downwards depict temperratllre-sensitive mutations. Open bars indicate weaker effects; weaker imL’ effectii in line A, and weaker P &eats in line C. (The open bars in line A pointing upward represent mutations that affect e,ggrege,tion. These have ihe letter A inside the bar.) Parenthew indicate a slrght ambiguity in positioning, where, in general, every bar is within 3 residues of the correspondmg position in the protein. Although many temperatuwsensitive mutalions confer partial i” chsracter at low temperature, only the mutants selected for this character are shown in line C. For linea B and D, each replacement produced by suppression ia represented by 8 box. In line B, (0) indicates replscoments which do not produce the i- phenotype (loas of repreauion). (1) i’ exchanges, snd 0 partial i- proteins. A dot in B half-filled box indic&es temperature sensitivity. In line D, (0) reprewnt exchanges that do not result in ia proteins. (I) f proteins; and ( 0) waker effeota. A dot in B halffilled box represents aatemperature-eensitiv i’ protein. The asterisk* in the boxes at positions 3 and 61 depict proteins that bind operator more tightly than wild-type.











Fro. 4. (a) The hydrogen-bonding scheme of the main-chain atoms of a monomer of a BeuceJones V,,, protein. Taken from Epp et nl. (1974). Large arrows have been added to indicate the positions of the hypervariable regions. (b) The structure shown in (a) has been represented schematically along a linear sequenoe. The distances between the turns are depicted by small numbers. Roman numerals indicate t,htx order of each F-sheet layer. Residue positions are encircled. (c) A hypothetical scheme for lac repressor, based on the assumption t.hat the clusters of’ is exchanges are part of the inducer binding site and are located at residues in turns that are on t,he same side of the protein, as represented in 2 dimensions. The turns at the bottom of the diagram have been designated arbitrarily, by assuming that they occur near the midpoint of the respective interval. Large arrows indicate the position of the is clusters. (The open arrow indicates that the last cluster is associated with weaker effects.) Wit,h the exception of t.he region near position 220, all of the clusters extend for several residues (3 t)o 5, depending on the cluster). The position given represents t,he precise location of at, least 1 residue in these groups as det,erminrd by nonsense suppression, and may not correspond to the exact midpoint of each group.


.I. 11, MILLEI:

t,ions corresponding 1.0 i” I.c~)lac~,rn~,tlt.s. I\ fifth layt~ M.OUIC~CK~I~ it’ \vt* inolutlt~tl 111(, group of is s&es near rrsittuc*s 297 and 298. Kroausc~ of 1IIV W&V c+i~~t s of 1-h~~ hll hstitutions, the inclusion of this last. clust,er is t~rnt,at,ivc~. Tht? t 11rns a1 I IIC t,ol) git’ 1I>tt diagram are spaced 27, 28, 2.5 and (about) 24 residues apart’. Jf’bhese regions fi)rrn par’{ of a pocket, t,hen they must come together on t’hc same sidt* of t)he prot,tGn. as tlra\~n in this scheme. Thus. the turns at the other end should OCCIII Ilear tilts cent.tbr* of’ tl1t1 respective segment. These points are also indicated. A comparison of Figure 4(b) anti (c) shows that the structures drawn on a linear ncaltk fi)r la/: W~ITW~I~ and thtl I&YI(YA Jones variable region are analogous. with the dimcansions t)hat closc~l~ match. hypot ht+ieal. a.ncI t ht~ The schematic diagram presented in Figure 4(c) is of’ WIIIW elucidation of the three-dimensional &ucture? is rquirrd before \V(Scan taktx full advantage of the data presented in this and the preceding papers. Howevc~r, if bhtx pictnr~* for Zac repressor core given here is correct in form. th(>n it would docurnent~ an import)ant similarit,y in t.he f1itictional design of different t,vpw of globular prot,eins. .I (‘tJltlparison of the structures of a number of globular pro&ins has alrc~ad~~ indicated important structural patterns (Levitt, &. Chothia. 1976) t,hat would he conaist,~:nt H it II the comparisons made here.

5. Summary (a)


The genetic studies of the lac repressor (for references and reviews consult the text) have outlined the basic features of repressor struct,ure. allowing the definition of basic domains involved in DNA and operator binding, inducer binding, and aggregation. Specific residues of the protein important for inducer binding and aggregation have been identified, and interesting variants of repressor have been created by mutation. For instance, repressors with certain substitutions at positions 3 and 61 bind operato and DNA 100 times more tightly than wild-type repressor. Moreover, a doubly albered repressor carrying exchanges at’ both positions now binds operator and DKA IO,(W times more Cghtly. Such molecules have been used to facilitate studies of repressoroperator interaction in who. These combined studies demonstrate that genetics can he a powerful tool for analyzing and even restruct,uring proteins in a directed manner. The analysis of mutational sites has also suggested similarities in bhe general st,ructure of certain sub&rat-e binding sites, since the pattern of mutidtional sites leading tt, loss of inducer binding appears similar to the arrangement ofthe hypervariablr region> in immunoglobulins. This generates t,he speculation that, thr inducer binding site i< formed from residues at turns in hhr secondary structure. and t)hus allows US to prrdiet the position of these turns. The systematic replacement of specific residues by nonsense suppression reveals that many amino acid substitutions are effectively neutral. The pattern of replacements is in good agreement with the findings and predictions of Perutz and co-workers (Perutz & Lehmann, 1968) for hemoglobin. IMore importantly, the txxtmsive USCof suppressed derivatives has laid t(he groundwork for a detailed understanding of how Zac repressor







~vorks. slnccx WP can superimpose these result’s on t’ha three-dimensional structure of t’hf> wild-t),vpe protein, which will eventually emerge from t,he X-ray crystallographic analysis. (b) Lim,itations There are a number of limitations to the methods used here. Even with the detailed gene-protein map, the position of missense mutations can be estimated, but not pinpointed precisely. (Fortunately, the positions of suppressed nonsense mutations are known exactly.) Also, variations in the amounts of proteins present in viwo can hinder quantit’ative arguments. For instance, certain altered repressors are degraded in vivo (Schlotmann & Beyreuther, 1979), which obviously enhances the resulting i- phenotype. However, since the respective alterations cause either defective folding or strucconclusion that the tures that are recognized as defective by proteases, the qualitative corresponding residue is important for the structure (or activity) of the protein is still valid. Clearly, more detailed quantitative studies require characterization in vitro of purified proteins. The maps and diagrams presented in this and the accompanying papers are meant to serve as a framework, to be refined as more data from physical experiments



(c) Future applications The type of analysis reported here, including the controlled generation of variants, is applicable to many proteins, even those from eukaryotic cells which are synthesized and expressed in E. coli (see Discussion in Miller et al., 1979). Understanding how proteins function is an essential part of molecular biology, for all cellular processes are ultimately mediated by specific enzymes. Much of our future knowledge about the complex workings of both prokaryotic cells and eukaryotic cells will depend on identifying the enzymes involved and understanding how they work. The idea of being able to tailor-make proteins, to redesign them to suit rour needs, which once seemed like an impossible dream, now lies just within our reach. 1 thank Dr Francis Crick for suggesting the analogy between Zac repressor and immlmoglobulin, and Drs S. Brenner, W. Gilbert, M. Levitt, L. Orgel and D. Galas for helpful discussions. This work was supported by a grant from the Swiss National Fund (3.179.77).


K., Beyreuther, K., Fanning, E., Geisler, N., Gronenborn, B., Klemm, A., Miiller-Hill, B., Pfahl, M. & Schmitz, A. (1972). Nature (London), 237, 322-327. Betz, J. L. & Sadler, J. R. (1976). J. Mol. Biol. 105, 293-319. Beyreuther, K. (1979). In The Operolz (Miller. J. W. & Reznikof, W. S., eds), pp. 123-154, Cold Spring Harbor Laboratory, New York. Bourgeois, S. & Pfahl, M. (1976). Adwan. Protein Chem. 30, l-99. Bourgeois, S., Jernigan, R. L., Kabat,, E. A., Szu, 8. C., & Wu, T. T., (1979). Biopolymers, in the press. Brake, A. J ., Fowler, A. V., Zabin, I., Kania, J. & Miiller-Hill, B. (1978). Proc. Nat. Acad. Sci., .C:.S.A. 75, 4824-4827. Chou, P. Y., Adler, A. J. & Fasman, G. D. (1975). J. Mol. Biol. 96, 29. Coulondre, C. & Miller, J. H. (1977). J. Mol. Biol. 117, 525-576. Epp. O., Colman, P., Fehlhammer, H., Bode. W.. Schiffer, M., Huber, R. & Palms, W. (1974). Eur. J. Biochem. 45, 513-524.


d. H.


Geisler, N. 6i Weber, K. (1977). Uiockemistry, 16, !138~~943. Jobe, A., Riggs, A. D. & Bourgeois, S. (1972). .I. 1lIol. Hiol. 64, 182~~19Y. Kania, J. & Brown, D. T. (1976). Proc. Nat. Acad. Sci., ?C’.S.d. 73, 3529-3533. 261, 552.-558. Levitt, M. & Chothia, C. (1976). Nature (London), Miller, J. H. (1979). In The Operon (Miller, .J. H. & Reznikof. IV. S.. (*(Is), (!old Spritlz Harbor Laboratory, New York. In the press. Miller, J. H. & Schmeissner, U. (1979). J. Mol. Biol. 131, 223-~248. Miller, J. H., Reznikoff, W. S., Silverstone, A. E., Ippen, K., Signer, E. & Beckwith, .J. I

Genetic studies of the lac repressor. XI. On aspects of lac repressor structure suggested by genetic experiments.

J. xoz. Biol. (1979) 131, 249-258 Genetic Studies of the ZucRepressor XI. On Aspects of lac Repressor Structure Suggested by Genetic Experiments ?JEF...
839KB Sizes 0 Downloads 0 Views