Advances in Enzymology and Related Areas of Molecular Biology, Volume 64 Edited by Alton Meister Copyright © 1991 by John Wiley & Sons, Inc.

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE By EDITH WILSON MILES, Laboratory of Biochemistry

and Pharmacology, National Institutes of Health, Bethesda, Maryland CONTENTS

I. Introduction 11. Studies of Crystals A. Crystallization and Purification by Crystallization 1. Crystallization 2. Purification by Crystallization B. Three-Dimensional Structure of the a2pzComplex C. Kinetic and Microspectrophotometric Studies of Crystals 111. Correlation of Crystallographic Results with Other Structural Studies A. Amino Acid Sequences and Mutants I . a Subunit Sequences and Mutants 2. p Subunit Sequences and Mutants B. Protein Folding and Domains I. a Subunit Folding and Domains 2. P Subunit Folding and Domains C. Other Structural Studies D. Multifunctional Enzyme from Yeast and Molds IV. Catalytic Mechanism A. a Subunit Reaction Mechanism B. p Subunit Reaction Mechanism 1. P-Replacement Reactions 2. The Indolenine Intermediate in the Synthesis of L-Tryptophan 3. p-Elimination Reactions 4. Other Reactions 5 . Stereochemistry 6. Active Site Residues V. Protein-Protein Interaction and Channeling A. Conformational Changes upon Subunit Assembly B. Site-Site Interactions C. Channeling of lndole V1. Conclusions and Future Directions Acknowledgments References

Advances in Enzymology and Related Areas of Molecular Biology, Volume 64, Edited by Alton Meister ISBN 0-471-50949-3 8 1991 by John Wiley & Sons, Inc.

93

94

EDITH WILSON MILES

I. Introduction

Tryptophan synthase (E.C.4.2.1.20) from bacteria, yeasts, molds, and plants catalyzes the final two reactions in the biosynthesis of L-tryptophan. This enzyme has been the subject of many important genetic and biochemical studies and has been frequently reviewed (1-5). I emphasize here the important progress made since my previous review in this series in 1979 (2). I describe the three-dimensional structure of the tryptophan synthase a432 complex from Sulmonellu typhimurium (6, 7) and correlate this new structural information with previous biochemical and genetic studies. I describe how site-directed mutagenesis is being used to explore the relationship between enzyme structure and enzyme mechanism. The early history of the studies of tryptophan synthase from Neurosporu crussu and from Escherichiu coli has been vividly recounted by Yanofsky (8, 9). Studies of mutants that require tryptophan for growth led to the discovery that tryptophan synthase from E. coli is a multifunctional, multicomponent enzyme (9, 10). Enzyme fractionation demonstrated that the enzyme from E. coli is an a& complex composed of two nonidentical dissociable subunits, now called the a and P subunits (1 1). Whereas the isolated a subunit is a monomer, the P subunit is usually a dimer and is often called the PZ subunit. In this chapter I use the term p subunit to refer to each Q polypeptide chain and to the active pzdimer. Figure 1 summarizes the subunit structure of bacterial tryptophan synthase and the reactions involved in the synthesis of L-tryptophan. Although the separate a and P subunits have low activities in the a and P reactions, respectively, the a2P2complex has much higher activities in these reactions. The a& complex also has a higher affinity for substrates than the separated a and P subunits. The physiologically important reaction catalyzed by the a 2 P 2 complex, termed here the reaction, is the sum of the a and p reactions. In the overall a@ reaction, indole produced in the active site of the a subunit becomes a substrate for the p subunit, where it is converted to L-tryptophan by a pyridoxal phosphate-dependent P-replacement reaction with L-serine. Although early experiments showed that indole does not appear as a free intermediate in the reaction (,12-14), these results could not distinguish whether the sites at which the a and P reactions were catalyzed were juxtaposed or connected by a channel. The presence of a channel or tunnel has recently been established by the crystallographic studies (6) to be described in Section 1I.B.

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE

95

BACTERIAL TRYPTOPHAN SYNTHASE ~~

2a

MW=2=29,000 = 58,OOo

+

indole-3-glycerol-P

01 Reaction

indole+D-glyceraldehyde3-P

02

MW=2 44,ooO = 88,Ooo

JI

ad32

MW=146.000

indole+L-Serine

-

PLP L-Tryptophan+ HzO

0 Reaction

CHOHCHOHCHzO Q

.-tryptopkn + D-glyceraldehyde3 - L H z 0

a0 Reaction

Figure 1. Subunit composition and reactions of bacterial tryptophan synthase.

Although tryptophan synthase from bacteria and plants has an structure, the enzyme from N.crussa and from Succharomyces cerevisiue is a single polypeptide chain that contains two regions functionally and structurally equivalent to the OL and p subunits (15) (see Section 1II.D). These two regions are fused through a short connecting region. This type of enzyme is termed a multifunctional enzyme. Comparative studies of the tryptophan biosynthetic pathway in fungi (16) and bacteria (17) have yielded important understanding of the evolution and regulation of the genes, enzymes, and pathway. a&

11. Studies of Crystals

Bacterial tryptophan synthase is an attractive subject for x-ray crystallography since structural analysis may explain how subunit

96

EDITH WILSON MILES

interaction affects catalysis. A preliminary x-ray diffraction study of the wild-type a subunit from E. coli and of a mutationally altered a subunit (18) did not lead to further structural studies. Although both the p subunit (19) and the a2P2complex (20) of tryptophan synthase from E. coli have been crystallized in my laboratory, our group and several other groups have not been able to grow crystals of these enzymes suitable for x-ray diffraction studies. In the course of our studies of the tryptophan synthase a $ 2 complex from S. typhimuriurn (21), we noted that the enzyme crystallized readily. We then found that crystals could be grown which were suitable for a complete structure investigation (22). A. CRYSTALLIZATION AND PURIFICATION BY CRYSTALLIZATION

1. Crystallization The tryptophan synthase a $ 2 complex from S. typhimurium crystallizes during purification in the presence of low concentrations of ammonium sulfate (5). Comparative studies of the effects of ammonium sulfate concentration on the solubility of the tryptophan synthase from S . typhimurium and from E. coli show that the a 4 3 2 complexes from the two sources have very different solubility properties (5). In contrast, the separate a and p subunits from the two sources have similar solubilities (5). Whereas the solubility curve for the a 2 p 2 complex from E. coli exhibits a single transition, the corresponding curve for the a2p2complex from S . typhimurium has very distinctive features (Fig. 2) (5). This curve exhibits two solubility minima that are most striking at 24°C. The precipitate at the first minimum (at about 26% saturation) is crystalline, whereas the precipitate at the second minimum (at about 35% saturation) is amorphous. We have used crystallization of the enzyme in the presence of a low concentration of ammonium sulfate as a tool for purification, as described later. In an attempt to obtain large single crystals suitable for analysis by x-ray diffraction methods, we conducted crystallization trials under various experimental conditions (22). Ammonium sulfate induced the formation of long thin needles, which were not suitable for x-ray diffraction. We obtained the best crystals by vapor diffusion in the presence of polyethylene glycol and various additives. Although the largest crystals were obtained from 12% polyethylene glycol 8000 and 10 mM MgC12, crystals from 12% polyethylene and 2 mM spermine had the best crystalline form. Both types of crystal

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE

97

-

20

25

30

35

40

45

50

( N H ~ ) SO4 (OhSaturation)

Figure 2. Effect of ammonium sulfate concentration on the solubility of the holo complex of tryptophan synthase from E. coli and from S . typhimurium. Enzymes were incubated at 2-4 m g h L in 0.05 M sodium N,N-bis(2-hydroxyethyl)glycine buffer, pH 7.8, containing 1.0 mM EDTA, 1.0 mM dithiothreitol, 0.02mM pyridoxal phosphate and ammonium sulfate at the indicated percentage saturation for 24 h at 4°C. Solutions were centrifuged for 5 min at room temperature. Aliquots of the supernatants were diluted and assayed for activity in the p reaction. The soluble enzyme activity of each supernatant solution is expressed as a percentage of the activity in the absence of ammonium sulfate. E. coli (0);S . ryphirnurium (0);both at 4°C.The precipitates and the remaining solution of S . typhimuriurn a& complex were mixed and incubated for 1 h at 24T,centrifuged, and assayed again as above. S . typhimurium a2P2 complex at 24°C (0).(From ref. 5 . ) a&

are monoclinic and in space group C2 with a = 184.5 A, b = 62.4 A, c = 67.7 A, p = 94' 40' and one ap pair of M,71,700/asymmetric unit (22). Slightly different unit cell parameters are reported in the later study (a = 184.5 A, b = 61.1 A, c = 67.7 A, p = 94.7') (6). The crystals have been shown to contain the tryptophan synthase a& complex by analysis of the isolated, washed crystals. The dissolved crystals and the solution of enzyme from which the crystals were grown exhibited closely similar absorption spectra, activities, and protein bands upon sodium dodecyl sulfate-polyacrylamide gel electrophoresis. The rate of crystallization and the size of the crystals formed is very dependent on the concentration of spermine (22, 23). Small crystals are formed very rapidly at high concentrations of spermine. We have used crystallization in the presence of polyethylene glycol and 2.5 or 5 mM spemine for the production of microcrystals (23) and for enzyme purification (24).

98

EDITH WILSON MILES

2. Purification by Crystallization Several methods have been used for the purification of the tryptophan synthase a& complex from E. coli (5, 20, 25). Although these methods can also be used for purification of the enzyme from S . typhimurium (5,21), we have developed two new procedures for purifying the enzyme from S. typhimurium by crystallization (23, 24). The first method takes advantage of the unusual solubility properties of the enzyme in solutions of ammonium sulfate discussed previously and illustrated in Fig. 2 (5). The partially purified enzyme is crystallized by dialysis against buffer containing a low concentration of ammonium sulfate (23). In the second method (24), the enzyme is crystallized directly from crude bacterial extracts by addition of polyethylene glycol and spermine. The extracts contain very high levels of the OL$Z complex (30-50% of the soluble protein) since they are prepared from an E. coli host (CB149) which contains a multicopy expression plasmid carrying the trpA and trpB genes from S . typhimurium. Addition of spermine and polyethylene glycol to these extracts results in the immediate formation of a bulky precipitate, which is rapidly removed by centrifugation. Microcrystals form in the yellow supernatant solution within a few minutes or a few hours. Although the crystals obtained in this first purification step contain nearly pure a$2 complex, the enzyme is usually recrystallized by dialysis against buffer containing a low concentration of ammonium sulfate, as described previously. We have also used this method to prepare mutant forms of the a& complex encoded by trpA or trpB genes which have been altered by site-directed mutagenesis (see Sections 1II.A.1 and IV.B.6). Since this purification method involves only centrifugation and dialysis steps, it can be used to purify several different mutant forms of the a& complex at the same time. The method has been used on various volumes of bacterial cultures ranging from 1 to 50 liters. A 50 liter culture yielded 2.4 g of the wild-type enzyme. Large crystals suitable for x-ray crystallography have been grown from several mutant forms of the a& complex prepared by this method. B. THREE-DIMENSIONAL STRUCTURE OF THE a432 COMPLEX

The three-dimensional structure of tryptophan synthase a& multienzyme complex from S . typhimurium has been solved to a resolution of about 2.5 A using standard x-ray crystallographic methods (6). The coordinates are available from the Protein Data Bank (26)

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE

$9

under the designation ‘‘1WSY. ” The four subunits are arranged in an extended aPPa order with an overall length of 150 A (Fig. 3, color insert). The two a subunits are at opposite ends of the complex on the two sides of the central p subunit dimer. The active centers of the neighboring OL and P subunits are 25 A apart and are connected by a “tunnel” with a diameter equal to or greater than the greatest dimension of indole. The tunnel probably provides a pathway for the internal diffusion of indole between the two active sites and prevents the escape of indole to the solvent (see Section V.C). The tertiary fold of the OL subunit in the a&, complex is that of an eightfold a@ barrel (Figs. 4B and 4C) (6). Similar structures have been observed in at least 16 enzymes. The canonical a/P barrel structure is formed by eight alternating a helices and P strands as shown schematically in Fig. 4A. The OL subunit of tryptophan synthase contains three extra a helices designated 0, 2’, and 8’, which are shown schematically in Figs. 4B and 4C. Two regions appear to be highly mobile (residues 55-58) or disordered (residues 179-192). The latter region contains a site at Arg-188 which has been found to be susceptible to proteolysis in the a 2 p 2 complex (see Section III.B.l). This site is indicated in Fig. 4C by an arrow labeled P. The active center of the a subunit has been located by x-ray crystallographic analysis of a crystal that was soaked in a solution containing indole 3-propanol phosphate (27), an analog of the substrate, indole 3-glycerol phosphate. Since the inhibitor lacks two hydroxyl groups, it cannot be cleaved by the enzyme. A positivedifference electron density map (Fig. 5 , see color insert) reveals the ~~

_____

Figure 3. (color insert) View of the S. typhimurium tryptophan synthase ~ $ 2 complex looking approximately down the twofold axis of symmetry between ap subunit pairs. The smaller a subunits (blue) are distant from each other on opposite ends of the p subunit dimer. The p subunit N-terminal residues (1-204) and C-terminal residues (205-397) are shown in yellow and red, respectively. The dot surfaces highlight the positions of bound indole propanol phosphate (red) in the active sites of the a subunits and the coenzyme pyridoxal phosphate (dark blue) in the active sites of the d p barrels. A tunnel that connects the two active sites (light blue) is shown in one ap subunit pair. (From ref. 6 . ) Figure 5. (color insert) Positive-difference density map showing the presence of the bound substrate analog, indole propanol phosphate, at the active site of the a subunit. The indole, propyl, and phosphate moieties of the inhibitor are clearly indicated by the positive-difference densities shown in orange. Strong features adjacent to the phosphate group suggest that residues 234 and 235 move from left to right by over 1 A when the substrate binds. Glu-49and Asp-60, which are thought to serve catalytic roles, are shown in a blue van der Waals dot surface (color version of figure from ref. 6 courtesy of C. C. Hyde.)

0 n

N (

m

B

J

n m

7

3

4 5 6

4

5

6

4 5 4

6

5

. P

Figure 4. Schematic representations of a canonical eightfold a/p barrel protein ( A ) and of the a subunit of tryptophan synthase (B) and (C). (A) The eight alternating f3 strands (A)and a helices (0)of a canonical eightfold a/p barrel protein are numbered sequentially from the amino terminus (N) to the carboxyl terminus (C). ( E ) The a subunit, represented as in (A), contains threc other helices labeled 0, 2', and 8'. P indicates a known site of proteolysis at Arg-I88 in a disordered loop between strand 6 and helix 6. Cleavage at this site yields an N-terminal fragment (a-1)and a C-terminal fragment (a-2). The active site is represented by a circle around IPP, the bound inhibitor, indole propanol phosphate. Two active site residues (Glu-49 and Gly-21I) are indicated by (0).(C) Schematic view of the overall fold of the a subunit based on the x-ray data. p Strands are shown as a flattened arrow with arrowheads at their C termini. a Helices are represented as cylinders and are labeled on their N termini. In addition to the eight strands and helices found in a typical a/$ barrel structure, the a subunit contains at least three other helices (labeled 0, 2 ' , and 8 ) . N and C mark the polypeptide amino and carboxyl termini. The Loops following strand 2 and strand 6 represent two polypeptide segments that are disordered in the crystal and are not currently part of the model. A known site of proteolysis (P)occurs in one of these disordered loops. The active site is centrally located near the C-terminal ends of the eight p strands. Indole propanol phosphate has been observed to bind in the active site as indicated by the ball-and-stick model. (Figure 4C is from ref. 6.)

loo

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE

101

binding site and conformation of the inhibitor. The strong positivedifference density features in orange clearly outline and delineate the indole ring and propyl and phosphate groups of the inhibitor. Strong positive- and negative-difference density features in the neighboring protein atoms suggest that local conformational adjustments may occur when the ligand is bound. Two residues that are thought to be catalytic residues (Asp-60 and Glu-49) are indicated by blue dot surfaces (see Section 1V.A). Indole 3-propanol phosphate binds at the top of the central barrel near the C-terminal ends of the p strands (Fig. 4C). Several other eightfold a@ barrel enzymes bind substrates at similar locations. Interestingly, one of these enzymes is triose phosphate isomerase (28), an enzyme for which D-glyceraldehyde 3-phosphate is also a substrate. A close-up view of the active site of the ct subunit (Fig. 6) shows the bound inhibitor and eight amino acid residues that have been identified as the sites of mutations which totally inactivate the a subunit or which have been observed in second-site revertants (1, 8). The roles of these residues are discussed in Section 1II.A. Inspection of the three-dimensional structure of the p subunit in the a2p2 complex (Fig. 3, color insert) shows that each p monomer

Figure 6. Stereoview of the conformation of residues in the active site of the a subunit which are the sites of missense mutations as determined by x-ray crystallography. The binding site and conformation of the bound competitive inhibitor indole propanol phosphate determined from a difference electron density map is shown near the center of the figure. (Courtesy of C. C. Hyde.)

102

EDITH WILSON MILES

contains two structural domains of nearly equal size (6). One of these two domains is termed the N domain, since it is largely composed of the N-terminal residues 1-204, which are colored in yellow in Fig. 3. The other domain is termed the C domain, since it is largely composed of the C-terminal residues 205-397, which are colored in red in Fig. 3. Residues 53-85 of the N-terminal sequence “cross over” into the C domain. The active site of each p monomer, which contains the bound coenzyme, pyridoxal phosphate, is “sandwiched’, between these two domains. Active site residues are described in Sections III.A.2 and IV.B.6. The tunnel, which is thought to facilitate the transfer of indole from the active site of the a subunit to the active site of the f3 subunit, passes through the interface between the N domain and the C domain (see Section V.C). The tertiary folds of the N domain and the C domain are shown schematically in Fig. 7. An examination of the two folding patterns reveals that the central core regions of both domains have similar folding topologies (6). Each core contains four parallel strands with three helices packed on the interior side of the sheet and a fourth helix packed on its exterior side. The finding that the cores of these two domains possess a high level of structural homology and are nearly superimposable suggests that a gene duplication followed by gene fusion may have occurred during the evolution of the enzyme. Pyridoxal phosphate, which binds at the interface between the two

*

Figure 7. Folding patterns of the two domains of the f3 subunit. (A) Schematic view of the f3 subunit N domain. The “core” of this domain is formed by a four-strand parallel fl sheet (strands 6, 3, 4, and 5 ) packed on one side with three helices (3, 4, and 5 ) and by one helix on the opposite side (helix 6). The N-terminal helices 1 and 2 wrap around the core. The coenzyme pyridoxal phosphate (ball-and-stick model) binds covalently through a Schiff base linkage to Lys-87. A stretch of residues between helix 2 and helix 3 crosses over to and closely associates with the C domain. Residues 1-8 at the N terminus are disordered in the crystal. ( B ) Schematic view of the f3 subunit C domain showing the six-strand f3 sheet at its center. Strands 1 and 2 are formed from residues 53-85 of the N-terminal half of the chain. The “core” of the C-terminal domain, defined by helices 8,9, 10, and 12 and strands 10,7,8, and 9 is topologically equivalent to the “core” of the N domain. The pyridoxal phosphate:Lys-87 S c h E base complex is shown in a ball-and-stick model with the phosphate group located toward the lower right. P shows a site susceptible to proteolytic cleavage at Lys-272, Arg-275, and Lys-283. The cleavage site is within a region (residues 260-310) which apparently does not have a well-defined secondary structure. Each domain is shown here from a point of view from the opposing domain. (From ref. 6.)

from C-domain

B

BETA SUBUNIT

103

104

EDITH WILSON MILES

domains, is located near the C-terminal ends of the parallel strands in the core of each domain. The residues of the C domain that are not in the core region contain two other structural elements: a helix at the C terminus and a 50-residue stretch containing residues 260310 that folds in a complicated way and lacks well-defined secondary structural elements. This region interacts at several points with the a subunit and contains several residues which line the wall of the tunnel. Three residues in this region (Lys-272, Arg-275, and Lys283) which are susceptible to limited proteolysis by trypsin (29-31) are indicated by the arrow in Fig. 7 (see Section III.B.2). C. KINETIC AND MICROSPECTROPHOTOMETRIC STUDIES OF CRYSTALS

The kinetic properties of the crystalline tryptophan synthase a& complex were compared with the properties of the soluble enzyme (23) before the x-ray structure was complete. These studies were aimed at determining whether the structure of the enzyme which was being determined by x-ray crystallography was that of an active form of the enzyme, whether the crystalline enzyme could bind substrates, and whether the crystalline enzyme could undergo the same ligand-induced conformational changes as the soluble enzyme. In order to carry out these experiments, it was necessary to find conditions under which the rates of reaction were not limited by the rate of diffusion of substrates into the crystals. Diffusional limitation depends on several factors that must be considered for each crystalline enzyme and reaction. These factors include the thickness of the crystal, the diffusion coefficient of the substrate inside the crystal, the substrate concentration, the maximum rates, and the apparent K, values for the substrates in each reaction examined (32). In order to decrease the thickness of the crystals, we developed a method for preparing microcrystals of the complex in the presence of 12% polyethylene glycol and 2.5 mM spermine (23). Scanning electron microscopic studies demonstrated that these microcrystals had the same crystal habit as the larger crystals that were being used for structural analysis by x-ray crystallography and were of rather uniform size: 33 pm (length) x 9 pm (width) x 3 pm (maximum thickness). We found that the microcrystals did not dissolve when they were suspended in solutions containing 12% polyethylene and 2.5 mM spermine and other additions needed for spectrophotometric assays of enzyme activity. This made it possible to

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE

105

compare the reaction rates of suspensions of microcrystals with those of the soluble enzyme by spectrophotometric assays. Our results show that the maximum catalytic rate of the crystalline enzyme is 0.8 that of the soluble enzyme in the cleavage of indole 3-glycerol phosphate (a reaction), 0.3 that of the soluble enzyme in the synthesis of L-tryptophan by the (3 reaction or the coupled ap reaction, and 2.7 that of the soluble enzyme in the serine deaminase reaction. These small differences in rates probably reflect functional differences between the crystalline and soluble enzymes since the reaction rates of the microcrystals were calculated to be virtually free of diffusional limitation under these reaction conditions. The crystalline and soluble enzymes differ markedly in K, and KI values for some substrates and inhibitors. Since some of these kinetic properties of the soluble enzyme have been attributed to ligand-dependent conformational changes that are transmitted from one subunit to the other (see Section V.B), our results suggest that these conformational changes may be altered by lattice forces in the crystal. Similar conclusions have been reached from microspectrophotometric studies on single crystals of tryptophan synthase (33). The most important conclusion of these studies is that the active sites of both the a subunit and the p subunit in the crystal are functional and accessible to substrates. Microspectrophotometricstudies of single crystals of tryptophan synthase are also being used to compare the catalytic and regulatory properties of the enzyme in the soluble and crystalline states (33, 34). The studies also help to establish conditions for forming individual catalytic intermediates suitable for x-ray crystallographic studies. Polarized absorption spectra of single crystals of the S. typhimurium tryptophan synthase a2p2complex are measured in the presence and in the absence of substrates, substrate analogs, and reaction intermediate analogs. The ligands used have previously been shown to form chromophoric complexes with pyridoxal phosphate at the active site of the p subunit in the soluble a432 complex. We find that the soluble and crystalline enzymes usually produce the same chromophoric intermediates. However, in some cases the equilibrium distribution of these intermediates differs in the two states of the enzyme, (Reaction intermediates and steady-state kinetics are described in Sections IV.B.l and IV.B.2.) Spectrophotometric titrations have been carried out to compare dissociation constants of three tryptophan compounds for trypto-

106

EDITH WILSON MILES

phan synthase in the soluble and crystalline states. The dissociation constant for each compound tested is threefold to sevenfold higher for the crystalline enzyme than for the soluble enzyme. The differences between the enzyme in the crystal and in solution may be due to crystal lattice forces that alter the conformation or flexibility of the protein. We also find that ligands that bind to the active site of the a subunit alter the distribution of intermediates formed at the active site of the p subunit in both the crystalline and soluble states. These results confii- that the enzyme in the crystalline form is catalytically competent and subject to the ligand-dependent subunit interactions that have previously been detected in solution (see Section V.B). Thus, x-ray crystallography can be used to investigate both the mechanism of catalysis by the OL and p subunits and the structural basis of the intersubunit regulatory signals. The microspectrophotometric studies of single crystals thus set the stage for x-ray crystallographic studies of enzyme-substrate intermediates which promise to reveal the mechanism of the enzyme at the molecular level. III. Correlation of Crystallographic Results with Other Structural Studies

Considerable information about the structure of tryptophan synthase has been deduced from various types of study during 30 years of investigation before the three-dimensional structure was determined. The information comes from comparisons of homologous amino acid sequences in different species, identification of sites of mutations, studies of protein folding and protein domains, and studies using various other physical and chemical approaches. In this section I correlate some of these results with the three-dimensional structure of the tryptophan synthase a 2 p 2 complex from S . typhimurium (6). I also speculate on the structural relationship between the multienzyme complex from bacteria and the multifunctional enzyme from yeast and molds. A. AMINO ACID SEQUENCES AND MUTANTS

Early studies used protein sequence analysis to determine the amino acid sequence of the a subunit of tryptophan synthase from E. coli ( 3 3 , S . typhimurium (36), and Aerobacter aerogenes (37) and

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE

107

partial sequences of the p subunit (38, 39). More recently, DNA sequence analysis has been used to deduce the amino acid sequences of the cx and p subunits from a wide variety of bacteria and of the corresponding cx and p domains from S. cerevisiae and N . crassa (for reviews and comparisons see refs. 17, 40, and 41). The DNA sequence of the gene encoding the tryptophan synthase p subunit of Arabidupsis thaliuna, a higher plant, has recently been reported (42). The results indicate that the cx and p subunits in plants are encoded by separate genes (42). The availability of this large number of sequences is partly the result of a considerable interest in the evolution and regulation of the tryptophan pathway in different organisms (16,17). Comparison of homologous amino sequences from widely divergent organisms can give three types of important information: (a) the position and nature of conserved residues important in maintaining either structure or function, (b) the likely positions of surface loops, indicated by polypeptide segments accepting insertions of extra amino acids, and (c) improved accuracy in the prediction of secondary structure (41). Comparisons of the homologous amino acid sequences of the cx and p subunits from 10 species of bacteria and of the corresponding N-terminal cx domain and C-terminal p domain from S. cerevisiae and from N . crassa (17) show that many more residues are completely invariant in the p subunit (27%) than in the cx subunit (9%). Determining the location of sites of missense mutations that result in complete loss of activity serves to identify amino acid residues that may be important for structure or function. 1. cx Subunit Sequence and Mutants

Studies starting in the 1950s led to the determination of the amino acid sequence of the wild-type cx subunit from E. culi and of several mutant forms that required L-tryptophan for growth (35, 43, 44). Yanofsky’s group used protein sequence analysis to establish the position of each amino acid substitution in the cx chain in a series of missense mutants (Fig. 8). Although only eight sites of mutation were located (residues 22, 49, 175, 211, 234, and 235), two or more different amino acid changes were found at positions 49, 21 1, and 234. These amino acid changes result from different base changes in the nucleotide sequence of the parental codon. An additional site that is changed in some second-site revertants was located at residue

108

EDITH WILSON MILES II

I

illdylii I I

i

1111

II

II

1

1

1111 rlb

Figure 8. The locations of mutations in the a subunit from E. coli and of amino acid residues which are highly conserved or invariant in homologous sequences. The 269 amino acids of the a subunit from E. coli are represented by the horizontal bar. Marks below the bar (- x ) identify locations of missense mutations and of a mutation at residue 213 which occurs in a second-site revertant. These sites are identified by the amino acid change and the residue number (40,41). Marks above the bar identify the locations of amino acid residues which are highly conserved (short dash, -)orinvariant (long dash, -) in many species of bacteria, Neurospora crussa, and Succharomyces cerevisiue (17).

213. The crystallographicresults show that Gly-213 and seven of the eight sites of the missense mutation (Fig. 8) are located close to the bound inhibitor in the active site of the a subunit (Fig. 6). The eighth site (Thr-183) is located in a region that is highly disordered and not in the current model. Two active site residues (Glu-49 and Asp-60) are located in positions suitable to be catalytic residues (Figs. 5 , color insert, and 6). The proposed catalytic roles of these residues are discussed in Section 1V.A. Although Asp-60 was not one of the originally identified sites of mutation, one class of mutants which was not identified at that time has recently been mapped at codon 60 (45). The finding that these eight sites of missense mutations are located in the active site of the a subunit in the three-dimensional structure of the a2P2complex (6)(Fig. 6) supports the conclusion that the a subunit contains only a small number of crucial positions at which a single amino acid replacement can completely inactivate the enzyme. Studies of second-site revertants also gave important clues to the relationships between certain amino acids in the folded structure of the a subunit (44, 46-48). In one case it was found that whereas mutant forms of the a subunit with the single amino acid replacement of tyrosine 175 by cysteine or of glycine 21 1 by glutamic acid were inactive, a doubly altered a subunit containing both of these changes had a low, partial activity. These results led to the prediction that

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE

109

tyrosine 175 and glycine 211 were close to each other in the folded polypeptide chain (44,47). This prediction and the additional prediction that residues 177 and 213 are in close spatial proximity are confirmed by the x-ray crystallographic data (6) (Figs. 6 and 9A). We have rationalized the effects of amino acid substitutions at positions 175 and 211 by computer graphics modeling of the substrate binding site of the a subunit using the x-ray coordinates of the wild-type enzyme (49). The steric effect of amino acid substitution can be seen most readily when indole 3-propanol phosphate and the residues at positions 175 and 211 are shown with standard van der Wads dot surfaces (Fig. 9). With the wild-type enzyme (Fig. 9A), there is a snug fit between tyrosine 175, glycine 211, and the inhibitor. Replacement of the tyrosyl side chain by the smaller cysteinyl side chain would leave a space between glycine 211 and the indole group of the inhibitor (Y175C mutant in Fig. 9B). In contrast, replacement of glycine 21 1 by the larger glutamic acid would result in severe crowding with the side chain of tyrosine 175 and with the inhibitor (Y175/G211Emutant in Fig. 9C). In the double mutant (Y175C/G21IE in Fig. 9D), the space created by the smaller cysteine 175 can be occupied by the bulky side chain of glutamic acid 21 1. Thus, the double alteration of residues 175 and 211 in the secondsite revertant may restore the proper geometry of the substrate binding site. This correlation of the crystal structure with the studies of mutants of the a subunit leads to the conclusion that the early studies with mutants gave important clues to the relationship between the amino acid sequence and the structure and function of the a subunit. a. Later Studies with Mutants. The original method used for selecting tryptophan-requiring mutants of the a subunit resulted in the isolation of mutants at a small number of sites which were absolutely essential for structure or function. However, mutants that are not totally inactive can be very useful for investigations of protein folding and stability, subunit interaction, channeling, substrate binding, and ligand-dependent conformational changes. The newer techniques of random mutagenesis and of site-directed mutagenesis allow isolation of mutants with single amino acid replacements at almost any desired location. A random mutagenesis approach, which was initiated to isolate mutants suitable for folding studies, has resulted in the isolation of 17 mutants of the a subunit from E. coli

.. .. .. .;I" . ., .. *

*

.. . .. . . . * *

....

.?-

Figure 3. S. ryphimurium tryptophan synthase azpzcomplex. (ref. 6.)

Figure 5 . Difference density map with bound substrate analog, indole propanol phosphate (color version of figure from ref. 6 courtesy of C. C. Hyde.)

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE

111

with single amino acid replacements (50). Preliminary analyses of these mutant (Y subunits show that some amino acid alterations have no apparent effects, whereas others have a variety of novel functional effects. Interestingly, the only completely inactive mutants isolated have alterations at positions 22 and 60,which are sites of missense mutations identified in the early studies (43) and in recent studies (49, respectively (Fig. 8). A number of mutant (Y subunits are now being made by site-directed mutagenesis in several laboratories. Yutani and co-workers have used site-directed mutagenesis of the trpA gene from E . coli to obtain a complete set of 20 variant a subunits substituted at position 49, one of the original sites of missense mutations (Fig. 8) (51). These mutant (Y subunits have been used for studies of protein stability (52) and for investigations of catalytic mechanism (see Section 1V.A) (51). Yutani and co-workers have also used a mutant of the (Y subunit with histidine 92 substituted by threonine to assign proton NMR resonances (53). We decided to develop a method for engineering mutants in tryptophan synthase from S . ryphimuriurn soon after we initiated studies of the crystal structure of the a& complex from this source (22). We anticipated that knowledge of the three-dimensional structure of the enzyme would provide a rational basis for the selection of key residues for amino acid replacement by site-directed mutagenesis. We also hoped to obtain crystals of mutant forms of the ( ~ $ 2 complex from S . ryphimurium suitable for x-ray crystallographic Figure 9. Stereo views of the active site of the (I subunit of tryptophan synthase: effects of amino acid substitutions at position 175 and 211 predicted on the basis of computer graphics modeling. (A) Wild-type residues tyrosine 175 and glycine 21 1 shown with standard van der Waals surfaces (dot surfaces) in their positions in the wild-type form of the enzyme with the position of bound indole propanol phosphate. Labels: N1,indole ring nitrogen atom; C3,C3,carbon atom; and P, phosphorus atom of phosphate group. In the wild-type enzyme the inhibitor maintains a snug fit to the surrounding protein atoms. ( B ) In the Y175C mutant, the replacement of the tyrosyl side chain by the smaller cysteinyl side chain would leave space between glycine 21 1 and the indole group of the inhibitor. (C) The presence of a glutamic acid side chain at position 21 I would result in severe crowding with the side chain of tyrosine 175 and with the inhibitor. The resulting distortion in the active site would likely prevent substrate binding. (0)In the double mutant, the space created by the smaller cysteine 175 can be occupied by the bulky side chain of glutamic acid 21 I . The proper geometry of binding of the substrate could likely be maintained. (From ref. 49.)

EDITH WILSON MILES

112

analysis. A general method for site-directed mutagenesis was developed by subcloning the major part of the trpA and trpB genes from S . typhimurium from plasmid pSTB7 into bacteriophage M13mp18 (Fig. 10)(54). This construct has been used for site-directed mutagenesis of the trpA and trpB genes by the method of Kunkel(55). The mutant genes have been subcloned into an efficient expression vector and expressed in high yield in a S . typhimurium or in an E. coli host that lacks the trp genes (24, 49, 54, 56). Our first target for mutagenesis was selected before the crystal structure was solved. These studies of a mutant a subunit from S . typhimurium in which Arg-179 was replaced by leucine showed that Arg-179 is not obligatory for catalysis, for binding of indole 3-glycerol phosphate, or for subunit interaction (54). However, this amino acid alteration does have striking effects on some of the ligand-dependent spectroscopic and kinetic properties of the a& complex (see Section V.B). Since these properties have been attributed to the reciprocal transmission of substrate-induced conformational

M13mpl8 Insert

Ecpm

t rpB'

.

Figure 10. Plasmid used for oligonucleotide-directedmutagenesis. Plasmid pSTB7 is a derivative of pBR322 which contains part of the tryptophan operon from S. typhimurium; the trp promoter (P), truncated trpC (C') gene, the trpA and rrpE genes, and the terminator (tt'). The EcoRI-Hind11 fragment of this plasmid has been cloned into bacteriophage M13mp18 and used for site-directed mutagenesis. (From ref. 54.)

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE

113

changes between the a and p subunits, our results suggest that changing Arg-I79 to leucine either induces conformational changes in the a subunit or alters the transmission of these changes to the p subunit. It is not possible to correlate the effect of amino acid substitution at Arg-179 with the three-dimensional structure of the a subunit in the a& complex (6) since Arg-179 is in a region of the structure (residues 179-191) which has a very weak electron density and is not in the current model. This region may have a very important function that cannot yet be understood from the x-ray analysis, since it also contains Thr-183, one of the original sites of missense mutation. Our studies of mutants of the a subunit substituted at the active site residues Asp-60, Tyr-175, and Gly-211 (49) are discussed in Section 1V.A. Alignment of the homologous amino acid sequences of a subunits from 10 species of bacteria and of the corresponding N-terminal a domain from S. cerevisiue and from N. crussu shows that a rather small number of residues are invariant (9%) or highly conserved (17). An examination of the locations of these highly conserved residues in the amino acid sequence (Fig. 8) reveals that most of them are clustered near positions that have been identified as the sites of missense mutations or of second-site revertants. We find that most of these highly conserved residues are located close to the substrate binding site of the a subunit in the three-dimensional structure of the a& complex (6). There is a very high incidence of conserved amino acids in the sequence between residues 44 and 65. This sequence includes Glu-49 and Asp-60, which are thought to be catalytic residues (see Section 1V.A) and part of a region (residues 53-78) which is inserted between strand 2 and helix 2 in the canonical eightfold a/@barrel (Fig. 4). Some of these residues (residues 55-58) are located at the interface between the a and p subunits, have very poor electron density features, and appear highly mobile. The observation that the sequence in this region is highly conserved suggests that this region is very important for function or for interaction between the a and p subunits. b. Structure Prediction by Evolutionary Comparison. Two independent groups used aligned sequences of the a subunit to facilitate the prediction of the secondary structure before the three-dimensional structure was available. The underlying rationale is that

114

EDITH WILSON MILES

essential structural and functional features are conserved during divergent evolution while those of lesser significance will vary. The first study (41) predicted an eightfold do barrel secondary structure that was similar to the structure found by x-ray crystallography (6). The second study (57) initially predicted a P-sheet/a-helix structure. Reevaluation of the results after the crystallographic results became available showed that they were also consistent with an eightfold a/p barrel structure (57). 2. p Subunit Sequences and Mutants Alignment of the homologous amino acid sequences of the p subunits from 10 species of bacteria and of the corresponding C-terminal p domain from S. cerevisiae and from N. crassa shows that a large number of residues are invariant (27%) or highly conserved (17). Recent studies show that the amino acid sequence of the p subunit from a plant is highly conserved with respect to corresponding microbial sequences (42). The residues that are highly conserved in many species are widely distributed in the amino acid sequence and in the three-dimensional structure of the p subunit in the a 2 p 2 complex from S . typhimurium (6). Although the locations of many of these residues in the three-dimensional structure have not yet been carefully analyzed, certain of them are in the active site (Fig. 11). The crystal structure clearly shows that Lys-87 forms a Schiff base with pyridoxal phosphate and thus confirms earlier studies using protein chemistry (38, 39). Since the imidazole ring of His-86 is close to the phosphate of pyridoxal phosphate in the structure, the imidazole nitrogen may serve to neutralize the negative charges on the phosphate. Another highly conserved region between residues 106 and 118 includes residues 109, 114, and 115 which may be near the substrate binding site; this site has not been established by crystallographic studies using bound substrates or substrate analogs. Residues 232-237, which are ligated to the phosphate of the coenzyme, are found in another highly conserved region between residues 229 and 237. The highly conserved residues 343-351 include Glu-350, which is located near the pyridine nitrogen of the coenzyme. Some of these active site residues are also discussed in more detail in Section IV.B.6. The unusually high sequence homology in the p subunit may result from the several structural and functional features which must be preserved during evolution. In addition to

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE

115

1u

Figure 1 1 . Stereo view of the active site of the p subunit based on the x-ray crystallographic results. Pyridoxal phosphate is bound through a Schiff base linkage to the side chain of Lys-87 (see text). (From ref. 7.)

the active site, the p subunit must maintain the long, intramolecular indole tunnel and large interaction sites between the two p monomers, between the a and p subunits, and between the two structural domains. a. Mutants of the f3 Subunit. Only three conventional missense mutants of the p subunit from E. coli have been identified: GI 16D (58), G281R (59), and K382N (60). Gly-116 and Lys-382 are invariant in homologous sequences (17) and are located rather close to pyridoxal phosphate in the three-dimensional structure (6). Gly-116 may be in the substrate binding site. Lys-382 may form an ion pair with the carboxyl of Glu-350, which is close to the pyridine N of the coenzyme. Gly-281 is located in a long random coil that interacts with the a subunit. This residue is invariant in different sequences. We have prepared a number of mutant forms of the p subunit from

I

116

EDITH WILSON MILES

S . typhimurium by site-directed mutagenesis (24) to investigate the possible catalytic roles of the altered residues (see Section IV.B.6). B. PROTEIN FOLDING AND DOMAINS

One of the fundamental unsolved problems in biology is how proteins unfold and refold. Another key question is what factors contribute to the stabilization of the folded form of a protein. A central principle of protein folding is that the final structure is determined by the amino acid sequence. With the recent increase in amino acid sequences derived from DNA sequence analysis, there is much interest in developing ways to derive folding data from sequence data (41, 57). Although the pathway of protein folding is thought to involve nucleation and folding intermediates, these intermediates are often transient and hard to detect. The examination of the threedimensional structures of many globular proteins has revealed that such proteins often contain distinct “domains” (61). This observation has led to the suggestion that these structural domains might correspond to intermediates in the folding process (62). The (Y and g subunits of bacterial tryptophan synthase have proved to be very useful for studies of protein folding and stability. The original collection of missense mutants of the a subunit (1, 8) (Fig. 8) provided a convenient source of a protein with several different amino acid replacements at a single site. These mutants have been used for studies of the effects of single amino acid replacement on stability and on the kinetics of folding. The studies led to observations of folding intermediates. The p subunit is an attractive target for the investigation of folding domains since its chromophoric coenzyme provides a good spectrophotometric probe. Finally, formation of the a& complex is a good system for analyzing protein assembly. 1.

(Y

Subunit Folding and Domains

Yutani and his colleagues initiated studies of the effects of single amino acid substitution on protein stability by using mutant forms of the a subunit from E. coli in which Glu-49 was substituted by Gln or Met (63, 64). They subsequently obtained a complete set of 20 variants at position 49 by classic genetic techniques and by sitedirected mutagenesis (5 1). The conformational stability of this series

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE

117

of variant proteins was determined from denaturation curves in studies using guanidine hydrochloride (52). The results demonstrate a strong correlation between the Gibbs energy of unfolding and the hydrophobicity of the residue at position 49. The large effects of amino acid substitution of Glu-49 on stability are probably a consequence of the location of this residue in the hydrophobic interior of the a subunit (6). Matthews’ group has investigated the effects of replacing Gly-211 by Glu or Arg on the melting temperature, enthalpy, and entropy (65). His group has also made extensive use of mutants for studies of the kinetics of folding and unfolding (66-68). There are several lines of evidence that the a subunit contains domains that fold independently upon chemical denaturation. The first evidence for folding domains in the a subunit came from complementation studies by Jackson and Yanofsky (69, 70). The experiments used dimers of the a subunit which were formed in low yield after treatment of the a monomer with a high concentration of urea and removal of the urea by dialysis. When dimers were formed from different mutant a chains, it was found that dimers formed from certain combinations of mutant a chains regained the enzymatic activity that was absent in the mutant monomers. For example, an a subunit with an alteration at residue 49, which is in the N-terminal part of the chain, complemented another a subunit with an alteration at position 21 1, which is in the C-terminal part of the chain. The results were rationalized by the model shown in Fig. 12A. This model proposes that dimers are formed by exchange of terminal portions of the contributing monomer chains. The reciprocal exchange of mutant a chain termini results in the construction of one functional active site region containing the wild-type residues Glu-49 and Gly-211 and one nonfunctional active site containing the two altered residues. The crystallographic results confirm that Glu49 and Gly-211 are combined in the active site of the wild-type enzyme. The model in Fig. 12A can be compared with another model in Fig. 12E which is based on additional structural information. Additional evidence for the occurrence of folding domains in the a subunit has come from studies using limited proteolysis. We found that limited tryptic proteolysis of the a& complex results in cleavage of the a subunit at Arg-188 and produces an active “nicked” enzyme (71,72). The site of cleavage is shown by an arrow labeled P in Fig 4C. The two fragments of the a subunit produced by this

118

EDITH WILSON MILES

A.

,

B.

-

6M urea 3

"

3

"

Figure 12. Models for formation of dimers by the a subunit and for complementation I ) by mutant forms of the a subunit. (A) Model proposed in ref. 70. The open bar ( represents one polypeptide chain with an N-terminal mutation (€3) such as one at residue 49. The solid bar ( =) represents a second polypeptide chain with a Cterminal mutation (W) at position 211. Unaltered residues at these positions are indicated by (e).The active site regions are indicated by the circled areas. The reciprocal exchange of mutant a-chain termini results formally in the construction of one functional active site region and one doubly altered one. ( B ) Model based on x-ray structure and limited proteolysis experiments. The two mutant forms of the a subunit with amino acid substitutions at Glu-49 (€3) or at Gly-21I (W), which are also shown in (A), are represented on the left by the schematic method described in Figs. 4A and 48. The arrows point to the flexible loops between strand 6 and helix 6 which contain a site (Arg-188) that is susceptible to limited proteolysis. Cleavage at this site yields two fragments which correspond to folding domains (see text and Fig. 13). The model assumes that, following exposure to 6 M urea and removal of urea by dialysis, there is reciprocal exchange of the C-terminal folding domains as proposed in (A). This exchange may be facilitated by the long, flexible loop between strand 6 and helix 6 and by the ability of each domain to fold independently.

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE

119

cleavage (termed the a-1 and 01-2 fragments) can be separated after denaturation by urea and shown to refold independently after removal of urea. The refolding of the N-terminal residues 1-188 (a-I) appears to be complete, whereas the refolding of the C-terminal residues 189-268 (01-2) is partial. The renatured fragments reassociate to form an active “nicked” enzyme. The results are evidence that these two fragments correspond to independent folding domains. Equilibrium studies of guanidine hydrochloride- (63, 64) and of urea-induced (73) unfolding of the a subunit demonstrate that the unfolding process involves at least one stable intermediate (Fig. 13). Our studies of the guanidine hydrochloride-induced unfolding of the a subunit and of the two proteolytic fragments show that the stepwise unfolding of the a subunit parallels the unfolding of the a-2 fragment at low concentrations of denaturant and the unfolding of the a-1fragment at the higher concentrations of denaturant (74). We conclude that the principal folding intermediate has a folded N-terminal domain corresponding to the a-1 fragment and an unfolded Cterminal domain corresponding to the 01-2 fragment (74) (Fig. 13). This conclusion is supported by subsequent hydrogen exchange experiments (75). The folding intermediate has also been demonstrated in equilibrium studies of denaturant-induced unfolding of a subunits from S . typhimurium and from one or more interspecies hybrids (21, 76). Kinetic studies of the folding and unfolding of homologous a subunits from E. coli, S . typhimurium, and five interspecies hybrids (76) show that all the proteins follow the same folding mechanism, which involves a folding intermediate. The evidence described above for folding domains in the a subunit led many investigators to envisage these domains as separate units of structure connected by a hinge region that was susceptible to proteolysis (see Fig. 13). They were thus surprised by the x-ray crystallographic results (6), which show that the a subunit has a single structural domain (Fig. 4C). The site of proteolysis at Arg188 is located in this structure in a highly mobile surface loop that connects strand 6 and helix 6 (see arrow labeled P in Fig. 4C). Proteolytic cleavage in this loop results in an N-terminal fragment containing the first five helidstrand structural units and strand 6 (a&) and a C-terminal fragment containing helix 6 and the last two of these units (a&*). These folding studies and the crystallographic

120

EDITH WILSON MILES

f0 Y

0

2

1

3

GuHCl IM) Figure 13. Stepwise unfolding of the a subunit of tryptophan synthase from E. coli by guanidine hydrochloride. (Borrorn)The fractions of native (N),intermediate (I), and denatured (D) states of the a subunit as a function of guanidine hydrocloride concentration at pH 7.0 and 26°C. (Top) Model of the denaturation process where a-1 and a-2 represent the domains corresponding to the a-1 and a-2 fragments obtained by tryptic cleavage at Arg-188. The a-2 domain of the intact a subunit (N)is shown to unfold at I M guanidine hydrochloride to yield a partially unfolded a intermediate (I); the a-1 domain is shown to unfold at 3 M guanidine hydrochloride to yield the fully denatured form (D). The model is based on the finding that the guanidine hydrochloride-induced unfolding of the a-2 and a-1 fragments, respectively, parallel these two steps. (From ref. 74.)

results indicate that the N-terminal part of the cw/p barrel can fold independently and that partial d p barrels are much more stable than would have been expected. We conclude that folding domains may differ from structural domains and that a protein with a single structural domain can have two or more folding domains. Figure 128 presents a model for the complementary dimer formation described above and is based on the additional information obtained from the crystal structure and from the limited proteolysis and unfolding studies. This model shows that dimers are formed by

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE

121

the exchange of the C-terminal parts of the two monomers which correspond to the a-2 fragments or to the C-terminal folding domains. This exchange requires bending or reorientation of the a chain in the loop between strand 6 and helix 6. The presence of this flexible loop, which is susceptible to limited proteolysis, may facilitate dimer formation. The refolding of the two monomers into the dimeric structure may also result from the ability of the two regions of the a subunit to refold independently after removal of urea. The availability of the x-ray structure now makes it easier to interpret previous studies on the kinetics of folding and of folding intermediates (63, 64, 73-77) and to determine which amino acid residues are involved in the “docking” of the folding domains (77). Matthews is using some of the original missense mutants shown in Fig. 8 to facilitate these investigations (77). 2. p Subunit Folding and Domains

Limited tryptic proteolysis of the p subunit yields a partially functional “nicked” protein consisting of two large polypeptide fragments termed the F1and F2 fragments (78,79). By treating the nicked protein with urea or guanidine hydrochloride, it is possible to dissociate and isolate the separate fragments. Upon removal of the denaturing agent, each fragment spontaneously refolds to a conformation similar to that of the corresponding domain in the p subunit or nicked p subunit. These results are evidence that the isolated fragment: correspond to independent folding domains. Goldberg and his colleagues have used these fragments in a number of kinetic and immunological studies of intermediates on the pathway of folding of the p subunit (80-87). Their results provide evidence that the fragments correspond to folding intermediates. The folding domains generated by limited proteolysis of the p subunit are not identical to the two structural domains (termed the N domain and the C domain) in the three-dimensional structure (Figs. 3, color insert, and 7). Whereas the two structural domains are largely derived from the N-terminal residues 1-204 and the C-terminal residues 205-397 (see Section II.B), the two proteolytic domains are derived from residues 1-272 and 284-397. Thus, the F2fragment lacks strands 7 and 8 and helices 8 and 9 of the C domain (Fig. 7).The “hinge” between the two proteolytic domains (residues 273-283 indicated by the arrow labeled P in Fig. 7) is located on a

122

EDITH WILSON MILES

side of the p subunit opposite the dividing point between the N and C domains (residues 204-205) and near the edge of the interface of the a and p subunits. We had predicted that this hinge region was located in or near the interface between the a and p subunits on the basis of our studies, which showed that the rate of proteolysis of the p subunit is greatly reduced in the a432 complex (71). This location is also consistent with the finding that the “nicked” p subunit cannot form a complex with the a subunit (79). Thus, the site of proteolysis in the p subunit, as in the a subunit, appears to be in a flexible loop that is not a hinge between two structural domains. Nevertheless, the proteolytic fragments generated by limited proteolysis appear to correspond to folding domains and give useful information on the folding mechanism. The hinge between the two folding domains is essential for enzyme activity (79), for interaction of the a and p subunits (79), and for conformational changes that occur upon substrate binding (88-90). C. OTHER STRUCTURAL STUDIES

At the time of my previous review in this series in 1979 (2), the main structural information about tryptophan synthase and its subunits could be summarized simply: the a subunit normally exists as a monomer, the p subunit normally exists as a dimer, and the two subunits can combine to form either an a& complex or an a p 2 complex (see Fig. 1). Since pyridoxal phosphate increases the apparent association constant for formation of the a& complex from about 1 pM to 1 nM (91), it is very difficult to separate the holo a 4 3 2 complex into the a and p subunits. Although pyridoxal phosphate is readily removed from the p subunit as the oxime after treatment with hydroxylamine, the same treatment of the holo a& complex results in formation of a tightly bound pyridoxal phosphate oxime (92). We found that the pyridoxal phosphate oxime can be removed and that the a and apo p subunits can be separated by a method involving addition of a chaotropic agent (1 M KSCN) in the presence of hydroxylamine (72, 92). I now correlate these observations with the three-dimensional structure of the a2p2 complex (6). This structure shows that pyridoxal phosphate is deeply bound in the interaction site between the N domain and the C domain of the p subunit (Fig. 3, color insert). The interface between the a and f3 subunits has an extensive surface

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE

123

area of about 1100 A’ and encloses the tunnel that separates these two domains of the p subunit and extends from the active site of the p subunit to the active site of the a subunit. Interaction of the ci and /3 subunits may stabilize the p subunit and the interaction between the two domains of the p subunit. This stabilization may tighten the binding of the pyridoxal phosphate oxime and prevent its removal from its location between the domains. In a reciprocal way, the presence of the bound pyridoxal phosphate “sandwiched” between the two domains of the p subunit may stabilize the interaction between the two domains and between the (3 subunit and the a subunit. The addition of the chaotropic agent presumably weakens these interactions and permits removal of pyridoxal phosphate as its oxime. Several other approaches led to information on the structure of the separate a and p subunits and on the a& complex before the three-dimensional structure was determined (6). Chemical crosslinking experiments estimated a distance of 18-22 A between the reactive sulfhydryl in the ci subunit and the pyridoxyl lysine (Lys87) of the p subunit (93). This estimate is consistent with the crystallographic data, which show that the active sites of the ci and p subunits are approximately 25 A apart (6). Small-angle x-ray scattering studies yielded estimates of the shapes, sizes, and radii of gyration of the a and p subunits and of complex (94). The estimated subunit arrangement and the the ci& shape and length of the a2p2complex agree reasonably well with the crystallographic data (6). The estimated maximum length of 135 A compares with a value of 150 A from the crystal structure. In the proposed models for the a2p2complex, the two a subunits are located distant from each other and separated by the main part of the p dimer. The subunit arrangement in the model is less linear than that found by the crystallographic studies. Conclusions from these experiments about the effects of subunit assembly on molecular shape are discussed in Section V.A. Fluorescence energy transfer measurements and hydrodynamic studies also produced information on the quaternary structure of tryptophan synthase from E. coli (95). Translational frictional ratios obtained from measurements of sedimentation and diffusion constants and partial specific volumes provided independent information on the shapes of the different particles in solution. The data

124

EDITH WILSON MILES

were interpreted by model building, which used several constraints and assumptions. The authors assumed on the basis of studies of folding domains of the a and Q subunits (see Section 1I.B) that each subunit was composed of two spherical domains with radii calculated from the molecular weights of the corresponding proteolytic fragments. Thus, the models proposed for the asp2complex consisted of eight closely packed spheres with some of the domains of the a and Q subunits interdigitated. In contrast, the crystal structure (Fig. 3, color insert) shows that the a subunit consists of a single domain and that the a and Q subunits have a single, relatively flat interaction site (6). The total length of the complex (140 A) in the model and the wide separation of the two a subunits agrees well with the xray scattering data (94) and the x-ray crystallographic data (6). The important conclusion that the active sites of the a and Q subunits are separated by a considerable distance is consistent with the crystallographic results and with the occurrence of channeling. One of the most interesting findings, especially in retrospect, is that the a& complex has an unusually high partial specific volume “perhaps due to internal cavities arising from the packing of the a and Q subunits” (95). This high value of the partial specific volume may result from the presence of the tunnel (an internal cavity) that was discovered in the crystal structure (Fig. 3)(see Section V.C). Results related to protein assembly are mentioned in Section V.A. The quaternary structure of the a& complex has also been evaluated by small-angle neutron scattering studies (W). The work used various deuterium-labeled and unlabeled a and Q subunits and “nicked” a subunits in which one or the other of the two proteolytic fragments was deuterium labeled. The findings that the two a subunits are completely separated and are situated on opposite sides of the p dimer are consistent with the later crystallographic data. The estimated distances between the various subunits agree well with the crystallographic results. The findings that the two domains of the a subunit are “intimately juxtaposed” and that “the distances between two like or unlike domains belonging to opposite a subunits are roughly equal” (96) are consistent with the results of x-ray crystallography (6). For results from this study related to protein assembly see Section V.B. High hydrostatic pressure causes reversible dissociation of the Q dimer (W-101). This process and renaturation after decompression

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE

125

can be followed by various spectroscopic techniques and activity measurements. The presence of pyridoxal phosphate strongly stabilizes the p dimer to dissociation and affects the rate of dissociation. The two p subunits in the holo P dimer are very tightly associated; the dissociation constant at 1 bar is K d = 3.7 x lo-’’ M (101). The crystal structure reveals that the two p monomers interact over a broad, nearly flat surface (about 1440 A’) through which a dyad axis of symmetry passes (Fig. 3)(6). Part of the N domain of each p subunit interacts with part of the C domain of the complementary p subunit. The tight association been the two P monomers probably results from this large contact surface. The striking effect of pyridoxal phosphate in stabilizing the p dimer is probably related to the way the coenzyme is “sandwiched” between the N and C domains. The coenzyme may stabilize interaction between the two domains as discussed previously. Since this domain interaction site also contains the tunnel that appears to extend to the contact surface between the two P monomers, stabilization of the domain interaction near this contact surface may tighten the association of the two monomers. D. MULTIFUNCTIONAL ENZYME FROM YEAST AND MOLDS

The separate ci and p polypeptides of bacterial and plant tryptophan synthase are represented in fungi by a fusion polypeptide (Fig. 14). The amino acid sequences of the multifunctional enzymes from N. crassa and from S. cerevisiae show strong homology with the amino acid sequences of the ci and p subunits from bacteria (17, 40, 102, 103) and of the p subunit from a plant (42). The first third of the fusion polypeptide is homologous to bacterial ci chains and most of the rest is homologous to bacterial and plant P chains. This domain order agrees with previous genetic and biochemical data (104). A short nonhomologous “connector” joins the two homologous segments in the fusion polypeptide. The chromosomal order of all bacterial genes that specify the tryptophan synthase ci and P chains is frpB-frpA (105). Fusion of these genes in their present arrangement would result in the synthesis of a polypeptide with a segmental order, N-P-a-C, opposite that observed in fungi. If we assume that fungi evolved from bacteria and that the bacterial arrangement of tryptophan synthase coding regions reflects those existing in the ancestor of the fungi, then we must explain why the

126

EDITH WILSON MILES

QENEORDER #ILyPBpTIDB(S)

AMINO ACIDS

BNZYMB

Figure 14. Organization of the tryptophan synthase genes and polypeptides in E. coli and N . crussu. The A and B domains of the N. crassa polypeptide arc the segments homologous to the Q and p subunits in E. coli. N and C represent the amino and carboxy termini of each polypeptide, respectively, while con represents the connector. The enzyme from E. coli is an a& multienzyme complex, whereas the enzyme from N. crass0 is a multifunctional enzyme.

P-a coding order was reversed in the evolution of the fungal gene

(106- 108).

The fact that the enzymes from bacteria and fungi have strong amino acid sequence homology implies that these enzymes have similar three-dimensional structures. It follows that an examination of the spatial arrangement of the a and P subunits in the bacterial a2P2 complex should suggest how the homologous a and f3 regions of the fungal enzymes are arranged and might explain why the coding order was reversed in the evolution of the fungal gene. In the threedimensional structure of the a$* complex from s. typhimurium, the N terminus of the P subunit is about 50 A from the C terminus of the a subunit (6). A rather long peptide would be required to connect these two termini to produce the fungal polypeptide. However, an even longer peptide would be required to bridge the 70 A distance that separates the C terminus of the P subunit from the N terminus of the a subunit to yield the hypothetical polypeptide resulting from a fusion of the trpB and trpA genes in the B-A orientation found in bacteria. Thus, the A-B orientation may have been favored in the evolution of the fungal enzyme in order to use a shorter connector. It is also possible that the A-B orientation was favored in order to maintain the free N-terminal helix of the a subunit (108). This helical element, designated “helix 0,” is one of three extra helical elements found in the a subunit but not in the canonical eightfold a@ barrel

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE

127

(see Fig. 4). These extra elements may have evolved in order to serve important functional roles. Helix 0 caps the bottom of the eightfold d p barrel in the a subunit and may shield Glu-49, a key catalytic residue, from solvent (6, 108). Since the peptide that connects the a and p domains in the fungal enzyme is quite long, it is not possible to learn more about its location or role by examining the three-dimensional structure of the bacterial a2p2complex. One possible role of this connecting peptide is to allow conformational flexibility that permits correct polypeptide folding (102, 107). This role is supported by two types of evidence that show that the sequence of the connector is of secondary importance. The first study demonstrates that insertion of unrelated residues into the connector of the yeast enzyme yields partially active enzyme (107). The second investigation shows that the 54-residue connector of the N.crussu polypeptide has less than 25% identity to the 45-residue connector of the yeast polypeptide (103). The possible explanations for the domain arrangement in fungi have also been probed by creating artificial fusions of the a and p subunits in different orientations and with different connectors (106, 108). Several of these fusion proteins are highly active in crude bacterial extracts (108). It would be of interest to characterize the activity of the purified fused enzymes and to compare various kinetic parameters with those of the purified a2p2complex from E. coli. An examination of the crystal structure of the a432 complex from S. typhimuriurn (6) indicates that fusion with a short domain connector must result in an altered or distorted spatial arrangement of the a and p subunits. It is possible that enzymatic activity results from interaction of the p domain of one fused molecule with the a domain of a second fused molecule. This hypothesis is supported by the observation that the fused proteins tend to form aggregates (106, 108). The finding that fused proteins with the a-p orientation are more active than those with the p-a orientation supports the idea that the free N-terminal helix of the a domain plays an important role (108). IV.

Catalytic Mechanism

As described in the introduction and Fig. 1, tryptophan synthase catalyzes two different types of reaction at distinct and separate

128

EDITH WILSON MILES

active sites. Although extensive kinetic studies of the reactions catalyzed at the active sites of the a and p subunits have given some information on the mechanisms of these reactions, very little was known about the amino acid residues in each active site before the three-dimensional structure of the tryptophan synthase a& complex from S. typhimurium was determined (6). This structure identifies some active site residues in each subunit. The roles of these residues in catalysis and in substrate binding are now open to further investigation by site-directed mutagenesis. A. a SUBUNIT REACTION MECHANISM

The reaction catalyzed by the a subunit of tryptophan synthase, termed here the “a reaction,” is the reversible cleavage of indole 3-glycerol phosphate to yield indole and Pglyceraldehyde 3-phosphate. Early steady-state and fast reaction kinetic studies of the a reaction were described in my previous review in this series (2). Studies of substrate binding are facilitated by use of the substrate analog, indole 3-propanol phosphate (27). This analog lacks the two hydroxyl groups of the substrate and can not be cleaved by the enzyme. The crystallographicstudies described in Section I1.B used this inhibitor to locate the active site of the a subunit (Figs. 4-6). A plausible mechanism for the a reaction is presented in Fig. 15 (109). The cleavage of the C ; 4 3 bond in indole 3-glycerol phosphate [l] is activated by tautomerization of the indole ring to yield an indolenine tautomer [2]. Intermediate [21 has a tetrahedral carbon at C;. The tautomerization is probably facilitated by two catalytic groups, BI-H and Bz, by “push-pull” general acid-base catalysis. BI-H protonates the indole ring at C;, while B1 abstracts the proton on N-1 of the indole ring. Tautomer [2] “would have the requisite electron sink to stabilize the carbanionic transition state arising during aldol cleavage” (1 10). The actual bond cleavage to indole and glyceraldehyde 3-phosphate [3] is then catalyzed by B3, which removes a proton from the C3 hydroxyl group. It is possible that a single residue could serve as BI and B3. Phillips and Cohen suggested that a protein carboxylate may promote indolenine formation since facile intramolecular proton transfer occurs from the propionic acid side chain in the hydrolysis of 2-halo-3-propionic acids to the C; of the indole ring (111).

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE Q

129

REACTION R* H I

k CH-R

r21

I

H

Bi

Figure 15. Mechanism of the Q reaction (see text). (From ref. 109.)

We had initiated studies to identify active site residues that catalyze the a reaction (51) before the three-dimensional structure of the a2p2complex was determined. Glu-49 is a good candidate to be a catalytic residue since it is one of the sites of missense mutations which inactivate the a subunit (1, 43, 44) (Fig. 8) and since it is invariant in homologous sequences from many species (17,40,41). Studies of the pH dependence of the a reaction indicate that the reaction is catalyzed by one or more bases with pK values of about 7.9 (1 12). Glu-49 might be one of these bases since it has an unusually high pK of about 7.5 (113). This unusually high pK value probably results from the location of Glu-49 in a hydrophobic environment in the a subunit (6, 113). Our studies of the effects of replacing Glu49 in the a subunit from E. coli by each of 19 other amino acids provide strong evidence that Glu-49 is essential for activity (51). The

130

EDITH WILSON MILES

wild-type and the mutant a subunits form a& complexes with the p subunit with similar association constants and stimulate the activity of the p subunit in the p reaction. Thus, none of the changes at position 49 alters the conformation of the a subunit in a way that significantly interferes with subunit interaction. However, the 19 mutant a z p z complexes are completely devoid of activity in the a reaction or in the ap reaction. We later used circular dichroism and difference absorption studies to investigate ligand binding by five of the mutants of the a subunit, substituted with Asp, Lys, Ala, Phe, or Gly at position 49 (109). Our finding that these mutant OL subunits all bind the substrate analog, indole 3-propanol phosphate, indicates that amino acid substitution does not alter substrate binding and supports a catalytic role for Glu-49. The crystallographic results show that Glu-49 is located in the interior of the a subunit near the binding site of indole 3-propanol phosphate (Figs. 5 , color insert, and 6) (6, 49). The carboxylate of Glu-49 is located near the scissile bond in a position suitable for a catalytic group and is thus likely to be B3 shown in Fig. 15. Asp-60 is a good candidate to be a second catalytic group (Bz shown in Fig. 15) since the carboxylate of Asp60 seems to hydrogen bond with the indole N H of the inhibitor bound to the active center of the a subunit (Fig. 6). Asp-60 is invariant in homologous sequences from many species with one exception (16,40,41, 114). Glutamic acid is located in the homologous position in the a subunit of Caulobacter crescentus (1 14). We have evaluated the fhctional role of Asp-60 in the a subunit from S.fyphimurium by site-directed mutagenesis (49). Our finding that replacement of Asp-60 by asparagine, alanine, or tyrosine results in complete loss of activity in the a reaction is evidence that aspartic acid is a second catalytic group. We conclude that Asp-60 plays a catalytic role, not a substrate binding role, since these mutant forms bind indole 3-propanol phosphate. Glu-60 may serve as an alternative catalytic base since the mutant form with glutamic acid at position 60 has partial activity. This result is consistent with the presence of glutamic acid at the homologous position in the a subunit of Caulobacter crescentus (114). Asp-60 is located in part of the a subunit that has weak electron density in the x-ray crystallographic electron density map and appears highly mobile (6). It is possible that the flexibility of this region allows glutamic acid to substitute

STRUCTURAL BASIS FOR CATALYSIS BY TRYPTOPHAN SYNTHASE

131

for Asp-60 and serve as a catalytic residue. We conclude that Asp60 serves as the catalytic base B2 in Fig. 15. We still have no evidence for the identity of B1 in Fig. 15, the group that protonates C; of indole. Although the same residue might serve as B1 and BJ, the carboxylate of Glu-49 appears to be too distant from the C; position of indole in the crystal structure with bound indole 3-propanol phosphate. However, the precise mode of binding of indole 3-glycerol phosphate has not yet been determined and could differ from that of the analog. We also do not know the exact conformation of the substrate and of the active site residues in complexes of the enzyme with reaction intermediates. It is possible that substrate tautomerization during catalysis could trigger a conformational change that would position the carboxylate of Glu-49 near the C; position of indole 3-glycerol phosphate. Since Tyr-175 is located in the active site of the a subunit with its phenolic hydroxyl close to the side chain of indole 3-propanol phosphate (Fig. 6), the hydroxyl of Tyr-175 might have a role in catalysis or in substrate binding (6, 49). However, our finding that the a& complex, which contains a subunit substituted with phenylalanine at position 175 (Y175F), retains significant activity in the a and ap reactions demonstrates that Tyr-175 is not essential for catalysis (49). The important role of the aromatic ring of Tyr-175 in the binding site is discussed in Section 1II.A. 1. The roles of other residues in the active site of the a subunit remain to be explored. The crystal structure shows the presence of a number of hydrophobic residues in the binding site for indole, including Phe-22, Leu-100, Tyr-102, Leu-127, Ala-129, IIe-153, and Tyr-175 (6). The phosphate of indole 3-propanol phosphate binds between the peptide loops containing residues 21 1-213 and 234235. B. p SUBUNIT REACTION MECHANISM

The p subunit of tryptophan synthase catalyzes a number of pyridoxal phosphate-dependent reactions including p-replacement, pelimination, transamination, and isomerization reactions (Table 1). Association of the p subunit with the a subunit has different effects on the rate of each type of reaction. The mechanism of these reactions has been investigated by a large number of spectroscopic and kinetic studies. The UV-visible and fluorescence spectral

+

*

L-serine + pyruvate NHs L-tryptophan + pyruvate + indole + NH3 L-serine &mercaptoethanol + PLP+ S-pyruvyl mercaptoethanol + PMP + Hz0 PMP + indole-3-pyruvate -+ PLP + L-tryptophan 2-amine3-butenoate + HzO+ a-ketobutarate + NH3 D-tryptophan L-tryptophan 2,3-dihydm5-F-Dtryptopryptophane 2,3dihydro-5-F-L-tryptophan L-serine + indazole + &I-indazole-Lalanine L-serine + indoline + dihydroiseLtryptophan .- -

+

+ indole + L -tryptophan + L-serine + &mercaptoethanol+ S-(hydroxyethyl)-L-cysteine + H 2 0

H20

L-wine

Reaction

p-Replacementd

p-Replacementd

Isomerization Isomerization

&Eliminationc

Transamination

Transamination

on &Elimination

B -

&Replacement

f3-Replacement

Reactiontype

+ + -

-

Structural basis for catalysis by tryptophan synthase.

Advances in Enzymology and Related Areas of Molecular Biology, Volume 64 Edited by Alton Meister Copyright © 1991 by John Wiley & Sons, Inc. STRUCTUR...
6MB Sizes 0 Downloads 0 Views