338

Biochimica et Biophysica Acta, 1038 (1990) 338-345

Elsevier BBAPRO 33634

Conservation of functional residues between yeast and E. coli inorganic pyrophosphatases Reijo Lahti 1, Lee F. Kolakowski, Jr. 2, Jukka Heinonen 1, Mauno Vihinen 1, Katariina Pohjanoksa i and Barry S. Cooperman 2 I Department of Biochemistry, University of Turku, Turku (Finland) and e Department of Chemistry, University of Pennsylvania, Philadelphia, PA (U.S.A.)

(Received 24 January 1990)

Key words: Inorganicpyrophosphatase; Activesite residue; Amino acid sequence; Alignment; Protein flexibility

The alignments of the amino acid sequences of inorganic pyrophosphatase (PPase) from Saccharomyces cerevisiae (YI-PPase, 286 amino acids) and Escherichia coil (E-PPase, 175 amino acids) are examined in the light of crystallographic and chemical modification results placing specific amino acid residues at the active site of the yeast enzyme. The major results are: (1) the full E-PPase sequence aligns within residues 28-225 of Y1-PPase, raising the possibility that the N-terminal and C-terminal portions of Y1-PPase may not be essential for activity, and (2) that whereas the overall identity between the two sequences is only modest (22-27% depending on the choice of alignment parameters), of some 17 putative active site residues, 14-16 are identical between Y-PPase and E-PPase. PPase thus appears to be an example of enzymes from widely divergent species that conserve common functional elements within the context of substantial overall sequence variation.

Introduction Inorganic pyrophosphate (PPi) is a central phosphorus metabolite that can reach quite high concentrations in bacteria (up to 40 mM) and in yeast (up to 70 mM) (Ahmad, N. and Cooperman, B.S., unpublished data) [1-5]. PPi hydrolysis is catalyzed by inorganic pyrophosphatase (pyrophosphate phosphohydrolase, EC 3.6.1.1) (PPase). All known PPases are divalent metal ion requiring, with Mg 2+ conferring the highest activity. The best studied PPases are those from Saccharomyces cerevisiae [6] and Escherichia coli [7]. The S. cerevisiae enzyme (Y1-PPase) is a homodimer containing 286 amino acids per monomer. Its amino acid sequence was directly determined some time ago [8] and recently verified, with some small changes, by DNA sequencing of the PPA gene [9]. A three-dimensional X-ray crystallographic structure at 3 A resolution has been published

Abbreviations: E-PPase, inorganic pyrophosphatase from Escherichia coil; PPi, inorganic pyrophosphate; Y1-PPase, inorganic pyrophosphatase from Saccharomyces cerevisiae; Y2-PPase, inorganic pyrophosphatase from Kluyveromyces lactis. Correspondence: B.S. Cooperman, Department of Chemistry, University of Pennsylvania,Philadelphia, PA 19104, U.S.A.

[10,11]. The enzyme binds up to four divalent metal ions per subunit, with three required for activity [13]. A divalent metal-ion binding cavity has been identified that contains 12 polar residues that appear to interact with bound metal ions, as well as five basic residues that could plausibly interact with PPi [10,11]. Three of these 17 residues have been implicated as essential by chemical modification studies [13-15]. The E. coli enzyme (E-PPase) is a hexamer containing 175 amino acids per monomer. The full DNA-deduced sequence has been determined only recently [16] and is consistent with the earlier, partial sequence determined directly [17]. A chemical modification study has provided evidence for an active site lysine [18]. A DNA-deduced amino acid sequence of PPase from the yeast Kluyveromyces lactis (Y2-PPase) has also become available recently [19] and shows considerable similarity to that of Y1-PPase. In this paper we examine the alignment of the amino acid sequences of Y1-PPase and E-PPase. The best alignments found, while showing only modest overall identity between E-PPase and Y1-PPase (22-27% depending on the choice of weighting parameters) show remarkable conservation of the 17 putative active site residues mentioned above (14-16 identities, again depending on the choice of parameters). A preliminary account of this work has been presented earlier [20].

0167-4838/90/$03.50 © 1990 ElsevierSciencePublishers B.V. (BiomedicalDivision)

339 Methods

parameters. A large number of such alignments having an overall similarity (identical plus similar residues) of >_ 35% were analyzed in detail. In evaluating the alignments obtained, we took into consideration not only the measures of quality provided by the programs (quality, ratio, %-similarity) but also the number of extended regions of similarity in a given alignment. Such regions, or 'patches' have been posited by Reichardt and Berg [29] to be a general feature of proteins with a common function in widely divergent species. Furthermore, we discarded all alignments have more than 10 gaps. The rationale for adopting this limitation is the observation of Doolittle's [30] that in matching sequences of protein pairs that are thought to be related by common ancestry, the number of gaps should not exceed approx, four per 100 residues, or approx. 8 for the PPase alignment. On this basis, the two best alignments were obtained with Gap weight/Length weight parameter values of 2.0/0.6 and 4.0/0.3, respectively. For each parameter set, the same alignment was obtained with either algorithm. The 2.0/0.6 alignment is presented in Fig. 1, and specific features of the two alignments are summarized in Table I. Common to both alignments are the matches of E-PPase residues 1-16, 29-36, 51-63, 64-125, and 134-175 with Y1-PPase residues 28-43, 56-63, 89-101, 114-175 and 184-225, respectively. Both alignments place the amino acid sequence of E-PPase well within that of Y1-PPase, such that the N- and C-termini of Y1-PPase (residues 1-27 and 226-286, respectively) do not correspond to any sequences within E-PPase. Both alignments also have a large number of identical aspartates, which constitute 16.7 and 18.4%, respectively, of the total identical residues in the 2.0/0.6 and 4.0/0.3 alignments. These percentages are considerably in excess of the percentages of aspartate residues in Y1-PPase (8.4%) and E-PPase (8.0%), a result that is particularly

The algorithms of Smith and Waterman [21] and Needleman and Wunsch [22] as implemented in the GAP and BESTFIT algorithms supplied by the University of Wisconsin Computer Group [23] were used to align the gene-deduced amino acid sequences of Y1PPase [9] and E-PPase [16]. The two important adjustable parameters in both algorithms are Gap weight and Length weight. An increase in the value of Gap weight increases the penalty incurred per gap introduced. Similarly, an increase in Length weight increases the penalty as a function of gap size. Matched amino acids were scored using the similarity matrix of Schwartz and Dayhoff [24], as rescaled and normalized according to Gribskov and Burgess [25]. In this matrix, a perfect match has a value of 1.5 and the poorest match has a value of -0.8. The following matches, having values of > 0.7, were considered similar: A : G ; C : S . Y ; D:E.G.N.Q; E:Q; F:I.L.W.Y; H:Q; I:L.V; K:R; L : M . V ; R : W ; W : Y . The algorithm of Kyte and Doolittle [26] as implemented in the peptide structure algorithm supplied by the University of Wisconsin Computer Group [23] was used to predict the hydrophathy profiles. Flexibility indices were calculated according to Vihinen [27] and flexibility profiles according to Karplus and Schulz [28]. Results

Alignment of the Y1-PPase and E-PPase amino acid sequences In exploring the alignment of Y1-PPase and E-PPase we employed both the Smith and Waterman [21] and Needleman and Wunsch [22] algorithms, and for each examined the results obtained with a large number of sets of values of the Gap weight and Length weight •

.

** •

.

Y2-1 YI-I E-I

SYTTRQ•GAKNSLDYKVY•EKDGKPISAFHDIPLYADEANGIFNMvvEIPRWTNA.KLEITKEEPLNPIIQDTKKGKLRFvRNCFPHHGYIHNYGAFPQTW TYTTRQIGAKNTLEYKVYIEKDGKPV•AFHDIPLYADKENNIFNMvvEIPRWTNA.KLEITKEETLNPIIQDTKKGKLRFVRNcFPHHGYIHNYGAFPQTW ........................... SLLNVPAGKDLPEDIY.VVIEIPANADPIKYEIDKES..GALFVD ...... R F M S T A M . . . F Y P C N Y G Y I N H T L i i[P D [~II IVIEIP i K EI KE II D RF Y N Y G i VT

Y2-101 YI-101 E-63

EDPNESHPETKA•GDNDPLDVLEIGEQVAYTGQ•KQ•KvLGVMALLDEGETDWKvIAIDINDPLAPKLNDIED•EKHLPGLLRA.TNEWFRIYK.IPDGKPE

*

Y2-201 YI-201 E-151

200 200 150

EDPNVSHPETKA~GDNDPID~LEIGETIAYTGQVKQ~Ir~LGII~tLLDEGETDWKVIAIDINDPL~PKLNDIED~EK~FPGLLRA.TNEWFRIYK.IPDGKPE

S ............ LDGDPVDVLVPTPYPLQPGS•IRcRP•GvLKMTDEAGEDAKL•AvP.HSKLSKEYDHIKDvND.LPELLKAQIAHFFEHYKDLEKGKWv D DPIDVL G v i iGil i DEi D K[[Ai L i I DV IP L L I A iF Y K I

**

*

*

*

*

*

*

*

*

*

**

**

*

*

**

*

K

I

I

GK

*

NQFAFSGEAKNKKYTLDVIRECNEAWKKLI SGKSADAKKIDLTNTTLSDTATYSAEAASAVPAANVLPDEP IDKS I D K W F F I S G S A N Q F A F S G E A K N K K Y A L D I I K E T H D S W K Q L I A G K S S D S K G I D L T N V T L P D T P T Y S K A A S D A I P P A S L K A D A P IDKS I D K W F F I S G S V KVEGWENAEAAKAE IVASFERAKNK .............................................................

11

i00 i00 62

286 286 175

I

Fig. 1. Alignmentof E-PPase, Y1-PPase and Y2-PPase. The E-PPase alignmentshown was generated using a Gap weightvalue of 2.0 and a Length weight value of 0.6. Asterisks indicate matched residues between Y1-PPase and Y2-PPase having a similarity score of < 0.6 (poor matches). Vertical bars indicate matched residues between Y1-PPase and E-PPase having a similarity score of > 0.7. Horizontal lines indicate extended regions of similaritybetweenY1-PPase and E-PPase.

340

2.O/O.6

YI-40

C-S---I i---S----l I--SIPRWTNA. IIQDTKKGKLRFVRNCFPHHGY NIIiIFNMVVE •V IEIP - C--S--7 -I KKLE-EIrTKEETLNP S - • X~ GNYG 94 KE Ii D RF

E-13

EDIY[_sVV~2IT~oNANA~PI[KYEIDKESH~_____GALF~_;ECTION RFM2---TAM-" 2 ~-F~fPcNYG 56

4.0/0.3 YI-40 E-13

.... s___- h

r-S---i

t--s---I

l---s----i

I--S-

ITKEETLNPIIQDTKKG~CFPHHGYIHNYG 94 IIl I K EI KE I I I X NYG 56 ED I_~S~;IT~ON~[~YE IDKE~_F~____ SR~E;TI~": :-::::::IYPCNYG

NNIFNMVVEIPRWTNAKLE

2.0/O.6 - -a . . . .

a

YI-174

187 V

E-123

IP LLIA

VND. LPET,T.KAQIAH

136

~ECTION- -

3 4.O/O.3

--a-- I i----a---

YI-174

E-123

VEKYFPGLLRATNE V L VNDLPELLKAQIAH

187 136

ISECTION-[ 3 Fig. 2. Matches of Yl-PPase and E-PPase residues 13-56 and 123-136 showing three sections in which 2.0/0.6 and 4.0/0.3 alignments differ. Regions of a-helix and fl-sheet according to Terzyan et al. [11] are indicated.

pertinent with respect to the identity of amino acid residues involved in divalent metal ion binding (see below). By contrast, the two alignments differ in the manner in which E-PPase residues 17-28, 37-50 and 126-133 are matched with Y1-PPase residues 44-55, 64-88 and 176-183, respectively (Fig. 2). In both the first and third of these sections, the presence of two additional one base gaps in the 2.0/0.6 as compared with the

4.0/0.3 alignment is accompanied by an increase in the number of identical and similar residues and the formation of an additional extended region of similarity. On the other hand, in all three sections, elements of the secondary structure present in the Terzyan et al. [11] structure of PPase are conserved in the 4.0/0.3 alignment but disrupted in the 2.0/0.6 alignment. Secondary structures generally show a strong tendency to be conserved in evolution [31]. Lastly, in the second section,

341 TABLE I

General features of selected alignments G a p weight Length weight

2.0 0.6

4.0 0.3

Total No. of gaps

10

4

G a p sizes and (no.) a

1(6), 2(1), 3(1) 6(1), 12(1)

1(2), 11(1) 12(1)

No. of identities No. of similarities

48 31

38 29

Percent similarity b

45.1

38.3

E : 14-22 Y : 4 1 - 5 0 (0.97) E : 29-35 Y : 56-62 (1.14) E : 53-61 Y : 92-99 (0.89) E : 65-72 Y : 115-122 (1.31) E: 88-99 Y : 138-149 (0.91) E: 102-108 Y : 152-158 (1.07) E: 126-132 Y : 178-184 (1.21)

E : 29-35 Y : 56-62 E : 51-62 Y : 89-100 E : 65-72 Y:115-122 E: 88-108 Y : 138-158 E: 118-124 Y : 169-175

Extended Regions of Similarity (score) c

common core

a No. in parentheses refers to multiplicity of gap of indicated size. b (No. of identities plus No. of similarities)× 100/175. c Score is calculated as the average similarity per residue in the extended region ( m i n i m u m five identical or similar residues) using the modified Dayhoff matrix and including G a p weight and Length weight penalties where appropriate.

for the the 4.0/0.3 alignment are 14 identical residues, 10 of which fall in the five regions of extended similarity. The high percentage of putative active site residues that are identical, 94 and 82% for the 2.0/0.6 and 4.0/0.3 alignments, respectively, vastly exceed the percentages of identical residues for all of the 175 residues of E-PPase (27.4 and 21.7%, respectively). Independently of the X-ray studies, three of these putative active site residues in Y1-PPase (Arg-78 (13), Lys-56 (14), and Glu-150 (15)) have been implicated by chemical modification studies as being essential for enzymatic activity. Of these three residues only one, Lys-56, matches with an identical residue (Lys-29) in both alignments. In this connection it is especially noteworthy that a recent chemical modification study of E-PPase has led to the isolation of a tryptic peptide, containing an apparently essential lysine, having the N-terminal sequence DLPE [18]. This sequence corresponds to residues 10-13 in E-PPase (Fig. 1), leading to the identification of Lys-29 as the essential residue, since it is the first Arg or Lys following residue 13. Arg-78 matches with E-PPase Arg-43 in the 2.0/0.6 alignment but is matched with a gap in the 4.0/0.3 alignment, whereas Glu-150 is matched against Gly-100 in both alignments. In the latter case a locally optimized alignment may be more appropriate than the alignment employed for the entire protein structure. Glu-150 falls within the 'common core' region discussed above and is part of a very acidic hexapeptide, containing two Asp T A B L E II

the 11 additional residues in Y1-PPase vs. E-PPase form a single gap in the 4.0/0.3 alignment as compared with three gaps in the 2.0/0.6 alignment. In concluding this section we note that in considering a very large number of alignments, the common feature of all was the match of E-PPase residues 65-125 with Y1-PPase residues 115-176. This 'common core' region includes three extended regions of similarity (Table I). The active site of PPase

Within their published 3 ,~ structure of Y1-PPase, Kuranova et al. [10] and Terzyan et al. [11] identify the 17 polar residues listed in Table II either as ligands for one of the four divalent metal ion binding sites per subunit or as potential sites of direct interaction with bound PPi- The overall three-dimensional structure of a Y1-PPase subunit is shown in Fig. 3A. In the view shown, the putative active site cavity is at the lower right. This region is shown in greater detail in Fig. 3B, in which the positions of all 17 of the residues listed in Table II are indicated. In the 2.0/0.6 alignment, 16 of these putative active site residues in Y1-PPase are identical with residues in E-PPase, with 11 of these 16 falling within one of the seven regions of extended similarity noted in Table I. The corresponding numbers

Putative active site residues Matched E-PPase residues a Proposed c function

M 2+ Binding

PPi Binding

No. of identical matches

Y1-PPase residue

2.0/ 0.6

4.0/ 0.3

E-48 E-58 Y-89 Y-93 D-115 D-117 D-120 D-147 E-148 E-150 b D-152 Y-192

E-20 E-31 Y-51 Y-55 D-65 D-67 D-70 D-97 E-98 G-100 D-102 Y-141

1-21 E-31 Y-51 Y-55 D-65 D-67 D-70 D-97 Eo98 G-100 D-102 Y-141

K-56 b R-78 b K-154 K-193 K-198

K-29 b R-43 K-104 K-142 K-148

K-29 b GAP K-104 K-142 K-148

16

14

a For alignments obtained using different weighting parameters. b Also implicated at the active site by chemical modification studies. c Refs. 10 and 11.

342

93

48

\ 78

\120 ~ ~.~

92

93

Fig. 3. Three-dimensional structure of the Y1-PPase subunit. (A) A view of the overall structure. Only the a-carbon backbone is shown, except for the putative active site cavity at the lower fight where side-chains are indicated. (B) A more detailed view of the active site cavity, indicating the positions of the side-chains of the residues listed in Table II. The pictures were generated using the I N S I G H T program and displayed on an Evans and Sutberland PS390, using the coordinates deposited by the Moscow crystallography group at the Brookhaven Protein Data Bank.

343 tYI-PPase KD

0 5.0

50

I

,

,

Hydrophilicity

,

,

A',

-5.o

,

,

,

,

5.0

Hydrophilicity

I00

I

,

,

~

;

,

,

,

,

,~-~.

~-J

,

,

,

,

~

,

,

,

,

I

~ - J ~ "

,

/~ ~ /

,

,

,

,

,

,

,

,

,

, v

,

,

, ~-

=

,

I

,

,

,

,

,-P

-

v

I00 ,

I

~

A..-.-,~. ~ - - , ~ / ' , ,t

5o ,

150

I

,

_~.

V

1

,

P-,-/"-,~__ ~

0

IE-PPase KD

,

150 ,

,

,

,

,

,

,

~

,

I

,

,

,

,

v,wj

-5.0 Fig. 4. Hydropathy profile for truncated Y1-PPase and truncated E-PPase according to Kyte and Doolittle [26].

and two Glu residues. The matching hexapeptide in E-PPase contains the same four acidic residues. For this peptide, the match common to both alignments considered in this paper has three identical residues, all of the acidic residues except for Glu-150. The local realignment shown results in a match with five identical residues, including Glu-150, at a cost of introducing two single-residue gaps in a fl-sheet region of Y1-PPase [11]. common alignment

local realignment

E-PPase Y1-PPase

E-PPase Y1-PPase

DEAGED DEGETD

DEAGE • D D E . GETD

Baltscheffsky et al. [32] have pointed out a similarity in the sequence of Y1-PPase residues 114-122 and a conserved sequence in a number of proton ATPases [33] and suggested that D-120, together with metal ion, is involved in the binding a n d / o r function of a substrate phosphoryl group. The present results are consistent with this suggestion, since the peptide Y1-PPase 115-122 not only has the highest similarity score of any extended region of similarity between Y1-PPase and E-PPase (Table I), but also has the highest density of putative active site residues in PPase (Table II), containing three such residues in a total of eight residues.

Implications of the alignment of the amino acid sequences of Y1-PPase and Y2-PPase Y1-PPase and Y2-PPase both have 286 amino acids. A straightforward alignment of the two sequences with no gaps (Fig. 1) yields a total of 242 identities (84.6%). Of the remaining 44 residues, 12 align with matches having values of > 0.7, leaving a total of 32 dissimilar residues. The locations of these 32 residues are supportive of the overall alignment of the E-PPase sequence within the Y1-PPase sequence (Fig. 1). First, their frequency of occurrence is quite low in the interior section of Y1-PPase that is matched with E-PPase, and is considerably larger in the N-terminal and especially the C-terminal sections that do not correspond to sequences in E-PPase. Second, all 48 of the residues in Y1-PPase that are identical with residues in E-PPase in one or both of the 2.0/0.6 and 4.0/0.3 alignments are also identical with the matched residues in Y2-PPase.

Third, all of the 17 putative active site residues listed in Table II are identical between Y1-PPase and Y2-PPase.

Comparison of hydropathy profiles Deletion of the gaps shown in Fig. 1 (for Y1-PPase: residues 1-27, 44, 65-66, 72-77, 85-87, 102-113, 160, 177 and 226-286; for E-PPase, residues 28, 133 and 143) gives two truncated proteins, tY1-PPase and tEPPase, which, when aligned, are 27.9% identical and have an additional 18.0% of similar residues. Despite this only modest extent of overall similarity, many of the major features of the hydropathy profiles are conserved between the two truncated PPases (Fig. 4). Comparable results were obtained for truncated proteins generated using the 4.0/0.3 alignment.

Correlation with temperature stability and predicted rigidity E-PPase is much more heat stable than Y1-PPase [34-36]. Vihinen [27] has observed that thermostable enzymes are, in general, less flexible (as measured by their flexibility indices, F) than their thermolabile counterparts. Consistent with this observation, we calculate F values of 0,9925 and 1.0039 for E- and Y1-PPase, respectively, using the method of Vihinen [27]. The magnitude of the difference in these values is similar to what was previously found for other enzymes [27]. Residues 226 to 245, which fall within the C-terminal portion of Y1-PPase that is both non-identical to Y2PPase and for which there is no corresponding E-PPase sequence (Fig. 1), make up by far the most flexible portion of Y1-PPase, as estimated by the method of Karplus and Schulz [28]. Terzyan et al. [11] have shown that both the N- and C-terminal regions of Y1-PPase form loosely packed loops on the surface of Y1-PPase, and it is reasonable to speculate that the lower thermostability of Y1-PPase could arise, at least in part, from the higher mobility of the terminal loops causing unfolding of the enzyme. In this connection it is interesting that the 'core' of Y1-PPase that matches with EPPase, residues 28-225, has a lower F-value, 0.9991, than that of the native Y1-PPase. It will be interesting to test whether Y1-PPaSe mutant proteins formed by partial deletions within either the N- or C-terminal of

344 Y1-PPase are more thermostable a n d / o r more rigid in sturcture than native Y1-PPase. Discussion

The 2.0/0.6 and 4.0/0.3 alignments compared in Tables I and II and Fig. 2 have 27.4 and 21.7% identity, and employ 5.7 and 2.3 gaps per 100 residues compared, respectively. These values are of borderline significance for considering two protein sequences to be evolutionarily related. Thus, Doolittle [30] has put forward the rule of thumb that an alignment having 4 gaps per 100 residues should have a minimum of 25% identity in order to be considered statistically significant. However, the very strong conservation of putative active site residues (Table II), the fact that very few of the dissimilar residues between Y1-PPase and Y2-PPase fall within the major regions of overlap between Y1-PPase and E-PPase (Fig. 1), the similarity of the hydropathy profiles of the truncated proteins, and the large number of identical aspartate residues when Y1-PPase and E-PPase are aligned, taken together, provide a very strong argument in favor of an evolutionary relationship between the yeast and bacterial enzymes. This point is also supported by our observation that of all the protein sequences included in the N B R F protein library, Y1PPase was the most similar to E-PPase when tested by the algorithm of Lipman and Pearson [37]. By way of comparison, we have performed 2.0/0.6 and 4.0/0.3 alignments for 22 other proteins, generally chosen from among the most abundant proteins, with the majority being enzymes of intermediary metabolism, the sequences of which are known for both yeast and E. coli. The results, summarized in Table III, show the extent of identity we have found with PPase to be on the low side, but not unusually so. However, PPase does appear to be unusual in the extent to which the percentage of identical residues depends on the choice of weighting parameters. A major conclusion of this work is that the catalytic PPase function is present as a 'core' within the Y1-PPase sequence, leading to the prediction that neither the N-terminal nor the C-terminal portions of Y1-PPase are directly involved in catalytic function. From the results presented above the outer limits of such a 'core' extend from Y1-PPase residues 28-225, although all of the putative active site residues are contained between residues 48 and 198. In both of the alignments considered, the percentage of identical putative active site residues is impressively high, and leads to the clear conclusion that E. coli and yeast PPases evolved from their common ancestor with only modest conservation of overall sequence but with strong pressure to conserve essential amino acid residues. It therefore seems likely that the demonstrated requirement of the enzymatic activity of Y1-PPase and E-PPase for three divalent metal ions

TABLE III Percentage of identical residues in proteins having the same function in E. coli and yeast a

Enzyme 1 Glyceraldehyde-3-phosphate dehydrogenase 2 Elongation factor Tu 3 Glucose-6-phosphate isomerase 4 Fumarase 5 Enolase 6 NADP-dependent glutamate dehydrogenase 7 S-Adenosylmethioninesynthetase

8 Galactose-l-phosphateuridylyltransferase 9 Glutamine phosphoribosylpyrophosphate aminotransferase 10 Adenylatekinase 11 Fructose-l,6-bisphosphatase 12 Phosphoglyceratekinase 13 Phosphoribosyl-AMPcyclohydrolase 14 ATP phosphoribosyltransferase 15 5'-Phosphoribosylglycinamide transformylase

Size b

% Identity 2.0/0.6

4.0/0.3

331 437 549 467 123

68.6 64.6 60.7 58.4 58.3

68.6 64.4 59.8 58.0 58.3

447 382

53.4 51.9

53.4 51.3

347

51.6

50.0

505 214 332 387 203 297

50.5 48.3 42.9 42.7 34.4 33.5

50.5 48.3 42.9 42.1 34.4 31.6

212

32.6

31.1

16 Inorganic pyrophosphatase

175

27.4

21.7

17 Dihydrolipoamideacetyltransferase 18 Dihydrofolatereductase 19 Aspartokinase 20 Orotidine-5'-phosphate decarboxylase 21 Adenylatecyclase 22 DNA primase 23 a-Galactosidase

482 157 414

27.1 26.1 24.2

25.7 27.7 20.8

245 848 409 451

20.4 19.2 15.6 12.4

19.6 13.6 12.8 11.0

a Sequence data from Ref. 43. b Number of amino acids in smaller of the two proteins.

bound per active site [12,38] will be true of other PPases as well. Similar conclusions concerning the conservation of essential amino acid residues have been reached for a number of other enzymes, including Type I dihydrofolate reductase [391, malate dehydrogenase [40] biotincontaining enzymes [41], and amylolytic enzymes [42]. Although determination of the three-dimensional structure of E-PPase (currently underway in the laboratory of A. Goldman, Rutgers University) will be necessary in order to decide which of the two alignments considered in this paper is more correct (determination of PPase amino acid sequences from additional organisms would also be helpful), we note that the matches of putative active site residues (Table II) favor the 2.0/0.6 alignment, whereas the 4.0/0.3 alignment is both statistically more probable and more probable based on the structure of Terzyan et al. [11]. Thus, the insertions and deletions in 4.0/0.3 alignhment all occur outside the secondary structures and tend to appear in the loop structures on the surface of the protein. By contrast, some of the gaps in the 2.0/0.6 alignment would sp]; the secondary structures (Fig. 2).

345 T h e conclusions r e a c h e d in this p a p e r are i m p o r t a n t for e x p e r i m e n t a l efforts c u r r e n t l y u n d e r w a y in b o t h o f o u r l a b o r a t o r i e s a n d in the l a b o r a t o r i e s of others. Sited i r e c t e d m u t a g e n e s i s studies of b o t h E - P P a s e a n d Y1P P a s e should allow direct testing of the c a t a l y t i c roles of the residues listed in T a b l e II, a n d m a y resolve s o m e of the a m b i g u i t i e s of p r o p e r l y aligning E - P P a s e with Y1-PPase. F o l l o w i n g a r g u m e n t s p u t forth b y R e i c h a r d t a n d Berg [29], those residues having the highest p r o b a b i l i t y of b e i n g essential should fall within the e x t e n d e d regions of similarity n o t e d in T a b l e I. F u r t h e r , o u r conclusions s u p p o r t the n o t i o n that o l i g o n u c l e o t i d e p r o b e s b a s e d o n the Y 1 - P P a s e sequence b e t w e e n residues 115-175 offer the b e s t c h a n g e for selectively l o c a t i n g P P a s e genes in o t h e r organisms.

Acknowledgements This w o r k was s u p p o r t e d with the help of N I H g r a n t R01 A M 1 3 2 0 2 a n d a g r a n t f r o m the city o f T u r k u ( T u r u n k a u p u n g i n apuraha). L . F . K . was s u p p o r t e d b y N I H T r a i n i n g G r a n t N o . 4T32-GM-07229. W e t h a n k Drs. Bruce Erickson, S t e p h e n A l t s c h u l a n d R o b e r t H e i n f i k s o n for helpful discussions in the early stages o f this work, a n d Dr. D a v i d C h r i s t i a n s o n a n d R i c h a r d A l e x a n d e r for their help in d i s p l a y i n g the P P a s e structure.

References 1 Ermakova, S.A., Mansurova, S.E., Kalebina, E.S., Lobakova, I., Selyach, O. and Kulaev, I.S. (1981) Arch. Microbiol. 128, 394-397. 2 Kulaev, I.S. and Vagabov, V.M. (1983) Adv. in Microbial Physiol. 24, 83-171. 3 Kukko-Kalske, E. and Heinonen, J. (1985) Int. J. Biochem. 17, 575-580. 4 Abroad, N. and Cooperman, B.S. (1987) Fed. Proc. 46, 1931. 5 Keltjens, J.T., Van Erp, R., Mooijaart, R.J., Van der Drift, C. and Vogels, G.D. (1988) Eur. J. Biochem. 172, 471-476. 6 Cooperman, B.S. (1982) Methods Enzymol. 87, 526-548. 7 Josse, J. and Wong, S.C.K. (1971) in Enzymes, 3rd Edn., Vol. 4 (Boyer, P.D., ed.), pp. 499-527, Academic Press, New York. 8 Cohen, S.A., Sterner, R., Keim, P.S. and Heinrikson, R.L. (1978) J. Biol. Chem. 253, 889-897. 9 Kolakowsid, F.L., Schltlsser, M. and Cooperman, B.S. (1988) Nucleic Acids Res. 22, 10441-10452. 10 Kuranova, I.P., Terzyan, S.S., Voronova, A.A., Smirnova, E.A., Vainshtein, B.K., HShne, W.E. and Hansen, G. (1983) Bioorg. Khim. 9, 1611-1619. 11 Terzyan, S.S., Voronova, A.A., Smirnova, E., Kuranova, I.P., Nekrasov, Y.V., Arutyunyun, E.G., Vainstein, B.K., HiShne, W. and Hensen, G. (1984) Bioorg. Khim. 10, 1469-1482.

12 Welsh, K.M., Jacobyansky, A., Springs, B. and Cooperman, B.S. (1983) Biochemistry 23, 2243-2248. 13 Bond, M.W., Chiu, N.Y. and Cooperman, B.S. (1980) Biochemistry 19, 94-102. 14 Komissarov, A.A., Makarova, I.A., Sklyankina, V.A. and Avaeva, S.M. (1985) Bioorg. Khim. 11, 1504-1509. 15 Gonzalez, M.A. and Cooperman, B.S. (1986) Biochemistry 25, 7179-7185. 16 Lahti, R., Pitkaranta, T., Valve, E., Ilta, I., Kukko-Kalske, E. and Heinonen, J. (1988) J. Bacteriol. 170, 5901-5907. 17 Cohen, S.A. (1978) Ph.D. Thesis, University of Chicago. 18 Komissarov, A.A., Sklyankina, V.A. and Avaeva, S.M. (1987) Bioorg. Khim. 13, 599-605. 19 Stark, M.J.R. and Milner, J.S. (1989) Yeast 5, 35-50. 20 Kolakowski, F.L., Altschul, S.F., Erickson, B., Cohen, S.A., Heinrikson, R.L. and Cooperman, B.S. (1987) Fed. Proc. 2041. 21 Smith, T.F. and Waterman, M.S. (1981) Adv. Appl. Math. 2, 482-489. 22 Needleman, S.B. and Wunsch, C.D. (1970) J. Mol. Biol. 48, 443-453. 23 Devereux, J.P., Haeberli, P. and Smithies, O. (1984) Nucleic Acids Res. 12, 387-395. 24 Schwartz, R.M. and Dayhoff, M.O. (1979) in Atlas of Protein Sequence and Structure (Dayhoff, M.O., ed.), pp. 353-358, National Biomedical Research Foundation, Washington, DC. 25 Gribskov, M. and Burgess, R.R. (1986) Nucleic Acids Res. 14, 6745-6763. 26 Kyte, J. and Doolittle, R.F. (1982) J. Mol. Biol. 157, 105-132. 27 Vihinen, M. (1987) Protein Engineer. 1, 477-480. 28 Karplus, P.A. and Schulz, G.E. (1985) Naturwissenschaften 72, 212-213. 29 Reichardt, J.K.V. and Berg, P. (1988) Nucleic Acids Res. 16, 9017-9026. 30 Doolittle, R.F. (1981) Science 214, 149-159. 31 Bajaj, M. and Blundell, T.L. (1984) Rev. Biophys. Bioeng. 13, 453-492. 32 Baltscheffsky, H., Alauddin, M., Fallq G. and Lundin, M. (1987) Acta Chem. Scand., Ser. B. B41, 106-107. 33 Walker, J.E., Feamley, I.M., Gay, N.J., Gibson, G.W., Northrop, F.D., PoweU, S.J., Runswick, M.J., Saraste, M. and Tybulewicz, V.L.J. (1985) J. Mol. Biol. 184, 677-701. 34 Kunitz, M. (1952) J. Gen. Physiol. 35, 423-450. 35 Josse, J. (1966) J. Biol. Chem. 241, 1938-1947. 36 Ichiba, T., Shibasaki, T., Lizuka, E., Hachimori, A. and Samejima, T. (1987) Biochem. Cell Biol. 66, 25-31. 37 Lipman, D.J. and Pearson, W.R. (1985) Science 227, 1435-1441. 38 Borschik, I.B., Pestova, T.V., Sklyankina, V.A. and Avaeva, S.V. (1985) FEBS Lett. 184, 65-67. 39 Simonsen, C.C., Chen, E.-Y., and Levinson, A.D. (1983) J. Bacteriol. 155, 1001-1008. 40 McAlister-Henn, L. (1988) Trends Biochem. So. 13, 178-181. 41 Samols, D., Thornton, C.G., Murtif, V.L., Kumar, G.K., Haase, F.C. and Wood, H.G. (1988) J. Biol. Chem. 263, 6461-6464. 42 Vihinen, M. (1990) Methods Enzymol. 183, 447-456. 43 Bairoch, A., The Swiss-Prot Database (Release 12), Intelligenetics Corp.

Conservation of functional residues between yeast and E. coli inorganic pyrophosphatases.

The alignments of the amino acid sequences of inorganic pyrophosphatase (PPase) from Saccharomyces cerevisiae (Y1-PPase, 286 amino acids) and Escheric...
685KB Sizes 0 Downloads 0 Views