Gene, 109 (1991) 39-45 © 1991 Elsevier Science Publishers B.V. All rights reserved. 037,~-1119/91/$03.50

39

GENE 06195

Sequence and expression of the gene encoding 3-phosphoglycerate kinase from Bacillus stearothermophilus (Recombinant DNA; nucleotide sequence; amino acid sequence comparisons; overexpression; thermal stability)

G.J. Davies*, J.A. Littleehild *, H.C. Watson and L. Hall

Department of Biochemistry, School of Medical Sciences, Universityof Bristol, Bristol (U.K.) Received by R.W. Davies: 24 March 1991 Revised/Accepted: 9 August/16 August 1991 Received at publishers: 23 September 1991

SUMMARY The structural gene (pgk) encoding 3-phosphoglycerate kinase (PGK) from Bacillus stearothermophilus NCA 1503, has been cloned in Escherichia cob and its complete nucleotide sequence determined. The gene consists of an open reading frame corresponding to a protein of 394 amino acids (aa) (calculated Mr 42 703) and, in common with other prokaryotic pgk genes, is preceded by the structural gene encoding glyceraldehyde-3-phosphate dehydrogenase (GAPDH). Constructs containing the B. stearothermophilus pgk gene and its flanking sequences in the high-copy plasmid, pUC9, co-express both PGK and G A P D H at high levels in transformed E. colicells, typically producing P G K at levels of up to 30% of the soluble cell protein. The deduced aa sequence of B. stearothermophilus PGK is compared with those of the mesophilic (yeast) and extreme thermophilic (Thermus thermophilus) enzymes since the crystal structure of these PGKs are known or are in the process of being determined. Changes in the sequences of the three enzymes, as they appear to relate to the enhancement of thermal stability, are discussed.

INTRODUCTION Phosphoglycerate kinase (PGK) is the monomeric glycoiytic enzyme responsible for the first substrate level phosphorylation of ATP during glycolysis. It catalyses the rever-

Correspondenceto: Dr. H. Watson, Department of Biochemistry, School of Medical Sciences, University of Bristol, Bristol BS8 ITD (U.K.) Tel. (44-272)303734; Fax (44-272)303497. * Current addresses: (GJ.D.) Department of Chemistry, University of York, York (U.K.) Te1.(44-904)432596; (J.A.L.) Departments of Chemistry and Biological Science, University of Exeter, Exeter (U.K.) Tel. (44-392)263469. Abbreviations: aa, amino acid(s); B., Bacillus;bp, base pair(s);gap, gene encoding GAPDH; GAPDH, glyceraldehyde-3-phosphate dehydrogenase; kb, kilobase(s)or 1000bp; nt, nucleotide(s);oligo,oligodeoxyribonucleotide; ORF, open readingframe; PGK, phosphoglyceratekinase; pgk, gene encodingPGK; r, ribosomal; RBS, ribosome-bindingsite; T., Thermus; [ ], denotes plasmid-carrier state.

sible transfer of a phosphoryl group from 1,3-diphosphoglycerate to ADP: 1,3-diphosphoglycerate + Mg. ADP 3-phosphoglycerate + Mg. ATP The structure of the horse and yeast enzymes have been determined (Banks et al., 1979; Watson et al., 1982). The two enzymes are very similar and consist of two distinct domains which are thought to move relative to each other during catalysis. In addition, the enzyme is extremely well characterised at the sequence level; the sequences of over 20 PGKs having been determined (Watson and i.ittlechild, 1990). The substantial medical and industrial benefit to be gained from a knowledge of the mechanisms by which proteins attain thermal stability have led a number of workers to analyse the factors contributing to protein thermal stability. Early work by Perutz and Raldt (1975) and later work by Walker et al. (1980) suggested a role for surface ion-pairs and additional hydrophobic interactions

40 the gene encoding PGK from the moderate thermophile

in the thermal stabilisation of proteins. Argos et al. (1979) have analysed the changes observed between mesophilic and thermophilic ferredoxins and dehydrogenases. They concluded that thermal stability was attained through a combination of increasing the volume of hydrophobic amino acids involved in internal packing, decreasing the external hydrophobicity and increasing the proportion of helix-stabilisers (Chou and Fasman, 1974) in 0~-helical regions of the structure. It is also an established fact that there are many differences in the aa sequences of thermophilic proteins when they are compared to their mesophilic counterparts. Each one of these changes could, in theory, account for all the measured difference in free energy between the two proteins. This leads those who are interested in the protein property of enhanced thermal stability to the problem of identifying those residues which are directly involved, from those that are due to evolutionary drift. In addressing this problem we have embarked on a study of a protein derived from a wide range of thermophilic microorganisms in an effort to define the residues, and their interactions, responsible for the property of enhanced thermal stability (Littlechild et al., 1987; Bowen et al., 1988). Part of this study involves determining the sequences, and the tertiary structures of a carefully selected 'set' of PGK molecules. This study also involves using site-directed mutagenesis ofmesophilic yeast PGK in an attempt to introduce thermophilic properties into the mesophilic yeast enzyme and an overexpression system, using a modified pMA27 shuttle-vector, has been developed as an aid to that part of the work (Minard et al., 1990). We report here the cloning and subsequent sequencing of

B. stearothermophilus whose tertiary structure is being determined. The B. stearothermophiluspgk gene has also been overexpressed in E. coil thus providing a system for engineering site-specific mutations into a thermally stable framework for further studies on the folding properties and catalytic mechanism of PGK.

RESULTS AND DISCUSSION

(a) Cloning and sequencing of the pgk gene Gas phase aa sequence analysis of the first 30 residues of B. stearothermophilus PGK provided the information necessary to construct a redundant oligo probe (5'-GAY TTY AAY GTN CCN ATG GA-3'). This probe corresponds to residues 21-27 of the aa sequence. Southern blotting of B. stearothermophilus genomic DNA using this oligo probe, identified a single band of approx. 3.8 kb in the HindIII digest. A sub-genomic library, using HindIIIcleaved, 3.8 kb size-selected, B. stearothermophilus DNA, was subsequently constructed in an attempt to enrich for recombinant clones containing the B. stearothermophilus pgk gene. In situ colony hybridization of 300 transformants with this oligo probe produced three positive clones which, on the basis of preliminary restriction mapping, appeared to contain the same insert. Further analysis showed that the oligo probe (corresponding to the N-terminal aa sequence) hybridized to a site roughly in the centre ofthe 3.8-kb insert fragment suggesting that the entire pgk gene would be located on this fragment. The pgk has been mapped and

I-4

M I-4 MSm#

I

'~1

I

1,

I

I

I

! ! 200bp

M

•I D

~

M

,

,

4

D

I-

D-

200bp

Fig. 1. Partial restriction map and sequencing strategy of the 3.8-kb Hindlll fragment containing the B. stearothermophiluspgkgene. Arrows indicate the direction and extent ofthe individual sequencing runs. Sequencing runs initiated using internal primers are marked with black heavy dots. The direction of transcription of both pgk and gap is towards the right.

41 sequenced using the strategy depicted in Fig. 1. The results obtained from sequencing both strands of the entire pgk gene, and its immediate flanking regions, are shown in Fig. 2. Southern-blotting of digested B. stearothermophilus

genomic DNA, using a radio-labelled probe derived from part of the B. stearotherrnophilus pgk coding region, suggested that only one copy of the gene was present in the genome.

AAC G A A A C G GGC TAT TCG CAC CGC GTC GTC GAC T T A G C T GCC TAC ATC GCC T C G A A A G G G CTG T A A A A C C A G C T T G G Asn Glu Thr GI¥ T y r Set His AEg Val Val A s p Leu Ala A l a T y r Ile A l a Set Lys Gly Leu ***

77

TG•TGAGTTTT••G••TGTT•AAGT•TATAATAGGAAAATGGAGGGGAG•GGGGAAATGAT••••A•Tc•TTTT••TTGTC•AAAcTGTAGc•AAAAGG

176

GGGCACGAACG A T G A A C A A G A A G ACG ATe CGC GAC GTT GAT GTG A G G G G A A A G CGC GTC TTT TGC CGC GTC GAT TTC Met A S h Lvs Lvs ThE Ile A E = A S D Val AsP Val A E a Glv Lv8 AEO Val Phe Cvs A z ~ V a l Asp Ph9 1 5 10 15 20

253

AAC GTT CCG A T G G A G C A A G G C GeT ATe Ace GAT GAC ACG CGC ATT CGC GCC G C A C T C CCG A C G ATC CGC TAT TTG Aan Val Pro Met Glu Gln GIv Ala Ile Thr Asp Asp Thr A.Tg Zle A E g Ala Ala Leu Pro ThE Zle /kEg TyE Leu 25 30 35 40 45

328

ATe GAG CAC GGG G C G A A A G T C ATT TTG GCG AGe CAC CTC GGC CGC C C G A A A G G A A A A G T G GTC G A A G A A T T G CGT Zle Glu H£s GI¥ A l a Ly8 Val Ile Leu A l a SeE His Leu Gly AEg PEO LyB Gly Lys Val Val Glu Glu Leu Azg 55 60 65 70 50

403

TTG GAT GCC GTT GOG AAG CGG CTC GGC GAG CTG CTT GAA eGG COG GTT GCC AAAACGAAT GAA GCG GTC GGC GAT Leu A o p A l a Val A l a Lyo A.Tg Leu GI¥ Glu Leu Leu Glu AEg PEo Val A l a LF8 Thr ASh Glu Ala Val GIy Asp 80 85 90 95 75

478

GAG GTG AA).GCG GCG GTC GAC CGT TTG AAC GAAGGC GAT GTG CTC T T G CTT GAG AAC GTC CGT TTT TAC CCT 6~C GIu Val LY8 A l a A l a Val asp Azg Leu A S h G l u GIy Asp Val Leu Leu Leu Glu Ash Val A r g Phe TyE Pro GIy 105 110 115 120 100

553

GAAGAGAAAAAT GAT CCC GAG CTC GCC A A A GCG TTT GCG GAG CTT G C G GAT eTA TAT GTC AAC GAT GCG TTC GGC Glu GIu L¥8 Ash Asp Pro GIu Leu Ala Lys A l a Phe Ala Glu Leu A l a Asp Leu Tyr Val A S h Asp Ala Phe GIy 130 135 140 145 125

628

GCC GCC CAT CGC GeT CAT GCG TCG A C G G A A G G C ATe GCC CAT TAC T T G CCG GCG GTG GCC G G A T T T T T ~ A T G G A A Ala Ala His Arg A l a His Ala SeE ThE G I u GIy Zle Ala His TyE Leu Pro Ala Val Ala G I ¥ Phe Leu Met GIu 155 160 165 170 150

703

ATe ATe GGC G G G G C G A A A A A A GAA CTT G A A GTG CTT GGC A A G G C G CTT T C G A A T C C G G A C CGC C C G T T T A C A G C G Lys GIu Leu Glu Val Leu GIy Lys Ala Leu Set A S h Pzo Asp Arg Pro Phe Thr Ala Ile Ile GIy Gly Ala Lys 180 185 190 195 175

778

GAC AAC TTG ATe ATe GGC GGC G G A C T G G C G GTGAAAGAC A A A A T C GGC GTC ATe GAC AAT T T G CTT G A A A A A G T C Val Lys Asp Lys Ile GI¥ Val Ile Asp A S h Leu Leu Glu Lys Val A s p A S h Leu Ile Ile GI¥ GIy GIy Leu Ala 205 210 215 220 200

853

TAT ACG TTC G T C A A A G C G CTC GGC CAT GAC GTC GGC AAG TCG CTG CTT GAG GAG GAC A A A A T C G A A C T C GCC AAA Tyr Thr Phe Val Lys A l a Leu GIF His Asp Val GI¥ LFS Set Leu Leu Glu Glu Asp Lys Zle Glu Leu Ala Lys 230 235 240 245 225

928

TCG TTT ~TG GAA A A G GCG A A A G A A A A A GGC GTC CGT TTT TAT A T G CCG GTG GAC GTG GTC GTC GCC GAC eGG TTT Set Phe Met Glu Ly8 Ala Lys Glu Lys G I y Val A z ~ Phe TyE Met P~o Val Asp Val Val Val Ala Asp Azl Phe 255 260 255 270 250

1003

GCG AAC GAC GCC A A C ACG A A A GTC GTG CCG A T T GAC GCG ATT C C A GCC GAT TGG TCG GCG CTT GAC ATe GGC CCG Ala ASh Asp Ala A S h ThE Lys Val Val PrO Ile Asp Ala Ile PEO A l a Asp Tz~ Set Ala Leu Asp Ile GIy Pz~ 280 285 290 295 275

1078

A A A A C G CGC GAA TTG TAC CGC GAT GTC ATT CGC GAG TCG AAG CTC GTT GTC TGG AAC GGC CCG A T G GGC GTC TTT L y a T h = A z ~ GIu Leu T ¥ = A r g Asp Val Zle A r g GIu Set Lys Leu Val Val Tzp Ash GIy Pro Met GI¥ Val Phe 305 310 315 320 300

1153

GAAATGGAC GCG TTC GCC CAC GGG A C A A A A G C G &TC GCC GAA G C A CTG GCG GAA GCG CTC GAC ACe TAT TCG GTC Glu Met Asp Ala Phe A l a His GI¥ ThE LyS A l a Ile Ala Glu A l a Leu A l a Glu Ala Leu Asp ThE Tyr BeE Val 330 335 340 345 325

1228

ATe GGC GGC GGG GAT TCG GCG GCG GCG GTT G A G A A A T T C GGC TTG GCC GAC A A A A T G GAT CAT ATe TCC Ace GGC Ile GI¥ GIy GIy A s p Set Ala Ala Ala Val G I n Lys Phe GIy Leu A l a Asp Lys Met Asp His Ile Set Thr GI¥ 355 360 365 370 350

1303

CTG CCG GGT GTC GTC G C A C T C G A A G A C A A A T G A C C C A C G G GGC GGC GeT TCG CTC GAG TTT ATG G A A G G A A A A C A G GI¥ GIy Ala Set Leu Glu Phe Met Glu GIy Lys Gln Leu Pro GI¥ Val Val Ala Leu Glu Asp Lys *** 380 385 390 39& 375

1379

CGCCGGATTTGGCGCATGGCGGCTGACTCCGCCCATTGAAAGGAAGGGTGAA

1431

Fig. 2. Complete nt sequence of the B. stearothermophiluspgkgene and its flanking regions. Accession No. X 58059 EMBL Library. Suitable restriction fragments were subcloned into M 13 vectors (Messing and Vieira, 1982) and sequenced using the dideoxy chain termination method (Sanger et al., 1977; 1980). Sequencing gels were run at increased voltage, and hence temperature, to overcome problems associated with band compression, with a 2 mm thick aluminium plate clamped to the gel plate during electrophoresis to dissipate the heat evenly. The region 5' to the pgk gene encodes GAPDH and shows total identity to the published sequence (Branlant et aL, 1989). The PGK aa sequence obtained by gas phase sequencing methods is underlined. Stop codons are indicated by asterisks.

42

(b) Features of the DNA sequences flanking the pgk coding region In common with other reported prokaryotic pgk sequences (Conway and Ingram, 1988; Bowen et al., 1988) the region upstream from thepgk gene was found to encode the gap gene for B. stearothermophilus glyceraldehyde-3phosphate dehydrogenase. The deduced GAPDH sequence shows total identity with the published gene sequence (Branlant et al., 1989). A putative RBS, complementary to the 3' end of the B. stearothermophilus 16S rRNA (Douthwaite et al., 1983), can be identified 8 nt upstream from the known ATG start codon ofthepgk gene. An equally suitable RB S, and potential initiating methionine, can be found 54 nt upstream from the one utilised in vivo. Indeed, it has been suggested that this is the start codon for pgk translation (Branlant et ai., 1989). The N-terminal aa sequence analysis of purified PGK shows, however, that the aa sequence corresponds to a protein initiated at the second of the two possible start codons (as shown in Fig. 2) and not the one proposed by Branlant etal. (1989). Our assignment of the B. stearothermophilus PGK start codon is in agreement with the recently reported, closely related B. megaterium PGK start codon (Schlapter et al., 1991). As found for other Bacillus genes (Hoshino et al., 1985; Branlant et al., 1989) there is no E. coil like, Rho-independent, transcription termination sequence within 70 nt located immediately following the 3' end of the B. stearothermophilus pgk gene. In addition, no reasonable ORFs are found within the sequenced 3' flanking region.

(c) Features of the pgk coding region The B. stearothermophiluspgk gene consists of an ORF of 1182 nt commencing with an ATG start codon and ending with a TGA stop codon. The codon usage closely resembles that observed for other B. stearothermophilus genes (Barstow et al., 1986). In common with other genes from thermophilic organisms there is a marked preference for G or C residues in the third nt position of the codon. The total G + C content of the Bacillus pgk gene is 56.7% whereas 69% of the third nt in each codon are either G or C.

(d) Sequence comparison The deduced aa sequence corresponds to a protein of 394 aa with a calculated Mr of 42 703 showing many of the features associated with proteins from thermophilic organisms. Fig. 3 shows the B. stearothermophilus PGK sequence aligned with the mesophilic yeast (Hitzeman et al., 1982) and thermophilic T. thermophilus(Bowen et al., 1988) PGK sequences. A small number of gaps have been introduced to maximise the alignments. All these 'adjustments' occur in regions which are known to correspond to surface

loops in the yeast enzyme. The overall identity of the

Bacillus enzyme with both the yeast and T. thermophilus aa sequences is approx. 50%. The most striking feature of the

B. stearothermophilus PGK aa sequence is the 15 residue deletion in the 'nose' region of the yeast PGK structure. This deletion has been observed in all prokaryotic PGKs sequenced so far (for references see Watson and Littlechild, 1990), in the wheat cytosolic and chloroplast enzymes (Longstaffet al., 1989) and a slightly shorter deletion in the archaebacterial sequences of the Methanobacterium bryantii and Methanothermusfervidus (Fabry et al., 1990). When compared with the mesophilic PGK sequences, the B. stearothermophilus sequence shows a large reduction in the content of Ser and Thr residues. It also shows a large reduction in the number of Lys residues concomitant with a rise in the number of Arg and Glu residues. These trends follow those observed for other thermophilic proteins (for reviews see Argos et al., 1979; Mozhaev and Martineck, 1984). It has been argued that some of the trends observed between aa sequences from mesophilic and thermophilic organisms may simply be due to the selection of G + C-rich codons in order to maintain DNA stability, and that aa substitution trends (such as replacement of Lys [codon AAR] by Arg [codon CGN]) are merely manifestations of this DNA nt bias. The selection pressure to maintain a high G + C-rich DNA does not appear to be that great for B. stearothermophilus PGK as judged by the fact that Lys is coded for 24 times by AAA but only eight times by AAG. The archaebacteria also show a change from Lys to Arg in association with an increase in thermostability even though the G + C content is low (31.4% for the thermophile, Methanothermus fervidl¢; Fabry et al., 1990). A trend in the mesophilic/thermophilie PGK sequence comparison is a decrease in the number of lie residues. This change occurs mainly as a result of the replacement of Ile by Val or Leu. As both Val and Leu are less hydrophobic than Ile (Nozaki and Tanford, 1971) the likely consequence of this trend is the reduction of the internal hydrophobicity of the PGK as a function of the temperature at which it is intended to operate. This trend is also seen in the thermophilic archaebacteda (Fabry et al., 1990). With the exception of Lys, the B. stearothermophilus PGK sequence does not show the reductions one might expect in the content of those aa known to participate in the degradative reactions at high temperatures. Indeed, the more reactive aa, such as Asp and Met, are actually present at higher levels in the B. stearothermophilus PGK sequence than in yeast and, whilst there are far fewer Gin residues in B. stearothermophilus PGK than in yeast, there are similar levels of Asn, the more easily deaminated aa. This is in marked contrast to the extreme thermophile where there are far fewer Asn than in yeast (Bowen et al., 1988). It is significant that the thermophilic archaebacteria also have a re-

43 1

PGK:

Yeast

10

20

30

SLSSKLSVQDLDLKDKRVFZRVDFNVPLDGEKITSNQRIV I

BaQillus atearothermophilus PGK:

I

I

I

I

IIII

IIIIIII

II

II

MNKKTZRDVDVRGKRVFCRVDFNVPMEQGAITDDTRIR I

Thermus thermophilus PGK:

IIII

III

III

I

I

III

MRTLLDLDPKGKRVLVRVDNVPVQDGKVODETRZL

40 50 60 70 80 90 i00 ii0 AALPTIKYVLEHHPRYVVLASHLGRPNGERNEEYSLAPVAKELQSLLGKDVTFLNDCVGPEVEAAVEASAPGSVI IIIIII

I

I

I IIIIIIIIII

I

I

III

I

II

I

I

II

II

III

I

I

AALPTIRYLIEHGAK-VILASHLGRPKGKVVEELRLDAVAKRLGELLERPVAKTNEAVGDEVKAAVDRLNEGDVL III

I I

II

I III]1111

I

I

I

I

I

I

I II

ESLPTLRHLLAGGAS-LVLLS~LGRPKGP-DPKYSLAPVGEALRAHLPEARFAPYPPGSE~GEVL 120 130 140 150 160 170 LLENLRYHIEEEGSREVDGQKVEASKEDVQKFRHELSSLADVYINDAFGTAHRAHSSMVGFDLP-QRAAGFLLEK I~11

I

I

III

LLENVRFYPGE--IIIIIII

III

IIIIIII

LLENVRFEPGE--180

I

IIIII

IIIII

I

I

IIII

II

EKNDPELAKAFAELADLYVDAFGAAHRAHASTEGIAHYLPAVAGFLMEK I

I

I

IIII

IIIIIII

I I

III

IIIIIII

EKNDPELSARYARLGEAFVLDAFGSAHRAHASWGVARLLPAYAGFLMBK 190

200

210

220

230

240

ELKYYGEALENPTRPFLAILGGAKVADKIQLIDNLLDKVDSIIIGGGMAFTFKEVLENTEZGDSZFDKAGAEIVP II

IIII

II

III

II

IIIII

III

lllll

III

IIIII

I

II

I I

I I

I

ELEVLGKALSNPDRPPTAZIGGAKVKDKZGVZDNLLEKVDNLZIGGGLAYTFVEAL-GHDVGKSLLEEDKIELAE I

I

I

I II

IIIII

IIIIII

II

I

I

III

I

II

III

I

II

II

III

III

EVRALSRLLEDPERPYAVVLGGAKVSDKZGVIESLLPRIDRLLZGGAMAPTPLEAL-GGEVGRSLVERDRLDLAK 250

260

270

280

290

300

310

320

KLMEKAKAKGVEVVI,PVDFIIADAFSADANTKTVTDKEGIPAGWQGLDNGPESRKLPAATVAKAETZVWNGPPGV IIIII

III

III

II

I

IIIIIII

III

I

II

II

I I

I

IIIII

II

SFMEKAKEKGVRFYMPVDVVVADRFANDANTEW-PIDAIPADWSALDZGPETRRLYRDVZRESELVVWNGPMGV I

III

I I

III

I I

I

I

I

III

IIIIIIIII

I

IIIIIII

DLLGRAEALGVRVYLPEDVVAAERIZAGVETRVF-PARAZPVPYMGLDIGPKTRBAFARALEGAR~GPMGV 330 340 350 360 370 380 390 FEFEKFAAGTKALLDEVVKSSAAGNTVIIGGGDTATVAKKYGVTDKISHVSTGGGASLELLEGKELPGVAFLSEKK

II

II

IIII

IIIII

I

I

I

II

I

IIIIIIIII

III

IIII

I

I

PEMDAPAHGTKAIAEALAEAL--DTYSVIGGGDSAAAVEKFGLADKMDHISTGGGASLEFMEGKQLPGWALEDK II

!

II

|

I I

I

IIIII

III

II

I

llllllllll

I

III

II

FEVPPFDEGTLAVGQAZAALE--GAFTVVGGGDSVAAVNRLGLKERFGHVSTGGGASLEFLEKGTLPGLEVLEG Fig. 3. Comparison of the deduced aa sequences ofyeast (a mesophile), B. $tearothermoph~us(a moder~e thermophile - - see Fig. 2) and Z thermophilus (an extreme thermophile) PGKs. The data ~ r yeast and Z ~ermophil~ PGKs is taken from H~zeman et al. (1982) and Bowen et al. (1988), respectively. Compa~son of these three sequences was chosen since the c ~ s t ~ structu~s of these enzymes a ~ or will short~ be available ~ r more detailed comparison ~ the strucmr~ level. Last di0ts of numerals are ~igned with the cor~sponding aa. The ~sidues numbered rel~e to the B. stea~ermophilus PGK sequence.

duction in Asn and Gin associated with increased thermal stability (Fabry et al., 1990).

(e) Expression of the Bacillus stearothermophUus pgk in Escherichia coli Clones containing the 3,8-kb Hindlll fragment of B. stearothermophilus DNA (Fig. 1) were ligated into the HindIIl site of the high-copy plasmid pUC9. This gave rise to two possible constructs: (1) that with both the pgk and gap genes in the correct orientation for expression from the adjacent vector's lacZ promoter, which were found to express both pgk and gap at levels of approx. 20 % of the E. coil total soluble cell protein (see Fig. 4) and (2) that in which thepgk and gap genes were in the incorrect orientation for expression from the lacZ promoter. This latter construct was found somewhat surprisingly to express PGK aione at levels corresponding to approx. 30% of total soluble cell protein (Fig. 4). This could perhaps be explained by the presence of a cryptic promoter within the gap gene, although this does not appear to cause transcriptional interference when the genes are inserted in the opposite orientation. Branlant etal. (1989) reported that expression of

B. stearothermophilus gap, in E. coli, required a promoter located some 1000 nt upstream from the gap gene. The construct reported here contains only 400 nt upstream from the start codon. It seems likely, therefore, that the expression of gap is under the control of the iacZ promoter associated with the pUC9 vector.

(f) Conclusions (1) The major changes observed between the sequences of T. thermophilus and B. stearothermophilus PGK are the additional (to those between yeast and Bacillus) replacement of Lys by Arg, a large increase in the number of Pro and Gly residues, a significant reduction in the number of Asn residues, the replacement of Asp by Glu and a subtle change in the sequence composition of hydrophobic an. (2) The replacement of Lys by Arg seems to be a general feature of thermophilic proteins. Arg, in addition to its lower chemical reactivity compared to Lys, is the most hydrophilic of the aa (Wolfenden et al., 1981) and its presence at the surface of the protein is therefore thermodynamically favourable. (3) The comparison of the sequence data would suggest

44

M

3

4 67 43

20

14 Fig. 4. A homogeneous 'PHAST' 0.1% SDS, 12.5% polyacrylamide gel of the soluble cell extracts obtained from E. coll. Lane M, molecular weight markers (from the top) consisting of phosphorylase b (94 kDa), bovine serum albumin (67 kDa), ovalbumin (43 kDa), soybean trypsin inhibitor (20 kDa), and 0~-Iactalbumin (14 kDa). Lane I, crude protein extract from E. coli [pUC9] cells. Lane 2, crude protein extract as lane 1 but with the pUC9 construct containing the B. stearothermophilus gappgk 3.8-kb insert. Bothpgk and gap genes were in the opposite orientation to the lacZ promoter of the vector. Lane 3, as in lane 2 but with the direction of transcription of both pgk and gap genes in line with the lacZ promoter ofthe vector. Lane 4, purified B. stearothermophilus PGK. Lane 5, purified B. stearothermophilus GAPDH. The gel was stained with Coomassie Blue and scanned using a Joyce-Loebi Chromoscan 3D gel scanner. Optical density of the band corresponding to either B. stearothermophih,o PGK or GAPDH was used to estimate the level of these enzymes. Protein overexpression. The 3.8-kb HindIII fragment of B. stearothermophilus DNA containing the pgk gene was subcloned into pUC9, in both orientations. These were then used to transform competent E. coil HBI01 recA- cells (Boyer and Rouiland-Dussoix, 1969). Recombinant clones were grown overnight in L-broth ( 1• tryptone/0.5 % yeast extract/0.5 ~'o NaCI/0.5 % glucose) containing 0.1 mg ampicillin per ml. Cells were harvested by centrifugation and disrupted by sonication. The 0.1% SDS/12.5% polyacrylamide gels (Laemmli and Favre, 1973) were used to estimate the levels ofB. stearothermophilus PGK produced by the recombinant E. coil cells.

that it is likely that the role of Pro residues in enhancing thermal stability is a thermodynamic one. In other words the additional Pro residues present in the T. thermophilus structure act to lower the stability of the unfolded state by reducing the number of conformations available to it. The increase in the number of Pro and also Gly, residues would allow the protein to form tighter, and therefore more stable, loops joining elements of secondary structure. When comparing the B. stearotherraophilus PGK to the sequence of B. megaterfum PGK, four additional Pro residues are found in the thermophilic Bacillus (Schlapter et al., 1991). (4) When the overall hydrophobic index is calculated (Bull and Breeze, 1973) for B. stearotherraophilus PGK there is an increase in hydrophobicity of the thermophilic

enzyme compared with that of yeast PGK as would be expected for a thermophilic protein whose optimum temperature is at or below 70°C. The T. thermophilus PGK has a slightly lower hydrophobic index than the B. stearothermophilus enzyme. This is consistent with the fact that the energetic contribution of hydrophobic interactions to protein stability decreases beyond 65°C (Brandts, 1964). It will be apparent that the cloning and sequence determination of the B. stearothermophilus pgk gene has yielded additional information relating to the aa sequence of a thermophilic protein which adds to current ideas relating to the property of enhanced protein stability. Clearly this information, when used in conjunction with the crystal structures of the respective thermophilic enzymes, will be of considerable use in investigating nature's solution to the problem of protein thermal stability. The ability to overexpress the B. stearothermophilus enzyme in E. coii will be important when the subtle changes important for stability at elevated temperatures are investigated further using sitedirected mutagenesis.

ACKNOWLEDGEMENTS

The authors would like to thank Professor John

Fothergill for use of the Aberdeen University gas phase sequencing facility. G.J.D. gratefully acknowledges the award of an SERC studentship. The work described in this paper was supported by the University of Bristol and by the SERC/Industry Protein Engineering Programme.

REFERENCES Argos, P., Rossmann, M.G., Grau, U.M., Zuber, H., Frank, G. and Tratschin, J.D.:Thermal stability and protein structure. Biochemistry 18 (1979) 5698-5703. Banks, R.D,, Blake, C.C.F., Evans, P.R., Haser, R., Rice, D.W., Hardy, G.W., Merrett, M. and Phillips, A.W.: Sequence, structure and activity of phosphoglycerate kinase: a possible hinge-bending enzyme. Nature 279 (1979) 773-777. Barstow, D,A., Sharman, A.F., Atkinson, T. and Minton, N.P.: Cloning and complete nucleotide sequence of the Bacillus stearothermophilus tryptophanyl tRNA synthetase gene. Gene 46 (1986) 37-45. Bowen, D., Littlechild, J.A., Fothergill, J.E., Watson, H.C. and Hail, L.: Nucleotide sequence of the phosphoglycerate kinase gene from the extreme thermophile Thermus thermophilus. Biochem. J. 254 (1988) 509-517. Boyer, H.W. and Roulland-Dussoix, D.: A complementation analysis of the restriction and modification of DNA in Escherichia coll. J. Mol. Biol. 41 (1969) 459-472. Brandts, J.F.: The thermodynamics of protein denaturation If. A model of reversible denaturation and interpretation regarding the stability of chymotrypsinogen. J. Am. Chem. Soc. 86 (1964) 4302-4314. Branlant, C., Oster, T. and Branlant, G.: Nucleotide sequence determination of the DNA region for Bacillus stearothermophilus glyceral-

45 dehyde-3-phosphate dehydrogenase and ofthe flanking DNA regions required for its expression in Escherichia coli. Gene 75 (1989) 145-155. Bull, H.B. and Breeze, K.: Thermal stability of proteins. Arch. Biochem. Biophys. 158 (1973) 681-686. Chou, P.Y. and Fasman, G.D.: Conformational parameters for amino acids in helical, sheet, and random coil regions calculated from proteins. Biochemistry 13 (1974) 211-222. Conway, T. and Ingram, L.O.: Phosphoglycerate kinase gene from Zymomonas mobilus: cloning, sequencing, and localization within the gap operon. J. Bacteriol. 170 (1988) 1926-1933. Douthwaite, S., Christensen, A. and Garrett, R.A.: Higher order structure in the Y-minor domain of small subunit ribosomal RNAs from a Gram negative bacterium, a Gram positive bacterium and a eukaryote. J. Mol. Biol. 169 (1983) 249-279. Fabry, S., Heppner, P., Dietmaier, W. and Hensel, R.: Cloning and sequencing the gene encoding 3-phosphoglycerate kinase from mesophilic Methanobacterium bryantii and thermophilic Methanothermus fervidus. Gene 91 (1990) 19-25. Hitzeman, R.A., Hagie, F.E., Hayflick, J.S., Chen, C.Y., Seeburg, P.H. and Derynck, R.: The primary structure of the Saccharomyces cerevi. siae gene for 3-phosphoglycerate kinase. Nucleic Acids Res. 10 (I 982) 7791-7808. Hoshino, T., lkeda, T., Tomizuka, N. and Furukawa, K.: Nucleotide sequence of the tetracycline resistance gene of pTHTI5, a thermophilic Bacillus plasmid: comparison with staphylococcal Tc R controls. Gene 37 (1985) 131-138. Laemmli, U.K. and Favre, M.: Maturation of the head of bacteriophage T4. J. Mol. Biol. 80 (1973) 575-599. Littlechild, J.A., Davies, G.J., Gamblin, S.J. and Watson, H.C.: Phosphoglycerate kinase from the extreme thermophile Thennus thermophilus. FEBS Lett. 225 (1987) 123-126. Longstaff, M., Raines, C.A., McMorrow, E.M., Bradbeer, J.W. and Dyer, T.A.: Wheat phosphoglycerate kinase: evidence for recombination between the genes for the chloroplastic and cytosolic enzymes. Nucleic Acids Res. 17 (1989) 6569-6580. Messing, J. and Vieira, J.: A new pair of M 13 vectors for selecting either DNA strand of double-digest restriction frafments. Gene 19 (1982) 269-276.

Minard, P., Bowen, D.J., Hall, L., Littlechild, J.A. and Watson, H.C.: Site-directed mutagenesis ofaspartic acid 372 at the ATP binding site of yeast phosphoglycerate kinase. Protein Engineering 3 (1990) 515-521. Mozhaev, V.V. and Martinek, K.: Structure-stability relationships in proteins: new approaches to stabilising enzyme ;. Enzyme Microb. Technol. 6 (1984) 50-59. Nozaki, Y. and Tanford, C.: The solubility of amino acids and two glycine peptides in aqueous ethanol and dioxane solutions. J. Biol. Chem. 246 (1971) 2211-2217. Perutz, M.F. and Raidt, H.: Stereochemical basis of heat stability in bacterial ferredoxins and haemoglobin A2. Nature 255 (1975) 256-258. Sanger, F., Nicklen, S. and Coulson, A.R.: DNA sequencing with chainterminating inhibitors. Prec. Natl. Acad. Sci. USA 74 (1977) 5463-5467. Sanger, F., Coulson, A.R., Barrell, B.G., Smith, A.J.H. and Roe, B.A.: Cloning in single stranded bacteriophage as an aid to rapid DNA sequencing. J. Mol. Biol. 143 (1980) 161-178. Schlapter, B.S., Branlant, C., Branlant, G. and Zuber, H.: Nucleotide sequence of the phosphoglycerate kinase gene from Bacillus megaterium. Nucleic Acids Res. 18 (1991 ) 6423. Walker, J.E., Wonacott, A.J. and Harris, J.I.: Heat stability of a tetrametic enzyme, D-glyceraldehyde-3-phosphate dehydrogenase. Eur. J. Biochem. 108 (1980) 581-586. Watson, H.C. and Littlechild, J.A.: Isoenzymes of phosphoglycerate kinase: evolutionary conservation of the structure of this glycolytic enzyme. Biochem. Soc. Tr,'ms. 18 (1990) 187-190. Watson, H.C., Walker, N.P.C., Shaw, P.J., Bryant, T.N., Wendell, P.L., Fothergill, L.A., Perkins, R.E., Conroy, S.C., Dobson, M.J., Tuite, M.F., Kingsman, A.J. and Kingsman, S.M.: Sequence and structure of yeast phosphoglycerate kinase. EMBO J. 1 (1982) 1635-1640. Wolfenden, R., Andersson, L., Cullis, P.M. and Southgate, C.C.B.: Affinities of amino acid side chains for solvent water. Biochemistry 20 (1981) 849-855.

Sequence and expression of the gene encoding 3-phosphoglycerate kinase from Bacillus stearothermophilus.

The structural gene (pgk) encoding 3-phosphoglycerate (PGK) from Bacillus stearothermophilus NCA1503, has been cloned in Escherichia coli and its comp...
820KB Sizes 0 Downloads 0 Views