Gene, 96 (1990) 67-74 Elsevier

67

GENE 03768

Cloning, sequencing and characterization of the [NiFe]hydrogenase-encoding structural genes (hoxK and hoxG) from Azotobacter vinelandii (Recombinant DNA; cosmid; genomic DNA library; protein targeting; sequence homology; hydrophobic domains)

Angeli L. Menon, Larry W. Stults, Robert L. Robson and Leonard E. Mortenson Department of Biochemistry, University of Georgia, Athens, GA 30602 (U.S.A.) Received by S.R. Kushner: 27 January 1990 Revised: 18 April 1990 Accepted: 5 June 1990

SUMMARY

The Azotobacter vinelandii [NiFe]hydrogenase-encoding structural genes were isolated from an A. vinelandii genomic cosmid library. Nucleotide (nt) sequence analysis showed that the two genes, hoxK and hoxG, which encode the small and large subumts ot" the enzyme, respectively, form part of an operon that contains at least one other gene. The hoxK gene encodes a polypeptide of 358 amino acids (aa) (39 209 Da). The deduced aa sequence encodes a possible 45-aa N-terminus extension, not present in the purified A. vinelandii hydrogenase small subunit, which could be a cellular targeting sequence. The hoxG gene is downstream from, and overlaps hoxK by 4 nt and encodes a 602-aa polypeptide of 66 803 Da. The hoxK and hoxG gene products display homology to aa sequences of hydrogenase small and large subunits, respectively, from other organisms. The hoxG gene lies 16 nt upstream from a third open reading frame which could encode a 27 729-Da (240-aa) hydrophobic polypeptide containing 53 % nonpolar and 11% aromatic aa. The significance of this possible third gene is not known at present.

INTRODUCTION

Hydrogenases catalyse the reversible oxidation of dihydrogen and occur in diverse prokaryotes and eukaryotes (Adams et al., 1981; Gogotov, 1986; O'Brian and Maier, 1§88). Hydrogen can serve as the sole source of energy for some autotrophs, such as Alcaligenes (Friedrich et al., 1979; Doyle and Arp, 1987), Rhodobacter capsulatus Correspondenceto: Dr. R.L. Robson, Department of Biochemistry. University of Georgia, Athens, GA 30602 (U.S.A.) Tel. (404)-542-1191; Fax (404)-542-1738. Abbreviations: aa, amino acid(s); Ap, ampicillin; bp, base pair(s); A, deletion; hoxK and hoxG, genes encoding the small and large subunits of A. vinelandii[NiFe]hydrogenase, respectively; kb, kilobase(s) or 1000 bp; nt, nucleotide(s); oligo, oligodeoxyribonueleotide; ORF, open reading frame; RBS, ribosome-binding site(s); SSC, 0.15 M NaCI/0.015 Na3" citrate pH 7.6; [ ], denotes plasmid-carrier state. 0378-1119/90/$03.50 © 1990 Elsevier Science Publishers B.v. ~P:om~dicalDivision)

(Madigan and Gest, 1979), Azospirillum species (Sampaio et al., 1981), and Bradyrhizobiumjaponicum (Hanus et al., 1979), or as a supplementary energy source as in the H2-dependent mixotrophic growth of A. vinelandii on H2 and mannose (Wong and Maier, 1985). Uptake hydrogenases can recycle hydrogen evolved as a byproduct of nitrogenase activity in aerobic nitrogen-fixing organisms, thereby potentially increasing the efficiency of nitrogea fixation (Evans et al., 1981; Eisenbrenner and Evans, 1983; Aguilar et al, 1985). Uptake hydrogenases purified from the nitrogen-fixing microorgm-,isms A. vinelandii (Seefeldt and Arp, 1986), B.japonicum (Harker et al., 1984), and R. caps,datus (Seefeldt et al., 1987) are membrane-bound NiFe-sulfur proteins, existing as heterodimers composed of large and small subunits of approx. 67 and 31 kDa, respectively. The B. japonicum enzyme may also contain selenium (Boursier et al., 1988). There have been several reports on the genetics of this

68 family of enzymes. Studies on A. eutrophus hydrogenase clearly indicate that in addition to the structural genes, a number of other genetic loci are involved in processing, maturation, and regulation of its expression (Kortluke et al., 1987; Eberz et al., 1989; B. Friedrich, personal communication). These hydrogen-oxidizing (hox) structural and regulatory genes are clustered within 80 kb of DNA on a 450-kb megaplasmid, priG1, and are organized into several transcriptional units. Genes encoding H2-uptakerelated functions in Azotobacter chroococcum (Yates and Robson, 1985; Tibelius etal., 1987) and B.japonicum (Haugland et al., 1984) span at least 15 kb of genomic DNA.-Similai work in R. cup~alatus suggests the involvement of at least five genes in hydrogenase expression and activity (Xu et al., 1989). While the A. vinelandii [ NiFe]hydrogenase has been purified and partially characterized biochemically, nothing is known about the genetic requirements fi,r its synthesis. The aim of this present study was to isolate recombinant cosmid clones containing the A. vinelandii hydrogenase-encoding structural genes to subclone and determine the nt sequence of the relev~,nt region.

RESULTS AND DISCUSSION (a) Isolation and characterization of hydrogenase gene clones

A 5.8-kb Stul restriction fragment from pCMS1, containing the A. chroococcum hydrogenase-encoding structural genes (K. Tibelius, personal communication), was used to probe restriction digests of A. vinelandii genomic DNA. Strongly hybridizing bands were observed in all *.he geaomic digests under stringent hybridization conditions (Fig. 1). The same probe was then used to screen an A. vinelandii recombinant cosmid library. Twenty-four positively hybridizing recombinant clones were identified and isolated. Eleven were physically mapped with three restriction enzymes. The presence of hydrogenase-specific DNA in the cosmids was confirmed b!J hybridization to the A. chroococcum 5.8-kb probe. One clo..qe, pALM21, which contains approx. 40 kb of insert DN k, was chosen for further analysis because the hydrogenase-specific region is situated close to the center of the insert (Fig. 2). The exact location and orientation of the hydrogenaseencoding structural genes within pALM21 was determined by detailed restriction-enzyme analysis and by hybridization to DNA probes that contained both A. chroococcum hydrogenase structural genes (pCMSI, 5.8-kb StuI fragment) or the large subunit gene only IpCMS 1, 1.8-kb SphI fragment). The hybridization patterns observed with pALM21 restriction digests were consistent with those obtained with equivalent genomic digests.

1

2

3

4

5

kb I

Fig. 1. Hybridization of hydrogenase-encoding structural genes from Azotobacter chroococcum to A. vinelandii genomic DNA. Autoradiogram obtained after hybridization at high stringency with the 5.8-kb StuI fragment from pCMSI. A. vinelandii genomic DNA digested with: lanes: I, Asp718; 2, SalI; 3, Xhol; 4, Pstl. Lane 5 shows radioactive end-labeled bacteriophage 2 ttindlII markers. Southern blots were prepared by electrophoretic transfer to OeneScreen membranes (New England Nuclear) of restriction fragments separated by electrophoresis in agarose (0.8% w/v) in TAE buffer (Maniatis ct al., 1982). Radioactive probes for DNA hybridization were prepared from DNA fragments recovered from agarose-gel slices using the freeze-squeeze method (Thuringet al., 1975) and labeled by nick translation (Rigby et al., 1977) using deoxycytidine 5'-[~-nP]triphosphate (3000 Ci/mmol; Amersham International). Hybridizations were carried out in sealed bags or roller bottles for 14-16 h in 45-50% formamide/6 × SSC/10% w/v dextran sulfate/0.5~ nonfat dried milk (Carnation Co.) at 42°C (Johnson et al., 1984). Stringency washes were performed in I x to 0.2 x SSC at 68°C. Autoradiography was done at -70°C using Kodak X-OMAT AR film. (b) Subcloning and sequence analysis of the hydrogenaseencoding structural genes

To sequence the structural genes, an 8.9-kb BamHI-XbaI fragment from pALM21, encompassing the entire hydrogenase-specific region, was subcloned into pTZ 19R, giving plasmid pALMZ' 1 (Fig. 2). The sequence data compiled from one strand was utilized to design site specific, synthetic primers, spaced approx. 250bp apart, for sequencing the complementary strand of pALMZ' 1. The nt and derived aa sequences for the hydrogenasespecific DNA in A. vinelandii are shown in Fig. 3. Three OKFs, with the distinct possibility of a fourth, were identified on the basis of the presence of potential RB S (Stormo, 1986) and the biased codon usage characteristic of G + Crich A. vinelandii nil genomic DNA. The table of codon frequencies was established from sequenced nif, vnf, and anf genes from A. vinelandii. ORF 1 and ORF2 encode the small and large subunits of the [NiFe]hydrogenase and

69 pALM 21 tEl

Sc

Bg

sc

2kb

E,,BBp

x

E, E,B,EE, ,a

[E]

!

,... ° - - -

pALM Z'! r i

s i

spss lkb

i

I ORF1 I

s,~,s~

01~2

A A, S E,

IORF31~.§m.

Fig. 2. Cloning, subcloning, and organization ofhox DNA from A. vinelandii. Recombinant cosmid clone, pALM21, containing the hox structural genes was isolated from a gene bank consisting of 4600 clones, each potentially contorting A. vinelandii genomic 40- to 50-kb DNA fragments produced by fractionation of Sau3A partial digests of total genomic DNA, cloned into the BamHI site of cosmid pTBE (Grosveld et al., 1982). DNA was packaged into 2 particles using a commercial kit (Promega). The packaged recombinant cosmid particles were transfected into E. coli LE392, and plated onto selective media. ApR colonies were stored individually at -70°C in 20% glycerol. Colonies were screened by hybridization using the same probe and conditions described in Fig. 1. Cosmid clones were grown on nitrocellulose filters and DNA liberated, denatured and bound to the filter as described in Maniatis et al. (1982). Conditions for growth ofA. vinelandii and extraction of genomic DNA were described by Robson et al. (1984). E. coil strains were grown in Luria-Bertani medium (Maniatis et al., 1982) with Ap added at 100 #g/ml when required. DNA transformations were done using the method of Dagert and Erlieh (1979). Small- and medium-scale plasmid preparations were made using the alkaline lysis technique of Birnboim and Doly (1979). Large-scale plasmid preparations were made the same way but further purification was achieved by CsCI/ethidium bromide gradient centrifugation (Maniatis et al., 1982). The restriction map for pALM21 is shown for the following enzymes: B, BamHl; Bg, BgilI; E, EcoRI; [E] indicatez vector's EcoRl site; Sc, SacI; X, Xbal. The region of hybridization to the 5.8-kb Stul insert in pCMSI is indicated by a solid bar. An 8.9-kb XbaI-BamHl subfragment was subcloned into pTZ 19R to generate pALMZ' I which was mapped with the following additional enzymes: A, Asp718; S, Sail; Sp, Sph I. The relative positions of ORFs identified in the nt sequence are shown below the restriction map ofpALMZ' 1. The direction of transcription is indicated by an arrow. ORF4 sequence is incomplete and thus indicated by dotted lines. were identified from the N-terminal aa sequences of the purified polypeptides (L.C. Seefeldt and D.J. Arp, personal communication). We propose to adopt the genotype hox (Eberz et al., 1989) to define the genes involved in hydrogen oxidation, and to use hoxK and hoxG to describe the genes for the small and large subunits, respectively, of the [N,:Fe]hydrogenase in A. vinelandii. O R F I (1074 nt), hoxK, is capable of encoding a polypeptide of 358 aa (39.2 kDa). It is preceded by a potential RB S, A A G G G , located 7 nt upstream from the likely start codon. The deduced aa sequence of O R F I showed considerable homology to hydrogenase small subunits from a number of other sources (Fig. 4A). However, the derived polypeptide was considerably larger than 31 kDa, experimentally determined for the small subunit of the purified

A. vinelandii [NiFe]hydrogenase (Seefeldt and Arp, 1986). Also, the N-terminal aa sequence of the purified small subunit, determined by aa sequencing (Met-Glu-Thr-LysPro-Arg-Thr-Pro-Va!-Leu), matched aa residues 46-55 of our deduced aa sequence. ORF1, therefore, apparently encodes an additional 45-aa N-terminal segment missing in the mature small subunit. O R F 2 (1806 nt), hoxG, is located immediately downstream from O R F 1 , overlapping it by 4 nt. It is preceded by the sequence, A G G G G G A , located 6 nt upstream from the start codon and within the C-terminai coding region of the small subunit. This purine-rich sequence is not typical of a potential RBS, but is very similar to one identified for the A. chroococcum vnfG gene ( G G G G G G ) , which also overlaps into the C-terminal of vnfD (Robson et al., 1989).

TABLE I Bacterial strains and plasmids Strain or plasmid

Relevant characteristics

Reference or source

E. eoli LE392 71/18

F- hsdR514 r f mk supE44 supF58 lacY1 go;K2 gaiT22 metB1 trpR55 2Alac-pro [F'iacl q lacZAMI5 pro +] supE

Promega Co. (Madison, Wl) Messing et al. (1977)

A. ~inelandii

UW

Plasmids

pTBE pTZI9R pCMSI

Bush and Wilson (1959)

ApR cos ApR lacZ ApR A. chroococcum hydrogenase structural genes

Grosveld et al. (1982) Mead et al. (1986) K. Tibelius

70 1 101

TGTATcAAGCCATGACAAAAACATGGCATTGGCGCATTATTCGTGCGGTTTTCATTCAGcAACcGTGGGC~ATA~AACCGGCGCGCCGTCATAGCCGAAG100 GACGGTGCGCAGGG•CGCCGATAACGACCTGGCCACAAGGGTAACGGCATGTCTCGACTCGAAACTTTCTATGACGTGATGCGGCGTCAGGGCATCACGC200

H S R L E T F Y D V H R R Q G 1 T R GCCGCAGCTTTCTCAAATATTCCAGCCTGACCGCCGCGGCCCTGGGCCTCGGCCCGG•CTTCGCCCCGCGGATCGCCCACGCGATGGAAACCAAG•¢GCG R S F L K Y C S L T A A A L G L G P A F A P R I A H A Ifl E T K P R 301 ¢ACTCCGGTGCTCT~GCTGCACGGCCTGGAGTGCACCTGCT~CTCCGAGTCGTTCATCCGTTCGGCCCACCCGCTGGTCAAGGACGTGGTGCTGT~GATG T 'P V L~ H L H G L E C T C C S E S F 1 R S A H P L V K D V V L S H ~01 ATCTCGCTGGACTACGACGACACCCTGATGGCCGCCGCCGGCCACCAGGCCGAGGCCGCCCTCGAAGAGACCATGCGCAAGTACAAGGGCGAGTACATCC 1 S L D Y D D T L H A A A G H Q A E A A L E E T H R K Y K G E Y 1 L 501 TC•C•GTGGAGGGCAACCCGCCGCTCAACGAGGACGGCATGTTCTGCATCGT•GGCGGCAAGCCGTTCATCGAGCAGCTCAGGCATGTGGCGAAGGACGC A V E G N P P L N E D G H F C I V G G K P F I E Q L R H V A K D A 601 CAAGGCGGTGATCGCCTGGGGCAGTTGCGCCAGTTGGGGCTGCGTGCAGGCGGCCCGGCCCAACCCGACCCAGGCGGTGCCGATCCACAAGGTCATCACC K A V 1 A N G S C A S M G C V Q A A R P N P T Q A V P I H K V 1 T 701 GACAAGCCGATCGTCAAGGTGCCCGGCTGCCCGCCGATCGCCGAGGTGATGACCGGGGTGATCACCTACATGCTGACCTTCGGCAAGCTGCCCGAGCTGG D K P 1 V K V P G C P P 1 A E V H T G V I T Y H L T F G K L P E L D 801 ACCGCCAGGGGCGGCCGAAGATGTTCTACGGCCAGCGCATCCACGACAAGTGCTACCGCCGCCCGCACTTCGACGCCGGCCAGTTCGTCGAGCACTGGGA R Q G R P K H F Y G Q R I H D K C Y R R P H F D A G Q F V E H W D 901 CGACGAGGGCGCGCGCAAGGGCTACTGCCTGTACAAGGTCGGCTGCAAGGGCCCGACCAGCTACAACGCCTGCTCGACGGTGC•CTGGAACGAGGGCACT D E G A R K G Y C L Y K V G C K G P T S Y N A C S T V R W H E G T 1001 TCCTTCCCGATCCAGGCCGGCCACGGCTGCATCGGCTGCTCGGAGGACGGTTTCTGGGACAAGGGCTCGTTCTATGAACGCCTGACCACCATTCCGCAGT S F P 1 Q A G H G C I G C S E D G F M D K G S F Y E R L T T I P Q F 1101 TCGGCATCGAGAAGAACGCCGACGAAATCGGCGCCGCCGTCGCCGGCGGGGTCGGCGCGGCCATCGCCGCGCATGCCGCGGTCACCGCCATCAAGCGCCT G 1 E K h A D E 1 G A A V A G G V G A A I A A H A A V T A I K R L 1201 GCAGAACAAGGGGGATCGCCCATGAGCAGCCTGCCGAACGCCAGCCAACTGGACAAGTCCGGCAGGCGCATCGTCGTCGACCCGGTGACCCGCATCGAGG Q N K G D R P NI.~ S L P N A S Q L I D K S G R R 1 V V D P V T R Z E G 1301 GCCACATGCGCTG•GAGGTCAACGTCGACG•CAGCAACGTGATCACCAACGCCGTCTCCACCGGCACCATGTGGCGCGGCCTGGAGGTCATCCTCAAGGG H H R C E V N V D A S N V 1 T N A V S T G T H W R G L E V 1 L K G 1601 CCGCGACCCGCG•GACGCCTGGGCCTTCGTCGAGCGCATCTGCGGCGTCTGCACCGGCACCCATGCGCTGACCTCGGTGCGCGCGGTGGAGGATGCCCTG R D P P D A N A F V E R 1 C G V C T G T H A L T S V R A V E D A L 1501 GACATCC•CATC•CCTACAACGCCCACCTGATCCGCAACCTGATG•ACAAGACGCTGCAGGTGCACGACCACATCGTGCACTTCTACCACCTGCACGCGC D I R I P Y N A H L I R N L H D g T L Q V H D H I V H F V H L H A L 1601 TGGAC7GGGTCAACCCGGTCAACGCCCTGAAGGCCGATCCCAAGGCTACC1CC~CCCTGCAGCAGGCGGTTTCGCCGGCCCATGCCAAGTCCAGCCCCGG D W V N P V N A L K A D P K A T S A L ~ Q A V S P A H A K S S P G 1701 CTACTTCCGCGA~GTGCAGACGCGCCTGAAGAAGTTCGTCGAGAGCGGCCAGCTCGGCCTGTTCTCCAACGGCTACTGGGACAATCCgGCCTACAAGCTG Y F R D V @ T R L K K F V E S G Q L G L F S N G Y W D N P A Y K L 1801 CCGCCCGAGGCGGACCTGATCGCCGTGGCCCACTACCTGGAGGCGCTGGACCTGCAGAAGGACATCGTCAAGATCCATACCATCTTCGGCGGCAAGAACC P P E A D L H A V A H Y L E A L D L Q K D 1 V K 1 H T 1 F G G K N P 1901 CGCATCCGAACTACATGGT•GGCGGCGTGGCCTGCGC•ATCAACCTGGACGACGTCGGCGCCGCCGGCGCGCCGGTCAACATGACCAGCCTGAACTTCGT H P N Y H V G G V A C A I N L D D V G A A G A P V H H T S L N F V 2001 CCTCGAACGCATCCA~GAGGCCCGCGAGTTCACCAGGAACGTCTACCTGCCGGACGTGCTGGCGGTCGCCGGGATCTACAAGGACTGGCTGTACGGCGGC L E R 1 H E A R E F T R N V Y L P D V L A V A G 1 Y K D W L Y G G 2101 GGTCTGGCCGCGCACAACCTGCTGTCCTACG~CACCTTCACCAAGGTGCCCTACGACAAGTCCAGCGACCTGTTGCCGGCCGGCGCCATCGTCGGCGGCA G L A A H N L L S Y G T F T K V P Y D K S S D L L P k G A I V G G N 2201 ATTGGGACGAGGTGC•GCCGGTCGACGTGCGCGATCCCGAGGAGATCCAGGAGTTCGTCAGCCACTCCTGGTACAGCTACGCCGACGAAACCAAGGGGCT W D E V L P V D V R D P E E 1 Q E F V S H S W Y S Y A D E T K G L 2301 GCATCCCTGGGACGGCGTCACCGAGCCGAAATTCGAGCTCGGCCC~AACACCAAGGGCAGCCGCACCCACATCCAGGAAATCGACGAGGCGCACAAGTAC H P W D G V T E P K F E L G P N T K G S R T H I Q E 1 D E A H K Y 2q01 AGCTGGATCAAGGCGCC•CGCTGGCGCGGCCACGCTATGGAGGTCGGCCCGCTGGCACGTTACATCATCGCCTACGCTTCGGGCCGCGAATACGTGAAGG S N 1 K A P R W R G H A H E V G P L A R Y 1 I A Y A S G R E Y V K E 2501 AACAGGTCGACCGCTCGCTGGCCGCCTTCAACCAGAGCACCGGCCTGAACCTCGGCCTCAAGCAGTTCCTGCCCTCGACCCTCGGCCGCACCCTGGCGCG @ V D R S L A A F N Q S T G L N L G L K Q F L P S T L G R T L A R 2601 CGCCCTGGAGTGCGAGCTGGCGGTGGACAGCATGCTCGACGACTGGCAGGCCCTGGTCGGCAACATCAAGGCCGGCGACCGCGCCACCGCCAACGTCGAG A L E C E L A V D S H L D D M Q A L V G N 1 K A G D R A T A N V E 2701 AAGTGGGACCCGAGCACCTGGCCGAAGGAGGCCAAGGGCGTGGGCATCAACGAGGCGCCGCGCGGCGCCCTGGGCCACTGGATCAGGATCAAGGACGGCA K M D P S T M P K E A K G V G 1 N E A P R G A L G H W I R ] K D G K 2801 AGATCGAGAACTACCAGGCGATCGTGCCGACCACCTGGAACGGCACCCCGCGCGACCATCTGGGCAACATCGGCGCCTACGAGGCCGCGCTGCTCAACAC 1 E N Y @ A 1 V P T T M N G T P R D H L G H I G A Y E A A L L N T 2901 CAGGATGGAGCGCCCGGACGAGCCGGTGGAGATCCTGCGCACCCTGCACAGCTTCGACCCCTGCCTGGCCTGTTCGACCCACGTGATG¥CGCCGGACGGC R N E R P b'E P V E 1 L R T L H S F D P C L A C S T H V H S P D G 3001 CAGGAGCTGACCCGGGTGAAGGTCCGCTGAACCGGAGGATTGCGCGATGGCACTGGAAAAATCCCTGGAAACCGGCGACGGCCAGGAGAAGGTCCGCAAG Q E L T R V K V R . H A L E K S L E T G D G Q E K V R K 3101 CAGACCGCGGTGTACGTCTACGAG•CGC•GCTGCGCCTCTGGCACTGGGTCACGGCGCTGTCCATCGTCGTGCTCGGCGTGACCGGCTACTTC•TCGGC• Q T A V Y V Y E A P L R L M H W V T A L S 1 V V L G V T G Y F Z G A 3201 CGCCGCTGCCGACGATGCCCGGCGAGGCGATGGACAACTACCTGATGGGCTACATCCGCTTCGCCCACTTCGCCGCCGGCTACGTGCTGGCGATCGGCTT P L P T H P G E A H D N Y L H G Y 1 R F k H F A A G Y V L A I G F 3301 CCTCGGCCGGGTCTACTGGGCCTTCGTCGGCAACCACCACGCCCGCGAGCTGTTCCTCGTGCCGGTGCACCGCAAGGCCTGGTGGAAGGAGCTGTGGCAC L G R V Y W 4 F V G N H H A R E L F L V P V H R K A H M K E L W H 3601 GAGGTGCGCTGGTACCTGT~CCTGGAAAAGACCCCGAAGAAGTACATCGGCCACA~CCCCCTGGGCCAGTTGGCGATGTTCTGCTTCTTCGTGGTCGGCG E V R M Y L F L E K T P K K Y 1 G H N P L G Q L A H F C F F V V G A 3501 CGGTGTTCATGAGCGTCACCGGCTTCGCCCTCTACGCCGAGGGGCTGGGGCGGGACAGCTGGGCCGACCG~CTGTTCGGCTGGGTGATCCCGCTGTTCGG V F H S V T G F A L Y A E G L G R D S W A D R L F G W V ~ P L F G 3601 CCAGAGCCAG•ACGTGCACACCTGGCACCACCTGGGCATGTGGTACCTCGTCGTCTTCGTC•TGGTGCATGTCTACCTGGCCGTGCGCGA•GACATCGTT Q S Q O V H T M H H L G H M Y L V V F V H V H P Y L A V R E D I V 3701 TCCCGGCAGTCGCTGATCTCCACCATGGTCGGCGGCTGGCGGATGTTCAAGGACG•CCGGCCGGATTGAGCCCCGTGTC•TCCCTTCCGTCCGGGCCGGT S R @ S L 1 S T H V G G H R H F K D D R P D N 201

300 600 SO0 600 700 800 900 1000 1100 1200 1300 1600 1500 1600 1700 1800 1900 ZOO0 2100 2200 2300 2600 2500 2600 2700 2800 2900 3000 3100 3200 3300 3600 3500 3600 3700 3800

Fig. 3. The nt sequence ofA. vinelandii [NiFe]hydrogenase-encoding structural genes and ORF3, and deduced aa sequence, Potential RBS are underlined. N-terminal aa residues identified by a direct aa-sequence analysis are boxed. Potential start codons for hoxK, hoxG and ORF3 are located at positions 149, 1221 and 3045, respectively. The N terminus ofhoxG overlaps tht, C terminus ofhoxK by 4 nt. %qaen¢.ially overlapping subclones were generated using the DNase I random nicking method (Labiet et al., 1987). Sequencing of the reverse strand was done using a complete series of site-specific 17-mer synthetic oligos (prepared by Genetic Designs, Inc.). Double-stranded plasmid DNA template for sequencing was prepared using a modified alkaline lysis procedure (Kraft et al., 1988), further modified by ethanol precipitation of the DNA prior to RNAse treatment and including a proteinase K treatment (50 pg/ml at 55 °C for 20 min) before phenol-chloroform extraction. Sequencing was performed using the dideoxy chain-termination method (Sanger et al., 1977) with deoxyadenosine 5'-[~-aSS]triphosphate (600 Ci/mmol; Amersham International) and Sequenase (U.S. Biochemical Corp.) as per manufacturers' instructions. Deoxy-7-deazaguanosine triphosphate (Boehringer-Mannheim) was used routinely in place of dGTP to eliminate compressions observed with G + C-rich DNA (Barr et al., 1986; Mizusawa et al., 1986). Sequencing reactions were run out on 5.0~ or 6.0% polyacrylamide wedge gels (0.25 mm-0.75 ram) containing 7 M urea with Tris-borate-EDTA buffer (Maniatis et al., 1982). Autoradiography was done at room temperature using Kodak X-OMAT AR film. Sequence data analysis was accomplished using the University of Wisconsin GCG sequence analysis software package (Deveraux et al., 1985). Protein secondary structure and surface probability predictions were determined using the methods of Chou and Fasman (1978), Garnier et al (1978) and Emini et al. (1985). The nt sequence data reported in this paper has the GenBank accession number M33152. O R F 2 encodes a p o l y p e p t i d e of 602 aa (66.8 k D a ) and displays h o m o l o g y to the predicted aa sequences o f hydrog e n a s e large subunits from other organisms (Fig. 4B). T h e

d e t e r m i n e d value o f 67 k D a for the large subunit o f A. vinelandii [ N i F e ] h y d r o g e n a s e (Seefeldt a n d Arp, 1986). T h e predicted N - t e r m i n a l aa s e q u e n c e m a t c h e d , in all but

calculated M r is in close agreement with the experimentally

one position (residue 2), the experimentally d e t e r m i n e d

71

A~

AV BJ RC EC DG DB

AV BJ RC EC DG DB

v

100

v

HS~LETFYD~HRRQG~T~L~YCSLTAAAL~LGPAFAPR~AH~i1~ET~PRTP~L"L~GLE~C~SE~FzRSAF~LVKDV~SH~LDYDD~L~ . . . . HGAATETF~s~RRQG~T~R~F~LTAT~LG~LAA~R~AN~L~ETKPR~P~1WMH~LE~T~E~IR~AF~L~KD~v~L~Y~D~1 ~ ..... LsDzET~D~HRR~G11]R~RS~Q~v~LGP~FvPK~GE~ETKPRT~v~HGLE~Ep~RsAF~LAKD~X~LD~DD~L~ . . . . . HNNEETF~QAHRR~G~T~L~q~sLAATs~LGAG~APKXAW~E~K~R~P~V~GLE~T~TE~d~XR~AF~LAKD~1~L~LD~L~ HK~Y~GRGK~Q~EERLERR~~FCT~A~A~HGPA~PK~AE~L~TAKKRPs~/~LHNAE~T~C~E~LLRTVE~DEL~V~[~H~YHE~L~ .................. H~LS~EL~`V~CSA~VAGL~1~Q1¥~PG~VH~TE~AKKAP~zWVQGQ~-~V~S.JLL~AV~RZKEZL~J~IS~LEFPH~d ~ .....

I

J~QAEAALEETHRKYK~EY~LA~"~NI~LNED~FCIVGGK . . . . . . . . . . PFZEQLRHVAKDAK~VZAH~Sli~-~VQ~RIF~iF~AVPIHK~:~ ~G~QAEAXLEETRAKHK~YZLAV~EG~LNEGJG~IFCIDGGK . . . . . . . . . . PFVEKLKHHAEDAH~XIA~AS~G~V(~A~I(P~T~ATPIDKVZT ~G~AAEAAFEETZAKYK~NYZLAVIEG[N~LNED~G~FCITGGK . . . . . . . . . . PFVEKLRHAAEGAK~ZIS~ASY~G~VQ~Ai~N~11QATPVHKVZT ~I~TQAEEVFEDZZTQYN~KYZLA~EG~NqqLGEI~IFC]SSGR . . . . . . . . . . PFZEKLKRAAACaS~XZAI~G~C(AS~G~VQ~RP~II~I1QATPIDXVZT ~G~AVEEALHEAZ...K[G~FVCV]~qG~P~IGDG~YWGKVGRR . . . . . . . . . . NHYDICAEVAPKAK~IAXp~ATY~G~VQ~A~KP~I~GTVGV~EALG EL~EHALAHHYEIAEKFN~FFLLV~-~A]LP]TAKE~RYcZVGETLDAKGHHHEVTH~ELIRDLAPKSL~TVAV~JT~SAY~P~-~EG~.J~SKSVRDFFA

K P ' . . V K V I ~ Z A E V H T G V ~ TVHL . . . . . . ~F GK L ~ E L I - ~ R Q ~ K H ~ Y G Q R I ~ I D K ~ Y R R P H F D A G Q ~ E HWDDEGARK GYCL Y K ~ - K - ~ .~ N . . . . KPI..XKVI~GePqIAEVHTGVVTFZT . . . . . . ~FGKL~EL~R(~GRP~H~YSQRZIH~)K~YRRPHFDAG(~F~EEWDDEAARKGYCLYK#~GCKGqT~Y~ O. . . . KPI.oIKVPIGCPflIAEVHTGV]TYHL . . . . . . ~FDR~EL~RI~GRP~H~YSQRIIH~K~YRRPHFDAGI~F~EHMDDENARKGYCLYKI~GCKG~TTM D . . . . KPI..IKVI~GCPRZPDVHSAIITYHV . . . . . . ~FDRLI~DV~RI~GR~LH~YGQRTII~DK~YRRAHFDAGE~QSWDDDAARKGYCLYKII~CKG~TT~I K . . . . LGVKAINIA~;CP.~N.PHNFVGTVVHLL . . . . . . ~KG.It~EL~KI~GRP~/H~FGETV~DN~PRLKHFEAGE~TSFGSPEAKKGYCLYEL~CKG~DT~I ~EKXEKLL~vNvCP~-~HPD~H~GTLVAA~HvLN~EH~L~EL~D~GR~LL~FG~N1~H~EN~YLDKYDN~E~AETFTKPGCKAE . . . . . L~CKG~ST~

D ....

v v v ~00 NA~sT~R~NEGT~FP~QAGHG~G-~s~DGF~KG~FYERLTT~PQFG~EKNADEZGAA~AGG~GAAXAAHAAVTA~KRLQNKGDRP~ . . . . . . . . . . . . . NA~STVR~NGs~SFP~$GHG~G~E~DGFt4~KG~FYDRLTNZK~FG~EKNADQZGHVAAGAVGAAVAAHAAVTAVKRLATKREDADHNS~ . . . . . . . . . NA~sT~PLERRRHFPzQsGHC~z~E~DGFM~QG~FYDRLTTIKQFGZ[ATADQZG~TATGLVGAAVAAHAAVSVLKRAQKKNEEA~ . . . . . . . . . . . . . NA~ssTRWN~s~Qs~HG]C1LG~E~GFi~R~R~D1PQHGTH~TADT~GLTALG~AAAVG~A~ASAVDQRRRHNQQ~TETEHQ~GNEDKQA NI~PKQLFNQ.~NWPVQAGHP~'~EfPNFH~D]LY~PFYSAu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . AD~AKRRWNNGZNWCVENA.V~XGC~P~FI~GK~PFYVAE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

v

100

.•HS$LP~ASQLD~$GRR~VV~-P~-~-~1RC~NVDASNV~TN[~V~TGTHW[R~-~L~-~KG.rR-~AWAFVE~1~GV--~G+~-~ALT~HAVEDALD~R1P .HG~QT~NGFNLDNSGKR~VV~/~R;~GH~RV~NV~A~NV1RN~N~TGTH~RG~K~RD~RDAWAFTE~%~CGV~T~HAL~S~1RAVENAL~T~P

BQ

."TTQTeN~FTL~"A~x~zvv~P~v;n;~"~c~v~vN~"~zTN~sT~r"~vz~n~P~AwAF~z~c~v¢T~AL~s~RAv~s~L~Tze

HSTQY~T~GYT~N~AGRRL~R~GH~RC~V~]NDQ~V~T~SCGTZFiR~z.~QGRDPR~ARAFVE~1~V~TGM~AL~S~ YAIEDAIGIKVP .......... HSEHGGNKXVV D~Z~R~GHLR Z~EVEGGK. XKN~SHSTL FIRG~HiZ~KGRDPR~AQHFTQiR~CGVCTY~.AL ~S~RAVONCVGVKZP . . . . V~AAT~AADGK~KZ~D~L~-~/~LK1~1EVKD~K~.VD~KC~GGHF~F~Q~RG~RDPRq$~1~Q~GvC;PTA~CTAS~AHQD~AF~KVT AV BJ RC EC DG DB

2OO ~AHL~-~LH~KTL~H~V~H~HAL~NPVNALKAD~KATsAL~AV~PAHAK~SPG~FRDV~TRLKKFVE~GQLGLF~GY~.DNPAYKLPPE E~N$~N[HQLALQVH~H~VHF~H~HAL(DMVDVVsALsAD~RATsTLAQ~IS.NWPL~PGYFKDLQTRLKKFVESGQLGPFKNGYW~.GSKAYKLPPE D~kN~HQLNLQ~H~H~VHF~H~HAL[DMV~PVNALRAD~kAT~ELQQH~SP~HPLSSPGYFRDVQNRLKKFVESGQLGLFK~GY~..D~PAYKLPPE D~NZ~k~HLATL~CH~H[VhF~Q~G~VLDALKAD~RKTSELAQ~LS.SWPK~SPGYFFDVQNRLKKFVEGGQLG1FR~GYW~GHP~LP~E Ek~TL~THGAQYHH~H~F~H~HAL~DMVNVANALNA~PAKAARLANDLSP..RKTTTE$LKAVQAKVKALVESGQLG1FTNAYFLGGHPAYV(-PAE T~GRZT~LZFGANYLQSLHJzLF~H~AAL~YVKGPDVSPF~PRYANA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .LLTORZK..OGAK...ADA

AV BJ RC EC DG DB

300 ADL~A~A~¢-~E~-~LQKD~VK~HT~-~GKN~N~G~VACATNLDDVGAAGAPVNHTSLNFVLER~HEAR-FTRNVYL~DVLAVAG1YK~W.LYGGGL ANLHAVAH~L~E~DF~KE~V~HT1~G~;KNP~PN~L~G~VPCP~NVDGTGAVGA~1NHERLNL1~$I1DRL1EFNE~VYLPDVAAZG~FYKDW~LYGGGL A~LHATTH~L]E~L~KE~vK~HT~GKN~P~L~G~VP~PINV~G~G~GA~1NHERLNLV~S11~RCTEFTRNvYLP~L~1GGF~E~.LYGGGL ANLHGFAH~L[E~DF~REIVKIHAVFGGKNP~PN~G~P~A~N1DE~GAVGA~VN~ERLNLV~I~TRTADF~NNV~1PDALA1GQFNKPWSE~GTGL VDLIATAI~YLIE~t~RVQVKAARAHAI~G~KNpI~TQFT~/~GCTNYD . . . . . . . . . SLRPERZAEFRKLYKEVREFIEQVYITDLLAVAGFYKN~..-AGZG TNTYGLNO~K~L~EIRRICHEHVAH~G~RH~H~/~GII~/~GATEZPTADKVAE . . . . . . . . . . . YAARFKEVQKFVIEEYLPLIYTLGQVYTD..LFETG1

AV BJ RC EC DG DB

AAHNLL~YGTFTKVP~YDK~SDL~L~AGAZV~GN~DEVLPVDVRDPEEZ~F~SHSWYsYADETKGL~P~D~E~KFELGPNTKG~RTH~QE~DEAH SG~SVLA~GDVPEHA~NDY$AKSLKL~GA~NGNL$EVFPVDHANPDE~Q[[F~VH$WYKYPDETKGL~P~D~E~NYVLGPNAKGTKTA1EQLDEGG ~G~VL~GDXPENP.~F~AGQLHL~GA11NGNLNE~HDvDTT~EQv~]~H~YDYGEPGHGL~D~R~E~KFELGPNLKGTRTN1E~1DEGA SDKCVL~YGAFPD1ANDFGEK~LL~.#~P~GAVINGDFNNVLPVDLVDPQQVQ~]F~DHAWYR~PNDQVGR~FD~1~D~YNP~GDVKGSDTNXQQLNEQE KTSNFLTCGEF~T~E~LNs~Y~T~VZWG~N~LSKV~D~P~L~EE[~MKY~EGA~A~YK~V~K~KWT ............. EFHGED GWKNVIAFGVFPEDD..DYKTFLL..K~GVYZDGKD . . . . . . . EEFDSKLVK~Y~GHSFFDHSAPG.GL~SV~EL~N[~ . . . . . . . . . . . . . . . . NPDKPG

AV BJ RC EC DG

500 ~4~AP~WRGH&~[VGPL~RY~ZAYA~G~E~V~EQV~R~LAAFNQ~TGLNLGL~FLP~L~TL~AL~AV~LD~W~ALVG~AGD~ATAN K~S~IFAPR~KGHAH[VGPL~RWVVGYAQNKSEFKDPVDKFLRDLNLPTSALF . . . . . . . ~L~TA~ALESVqAGRQHRYFQDKLVANI~AGOSSTAN K~SMIgAPRMRGNAMEVGPLA~TSSVTRKGHEDIKNQVEGLLRDHNLPVSALF . . . . . . . ~L~R]TA~R~LEAEYCCRLQKHFFDKLVTNZ~NGDSSTAN R~S~IKAPRMRGNAHEVGPLNRTLIAYHKGDAATVESVDRHHSALNLPLSGIO . . . . . . . ~L~RIIL~RIAHEAQHAAGKLQYFFDKLHTNLg[NGNLATAS R~SMI~KAPR~KGEAFEVGPLA~VLVAYAKKHEPTVKAVOLVLKTLGVGPEALe . . . . . . . ~L~I~TA~ R~GIoCLTAAoEVEVWLDKLEAN~KIAGKo . . . . A~_~FV~__AZR~KDKPC~VGpLA~RHHVQNPELSP•VGQKLLKELYGXEAKKFRDLGDKAF..~]M~.~HVA~AEE . . . . . . . . . . THLTAVAVE~WLKQVQPG

q00

DB

v

AV BJ RC EC DG DB AV BJ RC EC DG DB

C.

AV

v

600

VEKWDPSTW~IiKEAK]G~/GZNE ~ - ' ~ ' ~ ZRZKDGK~E N Y ~ ZVPTI~-I~GT~'~[D"L GNI GAV~A A~'ILNTRHERPDEPVEI L~:~L I ~ [ ~ ~ .A~ ~ ~ VDKWKPESW~ EAK~GF TE~PRI~L~IHWXK XKDTK~ON~I~CVVPTT~I~GSpI~DPKGNZGAFJEIA~LIHNTPHVNpEQPLEXL R1 X " ~ U~ctcAl~ ~ l n v ~ r VEKWDPSTWp~KEAK~GHTE~PRC~L~IHWVK ZKDGR~EN~QtCVVP TT~I~GSpI~SKGNI GAF[E~ASILILNTKHERPEEPVE1L R~L(H~FD~CiLA~S~HVMSA TEKWEPATW~TECR~G~GFTE~PRI~L6II'IMAAI RDGK~DL~Q~VVPTT~NIASPRIDPKGQZGA~EIAAIL~NTKHA1P EQPLEI CIR~LLH~FPD~iLAICIS~HVLGD • DLYTDWQY~TES~IG~GFVN~APRI~L~XVQRGGK~ENF~VVPS'I'~LG~ I~AE RKLsA~E~A~IIGTP I ADPKRPVEI~R~E~VDI~ICIT A~G~HVI" " " AETYVKSEI~DAAE~.]TGFT E ~ L ~ Y L K XKDKK~ENV~IXVSAT L~AN~I~DMGQRGP I~E A~IGVPVPDX KNPVNVC~L~R~YDI~LG~A~HVL HA b32 DGQELTRVKVR~ . . . . . . . . . . . . . . . . . . . . DGQELAKVKVR~ . . . . . . . . . . . . . . . . . . . . EGPPDHRQGPVGGCHEGSFRRKDQCPRPWPG~ DGSELISVQVR~ . . . . . . . . . . . . . . . . . . . . ETGEEHVVNID~ . . . . . . . . . . . . . . . . . . . . 1oo ........ I'!AL EKSL ETGDGQEKVRKQTAV~I;Y-~tY~'AI~L,~Lt i ~ , ZAL STV~L]G~ G-'~F~'G~A,~-I~-~TH~ DNYLIHGY-I~F -~.-H'-~A~-" ~ ~ ~ ~ ~ ~ ................... HQQKSDNVVSWYV(FIE A , VIRTMH L TV L CHA~/LiH]VT~YIFITG~KpL PISVS(GEApI'YL FYWIGYI R1LIIH~51A~Ivp"/vv LILInp~I~ vlw~:~,~.

BJ

~-

~-~-G-~uu,~© ,~L~-P~H~-~.Ai~-~7~EL~.&N~-~-yt FL~E~T~KKy[zGHN~L G~L~H-F-~_F~VVGA ~ H-~SV~GFi L ~ ..... BJ

F..V_~zJ~tHS~Z ..............

"- . . . . - . . . -• . . . . . . .

.-7. . . . . . . .

. ........

AV EC BJ

258 ~V~H[-~' L V~/~-'~HVk V~L~V~"ED'~V~RQSL~-SH--H--V~H~ T F~DDe PD. . . . . . ~Z H~S~H~R[G__HHJLI GA~~Z GH_V,~It.~JL~ED~H~DDTVLXST14_~Y~-~;H~JFGKZSNKER$" ..........................................................

L GRDSWADR~ G W~I e L~ . . . . ~ YWTG

Z-... T" ~ -~ '~ . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Fig. 4. Comparison of the deduced aa sequences of the [NiFe]- and [NiFeSelhydrogenase structural genes from A. vinelandii (AV), B. japonicum (BJ) ~Sayavedra-Soto et al., 1988), R. capsulatus (RC) (Leclerc et al., i988), E. coil (EC) (Menon et al., 1990), D. g~gas (DG) (Li et al., 1987; Voordouw et al., 1989), and D. baculatus (DB) (Menon et al., 1987). Regions that are conserved in all six sequences are boxed• All the conserved Cys residues are marked with arrows. (A) Small subunit comparison; (B) large subanit comparison; (C) comparison of the third gene from A. vinelandii, hyaC from E. coil and derived aa sequence of the region 3 to the B. japonicum large-subunit structural gene.

72 N-terminal aa sequence (Ser-Asn-Leu-Pro-Asn-Ala-Ser/ Cys-Gln-Leu). The arrangement of the structural genes in A. vinelandii is similar to that seen for all other sequenced [NiFe]- and [NiFeSe]hydrogenases, in that the small subunit gene precedes the large subunit gene. The A. vinelandii genes have overlapping stop and start codons indicative of translational coupling (Normark et al., 1983). Analysis of the nt sequence 5' to the small subunit gene indicated that it did not conform to any typical prokaryotic promoter consensus sequences. Further work, including transcript mapping, is required to establish the ez-tent of this operon. (c) Comparison with other hydrogenase-encoding structural genes Comparisons of small and large subunits of membranebound and periplasmic [NiFe]- and [NiFeSe]hydrogenases from various eubacteria are shown in Fig. 4. All small subunits are 35-50 aa longer at the N-terminal end than the mature small subunits isolated and characterized biochem~cally. This implies that some form of processing is occurring. The 'mature' subunit could be an artifact that actually exists as a larger form in vivo, but is converted to a smaller form after cell disruption and purification. Alternatively, it is po,~sible that the N terminus of the unprocessed form encodes a potential signal peptide which could function in directing the enzyme to the membrane or periplasm, as was originally proposed by Prickril et al. (1987) for the Desulfovibrio vulgaris periplasmic [Feonly]hydrogenase and thereafter for all other sequenced periplasmic and membrane-bound hydrogenases. These regions all share the positively charged motif, R ÷ R + XFXK +, a sequence that is conserved even in the [Fe-only]hydrogenase from D. vulgaris (Voordouw and Brenner, 1985). The presumptive targeting signal can be divided into two domains: an N-terminal domain consisting of predominantly polar and positively charged aa residues (excluding Desulfovibrio gigas) and a C-terminal domain consisting of a stretch of uncharged, largely hydrophobic aa residues, followed by a small stretch of polar charged aa. These sequences are unusually long)or bacterial signal peptides (von Heijne, 1985) and the four membrane-bound [NiFe]hydrogenase presumptive signal sequences are more similar to mitochondrial presequences and chloroplast transit peptides normally involved in directing proteins to the outer membrane or the intermembrane space of these organelles (Hurt and Van Loon, 1986; Verner and Schatz, 1988). Analysis of this region by gene fusions of the N terminus to cytoplasmic 'reporter' proteins should help clarify the role of these extensions in cellular targeting. The C termini of the small subunits of the membranebound [NiFe]hydrogenases contain a region of 20-22 aa,

predicted to form an ~-helix with low surface probability. The N-terminal and central segments of the ~-helix are largely composed of hydrophobic residues, while the remaining C-terminal aa are predominantly hydrophilic. This domain is apparently missing in the periplasmic Desulfovibrio baculatus and D. gigas hydrogenases and the cytoplasmic Methanobacterium thermoautotrophicum enzyme (Reeves et al., 1989) and could be partly responsible for the location of the membrane-bound enzymes. Our current knowledge about the structure of this class of hydrogenases is limited. Extensive aa identity exists amongst the deduced sequences of [NiFe]hydrogenases, suggesting that divergence of these aa sequences is limited by functional constraints, or that these genes have diverged relatively recently. These comparisons may throw light on the arrangement of conserved Cys residues which are potential ligands for metal.-sulfur clusters present in these proteins, and are associated with, or form an integral part of, the active site. The [NiFe]hydrogenase ofA. vinelandii contains an estimated 0.68 mol Ni and 6.6 tool Fe per tool of enzyme (Seefeldt and Arp, 1986). This implies that at least seven Cys residues would be required to interact with the metal clusters. The two subunits ofA. vinelandii contain a total of 21 Cys, however, only 13 of these appear to be conserved in all such proteins. The small subunit has nine conserved residues, while the large subunit has four. Of particular interest are those which participate in the motif -C-X-X-C-S/T- which occurs twice in each subunit and in both eases lie at the N and C termini. This motif is reminiscent of the spacing of Cys that, in part, coordinate [4Fe-4S] clusters in small ferredoxins (Brushi and Guerlesquin, 1988). (d) Identification of additional ORFs ORF3 (720 nt) starts 16 nt downstream from ORF2 and potentially encodes a 27.7-kDa, 240-aa polypeptide. A strong possible RBS, GGAGGA, is located 7 nt upstream from the start codon. The N-terminal portion of a fourth ORF has also been identified, and is ibcated downstream from ORF3. Sequencing of this downstream region is ongoing. It is likely that the hydrogenase structural gene operon in A. vinelandii encodes at least three ORFs. Our analysis of the published B. japonicum [NiFe]hydrogenase nt sequence indicates the presence of a potential third ORF starting 9 nt downstream from the large subunit gene (Sayavedro-Soto et al., 1988). The deduced l l0-aa sequence shows 55~ identity to our ORF3 (Fig. 4C). We suggest that the structural gene operon in B.japonicum probably contains at least one additional gene similar to that of A. vinelandii. The E. coli [NiFe]hydrogenase hya operon encodes four ORFs in addition to the structural genes (Menon et al., 1990). The deduced aa sequence of

73

hyaC (ORF3) displays a 51% identity with ORF3 from the A. vinelandii box structural gene operon (Fig. 4C). ORF3 encodes a hydrophobic polypeptide containing 53% nonpolar and 11% aromatic residues. Four major hydrophobic domains, each spanning 20-26 aa, are evenly distributed across the protein and are predicted to have very low surface probability. These properties are indicative of a membrane protein. The majority of the proteins identified on the basis of similarity in a computer search of the NBRF data base are integral membrane proteins. Many of them form part of multi-protein membrane complexes and/or are receptors, electron carriers or transport proteins. It is possible that ORF3 encodes a membrane protein that has a role in anchoring, proton transport, or as an electron carrier.

ACKNOWLEDGEMENTS

We thank K. Tibelius for plasmid pCMS 1, L.C. Seefeldt, D.J. Arp, B. Friedrich, N.K. Menon, and A.E. Przybyla for sharing information prior to publication, and Jack Chen and L.C. Seefeldt for critically reviewing the manuscript. This work was supported by National Science Foundation grant DMB 8607528.

REFERENCES Adams, M.W.W., Mortenson, L.E. and Chen, J.S.: Hydrogenase. Biochim. Biophys. Acta 594 (1981) 105-176. Aguilar, O.M., Yates, M.G. and Postgate, J.R.: The~oea~,fial effect of hydrogenase in Azotobacter chroocoecum under nitrogen fixing, carbon limiting conditions in continuous and batch cultures. J. Gen. Microbiol. 131 (1985)3141-3145. " Burr, J.P., Thayer, R.M., Laybourn, P., Najarian, R.C., Seela, F. and Tolan, D.R.: 7-Deaza-2 °deoxy guanosine-5'-triphosphate: enhanced resolution in MI3 dideoxy sequencing. Biotechniques 4 (1986) 428-432. ~: Birnboim, H.C. and Doly, L: A rapid alkaline extraction procedure for screening recombinant plasmid DNA. Nucleic Acids Res. 7 (1979) 1513-1525. Boursier, P., Hanus, F..!., Papen, H., Becker, M.M., Russell, S.A. and Evans, H.J.: Selenium increases hydrogenase expressior, in autotrophically cultured Bra@rhizobiumjaponicum and is a constituent of the purified enzyme. J. Bacteriol. 170 (1988) 5594-5600. Brushi, M. and Guerlesquin, F.: Structure, function and evolution of bacterial ferredox:~ns. FEMS Microbiol. Rev. 54 (1988) 155-176. Bush, J.A. and Wilson, P.W.: A non.gummy chromogenic strain of Azotobacter vinelat~dii. Nature 184 (1959) 381. Chou, P.Y. and Fasman, G.D.: Prediction of the secondary structure of proteins from their amino acid sequence. Adv. Enzymol. 47 (1978) 45-148. Dagert, M. and Erlich, S.D.: Prolonged incubation in calcium chloride improves the competence of Escherichia coil cells Gene 6 (1979) 23-28. Deveraux, J., Haeberli, P. and Smithies, O.: A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res. 12 (1985) 387-395.

Doyle, C.M. and Arp, D.J.: Regulation of H 2 oxidation activity and hydrogenase protein levels by H 2, 02, and C substrates in Alcaligenes latus. L Bacteriol. 169 (1987) 4463-4468. Eberz, G., Eitinger, T. and Friedrich, B.: Genetic determinants of a nickel specific transport system are part of the plasmid encoded hydrogenase gene cluster in Alcaligenes eutrophus. J. Bacteriol. 171 (1989) 1340-1345. Eisenbrenner, G. and Evans, H.J.: Aspects of hydrogen metabolism in nitrogen fixing legumes and other plant-microbe associations. Annu. Rev. Plant Physiol. 34 (1983) 105-136. Emini, E.A., Hughes, J.V., Perlow, D.S. and Boger, J.: Induction of hepatitis A virus-neutralizing antibody by a virus specific synthetic peptide. J. Virol. 55 (1985) 836-839. Evans, H.J., Purohit. K Cantrell, M.A., Eisenbrenner, G. and Russell, S.A.: Hydrogen losses and hydrogenases in nitrogen fixing organisms. In Gibson, A.H. and Newton, W.E. (Eds.), Current Perspectives in Nitrogen Fixation. Australian Academy of Sciences, Canberra, 1981, pp. 84-97. Friedrich, C.G., Bowien, B. and Friedrich, B.: Formate and oxalate metabolism in Alcaligenes eutrophus. J. Gen. Microbiol. ! 15 (1979) 185-192. Gamier, J., Ogusthorpe, D.J. and Robson, B.: Analysis of the accuracy and implications of simple methods for predicting secondary structure of globular proteins. J. Mol. Biol. 120 (1978) 97-120. Gogotov, I.N.: Hydrogenases of phototrophic microorganisms. Biochimie. 68 (1986) 181-187. Grosveld, F.G., Lund, T., Murray, EJ•, Moiler, A.L., Dahl, H.H.M. and Flavell, R.A.: The construction of cosmid libraries which can be used to transform ¢ukaryotic cells. Nucleic Acids Res. 10 (1982) 6715-6732. Hanus, F.J., Maier, R.J. and Evans, H.J.: Autotrophic growth of H2-uptake positive strains of Rhizobiumjaponicum in an atmosphere supplied with hydrogen. Prec. Natl. Acad. Sci. USA. 76 (1979) 1788-1792. Harker, A.R., Xu, L.-S., Hanus, F.J. and Evans, H.J.: Some properties of the nickel-containinghydrogenase of chemolithotrophically grown Rhizobium japonicum. J. Bacteriol. 159 (1984) 850-856. Haugland, R.A., Cantrell, M.A., Beaty, J.S., Hanus, F.J., Russell, S.A. and Evans, H.J.: Characterization of Rhizobium japonicum hydrogen upt-,,kegenes. J. Bacteriol. 159 (1984) I006-It/12. Hurt, E.C. and Van Loon, A.P.G.M.: How proteins find mitochondria and intermitochondrial compartments. Trends Biochem. Sci. I I (1986) 204-207. Johnson, D.A., Gautsch, J.W., Sportsman, J.R. and Elder, J.H.:Improved technique utilizing nonfat dried milk for analysis of proteins and nucleic acids transferred to nitrocellulose. Gene Anal. Tech. I 0984) 3-8. Kortluke, C., Hogrefe, C., Eberz, G., Pflhler, A. and Friedrich, B.: Genes of lithoautotrophic metabolism are clustered o,-. the megaplasmid pHGI in Alcaligenes eutrophus. Mol. Gen. Genet. 210 (1987) 122-128. KraR, R., Tardiff, J., Krauter, K.S. and Leinwand, L.A.: Using miniprep plasmid DNA for sequencing double stranded DNA templates with Sequenase. Biotechniques. 6 (1988) 544-346. Labiet, S., Lehrach, H. and Goody, R.S.: DNA sequencing using ~-thiodeoxynucleotides. Methods Enzymol. 155 (1987) 166-177. Leclerc, M., Colbeau, A., Cauvin, B. and Vignais, M.: Cloning and sequencing of the genes encoding the large and small subunits of H2 uptake hydrogenase (hup) of Rhodobacter capsulatus. Mol, Gen. Genet. 214 (1988) 97-107• Li, C., Peck Jr., H.D., LeGall, J. and Przybyla, A.E.: Cloning, characterization, and sequencing of the genes encoding the large and small subunits of the periplasmic [NiFe]hydrogenase of Desulfovibrio gigas. DNA 6 (1987) 539-551.

74 Madigan, M.T. and Gest, H.: Growth of the photosynthetic bacterium Rhodopseudomonas capsulat:.,s chemolithotrophieallyin darkness with H, as the energy source. J. ~,~t:~eriol. 137 (1979) 524-530. Maniatis, T., Fritsch, E.F. and San~hrook, J.: Molecular Cloning. A Laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1982. Mead, D.A., Szczesna-S~orupa, E. and Kemper, B.: Single stranded DNA 'blue' T7 promoter plasmids: a versati!e tandem promoter system for cloning and engineering, Prot. Eng. 1 (1986) 67-74. Menon, N.K., Peck Jr., H.D., L~gall, J. and Przybyla, A.E: Cloning and sequencing the genes encoding the large and small subunits of the periplasmic [NiFeSe]hydrogenase of Desulfovibrio baculatgs. J. Bacteriol. 169 (1987) 5401-5407 [Erratum 170 (1988) 4429]. Menon, N.K., Robbins, J., Peck Jr., H.D., Chatelus, C.Y., Choi, E.-S. and Przybyla, A,E.: Cloning and sequencing of a putative Escherichia coli [NiFe]hydrogenase operon containing six open reading frames. J. Bacteriol. 172 (1990~ ;9~9-!977. Messing, J., Gronenborn, B., M011er-Hill, B. and Hofschneider, P.H.: Filamento~,s c,,liphage MI3 as a cloning vehicle: insertion of a Hindlll fragm :nt ofqte 'ac regulatory region in M I3 rcplieative form in vitro. Proc. Natl. Acad. Sci. USA. 74 (1977) 3642-3646. Mizusawa, S., Nishumura, S. and Seela, F.: Improvement ofthe dideoxy chain termination method of DNA sequencing by using deoxy7-deazaguauosine triphosphate in place ofdGTP. Nucleic Acids Res. 14 (1986) 1319-1324. Normark, S., Bergstr6m, S., Edlund, T., Grundstr6m, T., Jaurin, B., Lindberg, F.P. and Olsson, O.: Overlapping genes. Annu. Rev. Genet. 17 (1983) 499-525. O'Brian, M.R. and Maier, R.J.: Hydrogen metabolism in Rhizobium: energeties, regulation, enzymology and genetics. Adv. Microbial Physiol. 29 (1988) 1-52. Prickril, B.C., He, S.-H., Li, C., Menon, N., Choi, E.-S., Przybyla, A.E., DerVartanian, D.V., Peck Jr., H.D., Fauque, G., LeGall, J., Teixeira, M., Moura, I., Moura, J.J.G., Patil, D. and Huynh, B.H,: Identification of three classes of hydrogenases in the genus Desulfovibrio. Biochem. Biophys. Res. Commun. 149 (1987) 369-377. Reeve, J.N., Beckler, G.S., Cram, D.S., Hamilton, P.T., Brown, J.W., Krzycki, J.A., Kolodziej, A.F., Alex, L., Orme-Johnsou, W.H, and Walsh, C,T.: A hydrogenase-linkedgene in Methanobacterium thermo. autotrophicum strain H encodes a polyferredoxin. Proc. Natl. Acad. Sci. USA 86 (1989) 3031-3035. Rigby, P.W.J., Dieckmann, M., "Rhodes, C. and Berg, P.: Labelling deox3'ribonucleic acid to a high specific activity in vitro by nick translation with DNA polymerase I. J, Mol. Biol. ! 13 (1977) 237-251. Robson, R.U, Chesshyre, J,A., Wheeler, C., Jones, R., Woodley, P,R. and Postgate, J.R.: Genomic size and complexitv in dzotobacter chroococcum. J. Gen. Mierobiol. 13~'(198a.) 1603-1612. Robson, R.L., Woodley, P.R., Pau, R.N. and Eady, R.R.: Structural genes for the vanadium nitrogenase from A:or.obacter chroococcum. EMBO J. 8 (1989) 1217-1224.

Sampaio, M,-J.A.M., deSilvia, E.M.R., Dobereiner, J., Yates, M.G. and Pedrosa, F.O.: Autotrophy and methylotrophy in Derxia gummosa, Azospirillum brasilense, andA. lipoferum. In Gibson, A.H. and Newton, W.E. (Eds.), Current Perspectives in Nitrogen Fixation. Elsevier, New York, 1981, 444 pp. Sanger, F., Nicklen, S. and Coulson, A.R.: DNA sequencing with chainterminating inhibitors. Proc. Natl. Acad. Sci. USA. 74 (|977) 5463-5467. Sayavedra-Soto, L.A., Powell, G.A., Evans, H.J. and Morris, I;LO.: Nucleotide sequence of the genetic loci encoding subunits of Bradyrhizobiumjaponicum uptake hydrogenase. Proc. Natl. Acad. Sci. USA. 85 (1988)8395-8399. Seefeldt, L.C. and Arp, DJ.: Purification to ~o~'~c~,eity of Azotobacter vinelandii hydrogenase: a nickel and iro~t containing 0t~ dimer. Biochimie 68 (1986) 25-34. Seefeldt, L.C., McCollum, L.C., Doyle, C.M. and Arp, D.J.: Immunological and molecular evidence for a membrane-bound, d~meric hydrogenase in Rhodopseudomtmas capsulatus. Biochem. Biorhys. Acta 914 (1987) 299-303. Storm~, G.D.: Translation initiation. In Reznikoff, W. and Gold, L. (Ed~.), Maximizing Gene Expression. Butterworth, Boston, 1986~pp. 195-223. Thuring, R.W.J., Sander, J.P.M. and Borst, P.: A freeze-squeeze mel~hod for recovering DNA from agarose gels. Anal. Biochem. 66 (1973) 213-220. Tibelius, K.H., Robson, R.L. and Yates, M.G.: Cloning and characterization of hydrogenase genes from Azotobacter chroococcum. Mol. Gen. Genet. 206 (1987) 285-290. Verner, K. and Schatz, G.: Protein translocation across membranes. Science 241 (1988) 1307-1313. yon Heijne, G.: Signal sequences. The limits ofvariation. J. Mol. Biol. 184 (1985) 99-105. Voordouw, G. and Brenner, S.: Nucleotide sequence of the gene encoding the hydrogenase from Desulfovibrio vulgaris (Hildenborough). Eur. J. Biochem. 148 (1985) 515-520. Voordouw, G,, Menon, N.K., LeGa]I, J., Choi, E.-S. Jr., Peck, H.D. and Przybyla, A.E.: Analysis and comparison of nucleotide sequences encoding the genes for [NiFe] and [NiFeSe]hydrogenases from Desuiforibrio gigas and Desulfovibrio baculatus. J. Bacteriol. ! 71 (1989) 2894-2899. Wong, T.-Y. and Maier, R.J.: H2-dependent mixotrophic growth of N2. -fixing Azotobacter vinelandii. J. Bacteriol. 163 (1985) 528-533. Xu, H.-W, Love, J., Borghese, R. and Wall, J.D.: Identi!ication and isolation ofgenes essential for H2-oxidat ion in Rhodobcc:er capsulatus. J. Bacteriol. 17i (1989) 714-421. Yates, MG. and Robson, R.L.: Mutants of Azotobacter chrooeoccum defective in hydrogenase activity. J. Gen. Microl01ol. 131 (1985) 1459-1466.

Cloning, sequencing and characterization of the [NiFe]hydrogenase-encoding structural genes (hoxK and hoxG) from Azotobacter vinelandii.

The Azotobacter vinelandii [NiFe]hydrogenase-encoding structural genes were isolated from an A. vinelandii genomic cosmid library. Nucleotide (nt) seq...
868KB Sizes 0 Downloads 0 Views