Plant Molecular Biology 18: 557-566, 1992. © 1992 Kluwer Academic Publishers. Printed in Belgium.

557

Cloning of a cDNA for rape chloroplast 3-isopropylmalate dehydrogenase by genetic complementation in yeast Mats Ellerstr6m ~, Lars-G6ran Josefsson 1, Lars Rask I and Hans Ronne 2 1Department of Cell Research, Swedish University of Agricultural Sciences, Box 7055, S-750 07 Uppsala, Sweden; 2Ludwig Institute for Cancer Research, Uppsala Branch, Box 595, S-751 24 Uppsala, Sweden Received 12 June 1991; accepted in revised form 16 October 1991

Key words: Brassica napus, chloroplast, 3-isopropylmalate dehydrogenase, molecular evolution, Saccharomyces cerevisiae

Abstract

Both insect and mammalian genes have previously been cloned by genetic complementation in yeast. In the present report, we show that the method can be applied also to plants. Thus, we have cloned a rape c D N A for 3-isopropylmalate dehydrogenase (IMDH) by complementation of a yeast leu2 mutation. The c D N A encodes a 52 kDA protein which has a putative chloroplast transit peptide. The in vitro made protein is imported into chloroplasts, concomitantly with a proteolytic cleavage. We conclude that the rape c D N A encodes a chloroplast IMDH. However, Southern analysis revealed that the corresponding gene is nuclear. In a comparison of I M D H sequences from various species, we found that the rape I M D H is more similar to bacterial than to eukaryotic proteins. This suggests that the rape gene could be of chloroplast origin, but has moved to the nucleus during evolution.

Introduction

Two main strategies have so far been used for cloning plant homologues of genes from other organisms. First, genes have been isolated by low-stringency hybridization, using probes from other eukaryotes or oligonucleotides derived from conserved protein sequences. This is a powerful method for closely related species, but its versatility decreases with increasing evolutionary distance. Animal probes hybridize to plant D N A

only if the sequences have been highly conserved during evolution. A second cloning method is screening of expression libraries for conserved epitopes with heterologous antibodies [19]. This method is limited by the fact that conserved epitopes are frequently non-immunogenic, since they are likely to be conserved also in the animal used for immunization. A third method, largely independent of both D N A and protein structure, is cloning by genetic complementation. The method relies on similar-

The nucleotide sequence data reported will appear in the EMBL, GenBank and DDBJ Nucleotide Sequence Databases under the accession number X59970.

558 ities in function, and has been widely used in both prokaryotes and unicellular eukaryotes [11, 34, 49]. To date, both insect and mammalian genes have been cloned by complementation in yeast [ 18, 30, 42]. We have now applied this method to a plant, Brassica napus (rape). Our objective was to test whether plant cDNAs can be cloned by complementation in yeast, and also more specifically to search for DNA-binding proteins that regulate the expression of napin, a rape storage protein [ 12, 22]. Our strategy was based on the finding that fusions between a DNA-binding protein and the yeast activator GAL4 can activate transcription of a yeast gene that has a target site for the DNA-binding protein [7 ]. We therefore first made yeast strains in which the LEU2 promoter was replaced with the napin promoter. These strains are leucine-auxotrophic, though some very limited growth occurs after prolonged incubation on leucine-less plates. We conclude that the napin promoter functions very poorly in yeast. We then made a rape c D N A library in a yeast vector where the c D N A is fused to the GAL4 gene. This library was screened in the above yeast strains for plasmids able to support growth without leucine. We looked for two kinds of plasmids. First, cDNAs encoding rape 3-isopropylmalate dehydrogenase (IMDH) should complement the leu2 deficiency, if the plant enzyme can function in yeast. Second, we reasoned that cDNAs for rape proteins that bind to the napin promoter might be able to activate the hybrid LEU2 gene, if fused in frame to GAL4. No plasmids encoding trans-acting factors were found, but one c D N A for a rape I M D H was cloned, as described below.

Materials and methods

Yeast strains and plasmids

The yeast strains used were isogenic to the MA Ta SUC2 ade2-1 canl-lO0 his3-11, 15 leu2-3,112 trpl-1 ura3-1 strain W303-1A [51]. Strains with substitutions in the LEU2 promoter were made

from U457, a HIS3 LEU2 SUP53-a derivative of W303-1A, using the popin/popout method [6]. Thus, a 251 bp Fok I-Sau3A I fragment of the napA promoter [22] was inserted between the BgI II and Hpa I sites in the LEU2 upstream region. Two yeast strains were constructed this way. Strain H135 has three tandem forward copies of the napA fragment, whereas H139 has two tandem copies in the reverse orientation. The cloning vector pHR50 was designed to allow expression from the G A P D H promoter of c D N A products, either as full-length proteins or as fusions to the carboxy terminal part of GAL4. It was made in three steps from pGAL4, a plasmid which has the Barn HI-Hind III GAL4 fragment of p 1A subcloned into pUC9 [ 21 ]. First, the URA3 Hind III fragment was cloned into the Hind III site of pGAL4, producing pHR30. This plasmid was cut with Eco RI, partially digested with Pvu I, and then ligated to a 1900 bp Eco RIPvu I fragment carrying the 2 ~m plasmid origin of replication. The resulting plasmid, pHR43, was cut with Eco RI and Bcl I and then ligated to the 400 bp Eco RI-Bam HI GAPDH promoter fragment of pYE8 [41], producing pHR50.

Construction and screening of a rape cDNA library

Total RNA [2] was prepared from immature seeds of the amphidiploid strain 20516-K, a derivative of Brassica napus cv. Sval0fs Karat. The m R N A was enriched on an oligo(dT) column [32], and c D N A was synthesized as described by Huynh et al. [ 19]. A mixture ofEco RI and Xho I linkers was ligated to the cDNA, to allow highefficiency cloning into double-digested plasmids [29]. The cloning vector pHR50 was digested with Eco RI and Xho I, which, removes the 5' end of GAL4, encoding the D N A binding domain. The plasmid was separated from this fragment on a 5-20~o sucrose gradient [32], and then ligated to the cDNA. The ligated D N A was transformed into Escherichia coli, producing a library of 100000clones. This library was amplified, and then transformed into yeast strains H135 and H 139. Approximately 200 000 transformants

559 were selected on uracil-less glucose media. The plates were replicated to leucine-less galactose media, and screened for prototrophic colonies. Such colonies were then tested for co-segregational loss of the URA3 marker and leucine prototrophy, to test if the LEU ÷ phenotype was due to the plasmid. Two colonies from strain H139 displayed plasmid-dependent leucine prototrophy. Both colonies contained identical plasmids, with a c D N A insert encoding a rape IMDH.

Protein synthesis and chloroplast import

The rape c D N A was transcribed and the resulting RNA was translated in vitro as previously described [37]. The 35S-labelled protein was then incubated with pea chloroplasts [5] isolated from 10-day-old plants of the cv. Timo (SvalOfAB). To reduce the starch content, the plants were first held in the dark for 18 h, followed by 1 h in the light prior to harvesting. Chloroplasts were purified on a 40/80 ~o discontinuous Percoll gradient. Import was carried out as described by Smeekens et al. [47], but in the presence of 10 mM ATP and 1 mM methionine. Chloroplasts containing 100/~g of chlorophyll were incubated in the light for 30 min at 20 °C with 106 dpm of 35S-labelled protein, in a final volume of 300 #1. The chloroplasts were then repurified [9], and the protein was analysed by SDS-PAGE [28]. Prior to electrophoresis, chloroplasts were added to an equal amount in each sample, to compensate for band distortions caused by abundant chloroplast proteins.

Other methods

The methods used for yeast genetics and molecular cloning have been described [36, 37]. The c D N A insert was subcloned into p U C l l 8 and pUC119 for sequencing [52]. The cloning junctions were sequenced on both strands in the original plasmid, using specific oligonucleotide primers. Southern blots were performed according to Mariani etal. [33]. As unspecific competitor

DNA, we used a mixture of pUC19, phage lambda, and salmon sperm DNA. Phylogenetic trees were computed from distance matrixes using either the method of Fitch and Margoliash [ 14] as modified by Felsenstein [ 13 ], or the neighbourjoining method of Saitou and Nei [40]. The proportion of amino acid substitutions, p, was determined in pairwise comparison of aligned sequences, ignoring those positions where either sequence had a deletion. The p scores were then converted into evolutionary distances, d, using the formula of Kimura, d = - In(1 - p - 0.2p2), to correct for multiple and parallel mutations [25]. These distances were used as input data for the FITCH and N J T R E E programs [ 13, 40].

Results

Our screen for cDNAs complementing the leucine auxotrophy of yeast strains with a napin promoter in front of LEU2 produced one plasmid, which encodes a rape I M D H (see below). This demonstrates the general feasibility of cloning plant genes by genetic complementation in yeast. However, the second goal of our screen was not achieved, as we failed to find cDNAs for transacting factors that bind to the napin promoter. It should be noted that cDNAs for transcription factors are expected to be rare. Moreover, only one out of six cDNAs will produce an in-frame fusion to the GAL4 gene. It is therefore likely that a very large number of plasmids must be screened for our strategy to be successful. For practical reasons, our screen was limited to 200000 colonies. It is conceivable that further developments in yeast methodology could make screening of 106-107 plasmids possible, which might be required for a successful application of our cloning strategy. The complementing plasmid has an insert of 1428bp, with an open reading frame of 406 codons (Fig. 1). The rape origin of the insert was confirmed by Southern blotting. By the same method, the subcellular location of the gene was investigated. Several strongly hybridizing bands were present in total rape D N A digested with

560 AC~T~.CTACTCTCTGAT~TGAAAAAGCTGAAAAAAAAG~Ti-GAAACCAGTT~C~TGAAATTATICCCCTAC~TAAGTATAT~AG~CGGTAGGTA~GT~CT~A 16 AATCTATTrcTTAAACTT~TTA.~ATTCTA~TTTTATAGTTAGTCTT~Ti.TTTAGTTTTAAAACACCAAGAAC~A~T~CACACAT~~ G G C ~ T C C T ~ A M k A A L QT H I R P V K F P A T L R A L T K Q S SPA P F R V R C A A A S PG 40 ~T~GCGGCGGCTCTC~C~GACT~CATCCGA~C~GTTAAGTTT~CGGCTACGTTGAGAGCT~TCACC/~T~T~~A~GTGAGATGC~G~TGC~C~G 136 K K RY H I TLL PG D G I GP E Y I S I A K N Y L QQA G S L E GL E F S F Q 80 A/~AGAGATACAATATCACTCTCCTTCCCG8CGATGGAATCGGTCC~GAGGTCATCTCCATCGCT~;~L~ATGTGCTTCAGCAAGCTGGTTCCTTGGAAGGT~G~GT~AG~C~G 256 E M p V G G A k L D L Y G V P L P E E T Y S A A K E S D A V L L 8 A I G G Y K W 120 GAAATGCCTGTAGGAGGAGCTGCTTTGGATTTGGTCG8AGTGCCTTTGCCTGAGGAGACCGTCTCGGCTGCTAA~GAATCAG~T~CTGTGC~C~GGAGCCA~GA~GGTA~TGG 376 D K H E I( H L I( P E T G L L Q L R A G L K V F A H L K P A T Y L P Q L V D A S T 160 GAT~GAATGM`AAACATTTGAAGCCTGAGACTGGGTTACTTCAACTTCGGGCTGGTCTT/~AGTCTTTGCTAATCTGAGACCTGCTACAGTTCTTC~ACAGTTAGTGGATGCTTCGAC~ 496 L K R E Y A E G V D L M V V R E L T G G I Y F G V P R G I K T H E H G E E V G Y 200 TTGAAGAGAGAGGTTGCAGAAGGTGTTGATCTGATGGTT~TTAGGGAGCTTACAGGAGGTAT~-fACTTTGGAGTGCC~GGGGC~TTAAGACT~TGAAAATGGTGA~GTTGGGTAT 61B H T E V Y A A H E I O R I A R V A F E T A R K R R G K L C S V D K A H V L D A S 240 AATACCGAGGTCTATGCTGCTCACGAGA~GATAG~TTGCTCGTGTTGCCTTCGAGACTC~CTCGGAAACGGCGTGGCAAGCTGTGTTCTGTTGAC~AGCT~TGTCTTAGATGCCTC~ 736 I L W R R R V T A L A A E Y P D V E L S H M Y V D H A A M Q L V R D P K Q F D T 280 ATTT~ATGGAGGAGACGAGT~CAGCACTAGCTGCTGAATATCCGGATGTTGAACTGTCA~ATATGTATGTTGACAATGCTGCCATGCAGCTTG~GTGACCcT~ACAG~TGACACC 856 I Y T H H I F G D I L S D E A S M I T G S I G M L P S A S L S D S G P G L F E P 320 ATTGTTACA~CAACK~TTTG~T~ATATATTATCCGAT~AAGCGTC~AT~TCACAG6AAGCATCGGCATGCTT~CTCT~TAGTCTCAGTGATTCGG~CCTGGACTC~G~CCT 976 I H G S A P D I A G Q D K A H P L A T I L S A A M L k K Y G k G E E K A A i( R I 360 ATACATGGTTCTC'CACCTGATATTGCTGGACAGGATA~GCAAA~cCGTTGGCAACCATCcTCA~CGCTGCAATGCTTCTGAAATACGGACTCGGA~GAG~GGCAGCT~ATC 1096 E D A V L G A L H K G F R T G D I Y S A G T K L V G C K E M G E E V L K S V D S 400 GAAGACGCTGTGTTGGGTGCTcTGAACAAAGGATTCAGAACAGGAGACATCTACTCCGCA~GAACTAAACTTGTGGGCTGCAAG~AGATGGGAGAG~AG~CTG~GTCAGTGGATTCC 1216 406 H V Q A S V CACGTTCAAGCl-[CTGl-[TAATCATCTGTATAACTTTGAAGAGA~ATTl~TTTTTTATCAlTTTCCAACAATAAGTGAGAGTCTCAGTTTCATCAGCCACAATGTGTGT~TG~TG~A 1336 TTGGACGTATTTGATCACATTGAAACTTTCGACGTGCTT~FfCTGT/~AATAAACATcCTAGTGGAGTTAAGACGA~AAAAAAAAAAAAAAACCTCGAGAAGACCTTGACATGATTTTGA 145B AAATGGATTCTTTACAGGATAT~AAGCATTGTTAACAGGATTATTTGTAcAAGATAATGT~AATAAAGATGCCGTCACAGATAGAlTGGCTTCAGTGGACTGATATGCCTCTAACA~G 1576

Fig. 1. Sequence of the rape IMDH cDNA and flanking parts of the cloning vector pHR50. The Eco RI and Xho I sites at the

cloningjunctions, the GAPDH TATA box, and a polyadenylation signal in the cDNA are underlined. Also underlined is the putative chloroplast transit peptide in the IMDH protein sequence. Nucleotides are numbered from the 5' end of the cDNA. Eco RI, but absent f r o m purified chloroplast and mitochondrial D N A (Fig. 2). We conclude that the gene is nuclear. The presence of several b a n d s was unexpected, since there are no E c o R I sites within the probe. It could m e a n that the hybridizing region contains introns, or that m o r e than one gene is present in the rape genome. The latter explanation seems probable, since Brassica napus is a hybrid between B. campestris and B. oleraceae.

The protein encoded by the c D N A shows extensive similarity to I M D H f r o m yeast and bacteria, confirming that it is a rape homologue o f L E U 2 (Fig. 3). H o w e v e r , the predicted rape protein has a protruding amino terminus of 34 residues, when c o m p a r e d to the other I M D H sequences. This amino-terminal extension is rich in basic and hydroxylated amino acids, but lacks acidic residues, which suggests that it could be an organelle-targeting signal. In fact, the sequence has the characteristics of a chloroplast transit peptide [43]. Such peptides almost invariably have one or several alanines following the first

Fig. 2. Southern blot with the cloned rape IMDH cDNA. Rape DNA was digested with Eco RI, seprarated on a 0.7% agarose gel, and hybridized to a 344 bp internal Sty I fragment

from the IMDH cDNA. Lane T, 10/tg of total rape DNA; lane C, purified chloroplast DNA; lane M, purified mitochondrial DNA. The amount of organelle DNA in lanes C and M is comparable to the amount of organelle DNA in lane T.

methionine, a feature which is found also in the r a p e protein. Moreover, the sequence V / I - R - A / CSA is frequently found at the cleavage site in

561 I l I ~ l SCE (7-333) IWLPGDHVGQEITAEAIKVLKAISDVR~VK--FDFEHHLI GGAAIDATGVPLPDEALEASKKVDAVLLGAVBGP---IqlGTGS----VIIPEQGLLI(IRKELQLYANLRP CUT (6-332) IWLPGDHVGTEITAEAII(YLKAIEEVKPEIK--FNFQHHLI GGAAIDATGVPLPDDALE~KKADAVLLGAVGGP---~GTGA----V~EQGLLKIRKELRLYAHLRP RP CMA (S-339) ITI LPGDHVGTEI~EAII(YLEAIEAATPY~IHFDFKHHI.IGGAAIDATGVPLPDDALESAIOISDAVLLGAV~GP---KWGTGA----LRPEQGLLKIRI(ELHLYABI IGGAAIEKEGEPITDATLDICPJ(AI)SIML6AV~NTVWTTPDGIITOVIIPEQGLLKLRKDLBLY~L~ YLI (15-345) IVLLGGDFCGPEVIAF.AVIO/LKSVAEASGTE---FVFEDRL DAYffI'PLTDEIYK~CLEAI)@VLtGAVGGP--ElfINPII----CRPEIIGLLKLRKS~L~ sPo 0-33o) IWLPGDHIGPEIVASALEVLIO/VEI(KRPELK--LEFEE~IGGASI I GAIGGY~I(WDI(HEKH--L~UGLLQL~GLKVF~LRP B~A (45-3~8) ITLLPGDGIGPEYISIAIOIVLQQAGSLEGLE---FSFQEMPVfiGAALDI.VGVPLPEETVS~I(ESDAVI IBC4~AIDEHHNPLPEEI'V~CKNADAILLGAVGGP---I(WDQNLSE--LRPEKGLLS IRKQLDLFANLRP BSU (5-33Z) IALLPGDGIGPEVLESAI'DYLI(SVAERFNHE---FEFEYGL BCA (B-3Z4) IAVLPGDGIGKEVI'SGAVEVI.KAVGIRFGHE---FTFEYGL16BAAIDEAGTPtPEETVI~LOIESDAVLLGAVGGP---~DDNPPH--LRPEKGLLAIRKQLDLYANLRP I 6BAAIDEAGTPLPEETLDICRRSDAILLGAVBGP---I(WI)HRPAS--LRPEKGLLGLIIKEMGLFANLRP BCO 0-335) LAVLPGDGIGPEVI4)AAIRVLICIVLDBDGHE~AVFEHAL DAS-APFPEPTRKGVEEAEAVLLGSVBGP---XWDGLPRK--IRPETGLLSLIIKSXI)LFAHLRP TrH (3-315) VAVLPGDGIGPEVI'EAALICVLRALDEAEG-G---PTYEVFPFB~AAI ATU (6-335) LFLLPGDGIEPEAMTEVRKLIETM~SAHNA~--FTVSE~LVr~P~SAYDAHc~VkISDADMEKALAADAILF~VGGP--K~D~VPYE-HRPEA~LLRLRKDLELFA~LRP I VLPGDGIGPEV A KVLK F FE LIGGAAIDAfi PLPDET AI]AVLLGAVGGP I(Wl] RPE GLL LRK L L ARLRP CON I

!

SCE CUT CMA YLI SPO ~k BSU BCA BCO TTH ATU CON

~NFASDSLLDLSPI KPQFAK~TDFVVVRELV~IYF~KRKED--D~D~-VAwDSEQYIVPEVQRITRMAkFMA~QHEPPLPIw~LDKANVLASSRL~/RKTVEETIKHEFPTLKVQHQLI CNFASESLLDLSPI KA~VVK~TDFVVVRELV~IYFBERKED--DGS~-VASDTETYSVPEV~RIT~WAAF~ALQHNPPLPIwSLDKA~VLASSRLNRKVVTETIEKEFPQLTVQHQLI CNFASDSLLELSPLRPEVVKBT~ I IVIIELVnGIYFGDREEQEESADK-QTA~TEKYTVDEVTRITI~AAFMALQHTPPLP IWSLDKAHVLASSRLWRRT~ISEEF P ~ Q L I CQLLSPKLADLSPI R-~VEGTDFIIVIIELVGGIYF6ERKED--DGSG--VASDTETYSVPEVER IAP,MAAFLALQHNPPLPVWSLB(ARVLASSRLWRKTVTRVLKDE FPQLELHHQLI CBFASKSLVKYSPLKPEI VEGVDFCYVRELTGGCYFGERTED--RGSG--YAMDTWPYSLEEVSR IARLAAWLAETSRPPAPVI'LLDKARVLATSRLWB~VAKI FKEEYP~TLKRQLI -ATVLPQLVDASTLKREVAEGVDLMVVRELTGG IYFGVPRGI KTREHGEEVGYNTEVYAAHE IDRIARVAFETA--RKRR~LCSVDKABVLDASl LWRRRVTAL-AAEYPDVELSHMYV -VKVFESLSDRSPLKKEYIDBVDFVIVRELTGGLYFBOPSKRYVRTEGEQEAVDTLFYKRTE I ERVIREGFKMA--ATRKGKVTSVDKANVLESSRLWREVAEDV-AQE FPDVKLEHMLV -WCYDSLVSRSPLKPDLVQGVDFVIVRELTGG IYFBQPSAV--VEHGEEKAVDTLLYKKEE IERIyRMAFELA--RGRRKKVI'SVDKARVLSSSRLWREVAEEV-ANE FPDVTLE~LV -VKAYATLLHASPLKRERVEBVDLVIVIIELTGGLYFGRPSER--RGPBEHEVVDTLAYTREE I ERII EKAFQI.A--QIRRKKLASVDKARVLESSR~RE I AEET-AKKYPDVELSHMLV -AKVFPGLERLSPLKEEIARGVDVLIYRELTGGIYFGEP . . . . RGMSEAEA~TERYSKPEVERVARVAF EAA--RKRRKHVVSVDIOVIVLEVGEFRRkq'VEEV-GRGYPDVAL EHQYV -AI CYPALAAASSLKPELVEGLDILIVRELTGBVYFGEPKQI I -DGHG~RGIDTQIYDTFEIEIIIASYAFELA--RSRDRRVCSMEKRN~KSGVLWHQVVTETHAAKYKDVQLEHMLA SL SPLKEVGVDFIVRELTGGIYFGPE GG ADTEY EVERIRAF A R VSDKANVLSSRLWR VE EFPDVLHQL

SCE CUT C~ YLI SPO BNA BSU BCA BCO TTH ATU COB

D$A~ILVKHPTHLH~II IT~MFGDII SDEASVIPrSLfiLLPSASLASLPD--KIqTAFGLYEPCHrSAPDL-PKNK~PIATI LS~MMLKLSLNLPEEG~IEI~VKKVL DSAAMILIKYPTQLHGIVITSHMFGDI ISDEASVIPG~LGLLPS~SLASLPD--SH~(AFGLYEPCH~SAPDL-PAHKV~PIATILSAAR~-KLSLDLYEEGVAVETAVKQVL DSAAMILI~P~KLNGIII~SHMFGDIISDEkSVIPGS~GL~PSASLASLPD--TNTkFGLYEPCHGSAPDL-PkNKVNPIkTI~SAA~1LRLSLDCVKEAEALEEAVKQVL OSA/~ILIKQPSKMHGI1IT~MFGDIISDEASVIPGSLGLLPSASLASLPD--THEAFGLYEPCHGSAPDL-GKQKVHPIATI LS/~MMLKFSLN~(PAGDAVE~VKESV DSAAMLLVKSPRTLHGWLTDHLFGDI ISDEASVIPGSLGLLPSASLSGWGKSEE~HCLYEPI HGSAPDIAGKGIVNPVGTILSASLLLRYGLHAPKEAEAI E~VRKVL DNA/~OLVRDPKQFD-TI~HH I FGDILSDEASMITGSI&~tLPSASLSD. . . . SGPGLFEPIHGSAPDIAGQDKANPLATILSAAMLLKYGLGEEKAAKRI Er]AV1.GAL DR~LIYAPHQFD-WVTEHMFGDILSDEASMLTG~G~-P~SLSS . . . . . SGLHLFEPVHGSAPDIAGK~,~PFAAI LSAAMLLRTSFGLE FEAKAVEOAVH~L D-MRMQL IRAPKQFD-VIVTEH~GOILSDEAS~_S@SL~LPSASLSA. . . . . . SGPSLYEPVHGSAPD IAIIWHKAHPIAAILS~kMMLRLSFGLTAEA ~GRARVWQAL DSTSMQL IANPGQFD-VIVTEN~GDILSDEASVITGSLI~LPSASLRS . . . . . . DRFGMYEPVHGSAPDIAGQGKAHPLGTYLSAALMLRYSFGLEKEAAA I EKAVDDVL DAMAMHLVRSPARFD-WVTGN I FGDILGNLRADLPGSLGLLPSASLGR . . . . . --GTPVFEPVHGSAPDYAGKGR-HPTAAI LSAAMMLE-QLRPBGLARKVEOAAKALL DAGI~IQLVRKPKQFD-VI VY~LFGDM~-SDVAAMLT~SLGMLPSASLGA-PDAKTGKRKA~YEPVHGSAPDIAGKSIA~PIA~IASFAMCLRYsFNMVDE~KLE~IANVL DSAAMLI P QFD VIVI" HMFBDILSDEASVIPGSLBLLPSASLS GLYEPHGSAPDIAGKK BPIATILSAAMMLRSL EA A E AV VL

I

I

Fig. 3. Alignment of I M D H sequences. Residues that are conserved in at least six species are shown below the alignment (CON), with those conserved in at least ten being underlined. The brackets mark variable regions not included in the phylogenetic analysis. The published Thermus sequence [23] has a frameshifted region of 63 nucleotides, as compared to the other sequences. The sequence shown in the alignment was derived from the shifted frame in this region. The beginning and end of the frameshift is marked by an X. The stars indicate three insertions that are unique to the eukaryotic proteins. Abbreviations: SCE, Saccharomyces cerevisiae; CUT, Candida utilis; CMA, Candida maltosa; YLI, Yarrowia lipolytica; SPO, Schizosaccharomyces pornbe; BNA, Brassica napus; BSU, Bacillus subtilis; BCA, Bacillus caldotenax; BCO, Bacillus coagulans, TTH, Thermus thermophilus; ATU, Agrobacterium

tumefaciens.

chloroplast transit peptides [ 15]. The extension of the rape I M D H ends with the sequence V-RC-A, which perfectly matches this motif (Fig. 1). As is seen in Fig. 4, the amino-terminal extension also has some sequence similarity to chloroplast transit peptides, in particular that of ribulose-l, 5-bisphosphate carboxylase. Taken together,

these similarities strongly suggest that the rape cDNA encodes a chloroplast protein. To confirm that the rape protein has a chloroplast transit peptide, we carried out an organelle import experiment. The c D N A was transcribed and translated in vitro, and the resulting protein was incubated with isolated pea chloroplasts. As

562

P

AVF~S PAA EVLGSIGRVITM

LHCP-II ATH

,--A

RUBISCOATH

L'-'B~I-~KL~AIAFPAT[RKLA~HNDITI TSH GGL~H~

Fig. 4. Alignment of the rape I M D H aminoterminus to the chloroplast transit peptides in the light-harvesting chlorophyll a/bbinding protein of photosystem II (LHCP-II) and the ribulose-l,5-bisphosphate carboxylase small subunit (RUBISCO) from Arabidopsis thaliana [26, 31]. Identical residues are enclosed within boxes. The dots mark an insertion of 20 residues in the RUBISCO sequence.

is shown in Fig. 5, the protein is imported into the chloroplasts, where it is protected from protease treatment. Concomitant with this, a processing occurs which reduces the apparent molecular mass from 52 kDa to 47 kDa. This is close to the expected change in molecular weight that would result from cleavage of the putative transit peptide. We conclude that the c D N A encodes a chloroplast protein. Genes encoding I M D H have so far been sequenced in five bacteria [20, 23, 44, 45, 48] and five fungi [ 1, 10, 17, 24, 50]. The I M D H sequences are highly homologous and can be aligned with a

Fig. 5. Chloroplast import of rape IMDH. The c D N A was transcribed and translated in vitro, and the resulting asSlabelled protein was incubated with isolated pea chloroplasts. The chloroplasts were subsequently purified and their protein content was analyzed by SDS/PAGE. Lane A, I M D H prior to incubation with chloroplasts; lane B, I M D H in purified chloroplasts; lane C, I M D H in purified chloroplasts treated with thermolysin. The numbers are molecular masses of marker proteins, in kDA.

small number of insertions and deletions. This makes them suitable for a phylogenetic analysis. For this purpose, we aligned the sequences and counted the number of amino acid substitutions in pairwise comparisons. To ensure a correct alignment, the variable regions marked by brackets in Fig. 3 were excluded. This comparison shows that the five fungal sequences are related to each other, as are the five bacterial proteins. Interestingly, the rape sequence is more similar to the Bacillus proteins (60-62~o identity) than to the fungal proteins (50-53 ~o identity). This similarity is also reflected in the pattern of insertions and deletions. Thus, eukaryotic and bacterial sequences differ in three positions where all eukaryotic proteins have insertions (Fig. 3). In all three positions, the rape I M D H sequence lacks the insertion, and is thus more similar to the bacterial proteins. To further elucidate the evolutionary position of the rape protein, the similarity scores were converted into evolutionary distances and used to construct phylogenetic trees (see Materials and methods). Figure 6 shows a tree obtained using the neighbour-joining method [40]. The FitchMargoliash method [14] produced a tree with identical topology and almost the same branch lengths. Within the tree, the eukaryotic sequences form a subtree with well-resolved branching order, where Candida utilis is closest to Saccharornyces, and Schizosaccharomyces most distant from the latter. The same topology was obtained when different subsets of the alignment were used for tree construction, indicating that the branching order of the eukaryotes is statistically significant. In the bacterial subtree, the three genera

563 Saccharomyces cerevisiae Candida utilis - Candida maltosa f arrowia lipolytica ,~hizosaccharomyces pombe Bacillus subtilis _~-~

Bacillus caldotenax Bacillus coagulans Brassica napus (chloroplast) Agrobacterium tumefaciens

I

0.1

I

Thermus thermophilus

Fig. 6. Phylogenetic tree for I M D H sequences. The tree was computed from the alignment in Fig. 3, using the neighbour-joining method [39]. The root was put arbitrarily midway between the two lowest branch points. Horizontal lengths are proportional to evolutionary distances [25]. The bar corresponds to a distance, d, of 10~o.

Agrobacterium, Bacillus and Thermus diverge from each other close to the root. The branching order of these three genera could not be conclusively determined, as different topologies were obtained with different subsets of the alignment. The tree confirms that the rape protein is more closely related to bacterial than to eukaryotic proteins. It forms a separate deep branch among the bacterial sequences, distinct from both Agrobacterium, Thermus and Bacillus.

Discussion

We have cloned a c D N A for rape chloroplast I M D H by complementation in yeast. Genetic complementation has been used extensively for cloning in unicellular organisms. Thus, the finding that the yeast HIS3 gene can complement the hisB mutation of Escherichia coli provided the starting point for molecular genetics in yeast [49]. More recently, both insect and mammalian genes have been cloned by complementation of yeast mutations [18, 30, 42]. Our extension of this method to plants provides an alternative to classical cloning strategies which has the advantage that it is independent of structural similarities. This is important, since plants are only distantly

related to other eukaryotes and therefore rather divergent in D N A and protein structure. The large number of yeast mutations available will make it possible to clone plant genes involved in all aspects of metabolism and cell physiology. I M D H is the second last enzyme in the leucine biosynthetic pathway. In yeast, I M D H differs from most other enzymes in this pathway in that it resides in the cytosol rather than in the mitochondria [39]. In plants, little is known about the synthesis of the branched-chain amino acids. Acetolactate synthase, the first enzyme in the pathway, is present in the chloroplasts [35], but it is not known where in the plant cell I M D H is located. However, chloroplasts possess a high degree of metabolic autonomy, and it is therefore likely that I M D H is found in chloroplasts as well as in the cytosol. Cytosolic and chloroplastic forms of the same enzyme are frequently encoded by different nuclear genes [ 16]. Since our cloned c D N A encodes a chloroplast protein, it is conceivable that a second rape gene may exist, encoding a cytosolic IMDH. The phylogenetic tree in Fig. 6 is consistent with findings, based on rRNA sequence data, that the fission yeast Schizosaccharomyces is very distantly related to the budding yeasts [ 8, 38]. Thus, the divergence of fission yeasts from budding

564 yeasts is thought to be quite ancient, probably contemporary with the split between fishes and land animals [38]. The four budding yeasts are much closer to each other (Fig. 6). In particular, Saccharomyces and Candida utilis are closely related, with 88 ~o identical sequences. It should be pointed out that current yeast taxonomy does not reflect phylogenetic relationships, which are largely unknown [4, 27]. Based on sequence similarity, it would be reasonable to assign Candida utilis to the genus Saccharomyces, Candida maltosa to a different genus in the same family, and Yarrowia lipolytica to a separate family. Among the bacteria, the three Bacillus species are clearly interrelated, but the genera Bacillus, Thermus and Agrobacterium diverge from each other close to the root of the tree, and their branching order could not be conclusively determined (Fig. 6). This is the expected result for these three genera. Agrobacterium belongs to the same phylum as the purple bacteria, Bacillus to the Gram-positive bacteria, and Thermus to a third phylum; the Deinococcous and its relatives [53]. Studies on 16S rRNA have shown that these bacterial phyla diverged at an early stage in evolution, with branch points too close together to allow a precise determination of branching orders [53]. The rape chloroplast I M D H is more similar to bacterial than to fungal proteins (Fig. 6). A likely explanation for this would be that the rape gene originally came from the chloroplast genome, which is evolutionary related to the bacteria [53]. It has previously been noted that some nuclearencoded chloroplast proteins are similar to bacterial proteins, and it was therefore suggested that the corresponding genes may have moved from the chloroplast to the nucleus during evolution [46]. This idea gained support from the finding that the tufA gene is chloroplastic in algae but nuclear in higher plants [3]. It is therefore conceivable that also the gene for chloroplast I M D H has moved from the chloroplast to the nucleus. The rape sequence is not closely related to any of the three bacterial phyla represented in Fig. 6. This is consistent with a chloroplast origin, since the chloroplasts derive from a separate bacterial phylum, the cyanobacteria [53].

The assumption that the rape I M D H gene originally came from the chloroplast genome could be confirmed in two ways. First, its sequence should be particularly similar to I M D H from cyanobacteria. Second, if another rape gene encoding a cytosolic form of the enzyme exists, this protein should be more similar to the fungal proteins.

Acknowledgements We thank Monika Carlberg and Anna Karin Tibell for excellent technical assistance, Rodney Rothstein for gifts of strains and plasmids, and Maria Landgren for a gift of chloroplast DNA. We are also grateful for the advice provided by Ulf Gyllensten, Gunnar von Heijne, Jonathan Napier and Jan Olof Nehlin on tree-building, targeting peptides, organelle import and protein synthesis, respectively. This work was supported by the Swedish Natural Research Council and the Swedish Research Council for Forestry and Agriculture.

References 1. Andreadis A, Hsu Y-P, Hermodson M, Kohlhaw G, Sehimmel P: Yeast LEU2: repression of mRNA levels by leucine and primary structure of the gene product. J Biol Chem 259:8059-8062 (1984). 2. Auffray C, Rougeon F: Purification of mouse immunoglobulin heavy-chain messenger RNAs from total myeloma tumor RNA. Eur J Biochem 107:303-314 (1980). 3. Baldauf S, Palmer JD: Evolutionary transfer of the chloroplast tufa gene to the nucleus. Nature 344:262-265 (1990). 4. Barnatt JA, Payne RW, Yarrow D: Yeasts: Classification and identification. Cambridge University Press, Cambridge, U K (1983). 5. Bartlett SG, Grossman AR, Chua N-H: In vitro synthesis and uptake of cytoplasmically-synthesized chloroplast proteins. In: Edelman M, Hallick RB, Chua N-H (eds) Methods in Chloroplast Molecular Biology, pp. 1081-1091. Elsevier Biomedical Press, Amsterdam (1982). 6. Boeke JD, LaCroute E, Fink GR: A positive selection for mutants lacking orotidine-5'-phosphate decarboxylase activity in yeast: 5-fluoro-orotic acid resistance. Mol Gen Genet 197:345-346 (1984). 7. Brent R, Ptashne M: A eukaryotic transcriptional activator bearing the DNA specificity of a prokaryotic repressor. Cell 43:729-736 (1985).

565 8. Cedergren R, Gray MW, Abel Y, SankoffD: The evolutionary relationships among known life forms. J Mol Evol 28:98-112 (1988). 9. Cline K, Werner-Washburne M, Lubben TH, Keegstra K: Precursors to two nuclear-encoded chloroplast proteins bind to the outer envelope membrane before being imported into chloroplast. J Biol Chem 260:3691-3696 (1985). 10. Davidow LS, Kaczmarek FS, DeZeeuw JR, Conlon SW, Lauth MR, Pereira DA, Franke AE: The Yarrowia lipolytica LEU2 gene. Curr Genet 11:377-383 (1987). 11. Elledge SJ, Mulligan JT, Ramer SW, Spottswood M, Davis RW: 2YES: A multifunctional c D N A expression vector for the isolation of genes by complementation of yeast and Escherichia coli mutations: Proc Natl Acad Sci USA 88:1731-1735 (1991). 12. Ericson ML, ROdin J, Lenman M, Glimelius K, Josefsson LG, Rask L: Structure of the rapeseed 1.7 S storage protein, napin, and its precursor. J Biol Chem 261: 14576-14581 (1986). 13. Felsenstein J: Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783-791 (1985). 14. Fitch WM, Margoliash E: Construction of phylogenetic trees. Science 155:279-284 (1967). 15. Gavel Y, von Hejne G: A conserved cleavage-site motif in chloroplast transit peptides. FEBS Lett 261:455-458 (1990). 16. Gottlieb LD: Conservation and duplication of isozymes in plants. Science 216:373-380 (1982). 17. Hamasawa K, Kobayashi Y, Harada S, Yoda K, Yamasaki M, Tamura G: Molecular cloning and nucleotide sequence of the 3-isopropylmalate dehydrogenase gene of Candida utilis. J Gen Microbiol 133:1089-1097 (1987). 18. Heinkoff S, Tatchell K, Hall BD, Nasmyth KA: Isolation of a gene from Drosophila by complementation in yeast. Nature 289:33-37 (1981). 19. Huynh TV, Young RA, Dawis RW: Constructing and screening cDNA libraries in 2gtl0 and 2gtl 1. In: Glover DM (eds) DNA Cloning vol. 1. IRL Press, Oxford (1985). 20. Imai R, Sekiguchi T, Nosoh Y, Tsuda K: The nucleotide sequence of 3-isopropylmalate dehydrogenase from Bacillus subtilis. Nucl Acids Res 15:4988 (1987). 21. Johnston SA, Salmeron JM Jr, Dincher SS: Interaction of positive and negative regulatory proteins in the galactose regulon of yeast. Cell 50:143-146 (1987). 22. Josefsson LG, Lenman M, Ericson ML, Rask L: Structure of a gene encoding the 1.7 S storage protein, napin, from Brassica napus. J Biol Chem 262:12196-12201 (1987). 23. Kagawa Y, Nojima H, Nukiwa N, Ishizuka M, Nakajima T, Yasuhara T, Tanak T, Oshima T: High guanine plus cytosine content in the third letter codons of an extreme thermophile. DNA sequence of the isopropylmalate dehydrogenase of Thermus thermophilus. J Biol Chem 259: 2856-2960 (1984). 24. Kikuchi Y, Kitazawa Y, Shimatake H, Yamamoto M:

The primary structure of the leul + gene of Schizosaccharomyces pombe. Curr Genet 14:375-379 (1988). 25. Kimura M: The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge, U K (1983). 26. Krebbers E, SeurinckJ, Herides L, CashmoreAR, Timko MP: Four genes in two diverged subfamilies encode the ribulose-l,5-bisphosphate carboxylase small subunit polypeptides of Arabidopsis thaliana. Plant Mol Biol 11: 745-759 (1988). 27. Kreger-van Rij NJW: General classification of the yeasts. In: Kreger-van Rij NJW (ed) The Yeasts. Elsevier, Amsterdam (1984). 28. Kvist S, Wiman K, Claesson L, Peterson PA, Dobberstein B: Membrane insertion and oligomeric assembly of HLA-DR histocompatibility antigens. Cell 2 9 : 6 1 - 6 9 (1982). 29. Kurtz DT, Nicodemus CF: Cloning of a2~ globulin cDNA using a high efficiency technique for the cloning of trace messenger RNAs. Gene 13:145-152 (1981). 30. LeeM, NurseP: Complementationused toclone ahuman homologue of the fission yeast cell cycle control gene cdc2 ÷ . Nature 327:31-35 (1987). 31. Leutwiler LS, Meyerowitz EM, Tobin EM: Structure and expression of three light-harvesting chlorophyl a/bbinding protein genes in Arabidopsis thaliana. Nucl Acids Res 14:4051-4064 (1986). 32. Maniatis T, Fritsch EF, Sambrook J: Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY (1982). 33. Mariani P, Johansson M, Ellergren H, Harbitz I, Juneja RK, Andersson L: Multiple RFLPs in the porcine calcium release channel gene (CRC): assignment to the halothane (HAL) linkage group. Animal Genetics, in press. 34. McKnight GL, McConaughy BL: Selection of functional cDNAs by complementation in yeast. Proc Natl Acad Sci USA 80:4412-4416 (1983). 35. Miflin BJ: The location of nitrate reductase and other enzymes related to amino acid biosynthesis in the plastids of root and leaves. Plant Physiol 54:550-555 (1974). 36. Nehlin JO, Carlberg M, Ronne H: Yeast galactose permease is related to yeast and mammalian glucose transporters. Gene 85:313-319 (1989). 37. Nehlin JO, Ronne H: Yeast MIG1 repressor is related to the mammalian early growth response and Wilms tumour finger proteins. EMBO J 9:2891-2898 (1990). 38. Qu L-H, Nicoloso M, Bachellerie J-P: Phylogenetic calibration of the 5' terminal domain of large rRNA achieved by determining twenty eukaryotic sequences. J Mol Evol 28:113-124 (1988). 39. Ryan ED, Tracy JW, Kohlhaw GB: Subcellular localization of leucine bio-synthetic enzymes in yeast. J Bact 116: 222-225 (1976). 40. Saitou N, Nei M: The neighbour-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406-425 (1987). 41. Schaber MD, DeChiera TM, Kramer RA: Yeast vectors

566

42.

43. 44.

45.

46.

for production of interferon. Meth Enzymol 119:416-423 (1986). Schild T, Brake AJ, Kiefer MC, Young D, Barr PJ: Cloning of three human multifunctional de novo purine biosynthetic genes by functional complementation of yeast mutations. Proc Natl Acad Sci USA 87:2916-2920 (1990). Schmidt GW, Mishkind ML: The transport of proteins into chloroplasts. Annu Rev Biochem 55:879-912 (1986). Sekiguchi T, Ortega-Cesana J, Nosoh Y, Ohashi S, Tsuda K, Shigenori K: DNA and amino-acid sequence of 3isopropylmalate dehydrogenase of Bacillus coagulans. Comparison with the enzymes ofSaccharomyces cerevisiae and Thermus thermophilus. Biochim Biophys Acta 867: 36-44 (1986). Sekiguchi T, Suda M, Ishii T, Nosoh Y, Tsuda K: The nucleotide sequence of 3-isopropylmalate dehydrogenase from Bacillus caldotenax. Nucl Acids Res 15:853 (1987). Shih M-C, Lazar G, Goodman HM: Evidence in favour of the symbiotic origin of chloroplasts: primary structure and evolution of tobacco giycer-aldehyde-3-phosphate dehydrogenases. Cell 47:73-80 (1986).

47. Smeekens S, Bauerie C, Hagrnan J, Keegestra K, Weisbeek P: The role of the transit peptide in the routing of precursors toward different chloroplast compartments. Cell 46:365-375 (1986). 48. Strizhov NI, Krukov VM, Buryanov JI, Bayev AA: Gen fl-izopropilmalat-degidrogenazi Agrobacterium tumefaciens C58: klonirovaniye i pervitjnaya struktura. Dokl Akad Nauk SSSR 288:481-487 (1986). 49. Struhl K, Cameron JR, Davis RW: Functional genetic expression of eukaryotic DNA in E. coil Proc Natl Acad Sci USA 73:1471-1475 (1976). 50. Takagi M, Kobayashi N, Sugimoto M, Fujii T, Watari J, Yano K: Nucleotide sequencing analysis of a L E U gene of Candida maltosa which complements leuB mutation of Escherichia coli and leu2 mutation of Saccharornyces cerevisiae. Curr Genet 11:451-457 (1987). 51. Thomas BJ, Rothstein R: Elevated recombination rates in transcriptionally active DNA. Cell 56:619-630 (1989). 52. Vieira J, Messing J: Production of single-stranded plasmid DNA. Meth Enzymol 153:3-11 (1987). 53. Woese CR, Bacterial evolution. Microbiol Rev 51: 221271 (1987).

Cloning of a cDNA for rape chloroplast 3-isopropylmalate dehydrogenase by genetic complementation in yeast.

Both insect and mammalian genes have previously been cloned by genetic complementation in yeast. In the present report, we show that the method can be...
915KB Sizes 0 Downloads 0 Views