Gene, 122 (1992) 255-261 0 1992 Elsevier Science Publishers

GENE

B.V. All rights reserved.

255

0378-1119/92/$05.00

06811

Structure of the gene encoding potato cytosolic pyruvate kinase (PCR; gene family; gene structure;

plants;

recombinant

DNA)

K.P. Cole, S.D. Blakeley and D.T. Dennis Department Received

of Biology, Queens University, Kingston, Ontario K7L 3N6, Canada

by R. Wu: 31 March

1992; Revised/Accepted:

15 July/l6

July 1992; Received

at publishers:

25 August

1992

SUMMARY

The polymerase chain reaction (PCR) has been used to generate a series of overlapping genomic clones representing 43 bp of 5’ untranslated sequence, 63 bp of 3’ untranslated sequence and the entire coding sequence of the gene encoding potato cytosolic pyruvate kinase (PI&). This portion of the gene is approximately 4.5 kb in length and is interrupted by three introns, one of which is present in the 5’ untranslated region. Southern blot analysis indicates that PK, is encoded by a small gene family, and sequence data from a number of PCR-derived genomic clones indicate that there are as many as six PK, genes. Sequence differences between the PCR-generated genomic clones and a PK, cDNA clone are discussed with respect to the fidelity of Taq polymerase. An alignment of intron placement in the potato PK, gene with intron placement in PK genes from other sources indicates that two of the potato introns correspond to intron positions in other species.

INTRODUCTION

Pyruvate kinases (PK) catalyse the conversion of phosphoenolpyruvate and ADP to pyruvate and ATP. The kinetic and physical properties of PK have been studied extensively in mammals. In addition, the mammalian genes for PK have been isolated, sequenced and their organization determined (Lone et al., 1986; Inoue et al., 1986; Cognet et al., 1987; Noguchi et al., 1986; 1987; Takenaka et al., 1989). These genes consist of 12 exons and 11 introns (Noguchi et al., 1986; 1987; Takenaka et al., 1989). Four isozymes have been identified, M,-, M,-, L- and R-PK, all of which are homotetrameric enzymes with subunits of approximately 60 kDa. The L- and R-type PK are tran-

Correspondence

to: Dr. D.T.

University, Kingston, Fax (613) 545-6617.

Ontario

Dennis,

Department

K7L 3N6, Canada.

of Biology,

Queens

Tel. (613) 545-6701;

Abbreviations: aa, amino acid(s); bp, base pair(s); kb, kilobase or 1000 bp; nt, nucleotide(s); PCR, polymerase chain reaction; PK, pyruvate kinase(s); PK, gene encoding triosephosphate

isomerase;

PK; PK,, cytosolic TPZ, gene encoding

PK; PK,, plastid PK; TPI, TPI.

scribed from a single gene using different promoters that insert either the first or second exon, producing polypeptides that differ only in a small region of their N terminus (Noguchi et al., 1987). The kinetic properties of these enzymes are, however, different (Imamura and Tanaka, 1982). The Mi- and M,-type PK are also transcribed from a single gene, but in this case different mRNAs are produced through alternative RNA splicing of the primary transcript (Noguchi et al., 1986). In plants, distinct PK isozymes are present in the plastid and the cytosolic compartments of the cell. These isozymes are physically, kinetically and immunologically distinct (DeLuca and Dennis, 1978; Ireland et al., 1979; Plaxton, 1989). They have been purified from a number of plant tissues and their subunit composition determined. The cytosolic isozyme isolated from developing endosperm or expanding leaf of the castor plant is a homotetramer with a 56-kDa subunit, similar to that of the mammalian enzymes (Plaxton, 1989). In contrast, the cytosolic isozyme from germinating castor endosperm is a heterotetramer consisting of two immunologically related 56- and 57-kDa subunits (Plaxton, 1988). The plastid isozyme of developing castor endosperm is also a heterotetramer, composed of

256 63.5 and 54-kDa, x- and P-subunits (Plaxton, 1991). A comparison of the cDNAs encoding potato cytosolic and castor plastid PK demonstrates that these isozymes are encoded by different messages derived from distinct genes

overlapping genomic fragments, These were cloned into pUCl18 or pUC119 and sequenced (Fig. 1). In some cases, the size of the PCR fragments was greater than predicted from the cDNA sequence, indicating the presence of in-

(Blakeley et al., 1990; 1991). The aa sequence of one subunit of potato cytosolic pyruvate kinase (PK,), deduced from the cDNA clone, indicates that the protein is composed of 510 aa residues (Blakeley et al., 1990). cDNA clones for plastid PK (PK,) from the castor plant indicate that this enzyme is more closely related to prokaryotic PK and is composed of two subunits with deduced sequences of 583 and 493 aa (Blakeley et al., 1991). This report describes the organization of the gene for

trons. A diagrammatic representation of the gene for potato PK, including 43 bp of 5’ untranslated sequence and 163 bp of 3’ untranslated sequence is shown in Fig. 1. This portion of the gene, which includes the coding regions and part of the untranslated sequence, is approximately 4.5 kb in length and is interrupted by three introns, one of which is present in the 5’ untranslated region. The sequence presented in Fig. 2 has been compiled from the overlapping PCR-generated clones. As more than one PK, gene is

potato PK, and compares it with PK genes from non-plant sources. The organization of the potato cytosolic PK gene was determined using PCR.

present in potatoes, this may not represent a single potato PK, gene. There were, however, no discrepancies from this organization in any of the PCR fragments sequenced.

RESULTS

(b) PK, is encoded by a small gene family Total genomic DNA was digested with EcoRI or HindII1, blotted and hybridized with the PK, cDNA insert (Fig. 3). Four bands were detected in the EcoRI digest and three in the Hind111 digest. The presence of multiple bands is indicative of a small gene family for potato PK,. There is no EcoRI restriction site present within the PK, cDNA clone (Blakeley et al., 1990), although a single bp change in exon III has created an EcoRI site in two of the PCRgenerated genomic clones. Only one Hind111 restriction site is present within the cDNA, and this is also found in the

AND DISCUSSION

(a) PCR amplification, cloning and sequencing of potato PK, genomic fragments The organization of the potato PK, gene was determined by the amplification of genomic DNA using the PCR. Oligodeoxyribonucleotide primers (17-mer) derived from the sequence of a potato PK, cDNA clone (Blakeley et al., 1990) and subsequently primers based on the sequence within introns were used in the PCR to produce a series of 0 kb b----

1 I

X*

Fig. 1. PCR products

of the potato

open boxes and lines, respectively.

2

3

4

-__-

BH

B

XH

B I’

K E** B X

PK, gene. The organization

of the potato PK, gene is illustrated

The hatched

untranslated

areas represent

regions

Z

P

in the upper part; exons and introns

of the gene. The boxes in the lower part represent

are shown by the fragments

generated by PCR and the stippled area, the portion of the fragment that was cloned. The PCR was performed using the GeneAmp kit from Perkin Elmer Cetus Co. (Nepean, Ontario). Reaction mixtures were prepared according to the manufacturers’ recommendations using 1 pg of genomic potato DNA (var. Kennebec) as template and primers derived from the cDNA clone for potato PK,. A restriction map is given above. Enzymes used include XbaI (X), BalnHI (BH), BglII (B), Hind111 (H), PstI (P), EcoRI (E), XhoI (Z). One asterisk indicates a site which is not present in all of the PCR-derived genomic clones; a double asterisk indicates a site which is present in the genomic clones but not in the cDNA clone. Oligodeoxynucleotide primers used in PCR or sequencing reactions were synthesized by the Queens University Oligonucleotide Synthesis Laboratory. Thermocycling consisted of an initial denaturation of 2 min at 94°C followed by 25-30 cycles of denaturation for 1 min at 94°C primer annealing for 1 min at 50°C and polymerization for 2 min at 72°C.

A final polymerization

step of 10 min at 72” C was performed

prior to storage

at 4°C.

251

AGCTTGATCTGTAGCTTTTGTTAGTGACTGgtgaggtgaggtCCttttCCCCatttttttttggCtatgttataCttgtggaaatCtttggtttttgtaaattttCattttggtgatattttCtt

120

gttcttggtcatttgggtatgtgatttagtgttctttttgttgatctgtagcttctctggttattggtgagaacccttttttttttttgctatatgatgcttctataacttgtgcaaatc

240

gttggttggtagtgaagttttcattggtgattttttcttgttttggacatttgggtatttgattagtgtcttttgtagcttgtagcttctcctggtgactggtgagaccttttgtttgct

360

aagtttttaattttttggtgttttttcttgattttggtgatttgagtatttgatttagtgttctttatctgaaattccagattgagaaatggaacaagattttttcttttgatgaacttg

480

ttatgcttctaggacttgtgcaaattcttggtttgttgtaaagctctctttttggtcatttgagtatttggtttagtgttcttttacttgatctgaagtcctagattgagaaatgacaca

600

agttctttgtgtttgatttcttccttgtcacatcgttgatgatgtttcattgtttcttcaatccccctttttttgtttatttgaaagttcaagacttcacttctatatttgttctcagag

720

aaagtgttttggtttaatttttttgttatggggagatgtgtgaagtcttgatgaagattcttccaatttcccttttatccatgctattaagttgtggaagcacattgatttcttgaatca

840

cagcactgtgaaaagaacatctaggtgatagaagggccctggaatccacatatctgttgtttatattccacttgtatgctatgttgtccgtctcttacaaaatgctgctggggggttgtc

960 1060

ttCaCtCaggagCtgCttgttgattgtctgctatagaatactaaatgtttgttcatattgttaaaaaggatacatggaagtttgcctttgtataagatcagtaccctgcctctatgtaaa 1200 agtaataagtgcagttctaatttccaaagtattttttttatttgctactcagatctaaaggaatcaaggtccttttctttttctgtacatactagcactttcttggtacaaattgatgaa 1320 atgtttgcttttgcttataaaaacatgtaaaagttttctgtaatgagctggttatatgggcttctgaagaagagtgggttttgtccttttgagcagAT~TAG~G~TGGCC~CAT MAN1

AGACATAGCTGGGATCATGAGATCTCCCAAATGATGATGGCCGTATTCC~GACCMGAT~TTTGCACGCTAGGGCCATCT~CTAG~CAGTACC~~CTGGAG~GCTTCTCCGTGC DIAGIMKDLPNDGRIPKTKIVCTLGPSSRTVFMLEKLLRA TGGCATGAACGTTGCCAGGTTTAACTTTTCTTTTCTCATGGGACCCATGAGTACCATCAGGAGACATTGGAC~TCTT~GAT~CTA~CAG~TACTCAGATCC~G~TGCTGTCATGCTTGA GMNVARFNFSHGTHEYHQETLDNLKI AMQNTQILCAVMLD CACCAAGgttcgattttgaagcatattttagttttatctgttactatttttcttctatgtcaacttggtgcttgtgcagatggtttgtaattttgagctggtcgttatttcagtgtggta T K tttgcttgaggattacctcaggcatagtttgcttacagtttccttgtcaaggatgctacatttcactacttacctttgagcatcctcccccccttcccgggagaattaggtttctcaaac

1440 4 1560 44 1680 84 1800 86 1920

tagaagagtagcctttctaatcgattcaggcaaggagtgttgatgtagtcgagagagagattctcagttcccttttgtggttattatatattcctgacatggagaatgaaagtcccaact 2040 . ctttagtctcaagaaggaccatatttcctacatagttcagaagagtgttatgatcttgttttatgtggtcgtaagcaatggccttcctttagtggtggagaagtgcagtattttgattgg 2160 gctagtctttgttaaccacaagataataagtttgtttgttcgataatagtatttgtagaaagtgagtaatctgtggtaatctagtacattccaaaccttaaaattgtgtaatgtgaggtggatc 2280 agattgttgacagcttgaataacacatgctaaggtttcacttgcctctgtttaccggtatggagctttttcacacgttagtaagatctctttttttttttttgctagttttccagttatg 2400 agccactgcaggcgccagcaaactttcttagctttctcctcatttttgccagatatttaagtaaagactttgatcttttatttactatgatgttttggtaatttataaggctcgacctgg 2520 aatcattgaatggaacaacatttttatt'ctctatttctgtttctagcaactgctgttttgtttctggttctcgatggttaatgaatttgatcttaaaatgctgctattataacaagtcaa2640 gagt~Ctggtt~~~~CCttCcattt~ttt~ttCtgC~tgtgCa~CtattCCgC~gCttCagtttgCttCCttgatgttggaCtaaCatatgttttttatgcagGGGCCTGAGATTCGTACTGGT 2760 GPEIRTG 93 TTCTTAACAGATGGAAAACCGATTCAGCTTAAGGAAGG~GGTC~G~TCAC~TATCCACAGACTATACCAT~GG~TG~G~TGATCTC~TGAGCTAT~G~GTTGGTAGTG 2880 FLTDGKPIQLKEGQEITVSTDYTIKGNEEMISMSYKKLVV 133 GACTTGAAGCCCGGCAATACCATCTTGTGTGCACATGCAGATGGTACCAT~CCCTTACTGTTTTGTCA~TGATCCACCGTC~G~CGGTGAGATGTCGCTGCGAGAATTC~CCACCTTAGGA 3000 DLKPGNTILCADGTITLTVLSCDPPSGTVRCRCENSATLG 173 GAGAGGAAGAATGTAAACCTTCCAGGTGTGGT~TGGACCTTCC~CACTTACAGAG~GGAT~G~GATATACTAGAG~GGGTGTTCCT~C~CAT~ATATGATAGCGCTTTCG 3120 ERKNVNLPGVVVDLPTLTEKDKEDILEWGVPNNIDMIALS 213 TTTGTGCGTAAGGGTTCAGATCTTGTCAATGTTC~TGTTCGC~GGCTCTTGGTCCACA~CC~GCGCATTC~CT~TGTC~GgtatgCagagtaatttagtgtgtagtttagtatggttga 3240 FVRKGSDLVNVRKALGPHAKRIQLMSK 253 attttgaagcatatgaattgttaaggagagaagaacataatacttggcaatggattggtgttttctagatttttacttcttcccgaacatgtctttttcatttatcactctctttttatt 3360 (tc) cttcttttttttgggggtgggggggggtgtgaatgtgggtaagatttttgggatggtgtgttattcaatgaggggggatgtccagtgtacaaacatttgagtgctaactagttctatggt 3480 attCaCtgatgttttgattttCtgaattgtcttgtCttgaagGTTG~CC~G~GGGGT~TC~CTTTGACG~TCCTTCGTGAGACAGATTCTT~TATGGTTGCTCGAGGTGATCTCG 3600 VENQEGVINFDEILRETDSFMVARGDL 267 GAATGGAAATTCCAGTTGAGAAGATTTTTCT~GCTCAG~~A~ATATAC~G~T~TCTTGCTGGC~GC~TGGT~CTGCCACTCAGATGC~~~TC~~ATC~GTCTC 3720 GMEIPVEKIFLAQKMMIYKCNLAGKAVVTATQMLESMIKS 307 CACGACCCACCCGTGCTGAGGCTACTGATGTGGCT~TGC~TCTTGGATGGCACTGATTGTGTTA~TT~GTGGGGAGAGTGCAGCTGGTGCTTATCCTGAGCTGGCAGT~TCA 3840 PRPTRAEATDVANAVLDGTDCVMLSGESAAGAYPELAVKI 341 TGTCACGAATTTGCATTGAGGCAGAGTCTTCACTTGAC~TGAGGCTATCTTC~GG~~ATCAGGTGTACCCCGC~CC~TGAGCCCAT~GAGAG~TTGCATCATCTGCTGTCC 3960 MSRICIEAESSLDNEAIFKEMIRCTPLPMSPLESLASSAV 387 GTACAGCTAACAAAGCTAGAGCAAAACTCATTGTTGTCATTGTTGTCC~ACACG~GTGGGAGTACAGC~GC~GTTGCC~GTATAGGCCTGCAGTTCCTATTC~TCAGTAGTCGTGCCTGTTT 4080 RTANKARAKLIVVLTRGGSTAKLVAKYRPAVPILSVVVPV 427 TGACTACAGACTCTTTTGATTGGTCCATCAGCGACGAGACCCCAGCTAGACACAGTTTAGTATATAGGGGCTTGATTCCTCTTCT~GTG~GGTTCTGC~GGCCACTG TTCTGAAT 4200 LTTDSFDWSISDETPARHSLVYRGLIPLLGEGSAKAT%SE 467 CAACTGAGGTAATCCTTGAAGCGGCCCTGAAGTCTGCCGTCTGCCGT~CGAGAGGGCTA~C~CCTGGTGATGC~TCGTGGCACTTCATCGTAT~GTTCTGCATCCGTTATC~GATT~CG 4320 STEVILEAALKSAVTRGLCKPGDAVVALHRIGSASVIKIC 507 TCTTGAAGTAATCGTCGTGTCACATAACATACATAC~TCTTG~CTCCCTCCACCTGAGCTCAGACTGATTTTCATTTATGCTTTC~GTCT~AT~TGCATTATT~TA~CTGATTTTG 4440 VV K TCACAATTGTCTTAGGATATCTAGTATTATCACCAAGGATTCT 4509

Fig. 2. Nucleotide

sequence

PCR amplification

are indicated

of the potato by arrows

PK, gene. Exons and introns below the sequence.

are indicated

An EcoRI restriction

by upper-

and lower-case

site was created

letters, respectively.

in the PCR primer, located

Primers

used in the

between nt 3387 and

3393, by replacing nt 3395 and 3396 with those nt shown in parentheses above the sequence. The deduced aa sequence is shown in one letter code beneath the nt sequence. Those nt and aa present in the PCR-generated sequence which differ from that in the PK, cDNA (Blakeley et al., 1990) are indicated by bold type. The amplified DNA fragments were cloned into the plasmids pUCll8 and pUCl19 and sequenced by the dideoxy-chain termination procedure (Sanger et al., 1977) using [a-35S]dATP as label. Sequencing reactions were performed using the T7 polymerase kit supplied by Promega (Madison, Nucleotide potato

WI) or the T7 polymerase Sequence

PK, cDNA

Databases

kit from Pharmacia

under the following

clone has the accession

(Dorval,

accession

Quebec).

Nos.: Intron

No. X53688 (Blakeley

The nt sequence I, Z11964;

Intron

data will appear II, Z11969;

Intron

in the EMBL, III, Z11970.

GenBank

and DDBJ

The nt sequence

of the

et al., 1990).

PCR-derived genomic clones. The multiple bands on the Southern blot are not, therefore, likely to represent fragments of a single gene. The presence of multiple genes for potato PK, is supported by sequence analysis of the PCR-generated genomic

clones. A comparison of the coding sequence from the exons of the PCR-generated genomic clones and the PK, cDNA revealed 20 bp changes, affecting 19 codons (Table I). This could be a result of multiple, slightly different copies of the PK, gene or due to errors introduced by the PCR-

258

E

TABLE

H

Nucleotide

I changes

in the coding region of the potato

result of these changes

on the encoded

PK, gene and the

aa

kb

9.5-

O.Q-

Wild-type codon in cDNA”

Codon change detected by PCRb

GGA

GGG

(Gly8)

aa change

(Gly8)

Silent

T T A (Let?*)

C T A (Le@)

Silent

GTG

(Va135)

GTA

(Val”)

Silent

ATG

(Met133)

GTG

(Va1’33)

Conservative

ATT

(Ilei4i)

AT C (Ile’41)

Silent

ACT

(Thr’@)

GTT GAT

(Val”‘) (Asp=i)

TCT (Ser’69) GC T (Ala”‘)

Conservative Non-conservative

GCA

(Ala309)

Silent

GAC CGA

(Asp’=) (Arg309)

AT C (Ile3’i) AA C (Asn361)

ATT

(Ile35’)

Non-conservative Silent

AAT

(Ansj”)

Silent

C CA (Pro373)

C CG (Pro373)

Silent

TCA

(Ser3s5)

T CT (Ser385)

Silent

ACG

(Thr389)

ACA

(Thr389)

Silent

GGC (Gly@‘) AC C (Thr429)

GGT

(Gly404)

Silent

ACT

(Thr429)

Silent

T T C (Phe433)

T T T (Phe433)

Silent

T T C (Leu447)

T T A (Leu447)

Silent

C CA (Pro454)

C CT (Pro454)

Silent

a The codons

and aa listed refer to those deduced

from the PK, cDNA

(Blakely et al., 1990); Fig. 3. Southern potato

blot analysis

PK, cDNA

of potato

genomic

DNA probed

insert. DNA (10 pg) was digested

with Hind111 (H)

or EcoRI (E), separated

on a 0.8% agarose

gel, transferred

nylon membrane

Dorval,

and subsequently

(ICN,

with random-primer-labelled

Quebec),

probes

(Promega,

with the

Madison,

b Codons which differ in the PCR-generated clones from those in the cDNA clone (Blakeley et al., 1990). These correspond to the nt and aa sequence

presented

in Fig. 2.

to a Biotrans hybridized

WI).

amplification procedure. Four of the bp changes appear in multiple independent clones, of which one is a nonconservative substitution in the codon for Ala309 (see next paragraph). These are likely to be genuine sequence differences between genes. Of the 16 remaining differences that were found and that are possible PCR errors, 13 are silent, two are conservative, and only one (in the codon for Va1227) results in a non-conservative change of the encoded aa (Table I). The codon present as GCA (Ala309) in the potato cDNA appears as CGA (Arg) in three genomic clones produced from two PCR reactions. Since an Arg residue appears in this position in the castor oil plant, rat M1, rat L and yeast PK, it is likely that this is an acceptable substitution at this position and suggests that the bp change does not result from a PCR error (Blakeley et al., 1990). A bp change detected in two genomic clones from the same amplification reaction occurs in the second position of the codon for Va1227in the cDNA clone and results in a non-conservative aa substitution in which Val is replaced by Ala. In this position of the potato, rat M, and yeast PK a Val is also present, whereas castor oil plant PK, has a Lys, and the rat L-type PK has an Ala (Blakeley et al., 1990). The vari-

ability in the aa at this position from PK of different sources suggests that this change may also reflect a genuine difference between genes rather than a PCR-induced error. One oligodeoxyribonucleotide primer used in PCR amplification overlaps the Leu2’ codon in which a bp change has been detected. The PCR product that was sequenced and contained this bp change was produced using a more distal primer and not the primer directly overlapping this codon. Thus, despite the problems associated with the errors introduced by Tuq polymerase during PCR, the data indicate that the technique has allowed the detection of multiple forms of the gene for PK,. Sequence differences within introns of PCR-generated genomic clones are more extensive. In addition to bp changes, numerous deletions or insertions of as much as 172 bp were found (data not shown). Five PCR-generated genomic fragments containing intron I all appear to be distinct from one another based on substantial sequence variations within the intron. There are no examples of sequence deletions or insertions within the exons of the gene. These differences within intron I are, therefore, unlikely to be the result of polymerase error but reflect differences in the template DNA. These data and the differences described above imply that there may be as many as six genes encoding PK, in potato. This is in contrast to yeast and

259 Aspergillus nidulans where PK is encoded

translated

(Burke

of the gene. Plant introns typically contain an A+T content that is higher than the adjacent exons, and this elevated A+T content is important for efficient pre-mRNA processing (Wiebauer et al., 1988). All the PK, exons have a 56% A+T content, whereas introns I and II have an A+T content of 65 % and 63 %, respectively. In contrast, intron III has an A+T content of only 56%, the same as the surrounding coding sequence. Three PCR-generated genomic clones of this intron, however, obtained from independent PCR amplifications, have a deletion which removes 57 bp of G-rich

et al., 1983; de GraaE

mammals

by a single gene and Visser, 1988) and to

where the four PK isozymes

are encoded by two

genes. In mammals, one gene produces the M,- and M,-types of PK by alternative splicing (Noguchi et al., 1986), whereas the second gene produces the L- and R-isozymes of PK using different transcriptional promoters (Noguchi et al., 1987). The sequences of PK, and PK, cDNA clones from tobacco and castor oil plant demonstrate that the isozymes are also derived from separate genes (S.D.B. and D.T.D., in preparation; Blakeley et al., 1991). The genomic organization that gives rise to the two subunits of the heterotetrameric plant PK, or the multisubunit PK, (Plaxton, 1988; 1991) is still unknown. The possibility exists, however, that the two closely related subunits of PK, are also produced from a single gene by a splicing mechanism rather than from separate genes since castor PK, a and /I cDNA clones (Blakeley et al., 1991) are identical except for their extreme 5’ region, and Southern blot data for castor PK, have indicated the presence of a single gene for PK, (Blakeley et al., 1991). (c) Exon-intron junctions The exon-intron splice junctions of the three potato PK, introns (Table II) resemble the consensus sequence of plant splice junctions (Brown, 1986). The 5’ splice site of intron I, which occurs in the 5’ untranslated region, forms part of a 12 bp direct repeat that is also found 307 bp within the intron. This direct repeat contains the upstream 7 nt and downstream 5 nt of what appears to be a functional splice site based on the cDNA sequence. Whether or not the second site can be recognized and utilized by the splicing mechanism is unknown, its use would lengthen the 5’ un-

sequence

but would not alter the coding region

sequence, resulting in an A+T content of 67 7~. Whether the presence of this G-rich sequence affects the splicing of this intron or whether the presence of exon-intron splice junction signals is sufficient to allow its efficient removal is unknown. (d) Alignment of the introns from the potato PK, gene with those of other PK genes An alignment of intron placement within the potato PK, gene with intron placement in PK genes from other sources is presented in Fig. 4. Intron I of the potato gene is located within the 5’ untranslated region, 14 nt upstream from the translational start codon. The rat and chicken M-type PK genes also contain an intron in the 5’ untranslated region at positions -13 and -19 from the ATG codon, respectively. Intron II of the potato PK, gene aligns to a location 3 nt from intron D of the A. nidulans PK gene but is 33 nt from the nearest vertebrate intron. The third intron of the potato gene has no counterpart in the A. nidulans or vertebrate genes. The position of the eleven introns of rat L (Cognet et al., 1987; Noguchi et al., 1987) and M (Takena-

Potato Cytosolic

TABLE

II

Exon-intron junctions pyrimidine nucleotide) Intron a

of the potato

5’ Splice junctionb

PK, gene (the letter Y represents

a

3’ Splice junction”

I

CTG:GTGAGG

TGTCCTTTTGAGCAG:AT

II

AAG:GTTCGA

ATGTTTTTTATGCAG:GG

III

AAG:GTATGC

GAAATTGTCTTGAAG:GT

Published consensusd

~AG:GTAAGT

TTT;TT;$;;TGCAG:GT

200

a Introns of the potato cytosolic PK, gene. Refer to Fig. 2. b The last 3 nt of the exon sequence and first 6 nt of the intron sequence

Fig. 4. Alignment

of the gene for potato

PK, with genes for PK from

be-

Aspergillus (de GraalTand Visser, 1988), rat M-PK (Takenakaet al., 1989), rat L-PK (Cognet et al., 1987; Noguchi et al., 1987) rat R-PK (Noguchi et al., 1987), chicken M-PK (Lonberg and Gilbert, 1985) and yeast (Burke

’ The last 15 nt of the intron sequence and first 2 nt of the exon sequence are presented as a 3’ splice junction. The colon marks the junction between intron and exon sequences. d See Brown (1986).

et al., 1983). Wide boxes are used to represent the PK cDNAs. The narrow boxes represent the 5’ untranslated portions of cDNA clones. The cDNAs were aligned based on their aa sequences, and the intron locations are indicated by the Y symbols.

are presented as a 5’ splice junction. tween exon and intron sequences.

The colon marks

the junction

260 ka et al., 1989) and of ten introns of chicken M (Lonberg and Gilbert, 1985) is highly conserved, as is the case with three of the seven introns of A. nidulans (de Graaff and Visser, 1988). In contrast, the yeast gene has no introns (Burke et al., 1983). This alignment of introns demonstrates that the potato PK, gene is less complex with respect to the number of introns than the animal or fungal PK genes, but more complex than that of yeast. This simple structure is not typical of all plant genes. For example, the gene for triosephosphate isomerase (TPI) from maize has eight introns compared with six in the TPI gene in chicken (Marchionni and Gilbert, 1986). Of the six introns in the chicken gene, five are located in exactly the same position as the maize gene, and the remaining one is displaced only 9 nt (Marchionni and Gilbert, 1986). Of the five introns in the A. nidulans TPI gene, one is conserved in a position identical to that of maize and chicken introns, whereas the first and last introns of the fungal gene are shifted by only 1 and 7 nt, respectively, from that of the maize gene (McKnight et al., 1986). Two of the fungal introns have no counterpart in the maize or chicken gene (McKnight et al., 1986). The pattern of intron conservation in the plant, animal and fungal TPI genes suggest that the ancestral gene was interrupted by introns prior to their divergence rather than there being numerous intron insertions at identical positions after the organisms had diverged during evolution (Marchionni and Gilbert, 1986; McKnight et al., 1986). The genes encoding PK from different sources also show conservation of intron location and, as with TPI, suggest intron loss. It is improbable that the two introns which are located within the coding sequence of the potato PK, gene have arisen as a result of random intron insertions since both these introns lie within regions of highly conserved aa sequences, which are unlikely to have tolerated such an event. In addition, these introns lie in regions which separate units of protein secondary structure. One model of gene evolution (Gilbert, 1978) postulates that original genes were assembled by recombination events which brought together useful units of protein structure and that introns are the remnants of this process. Secondary structure analysis using the method of Garnier et al. (1978) demonstrates that intron II is positioned between two P-sheets in the plant enzyme (data not shown) and aligns to a location separating regions identified by crystallographic analysis of the cat PK enzyme as domain A, P-sheet and domain B, p-sheet (Muirhead et al., 1986). Intron III also separates regions predicted to be P-sheet and a-helix in the plant enzyme and aligns to a similar location in the cat enzyme. The presence of these introns in the plant gene is more consistent with the hypothesis of intron loss from ancestral genes rather than their introduction by intron insertion.

(e) Conclusions (I) The potato PK, gene is encoded by a small gene family consisting of as many as six genes. Three introns are present, one of which interrupts the 5’ untranslated sequence, and the portion of the gene that has been analyzed is approximately 4.5 kb in length. (2) The structure of the PK, gene is less complex with respect to the number of introns than the corresponding gene from other eukaryotes, except for yeast. The placement of introns is highly conserved between mammalian, chicken and Aspevgillus PK, and two of the introns in the plant gene also correspond with intron positions in other sequenced PKs. The third plant intron is found in an unique location. This pattern of intron placement together with the observation that the plant introns lie in regions of highly conserved aa sequence support the hypothesis of an ancestral gene containing introns rather than a mechanism of intron insertion during the course of evolution. (3) The majority of nt differences between independently derived PCR fragments and the cDNA clone are not likely to be due to errors of the Taq polymerase-catalyzed reaction. The differences found in products of independent PCR reactions are, therefore, thought to result from genuine differences in the template DNA. The distribution of nt changes present in the coding sequence suggests that, in this case, Taq polymerase has faithfully duplicated genuine differences in genomic template rather than introducing random nt substitutions.

ACKNOWLEDGEMENT

This work has been supported National Science and Engineering Canada.

by a grant from the Research Council of

REFERENCES Blakeley,

S.D., Plaxton,

W.C. and Dennis,

ization of a cDNA for the cytosolic the relationship

D.T.: Cloning and character-

isozyme of plant pyruvate

between the plant and non-plant

kinase:

enzyme. Plant Mol.

Biol. 15 (1990) 665-669. Blakeley, S.D., Plaxton, W.C. and Dennis, D.T.: Relationship between the subunits of leucoplast pyruvate kinase from Ricinus communis and a comparison with the enzyme from other sources. Plant Physiol. 96 (1991) 1283-1288. Brown, J.W.S.: A catalogue of splice junction and putative branch point sequences from plant introns. Nucleic Acids Res. 14(24) (1986) 95499559. Burke, R.L., Tekamp-Olson, P. and Najarian, R.: The isolation, characterization, and sequence of the pyruvate kinase gene of Saccharomyces cerevisiae. J. Biol. Chem. 258 (1983) 2193-2201. Cognet, M., Lone, Y-C., Vaulont, S., Kahn, A. and Marie, J.: Structure of the rat L-type pyruvate

kinase gene. J. Mol. Biol. 196 (1987) ll-

261 de Graaff,

L. and Visser, J.: Structure

kinase gene. Cut-r. Genet. DeLuca,

of the Aspergillus nidulans pyruvate

14 (1988) 553-560.

V. and Dennis, D.T.: Isoenzyme

tids from developing

of pyruvate

castor bean endosperm.

kinase in proplas-

Plant Physiol. 61 (1978)

1037-1039. Gamier,

D.J. and Robson,

of simple methods

ture of globular

proteins.

K. and Tanaka,

ods Enzymol.

B.: Analysis

for predicting

of the accuracy

the secondary

struc-

J. Mol. Biol. 120 (1978) 97-120.

W.: Why genes in pieces. Nature

Imamura,

Biochem.

T.: Pyruvate

271 (1978) 501.

kinase isozymes

90 (1982) 150-165.

R.J., DeLuca,

V. and Dennis, D.T.: Isozymes

and chloroplasts.

Plant Physiol.

N. and Gilbert, W.: Intron/exon

Lone, Y-C., Simon, M-P., Kahn, and deduced aa sequences 195 (1986) 97-100. M. and Gilbert,

from maize: introns (1986) 133-141. McKnight,

Noguchi,

T., Inoue, H. and Tanaka,

of rat pyntvate

G.L., O’Hara,

H., Clayden,

Noguchi,

kinase

W.: The structure

of cat

J. 5 (1986) 475-481. T.: The Ml- and M2-type

are produced

T., Yamada,

isozymes

from the same gene by alterna-

Plaxton, Plaxton,

of pyruvate

kinase

Plaxton,

63 (1979) 903-907.

structure

of the chicken

A. and Marie, J.: Complete

pyn-

nucleotide

the plant-animal

isomerase

divergence.

gene

Cell 46

of pyruvate

Plant Physiol.

W.C.: Molecular

W.C.:

T.: The

are produced

from a

J. Biol. Chem. 262 (1987)

kinase from germinating

Leucoplast

pyruvate

termination

characterization

of plastid

endosperm

from developing degradation

castor

and oil

by a cysteine

97 (1991) 1334-1338.

S. and Coulson,

inhibitors.

kinase

of the enzyme’s

Plant Physiol.

castor

86 (1988) 1064-1069.

and immunological

Characterization

endopeptidase.

A.R.: DNA sequencing

Proc. Natl. Acad.

with chain

Sci. USA 74 (1977) 5463-

5461. Takenaka,

antedate

T. and Tanaka,

kinase

and cytosolic pyruvate kinase from castor-oil-plant leaf. Eur.J. Biochem. 181 (1989) 443-451. seeds.

kinase. FEBS Lett.

W.: The triosephosphate

of rat pyruvate

W.C.: Purification

Sanger, F., Nicklen,

of rat L-type pyruvate

K., Inoue, H., Matsuda,

L- and R-type isozymes

of rat Eur.J.

M., Noguchi,

Tanaka,

T., Inoue, H., Yamada,

T.: Rat pyruvate

characterization

K., Matsuda,

kinase M gene. Its complete

of the 5’-flanking

T. and

structure

and

region. J. Biol. Chem. 264 (1989)

2363-2367. P.J. and Parker,

M.L.: Nucleotide

sequence of

the triosephosphate isomerase gene from Aspevgillus niduhs: implications for a differential loss of introns. Cell 46 (1986) 143-147. Muirhead,

EMBO

bean endospenn.

vate kinase gene. Cell 40 (1985) 81-90.

Marchionni,

E. and Schmitt,

kinase.

14366-14371.

from rat. Meth-

154 (1986) 465-469.

in etioplasts Lonberg,

Schiltz,

single gene by use of different promoters.

Inoue, H., Noguchi, T. and Tanaka, T.: Complete aa sequence L-type pyruvate kinase deduced from the cDNA sequence. Ireland,

Gilmore,

tive splicing. J. Biol. Chem. 261 (1986) 13807-13812.

J., Osguthorpe,

and implications Gilbert,

L.A.,

muscle pyruvate

D.A.,

Barford,

D., Lorimer,

C.G.,

Fothergill-

Wiebauer,

K., Herrero,

processing

J.-J.

and Filipowicz,

W.: Nuclear

in plants: distinct modes of 3’-splice-site

and animals.

pre-mRNA

selection in plants

Mol. Cell. Biol. 8(2) (1988) 2042-2051.

Structure of the gene encoding potato cytosolic pyruvate kinase.

The polymerase chain reaction (PCR) has been used to generate a series of overlapping genomic clones representing 43 bp of 5' untranslated sequence, 6...
920KB Sizes 0 Downloads 0 Views