CurrentGenetics (1984) 8:629-634

Current Genetics © Springer-Verlag 1984

The Euglena gracilis chloroplast genome: structural features of a DNA region possibly carrying the single origin of DNA replication Bibiane Schlunegger and Erhard Stutz

Laboratoire de Biochimie,Universit6de Neuch~tel,Chantemerie 18, CH-2000 Neuch~tel, Switzerland

Summary. We sequenced a BglII-HindlII DNA fragment of the Euglena gracilis chloroplast genome which most likely carries the single origin of DNA replication according to recent electronmicroscopic mapping studies (Koller and Delius 1982a; Ravel-Chapuis et al. 1982). This DNA fragment contains a polymorphlc region (Schlunegger et al. 1983) which is composed, as will be shown, of multiple tandem repeats (54 bp, 87% A+T). Furthermore we located on this DNA fragment a short inverted repeat element (96 positions) observed in the electronmicroscopic studies (Koller and Delius 1982b). Between the borders of the polymorphic region and the nearby inverted repeat (distance of 179 positions) we retrieved an exact copy of parts of the rDNA leader (105 positions) including 49 positions of the chloroplast trnW gene. A computer search for bacterial type Oriregions did not reveal any significant sequence homology. However, the polymorphic region and its immediate vicinity have the capacity to form multiple stem and loop structures which may be involved in DNA replication initiation. Key words: Chloroplast DNA - Euglena gracilis - Origin of DNA replication

Introduction

Prokaryotic and eukaryotic genome replication normally is under tight control, DNA replication initiation being an important controlling step. In some bacteria the initiation step is rather well understood (Stayton et al. 1983) and determining the nucleotide sequence of the origin Offprint requests to: E. Stutz

region and manipulating it in various ways have provided important clues. It was recognized, e.g., that the origin of DNA replication of E. coli extends over 245 nucleotides (Oka et al. 1980) and that it contains conserved recognition sequences and less or non conserved spacer sequences providing specific distances between recognition sequences. Bacterial origins of DNA replication contain short direct and indirect repeats allowing the formation of elaborate stem and loop structures. Chloroplasts are in many respects of prokaryotic nature and their circular genome replicates by forming Cairns type replicative intermediates; but also displacement loops and rolling circle replicative intermediates were observed, especially in case of higherplants (Tewari 1979). In case of chloroplast DNA of the unicellular alga Euglena gracilis Cairns replicative intermediates were mainly observed (Manning and Richards 1972). More recently Koller and Delius (1982a) and Ravel-Chapuis et al. (1982) independently mapped the apparently single origin of replication of the Euglena chloroplast genome by electronmicroscopic analysis of replication forks about 5 to 6 kbp upstream of the 5' end of the sl6S rRNA gene (Jenni and Stutz 1979) and in close vicinity of a previously described polymorphic region (Z-region) known to contain a variable number of short repeats (Jenni et al. 1981). This Z-region and its vicinity had already been mapped and cloned into pBR322 and further analysed by electronmicroscopy (Schlunegger et al. 1983). It was shown that the polymorphic region was located between two short inverted repeats (Koller and Delius 1982b) and from denaturing studies it was concluded thatthe DNA segment contains regions extremely rich in A+T. Very recently Waddel et al. (1984)mapped two origins of DNA replication on the circular chloroplast genome of Chlamydomonas reinhardii. These origins are, respectively, 10 and 16.5 kbp upstream of one of the 16S rRNA genes. The same laboratory (Wang et al.

B. Schlunegger and E. Stutz: Euglena chloroplast DNA

630 H

A

G1

_2_~

Z

"'

!.

s~6S

6....

~

~ 42

t.

B g l II

~ IR

URF

left

Z-region

part

J

il

ptrn._ W

repeats

,,,.,,,,.,,.,,...,,,,Jr...

C

i

I

I I

I

,

,

i

I

I i

I 0

I I

I 2

Map position of the sequenced region and strategy of sequencing

III

right part I I

I

Hind

IR

I

I I

Results

I

(005")

Rsa Hinf Taq Sau Xba

I

I I 3A

[

kbp

Fig. 1A-C. Map position on the Euglena graeilis chloroplast genome of the BgllI • Z DNA fragment carrying the size variable region (Z-region) and strategy of sequencing. A Map position of BgllI • Z relative to the extra 16S rRNA gene and previously mapped BgllI fragments, H, G1 and I (Jenni et al. 1981). Sizes are in kbp. B Structural features of the sequenced BgllI-HindlIl fragment. IR, inverted repeats; URF, unidentified open reading frame; Z, region composed of a variable number of repeats of 54 bp, PtrnW,49 positions of the 3'terminal part of the tRNATrp gene. C Restriction fragments which were cloned into phage M13 mp9 and sequenced according to Sanger et al. (1980). The sequenced region is subdivided into a left side part, the repeated region and the right side part (see Fig. 2A, B)

According to Koller and Delius (1982a) and RavelChapuis et al. (1982) the Ori-region is in the vicinity of the Z-region which is a single locus on the circular DNA molecule composed of a variable number of short tandem repeats (Schlunegger et al. 1983). We have previously cloned this region into pBR322 and used the recombinant pEgcH2 for subcloning parts of this region into M13 rap9. In Fig. 1, line A we show the overall map position of the Z-region within the D N A fragment BgllI • Z and on line B we show the BgllI-HindlII fragment which was sequenced according to Sanger et al. (1980) using the restriction fragments given on lines C. The DNA fragments were sequenced in both directions. We subdivide in the following the sequenced fragment in a left side part (lp) consisting of 1,322 positions, the Z-region with n tandem repeats of 54 bases, and a right side part (rp) of 1,025 positions (see line B).

Structural features of the BgllI-HindlII fragment

1984) identified an A+T rich 0.42 kbp fragment as most likely locus of one of two origins. With this in mind it became interesting to analyse the Z-region and flanking parts (about 3.1 kbp) on a nucleotide level in order to see whether this segment or parts of it could structurally qualify for an origin of DNA replication. Parallel to these structural analysis went functional tests in collaboration with other laboratories, however, with no success, at least so far.

Materials and methods pEgcH2 DNA (Schlunegger et al. 1983) was isolated as described by Graf et al. (1982). The BgllI-HindlII fragment containing the Z-region was separated on agarose gel, eluted by electrophoresis (Wienand et al. 1979), extracted with phenol and precipitated with ethanol. After digestion with appropriate enzymes (shown in Fig. 1), subfragments of BgllI-HindlII fragment were l'dled with the Klenow fragment of DNA polymerase (Wattell and Reznikoff 1980) and cloned into the HinclI site of phage M13 mp9 (Messing et al. 1980; Messing and Vierra 1982), with the ligation conditions reported by Tait et al. 1980. Transformation of E. coli JM103 was according to Cohen et al. (1972), however competent ceils were prepared in 0.02 M Tris-HC1 pH 7,2,0.1 M CaC12 and 0.001 M NaC1. The clones to be sequenced by the method of Sanger et al. (1980) were selected after T-track analysis. As primer we used the synthetic pentadecamer nucleotide from New England Biolabs.

In Fig. 2A we give the nucleotide sequence for both strands from the left side part and in Fig. 2B we show one repeat unit of the Z-region (54 positions) and the right side part. According to EM analysis there exists a short inverted repeat element framing the Z-region. Indeed, we identified on the left and right side part such elements of 96 positions having 82% sequence homology. This structural element must be identical with that observed in the EM analysis and according to Koller and Delius (1982a) the origin of replication should be close to the inverted repeat on the right side part. No other inverted repeat element of such length was found within the sequenced fragment. The Z-region (Fig. 2B) is composed of tandem repeats. The 54 base element starts with 5'-GTAC (RsaI-site) and contains a second RsaI-site at position 19. The overall T+A content of the repeat unit is extremely high (87%). The primary structure of the repeats allows extensive formation of stem and loop structures as will be discussed. On the left side part (Fig. 2A) we identified a major open reading frame (URF) potentially coding for a protein of at least Mr 41,000. Transcription would to be towards the BgllI site, however, we have no clear evidence yet whether the region is transcribed. On the right side part we retrieved a pseudo-tRNA Trp gene (Fig. 2B). 48 out of 49 positions of the 3' terminal part are an exact copy of the functional chloroplast trnW gene recently mapped as part of a tRNA gene cluster which was totally

B. Schlunegger and E. Stutz: Euglena chloroplast DNA

631

A

%AT

GATCTTGTGA CTAGAACACT

AGTAAACATT TCATTTGTAA

TTTTCTTCAG AAAAGAAGTC

GAACAAACGT CTTGTTTGCA

TCGGCCCAAA AGCCGGGTTT

TATTTGACCG ATAAACTGGC

GGTGCTGACA CCACGACTGT

TTGACTGCCC AACTGACGGG

GTATGGGTAG CATACCCATC

ACAAATCCTG TGTTTAGGAC

CGGATTAAAT GCCTAATTTA

TGTAACATAT ACATTGTATA

OATTGCTAAC GTAACGATTG

CGGCTGACCT GCCGACTGGA

CTTTTTACGG GAAAAATGCC

TOCA~ATGCA ACGT~TACGT

TTAGCTACAC AATCOATGTG

ACATACTTAA TGTATGAATT

ATAGAAACCA TATCTTTOGT

GTTTGCACGA CAAACGTGCT

CACAATAAAT GTCTTATTTA

GTTCTTGGAG CAAGAACCTC

TTAACCATAA AATTGGTATT

AGTGCTTAAT] TCACGAATTA

GAACCCTCTA CTTGGGAGAT

CTCTTCCAAT GAGAAOGTTA

AAAATTTTAT TTTTAAAATA

CAAAATCAGT GTTTTAGTCA

TGACCAAAAA ACTGCTTTTT

TCAGGATTOA AGTCCTAACT

ATCTTCAACT TAOAAGTTGA

TTGAAACTTT AACTTTGAAA

GTTCGATAOA CAAGCTATCT

AGGAATATTT TCCTTATAAA

ACTGGGTCTT TGACCCAGAA

TAAATCATCT ATTTAGTAGA

TCGGAACGAY AGCCTTGCTA

TGTCTTCAOA ACAGAAGTCT

AYTATCATTT TAATAGTAAA

TTTTGATCAT AAAACTAGTA

GAAAAAAGAA CTTTTTTCTT

GCCAAGATTT CGGTTCTAAA

TTGAAGATTC AACTTCTAAG

TTTAAGTACA AAATTCATGT

CCAAAAAATT GGTTTTTTAA

ATTAAATGCG TAATTTACGC

CAATAACGAG GTTATTGGTC

AAAAACTAAT TTTTTGATTA

TACCATAATT ATGGTATTAA

ACGGCAATTT TGCCGTTAAA

ATTCTTACTA TAAGAATGAT

TAAAATATTT ATTYTATAAA

TAATYTTTTT ATTAAAAAAA

CCAAAACTCC GGTTTTGAGG

AAAGCCTTCA TTTCGGAAGT

GTTTTTCAAA CAAAAAGTTT

ATTTCATTTT TAAAGTAAAA

TAGAAATTTC ATCTTTAAAG

AACGGACTTO TTGCCTGAAC

ATTACTTTTA TAATGAAAAT

ACCAACTTGA TGGTTGAACT

AACAAAAAAT TTGTTTTTTA

TACAGTAAAG ATGTCATTTC

CGCATATTGG GCGTATAACC

ATATAATTTT TATATTAAAA

CAATAAAGCC GTTATTTCGG

AAAAGAAAAA TTTTCTTTTT

CTTTTGGATC GAAAACCTAG

GTTTTTTTTT CAAAAAAAAA

TTTACTATCC AAATGATAGG

CCTAAGGAAA GGATTCCTTT

AAAATTAAAT TTTTAATTTA

TGTAAAAAGA ACATTTTTCT

TAAACTAGTA ATTTGATCAT

TTAACTAACA AATTGATTGT

ATAATAATTT TATTATTAAA

GTTTGAAAAA CAAACTTTTT

AATACAATAA TTATGTTATT

AATTGCATTG TTAACGTAAC

CAGGTAGCCA GTCCATCGGT

ATTAATTAAA TAATTAATTT

ATAGATTTTA TATCTAAAAT

CTTTAATTAT GAAATTAATA

ATAGTTCATA TATCAAGTAT

AAAGAGTGTA TTTCTCACAT

ATTTCTAAA@ TAAAGATTTC

AAATACATCG TTTATOTAGC

TAAAAATCAT ATTTTTAGTA

AAAATGAATA TTTTACTTAT

TAAGAAATCG ATTCTTTAGC

TTTTAATTTT AAAATTAAAA

CTTTACCTAT GAAATGGATA

TTCTATGAAT AAGATACTTA

GTGTCATAAA CACAGTATTT

AAATAAAAAA TTTATTTTTT

TTAATCAAAC AATTAGTTTG

TAAAAAAAAT ATTTTTTTTA

ACAAAAAAAA TGTTTTTTTT

AATAGATCAT TTATCTAGTA

TGAATATATA ACTTATATAT

TATTTTAATT ATAAAATTAA

GCATTGTAGG CGTAACATCC

TTAAATGCAA AATTTACGTT

AAAATTCCCO TTTTAAGGGC

AAGTAATTTT TTCATTAAAA

CAAGAATCAA GTTCTTAGTT

GGGTTATGTA CCCAATACAT

CTTATATATA GAATATATAT

CCAATGTACT GGTTACATGA

TATATATACT ATATATATGA

URF ~

60 TTCCCCCAGG AAGOGGGTCC 120 TTTGCAAAGT AAACGTTTCA 180 TTTCAGCAAT AAAOTCGTTA 240 TCCTTTTCAT AGGAAAAGTA 300 GGTCATAACC CCAGTATTGG 360 AAATTGAAAA TTTAACTTTT 420 GACTTGACTT CTGAACTGAA 48O GATTAAAAGA CTAATTTTCT 540 TAAATGCAAA ATTTACGTTT 600 TGGAAATAAA ACCTTTATTT 660 ACGAAGGGTC TGCTTCCCAG 720 ATGTTCTGTT TACAAGACAA 780 TCGAAATTAT AGCTTTAATA 840 TTAGAAAATT AATCTTTTAA 9O0 AATAAATTTT TTATTTAAAA 960 ACGAATATCG TGCTTATAGC 1020 AAAAAACAAA TTTTTTGTTT 1080 TTTTTTTTGC AAAAAAAACG 1140 ATAATAATTC TATTATTAAG 1200 TCAATAACTA AGTTATTGAT 1260 GCAAAAAACC CGTTTTTTGG 1320 ATATATACCA TATATATGGT

56

53

61

65

[3

~TACrTATAf C~AATATA

ATATT~ATGT TATAATTACA

ACTTATATAT TGAATATATA

ArTATATATA TAATATATAT

%AT

CTATATATAC GATATATATG

54 CAAT] GTTA

87 n

63

AT TA

75

GTACTTATAT CATGAATATA

GGTCTTTACT CCAGAAATGA

GAATATCTAC CTTATAGATG

ATCAGTCCCC TAGTCAGGG~

66

ACATTTACTA TAATGTGTAC ACTAAAAGCG T_GT_AA_ATGA_T=~TTA_CA_CA~ T_GA_TT_TTCGC

CGCTTTCTAG GCGAAACATC

73

00TTTT00AG CCAAAACCTC

ACCTAC@CCT TATGGTTTTT TGE~TGCGGAA~ACCAAAAA

TATTGATTAT ATAACTAATA

TTAAGTACTT AATTCATGAA

TATGGTTAAA ATACCAATTT

TTCCGTAACG AAGGCATTGC

TTTATTCTAA AAATAACATT

73

GCTTTCTATT CGAAAGATAA

TAGCTATGTG ATCGATACAC

TOTGGCTAAT ACACCGATTA

GCAT~CCTAA CGTA~GCATT

72

AAGTGAACCC TTCACTTGGG

ATTGCCGAGG TAACGGCTCC

AAATGCTTAG TTTACGAATC

ATTCTATTGC TAAGATAACG

76

CATAATCCAT GTATTAGGTA

8CATACGTCA GGTATGCAGT

AGTCCGGATG TCACGCCTAC

GAAGAACTGG CTTCTTCACC

75

AAAGATGTTT TTTCTACAAA

ACTTTCCAAG TGAAAGGTTC

ATAATATTTT TATTATAAAA

AAGGTTTAAG TTCCAAATTC

78

TATAAGTGTT ATATTCACAA

AGTTAATTAT TCAATTAATA

TACAGTTCAT ATGTCAAGTA

TATCAGAGAT ATAGTCTCTA

83

ATAATGAAAA TATTACTTTT

AATAOAAGAT TTATCTTCTA

GTTTAGAGCT CAAATCTCGA

AACAATATTT TTGTTATAAA

73

ATGTAATATA TACATTATAT

AAATTGTATG TTTAACATAC

TTTCGTAGGA AAAGCATCCT

TTTTACTTAG AAAATGAATC

87

ATATATATAT TATATATATA

AGAGTTTTTC TCTCAAAAAG

TCAATTTCTA AGTTAAAGAT

TTATTCTTGT AATAAGAACA

80

TTATTCGTGA AATAAGCACT

TGATAGATTA ACTATCTAAT

TCTTACTTAT AGAATGAATA

TATTATTATT ATAATAATAA

85

TATTCTTCTG ATAAGAAGAC

AAACGTCATA TTTGCAGTAT

TTCAAGTTAC AAGTTCAATG

TTAATATATA AATTATATAT

87

TGTACTCTAA ACATGAGATT

AAGTTCGATC TTCAAGCTAG

TCATTTTTTA AGTAAAAAAT

TGAATAAATA ACTTATTTAT

73

AAGGTATTAA TTCCATAATT

ATATTAATAA TATAATTATT

TATTTTGTAA ATAAAACATT

CTACTTTTGA GATGAAAACT

TTTATATCAA AAATATAGTT

AAATGTGGGT TTTACACCCA

AAATTATTTG TTTAATAAAC

76

GTAATACTTC CATTATGAAG 1025 AGCTT TCGAA

60 TTGGTGACCA AGCAAGGGGC _AA[C~C~G~T_T!G~T~C~CG 120 GATTCGAACC TACTACATCA CTAAGCTTGG ATGATGTAGT 180 TAACCCTGAT TTATTAATT~ ATTGGGACTA AATAATTAA~ 240 TCCAAAGAAC CACACAAACT AGGTTTCTTG GTGTGTTTGA 3OO TTTCYTAAAG GGTAAAAAGG AAAGAATTTO CCATTTTTCC 360 TCAATCAGGT TGTATCCATC AGTTAGTCCA ACATASGTAG 420 TTOGTTATTT CTGCCAGATA AACCAATAAA GACGGTCTAT 480 ATTTCAAGAT AAACAAAGTT TAAAGTTCTA TTTGTTTCAA 540 AGGTTTTTCA GAAAAATGAT TCCAAAAAGT CTTTTTACTA 600 TTACATGAGT TACATATTTA AATGTACTCA ATGTATAAAT 660 AAACAAACAO AAATGTCTAG TTTGTTTGTC TTTACAGATC 720 TCTTATTCAA TGTTAGATTA AGAATAAGTT ACAATCTAAT 780 CCTGTAATCA CTTATGCTAT GGACATTAGT GAATACGATA 840 ATTATGGTAG TATAAATGAT TAATACCATC ATATTTACTA 900 CAATACTAAA AATTAATTGT GTTATGATTT TTAATTAACA 96O TTTATTCATC TGTACTTCTT AAATAAGTAC ACATGAAGAA 1020 TATTAGTTTA ATCGGCAAAA ATAATCAAAT TAGCCGTTTT

Fig. 2. A DNA sequence of the left side part. The inverted repeat is boxed. Start and reading direction of a URF are marked. The reading frame is open beyond the BgllI site (top). B DNA sequence of the repeat unit and the right side part. The RsaI sites in the repeat unit are underlined. The inverted repeat homologous with that on the left side part is boxed. The PtrnW is solidly underlined and additional sequences identical to sequences in the rDNA leader are marked by a dashed line

140 id rp tr

trn

W

120

ii0

i00

90

G . . . . . . f~

25

80

id r9 tr

130

..TTG ....................................................... AACCATAAGGCGTAGGTCTCCAAAACCTGATGTAGTAGGTTCGAATCCT~CAAAGCGCGC .TTGG..GAA ..........................................

70

60

50

40

30

................................................ A..GTGA.A.G. TTTTAGTGTACACATTATAGTAAATGTGCCCCTTGCTTGGTCACCAAGGGGACTGATGTA *..GTT.T.TTT.T .... CT ..... T.~ .... CAT.G.CTAG.GGCCTA .... ATC.CCC

L

trn

E

sequenced and s h o w n to be c o m p o s e d o f the t R N A genes for Tyr, His, MetE, Trp, Glu and Gly (Hollingsworth and Hallick 1982). It is k n o w n for some time (Orozco et al. 1980; R o u x

Fig. 3. Sequence comparison between the DNA segment in the vicinity of the tandem repeats (rp) a segment of the rDNA leader (ld) and parts of the tRNA gene cluster (tr). Sequence indentity is marked by dots. Position 25 (D-stem)of the tRNATrp gene is marked. The 5'-3' POlarity of the tRNA genes is indicated by horizontal arrows. The end of the trnW and the beginning of trnE of the gene cluster (tr) are marked by vertical bent arrows. Nucleotide positions are numbered according to Fig. 2B, note however the opposite orientation of presentation. The rDNA leader sequence is taken from Roux et al. (1983), the tRNA gene sequences from Hollingsworth and Hallick (1982)

et al. 1983; E1-Gewely et al. 1984) that the chloroplast r D N A leader sequence ofEuglena gracilis contains several p s e u d o - t R N A gene elements. When we f o u n d the pseudo-trnW gene near the Z-region and compared this

56

61

72

68

66

56

58

78

76

80

75

82

76

78

80

80

76

B. Schluneggerand E. Stutz: Euglena chloroplast DNA

632 V

III

IV

C A II A T T

T C T-A G C G-C A-T T,G 5' - - G G C G

ATGTA

6 C G-C

C T A 6-C

A

C

T A-T T-A

G C A

C C

A A

A-T T-A G-C

G,T T-A G G A-T

G C T-A T-A C G

T-A A-T T A A

1

AC . . . .

T

GG

TAOc_G

TAOc

A-T T A G-C

A T G

C-G C-G C-G TGTGC

A T

T GG

G T

T A T-A T-A AAAGCGCGCT

AT T-A A-T T A G A A- T T-A A T T-A A-T

AA

ATTAATATATATAA

Fig. 4. Secondary structure model of the Z-region and its vicinity containing a copy of the rDNA leader includinga ptRNATrp gene. Position 25 of the tRNATrp gene as well as its 3' end are underlined (consult also Fig. 3). Black dots mark the first repeat unit

28 positions

segment with the rDNA leader we found extensive sequence homologies as shown in Fig. 3. Most remarkable are the following observations: 1) The same 48 positions of the trnW gene are conserved in the leader as well as near the Z-region (we add in Fig. 3 the sequences of the corresponding part of the tRNA gene cluster); 2) The pseudo-trnW gene starts with position 25 which precisely is the splicing site for an intron in the D-stem of the tobacco chloroplast trnG gene (Deno and Sugiura 1984); 3) A highly conserved DNA stretch of 105 positions of the rDNA leader including the pseudo-trnW gene is inserted near the Z-region; 4) This insert near the Z-region seems not to be a transposable element, lacking structural features like e.g. short terminal repeats (Calos and Miller 1980); 5) We may add that other independently cloned DNA fragments of the same genome region also carried this short insert, indicating that the insert already exists on the chloroplast genome, and is not a cloning artefact.

Discussion

Can the sequenced fragment or part of it qualify for an Ori-region? According to electronmicroscopic observations of Euglena chloroplast DNA replication forks the origin of replication is near the Z-region, i.e., near that unique structural segment which is very rich in A+T and contains a variable number of direct repeats. Koller and Delius (1982a) observed in the majority of cases that DNA synthesis proceeds unidirectionally away from this region for about 5,000 nucleotides before it starts in the opposite direction towards the rDNA region (consult Fig. 1). They place the O-site near the 96 bp palindromic sequence whereby the precision of the measurements under the prevailing experimental conditions is around +- 500 nucleotides, and therefore most likely segments of

the right side part and the Z-region must somehow be involved in the DNA replication start. We searched by computer analysis on the entire DNA segment for bacterial type Ori-regions (Zyskind et al. 1982) but we could not find any significant sequence homology. We also analyzed for secondary structure elements, like potential stem and loop structures, which are considered to be recognized by proteins involved in the initiation step (Zyskind and Smith 1980). Many short indirect repeats were found throughout the sequenced segment but the Z-region itself and the immediate flanking part on the right side showed a remarkable accumulation of cruciforms as shown in Fig. 4. Four stem and loop structures of the right side part (I to IV) and one (V) of several stem and loop structures of the Z-region are displayed. As we already mentioned loop I and II are part of the pseudo-trnW gene and loop III and IV correspond to sequences also seen in the leader part of the rDNA region. It may be fortuitous that the AT-rich Z-region contains multiple 5'-GTAC sites (RsaI) similar to Ori-regions of enteric bacteria which contain multiple 5'-GATC sites (Sau3A). Rochaix et al. (1984) reported the construction of autonomously replicating plasmids in Chlamydomonas reinhardii (ARC). All these plasmids contained short DNA inserts mapping on the chloroplast genome. One of these inserts (pCA2) seems to originate from a chloroplast region containing one of the Ori-regions (Waddel et al. 1984). Rochaix and collaborators defined two classes of short consensus sequences which were found in all inserts and may be involved in promoting autonomous replication. One of these sequences (5'ACGTGACCAAGT, element II, pCA2) is found in position 4 4 - 5 2 of the right side part in stem and loop IV (Figs. 2B, 4) reading 5'-CTTGGTCAC (complementary strand 5'-GTGACCAAG). Such a well conserved element II is not found anywhere on the 3.1 kbp segment. Vallet et al. (1984) identified four distinct DNA fragments of

B. Schlunegger and E. Stutz: EugIena chloroplast DNA

Chlamydomonas reinhardii which promote autonomous replication in yeast. These fragments contain ARS elements as defined, e.g., by Broach et al. (1983). We also see such elements in the sequenced Euglena chloroplast DNA fragment, e.g., at positions 887 and 897 on the right side part. However, ARS bioassays using cloned Euglena chloroplast DNA fragments containing the sequenced stretch did not yield positive results (J. D. Rochaix, personal communication). Combining the electronmicroscopic results (loco cir.) with our sequencing data we come to the conclusion that within the sequenced BgllI-HindlII fragment, the Z-region and its immediate vicinity are better suited than any other part to function as an origin of DNA replication. Affirmative elements certainly are the extremely high content of A+T, the capacity to form multiple stem and loop structures and maybe the conserved ARC element II. The presence of a pseudo-tRNA Trp gene in that region may be fortuitous but we cannot exclude it to have a role in DNA replication initiation as discussed in the next paragraph.

The mobility of DNA sequences within the chloroplast genome A corollary of these sequencing studies was the observation that parts of the rDNA leader were found near the Z-region, what must be the result of a DNA transfer event of the evolutionary past. We mentioned already that the string of 105 nucleotides (see Fig. 3)contains49 positions of the trnW gene which itself was transferred in an even earlier time from the trn gene cluster in EcoV-H to the rDNA leader region (E1-Gewely et al. 1984). Most remarkable is the fact that the pseudo-trnW gene starts with position 25 (see Fig. 3) wfiich precisely is the splicing site for an intron in the D-stem of the tobacco trnG (Deno and Sugiura 1984). The usual intron site in tRNA genes is placed in the anticodon stem (Sprinzl and Gauss (1983). We searched for the missing 5'terminal part of the trnW gene upstream within the right side part but without success. The pseudo-tRNA genes in the rDNA leader as well as in the Z-region are found both in the Z-strain and B-strain (bacillaris). According to a rough estimate of E1-Gewely et al. (1984) the two strains may have separated more than 150 x 106 years ago, what means that the tRNA gene transfer into the rDNA region is an old event and nevertheless 48 positions of the trnW region are very well conserved. E1-Gewely et al. (1984) argued that the pseudo-tRNA genes in front of the 16S rRNA gene may be involved in transcription regulation. By analogy we may argue that the same DNA sequence in another genome environment (Z-region) may be involved in the DNA replication initiation step. So far RNA poly-

633 merases as well as DNA polymerases and other enzymatic activities involved in DNA replication initiation are very poorly understood (Bryant 1982) and appropriate enzymology is required to set up bioassays for testing whether any parts of the sequenced BgllI-HindlII fragment do indeed qualify as origins of DNA replication.

Acknowledgments. We are grateful to H. Gut for technical assistance, P. E. Montandon for helpful advice and C. Bachmann for secretarial help. This research is supported by Fonds National Suisse de la Recherche Scientifique (to E. S. 3.183.82).

References Broach JR, Li YY, Feldman J, Jayaram M, Abraham J, Nasmith KA, Hicks JB (1983) Cold Spring Harbor Symp. Quantitative Biol 47:1165-1173 Bryant JA (1982) In: Parthier B, Boulder D (eds) Encyclopedia of plant physiology, new Series, vol 14B. Springer, Berlin, pp 75-110 Calos MP, Miller JH (1980) Cell 20:579-595 Cohen SW, Chang ACY, Hsu L (1972) Proc Natl Acad Sci USA 69:2110-2114 Deno H, Sugiura M (1984) Proc Natl Acad Sci USA 81:405-408 E1-Gewely MR, Helling RB, Dibbib JGT (1984) Mol Gen Genet 194:432 Graf L, Roux E, Stutz E, KSssel H (1982) Nucleic Acids Res 10:6369-6381 Hollingsworth MJ, Hallick RB (1982) J Biol Chem 257:1279512799 Jenni B, Stutz E (1979) FEBS Lett 102:95-99 Jenni B, Fasnacht M, Stutz E (1981) FEBS Lett 125:175-179 Koller B, Delius H (1982a) EMBO J 1:995-998 Koller B, Delius H (1982b) FEBS Lett 139:86-92 Manning JE, Richards OC (1972) Biochemistry 11:2036-2043 Messing J, Vieira J (1982) Gene 19:269-276 Messing J, Crea R, Seeberg PH (1980) Nucleic Acids Res 9:309321 Oka A, Sugimoto K, Takanami M, Hirota Y (1980) Mol Gen Genet 178:9-20 Orozco EM, Rushlow KE, Dodd JR, Hallick RB (1980) J Biol Chem 255 : 10997-11003 Ravel-Chapuis P, Heizman P, Nigon V (1982) Nature 300:78-81 Rochaix JD, van Dillewijn J, Rahire M (1984) Cell 36:925-931 Roux E, Graf L, Stutz E (1983) Nucleic Acids Res 11:19571968 Sanger F, Coulson AR, Barrel BG, Smith AJH, Roe BA (1980) J Mol Biol 143:161-178 Schlunegger B, Fasnacht M, Stutz E, Koller B, Delius H (1983) Biochim Biophys Acta 739:114-121 Sprinzl M, Gauss DH (1983) Nucleic Acids Res l l : r l r53 Stayton MM, Bertsch L, Biswas S, Burgers P, Dixon N, Flynn JE, Fuller R, Kaguni M, Low R, Ktonberg A (1983) Cold Spring Harbor Symp Quantitative Biol 47: 693-700 Tait RC, Rodriguez RL, West RW (1980) J Biol Chem 255:813815 Tewari KK (1979) In: Hall TC, Davies JW (eds) Nucleic Acids in Plants, vol I. CRC Press, Florida, pp 41 108 VaUet JM, Rahire M, Rochaix JD (1984) EMBO J 3:415-421

634 WaddeU J, Wang XM, WU M (1984) Nucleic Acids Res 12: 3843-3856 Wang XM, Chang CH, Waddell J, Wu M (1984) Nucleic Acids Res 12:3857-3872 Wartell RM, Reznikoff WS (t980) Gene 9:307-310 Wienand U, Schwarz Z, Feix G (1979) FEBS Lett 98:319-323 Zyskind JW, Harding NE, Takeda Y, Cleary JM, Smith DW (1982) ICN-UCLA Symposia on molecular and cellular biology 22:13-25

B. Schlunegger and E. Stutz: Euglena chloroplast DNA Zyskind JW, Smith DW (1980) Proc Natl Acad Sci USA 77: 2460-2464

Communicated by U. Leupold Received July 19, 1984

The Euglena gracilis chloroplast genome: structural features of a DNA region possibly carrying the single origin of DNA replication.

We sequenced a Bg1II-HindIII DNA fragment of the Euglena gracilis chloroplast genome which most likely carries the single origin of DNA replication ac...
529KB Sizes 0 Downloads 0 Views