CurrentGenetics (1984) 8:629-634
Current Genetics © Springer-Verlag 1984
The Euglena gracilis chloroplast genome: structural features of a DNA region possibly carrying the single origin of DNA replication Bibiane Schlunegger and Erhard Stutz
Laboratoire de Biochimie,Universit6de Neuch~tel,Chantemerie 18, CH-2000 Neuch~tel, Switzerland
Summary. We sequenced a BglII-HindlII DNA fragment of the Euglena gracilis chloroplast genome which most likely carries the single origin of DNA replication according to recent electronmicroscopic mapping studies (Koller and Delius 1982a; Ravel-Chapuis et al. 1982). This DNA fragment contains a polymorphlc region (Schlunegger et al. 1983) which is composed, as will be shown, of multiple tandem repeats (54 bp, 87% A+T). Furthermore we located on this DNA fragment a short inverted repeat element (96 positions) observed in the electronmicroscopic studies (Koller and Delius 1982b). Between the borders of the polymorphic region and the nearby inverted repeat (distance of 179 positions) we retrieved an exact copy of parts of the rDNA leader (105 positions) including 49 positions of the chloroplast trnW gene. A computer search for bacterial type Oriregions did not reveal any significant sequence homology. However, the polymorphic region and its immediate vicinity have the capacity to form multiple stem and loop structures which may be involved in DNA replication initiation. Key words: Chloroplast DNA - Euglena gracilis - Origin of DNA replication
Introduction
Prokaryotic and eukaryotic genome replication normally is under tight control, DNA replication initiation being an important controlling step. In some bacteria the initiation step is rather well understood (Stayton et al. 1983) and determining the nucleotide sequence of the origin Offprint requests to: E. Stutz
region and manipulating it in various ways have provided important clues. It was recognized, e.g., that the origin of DNA replication of E. coli extends over 245 nucleotides (Oka et al. 1980) and that it contains conserved recognition sequences and less or non conserved spacer sequences providing specific distances between recognition sequences. Bacterial origins of DNA replication contain short direct and indirect repeats allowing the formation of elaborate stem and loop structures. Chloroplasts are in many respects of prokaryotic nature and their circular genome replicates by forming Cairns type replicative intermediates; but also displacement loops and rolling circle replicative intermediates were observed, especially in case of higherplants (Tewari 1979). In case of chloroplast DNA of the unicellular alga Euglena gracilis Cairns replicative intermediates were mainly observed (Manning and Richards 1972). More recently Koller and Delius (1982a) and Ravel-Chapuis et al. (1982) independently mapped the apparently single origin of replication of the Euglena chloroplast genome by electronmicroscopic analysis of replication forks about 5 to 6 kbp upstream of the 5' end of the sl6S rRNA gene (Jenni and Stutz 1979) and in close vicinity of a previously described polymorphic region (Z-region) known to contain a variable number of short repeats (Jenni et al. 1981). This Z-region and its vicinity had already been mapped and cloned into pBR322 and further analysed by electronmicroscopy (Schlunegger et al. 1983). It was shown that the polymorphic region was located between two short inverted repeats (Koller and Delius 1982b) and from denaturing studies it was concluded thatthe DNA segment contains regions extremely rich in A+T. Very recently Waddel et al. (1984)mapped two origins of DNA replication on the circular chloroplast genome of Chlamydomonas reinhardii. These origins are, respectively, 10 and 16.5 kbp upstream of one of the 16S rRNA genes. The same laboratory (Wang et al.
B. Schlunegger and E. Stutz: Euglena chloroplast DNA
630 H
A
G1
_2_~
Z
"'
!.
s~6S
6....
~
~ 42
t.
B g l II
~ IR
URF
left
Z-region
part
J
il
ptrn._ W
repeats
,,,.,,,,.,,.,,...,,,,Jr...
C
i
I
I I
I
,
,
i
I
I i
I 0
I I
I 2
Map position of the sequenced region and strategy of sequencing
III
right part I I
I
Hind
IR
I
I I
Results
I
(005")
Rsa Hinf Taq Sau Xba
I
I I 3A
[
kbp
Fig. 1A-C. Map position on the Euglena graeilis chloroplast genome of the BgllI • Z DNA fragment carrying the size variable region (Z-region) and strategy of sequencing. A Map position of BgllI • Z relative to the extra 16S rRNA gene and previously mapped BgllI fragments, H, G1 and I (Jenni et al. 1981). Sizes are in kbp. B Structural features of the sequenced BgllI-HindlIl fragment. IR, inverted repeats; URF, unidentified open reading frame; Z, region composed of a variable number of repeats of 54 bp, PtrnW,49 positions of the 3'terminal part of the tRNATrp gene. C Restriction fragments which were cloned into phage M13 mp9 and sequenced according to Sanger et al. (1980). The sequenced region is subdivided into a left side part, the repeated region and the right side part (see Fig. 2A, B)
According to Koller and Delius (1982a) and RavelChapuis et al. (1982) the Ori-region is in the vicinity of the Z-region which is a single locus on the circular DNA molecule composed of a variable number of short tandem repeats (Schlunegger et al. 1983). We have previously cloned this region into pBR322 and used the recombinant pEgcH2 for subcloning parts of this region into M13 rap9. In Fig. 1, line A we show the overall map position of the Z-region within the D N A fragment BgllI • Z and on line B we show the BgllI-HindlII fragment which was sequenced according to Sanger et al. (1980) using the restriction fragments given on lines C. The DNA fragments were sequenced in both directions. We subdivide in the following the sequenced fragment in a left side part (lp) consisting of 1,322 positions, the Z-region with n tandem repeats of 54 bases, and a right side part (rp) of 1,025 positions (see line B).
Structural features of the BgllI-HindlII fragment
1984) identified an A+T rich 0.42 kbp fragment as most likely locus of one of two origins. With this in mind it became interesting to analyse the Z-region and flanking parts (about 3.1 kbp) on a nucleotide level in order to see whether this segment or parts of it could structurally qualify for an origin of DNA replication. Parallel to these structural analysis went functional tests in collaboration with other laboratories, however, with no success, at least so far.
Materials and methods pEgcH2 DNA (Schlunegger et al. 1983) was isolated as described by Graf et al. (1982). The BgllI-HindlII fragment containing the Z-region was separated on agarose gel, eluted by electrophoresis (Wienand et al. 1979), extracted with phenol and precipitated with ethanol. After digestion with appropriate enzymes (shown in Fig. 1), subfragments of BgllI-HindlII fragment were l'dled with the Klenow fragment of DNA polymerase (Wattell and Reznikoff 1980) and cloned into the HinclI site of phage M13 mp9 (Messing et al. 1980; Messing and Vierra 1982), with the ligation conditions reported by Tait et al. 1980. Transformation of E. coli JM103 was according to Cohen et al. (1972), however competent ceils were prepared in 0.02 M Tris-HC1 pH 7,2,0.1 M CaC12 and 0.001 M NaC1. The clones to be sequenced by the method of Sanger et al. (1980) were selected after T-track analysis. As primer we used the synthetic pentadecamer nucleotide from New England Biolabs.
In Fig. 2A we give the nucleotide sequence for both strands from the left side part and in Fig. 2B we show one repeat unit of the Z-region (54 positions) and the right side part. According to EM analysis there exists a short inverted repeat element framing the Z-region. Indeed, we identified on the left and right side part such elements of 96 positions having 82% sequence homology. This structural element must be identical with that observed in the EM analysis and according to Koller and Delius (1982a) the origin of replication should be close to the inverted repeat on the right side part. No other inverted repeat element of such length was found within the sequenced fragment. The Z-region (Fig. 2B) is composed of tandem repeats. The 54 base element starts with 5'-GTAC (RsaI-site) and contains a second RsaI-site at position 19. The overall T+A content of the repeat unit is extremely high (87%). The primary structure of the repeats allows extensive formation of stem and loop structures as will be discussed. On the left side part (Fig. 2A) we identified a major open reading frame (URF) potentially coding for a protein of at least Mr 41,000. Transcription would to be towards the BgllI site, however, we have no clear evidence yet whether the region is transcribed. On the right side part we retrieved a pseudo-tRNA Trp gene (Fig. 2B). 48 out of 49 positions of the 3' terminal part are an exact copy of the functional chloroplast trnW gene recently mapped as part of a tRNA gene cluster which was totally
B. Schlunegger and E. Stutz: Euglena chloroplast DNA
631
A
%AT
GATCTTGTGA CTAGAACACT
AGTAAACATT TCATTTGTAA
TTTTCTTCAG AAAAGAAGTC
GAACAAACGT CTTGTTTGCA
TCGGCCCAAA AGCCGGGTTT
TATTTGACCG ATAAACTGGC
GGTGCTGACA CCACGACTGT
TTGACTGCCC AACTGACGGG
GTATGGGTAG CATACCCATC
ACAAATCCTG TGTTTAGGAC
CGGATTAAAT GCCTAATTTA
TGTAACATAT ACATTGTATA
OATTGCTAAC GTAACGATTG
CGGCTGACCT GCCGACTGGA
CTTTTTACGG GAAAAATGCC
TOCA~ATGCA ACGT~TACGT
TTAGCTACAC AATCOATGTG
ACATACTTAA TGTATGAATT
ATAGAAACCA TATCTTTOGT
GTTTGCACGA CAAACGTGCT
CACAATAAAT GTCTTATTTA
GTTCTTGGAG CAAGAACCTC
TTAACCATAA AATTGGTATT
AGTGCTTAAT] TCACGAATTA
GAACCCTCTA CTTGGGAGAT
CTCTTCCAAT GAGAAOGTTA
AAAATTTTAT TTTTAAAATA
CAAAATCAGT GTTTTAGTCA
TGACCAAAAA ACTGCTTTTT
TCAGGATTOA AGTCCTAACT
ATCTTCAACT TAOAAGTTGA
TTGAAACTTT AACTTTGAAA
GTTCGATAOA CAAGCTATCT
AGGAATATTT TCCTTATAAA
ACTGGGTCTT TGACCCAGAA
TAAATCATCT ATTTAGTAGA
TCGGAACGAY AGCCTTGCTA
TGTCTTCAOA ACAGAAGTCT
AYTATCATTT TAATAGTAAA
TTTTGATCAT AAAACTAGTA
GAAAAAAGAA CTTTTTTCTT
GCCAAGATTT CGGTTCTAAA
TTGAAGATTC AACTTCTAAG
TTTAAGTACA AAATTCATGT
CCAAAAAATT GGTTTTTTAA
ATTAAATGCG TAATTTACGC
CAATAACGAG GTTATTGGTC
AAAAACTAAT TTTTTGATTA
TACCATAATT ATGGTATTAA
ACGGCAATTT TGCCGTTAAA
ATTCTTACTA TAAGAATGAT
TAAAATATTT ATTYTATAAA
TAATYTTTTT ATTAAAAAAA
CCAAAACTCC GGTTTTGAGG
AAAGCCTTCA TTTCGGAAGT
GTTTTTCAAA CAAAAAGTTT
ATTTCATTTT TAAAGTAAAA
TAGAAATTTC ATCTTTAAAG
AACGGACTTO TTGCCTGAAC
ATTACTTTTA TAATGAAAAT
ACCAACTTGA TGGTTGAACT
AACAAAAAAT TTGTTTTTTA
TACAGTAAAG ATGTCATTTC
CGCATATTGG GCGTATAACC
ATATAATTTT TATATTAAAA
CAATAAAGCC GTTATTTCGG
AAAAGAAAAA TTTTCTTTTT
CTTTTGGATC GAAAACCTAG
GTTTTTTTTT CAAAAAAAAA
TTTACTATCC AAATGATAGG
CCTAAGGAAA GGATTCCTTT
AAAATTAAAT TTTTAATTTA
TGTAAAAAGA ACATTTTTCT
TAAACTAGTA ATTTGATCAT
TTAACTAACA AATTGATTGT
ATAATAATTT TATTATTAAA
GTTTGAAAAA CAAACTTTTT
AATACAATAA TTATGTTATT
AATTGCATTG TTAACGTAAC
CAGGTAGCCA GTCCATCGGT
ATTAATTAAA TAATTAATTT
ATAGATTTTA TATCTAAAAT
CTTTAATTAT GAAATTAATA
ATAGTTCATA TATCAAGTAT
AAAGAGTGTA TTTCTCACAT
ATTTCTAAA@ TAAAGATTTC
AAATACATCG TTTATOTAGC
TAAAAATCAT ATTTTTAGTA
AAAATGAATA TTTTACTTAT
TAAGAAATCG ATTCTTTAGC
TTTTAATTTT AAAATTAAAA
CTTTACCTAT GAAATGGATA
TTCTATGAAT AAGATACTTA
GTGTCATAAA CACAGTATTT
AAATAAAAAA TTTATTTTTT
TTAATCAAAC AATTAGTTTG
TAAAAAAAAT ATTTTTTTTA
ACAAAAAAAA TGTTTTTTTT
AATAGATCAT TTATCTAGTA
TGAATATATA ACTTATATAT
TATTTTAATT ATAAAATTAA
GCATTGTAGG CGTAACATCC
TTAAATGCAA AATTTACGTT
AAAATTCCCO TTTTAAGGGC
AAGTAATTTT TTCATTAAAA
CAAGAATCAA GTTCTTAGTT
GGGTTATGTA CCCAATACAT
CTTATATATA GAATATATAT
CCAATGTACT GGTTACATGA
TATATATACT ATATATATGA
URF ~
60 TTCCCCCAGG AAGOGGGTCC 120 TTTGCAAAGT AAACGTTTCA 180 TTTCAGCAAT AAAOTCGTTA 240 TCCTTTTCAT AGGAAAAGTA 300 GGTCATAACC CCAGTATTGG 360 AAATTGAAAA TTTAACTTTT 420 GACTTGACTT CTGAACTGAA 48O GATTAAAAGA CTAATTTTCT 540 TAAATGCAAA ATTTACGTTT 600 TGGAAATAAA ACCTTTATTT 660 ACGAAGGGTC TGCTTCCCAG 720 ATGTTCTGTT TACAAGACAA 780 TCGAAATTAT AGCTTTAATA 840 TTAGAAAATT AATCTTTTAA 9O0 AATAAATTTT TTATTTAAAA 960 ACGAATATCG TGCTTATAGC 1020 AAAAAACAAA TTTTTTGTTT 1080 TTTTTTTTGC AAAAAAAACG 1140 ATAATAATTC TATTATTAAG 1200 TCAATAACTA AGTTATTGAT 1260 GCAAAAAACC CGTTTTTTGG 1320 ATATATACCA TATATATGGT
56
53
61
65
[3
~TACrTATAf C~AATATA
ATATT~ATGT TATAATTACA
ACTTATATAT TGAATATATA
ArTATATATA TAATATATAT
%AT
CTATATATAC GATATATATG
54 CAAT] GTTA
87 n
63
AT TA
75
GTACTTATAT CATGAATATA
GGTCTTTACT CCAGAAATGA
GAATATCTAC CTTATAGATG
ATCAGTCCCC TAGTCAGGG~
66
ACATTTACTA TAATGTGTAC ACTAAAAGCG T_GT_AA_ATGA_T=~TTA_CA_CA~ T_GA_TT_TTCGC
CGCTTTCTAG GCGAAACATC
73
00TTTT00AG CCAAAACCTC
ACCTAC@CCT TATGGTTTTT TGE~TGCGGAA~ACCAAAAA
TATTGATTAT ATAACTAATA
TTAAGTACTT AATTCATGAA
TATGGTTAAA ATACCAATTT
TTCCGTAACG AAGGCATTGC
TTTATTCTAA AAATAACATT
73
GCTTTCTATT CGAAAGATAA
TAGCTATGTG ATCGATACAC
TOTGGCTAAT ACACCGATTA
GCAT~CCTAA CGTA~GCATT
72
AAGTGAACCC TTCACTTGGG
ATTGCCGAGG TAACGGCTCC
AAATGCTTAG TTTACGAATC
ATTCTATTGC TAAGATAACG
76
CATAATCCAT GTATTAGGTA
8CATACGTCA GGTATGCAGT
AGTCCGGATG TCACGCCTAC
GAAGAACTGG CTTCTTCACC
75
AAAGATGTTT TTTCTACAAA
ACTTTCCAAG TGAAAGGTTC
ATAATATTTT TATTATAAAA
AAGGTTTAAG TTCCAAATTC
78
TATAAGTGTT ATATTCACAA
AGTTAATTAT TCAATTAATA
TACAGTTCAT ATGTCAAGTA
TATCAGAGAT ATAGTCTCTA
83
ATAATGAAAA TATTACTTTT
AATAOAAGAT TTATCTTCTA
GTTTAGAGCT CAAATCTCGA
AACAATATTT TTGTTATAAA
73
ATGTAATATA TACATTATAT
AAATTGTATG TTTAACATAC
TTTCGTAGGA AAAGCATCCT
TTTTACTTAG AAAATGAATC
87
ATATATATAT TATATATATA
AGAGTTTTTC TCTCAAAAAG
TCAATTTCTA AGTTAAAGAT
TTATTCTTGT AATAAGAACA
80
TTATTCGTGA AATAAGCACT
TGATAGATTA ACTATCTAAT
TCTTACTTAT AGAATGAATA
TATTATTATT ATAATAATAA
85
TATTCTTCTG ATAAGAAGAC
AAACGTCATA TTTGCAGTAT
TTCAAGTTAC AAGTTCAATG
TTAATATATA AATTATATAT
87
TGTACTCTAA ACATGAGATT
AAGTTCGATC TTCAAGCTAG
TCATTTTTTA AGTAAAAAAT
TGAATAAATA ACTTATTTAT
73
AAGGTATTAA TTCCATAATT
ATATTAATAA TATAATTATT
TATTTTGTAA ATAAAACATT
CTACTTTTGA GATGAAAACT
TTTATATCAA AAATATAGTT
AAATGTGGGT TTTACACCCA
AAATTATTTG TTTAATAAAC
76
GTAATACTTC CATTATGAAG 1025 AGCTT TCGAA
60 TTGGTGACCA AGCAAGGGGC _AA[C~C~G~T_T!G~T~C~CG 120 GATTCGAACC TACTACATCA CTAAGCTTGG ATGATGTAGT 180 TAACCCTGAT TTATTAATT~ ATTGGGACTA AATAATTAA~ 240 TCCAAAGAAC CACACAAACT AGGTTTCTTG GTGTGTTTGA 3OO TTTCYTAAAG GGTAAAAAGG AAAGAATTTO CCATTTTTCC 360 TCAATCAGGT TGTATCCATC AGTTAGTCCA ACATASGTAG 420 TTOGTTATTT CTGCCAGATA AACCAATAAA GACGGTCTAT 480 ATTTCAAGAT AAACAAAGTT TAAAGTTCTA TTTGTTTCAA 540 AGGTTTTTCA GAAAAATGAT TCCAAAAAGT CTTTTTACTA 600 TTACATGAGT TACATATTTA AATGTACTCA ATGTATAAAT 660 AAACAAACAO AAATGTCTAG TTTGTTTGTC TTTACAGATC 720 TCTTATTCAA TGTTAGATTA AGAATAAGTT ACAATCTAAT 780 CCTGTAATCA CTTATGCTAT GGACATTAGT GAATACGATA 840 ATTATGGTAG TATAAATGAT TAATACCATC ATATTTACTA 900 CAATACTAAA AATTAATTGT GTTATGATTT TTAATTAACA 96O TTTATTCATC TGTACTTCTT AAATAAGTAC ACATGAAGAA 1020 TATTAGTTTA ATCGGCAAAA ATAATCAAAT TAGCCGTTTT
Fig. 2. A DNA sequence of the left side part. The inverted repeat is boxed. Start and reading direction of a URF are marked. The reading frame is open beyond the BgllI site (top). B DNA sequence of the repeat unit and the right side part. The RsaI sites in the repeat unit are underlined. The inverted repeat homologous with that on the left side part is boxed. The PtrnW is solidly underlined and additional sequences identical to sequences in the rDNA leader are marked by a dashed line
140 id rp tr
trn
W
120
ii0
i00
90
G . . . . . . f~
25
80
id r9 tr
130
..TTG ....................................................... AACCATAAGGCGTAGGTCTCCAAAACCTGATGTAGTAGGTTCGAATCCT~CAAAGCGCGC .TTGG..GAA ..........................................
70
60
50
40
30
................................................ A..GTGA.A.G. TTTTAGTGTACACATTATAGTAAATGTGCCCCTTGCTTGGTCACCAAGGGGACTGATGTA *..GTT.T.TTT.T .... CT ..... T.~ .... CAT.G.CTAG.GGCCTA .... ATC.CCC
L
trn
E
sequenced and s h o w n to be c o m p o s e d o f the t R N A genes for Tyr, His, MetE, Trp, Glu and Gly (Hollingsworth and Hallick 1982). It is k n o w n for some time (Orozco et al. 1980; R o u x
Fig. 3. Sequence comparison between the DNA segment in the vicinity of the tandem repeats (rp) a segment of the rDNA leader (ld) and parts of the tRNA gene cluster (tr). Sequence indentity is marked by dots. Position 25 (D-stem)of the tRNATrp gene is marked. The 5'-3' POlarity of the tRNA genes is indicated by horizontal arrows. The end of the trnW and the beginning of trnE of the gene cluster (tr) are marked by vertical bent arrows. Nucleotide positions are numbered according to Fig. 2B, note however the opposite orientation of presentation. The rDNA leader sequence is taken from Roux et al. (1983), the tRNA gene sequences from Hollingsworth and Hallick (1982)
et al. 1983; E1-Gewely et al. 1984) that the chloroplast r D N A leader sequence ofEuglena gracilis contains several p s e u d o - t R N A gene elements. When we f o u n d the pseudo-trnW gene near the Z-region and compared this
56
61
72
68
66
56
58
78
76
80
75
82
76
78
80
80
76
B. Schluneggerand E. Stutz: Euglena chloroplast DNA
632 V
III
IV
C A II A T T
T C T-A G C G-C A-T T,G 5' - - G G C G
ATGTA
6 C G-C
C T A 6-C
A
C
T A-T T-A
G C A
C C
A A
A-T T-A G-C
G,T T-A G G A-T
G C T-A T-A C G
T-A A-T T A A
1
AC . . . .
T
GG
TAOc_G
TAOc
A-T T A G-C
A T G
C-G C-G C-G TGTGC
A T
T GG
G T
T A T-A T-A AAAGCGCGCT
AT T-A A-T T A G A A- T T-A A T T-A A-T
AA
ATTAATATATATAA
Fig. 4. Secondary structure model of the Z-region and its vicinity containing a copy of the rDNA leader includinga ptRNATrp gene. Position 25 of the tRNATrp gene as well as its 3' end are underlined (consult also Fig. 3). Black dots mark the first repeat unit
28 positions
segment with the rDNA leader we found extensive sequence homologies as shown in Fig. 3. Most remarkable are the following observations: 1) The same 48 positions of the trnW gene are conserved in the leader as well as near the Z-region (we add in Fig. 3 the sequences of the corresponding part of the tRNA gene cluster); 2) The pseudo-trnW gene starts with position 25 which precisely is the splicing site for an intron in the D-stem of the tobacco chloroplast trnG gene (Deno and Sugiura 1984); 3) A highly conserved DNA stretch of 105 positions of the rDNA leader including the pseudo-trnW gene is inserted near the Z-region; 4) This insert near the Z-region seems not to be a transposable element, lacking structural features like e.g. short terminal repeats (Calos and Miller 1980); 5) We may add that other independently cloned DNA fragments of the same genome region also carried this short insert, indicating that the insert already exists on the chloroplast genome, and is not a cloning artefact.
Discussion
Can the sequenced fragment or part of it qualify for an Ori-region? According to electronmicroscopic observations of Euglena chloroplast DNA replication forks the origin of replication is near the Z-region, i.e., near that unique structural segment which is very rich in A+T and contains a variable number of direct repeats. Koller and Delius (1982a) observed in the majority of cases that DNA synthesis proceeds unidirectionally away from this region for about 5,000 nucleotides before it starts in the opposite direction towards the rDNA region (consult Fig. 1). They place the O-site near the 96 bp palindromic sequence whereby the precision of the measurements under the prevailing experimental conditions is around +- 500 nucleotides, and therefore most likely segments of
the right side part and the Z-region must somehow be involved in the DNA replication start. We searched by computer analysis on the entire DNA segment for bacterial type Ori-regions (Zyskind et al. 1982) but we could not find any significant sequence homology. We also analyzed for secondary structure elements, like potential stem and loop structures, which are considered to be recognized by proteins involved in the initiation step (Zyskind and Smith 1980). Many short indirect repeats were found throughout the sequenced segment but the Z-region itself and the immediate flanking part on the right side showed a remarkable accumulation of cruciforms as shown in Fig. 4. Four stem and loop structures of the right side part (I to IV) and one (V) of several stem and loop structures of the Z-region are displayed. As we already mentioned loop I and II are part of the pseudo-trnW gene and loop III and IV correspond to sequences also seen in the leader part of the rDNA region. It may be fortuitous that the AT-rich Z-region contains multiple 5'-GTAC sites (RsaI) similar to Ori-regions of enteric bacteria which contain multiple 5'-GATC sites (Sau3A). Rochaix et al. (1984) reported the construction of autonomously replicating plasmids in Chlamydomonas reinhardii (ARC). All these plasmids contained short DNA inserts mapping on the chloroplast genome. One of these inserts (pCA2) seems to originate from a chloroplast region containing one of the Ori-regions (Waddel et al. 1984). Rochaix and collaborators defined two classes of short consensus sequences which were found in all inserts and may be involved in promoting autonomous replication. One of these sequences (5'ACGTGACCAAGT, element II, pCA2) is found in position 4 4 - 5 2 of the right side part in stem and loop IV (Figs. 2B, 4) reading 5'-CTTGGTCAC (complementary strand 5'-GTGACCAAG). Such a well conserved element II is not found anywhere on the 3.1 kbp segment. Vallet et al. (1984) identified four distinct DNA fragments of
B. Schlunegger and E. Stutz: EugIena chloroplast DNA
Chlamydomonas reinhardii which promote autonomous replication in yeast. These fragments contain ARS elements as defined, e.g., by Broach et al. (1983). We also see such elements in the sequenced Euglena chloroplast DNA fragment, e.g., at positions 887 and 897 on the right side part. However, ARS bioassays using cloned Euglena chloroplast DNA fragments containing the sequenced stretch did not yield positive results (J. D. Rochaix, personal communication). Combining the electronmicroscopic results (loco cir.) with our sequencing data we come to the conclusion that within the sequenced BgllI-HindlII fragment, the Z-region and its immediate vicinity are better suited than any other part to function as an origin of DNA replication. Affirmative elements certainly are the extremely high content of A+T, the capacity to form multiple stem and loop structures and maybe the conserved ARC element II. The presence of a pseudo-tRNA Trp gene in that region may be fortuitous but we cannot exclude it to have a role in DNA replication initiation as discussed in the next paragraph.
The mobility of DNA sequences within the chloroplast genome A corollary of these sequencing studies was the observation that parts of the rDNA leader were found near the Z-region, what must be the result of a DNA transfer event of the evolutionary past. We mentioned already that the string of 105 nucleotides (see Fig. 3)contains49 positions of the trnW gene which itself was transferred in an even earlier time from the trn gene cluster in EcoV-H to the rDNA leader region (E1-Gewely et al. 1984). Most remarkable is the fact that the pseudo-trnW gene starts with position 25 (see Fig. 3) wfiich precisely is the splicing site for an intron in the D-stem of the tobacco trnG (Deno and Sugiura 1984). The usual intron site in tRNA genes is placed in the anticodon stem (Sprinzl and Gauss (1983). We searched for the missing 5'terminal part of the trnW gene upstream within the right side part but without success. The pseudo-tRNA genes in the rDNA leader as well as in the Z-region are found both in the Z-strain and B-strain (bacillaris). According to a rough estimate of E1-Gewely et al. (1984) the two strains may have separated more than 150 x 106 years ago, what means that the tRNA gene transfer into the rDNA region is an old event and nevertheless 48 positions of the trnW region are very well conserved. E1-Gewely et al. (1984) argued that the pseudo-tRNA genes in front of the 16S rRNA gene may be involved in transcription regulation. By analogy we may argue that the same DNA sequence in another genome environment (Z-region) may be involved in the DNA replication initiation step. So far RNA poly-
633 merases as well as DNA polymerases and other enzymatic activities involved in DNA replication initiation are very poorly understood (Bryant 1982) and appropriate enzymology is required to set up bioassays for testing whether any parts of the sequenced BgllI-HindlII fragment do indeed qualify as origins of DNA replication.
Acknowledgments. We are grateful to H. Gut for technical assistance, P. E. Montandon for helpful advice and C. Bachmann for secretarial help. This research is supported by Fonds National Suisse de la Recherche Scientifique (to E. S. 3.183.82).
References Broach JR, Li YY, Feldman J, Jayaram M, Abraham J, Nasmith KA, Hicks JB (1983) Cold Spring Harbor Symp. Quantitative Biol 47:1165-1173 Bryant JA (1982) In: Parthier B, Boulder D (eds) Encyclopedia of plant physiology, new Series, vol 14B. Springer, Berlin, pp 75-110 Calos MP, Miller JH (1980) Cell 20:579-595 Cohen SW, Chang ACY, Hsu L (1972) Proc Natl Acad Sci USA 69:2110-2114 Deno H, Sugiura M (1984) Proc Natl Acad Sci USA 81:405-408 E1-Gewely MR, Helling RB, Dibbib JGT (1984) Mol Gen Genet 194:432 Graf L, Roux E, Stutz E, KSssel H (1982) Nucleic Acids Res 10:6369-6381 Hollingsworth MJ, Hallick RB (1982) J Biol Chem 257:1279512799 Jenni B, Stutz E (1979) FEBS Lett 102:95-99 Jenni B, Fasnacht M, Stutz E (1981) FEBS Lett 125:175-179 Koller B, Delius H (1982a) EMBO J 1:995-998 Koller B, Delius H (1982b) FEBS Lett 139:86-92 Manning JE, Richards OC (1972) Biochemistry 11:2036-2043 Messing J, Vieira J (1982) Gene 19:269-276 Messing J, Crea R, Seeberg PH (1980) Nucleic Acids Res 9:309321 Oka A, Sugimoto K, Takanami M, Hirota Y (1980) Mol Gen Genet 178:9-20 Orozco EM, Rushlow KE, Dodd JR, Hallick RB (1980) J Biol Chem 255 : 10997-11003 Ravel-Chapuis P, Heizman P, Nigon V (1982) Nature 300:78-81 Rochaix JD, van Dillewijn J, Rahire M (1984) Cell 36:925-931 Roux E, Graf L, Stutz E (1983) Nucleic Acids Res 11:19571968 Sanger F, Coulson AR, Barrel BG, Smith AJH, Roe BA (1980) J Mol Biol 143:161-178 Schlunegger B, Fasnacht M, Stutz E, Koller B, Delius H (1983) Biochim Biophys Acta 739:114-121 Sprinzl M, Gauss DH (1983) Nucleic Acids Res l l : r l r53 Stayton MM, Bertsch L, Biswas S, Burgers P, Dixon N, Flynn JE, Fuller R, Kaguni M, Low R, Ktonberg A (1983) Cold Spring Harbor Symp Quantitative Biol 47: 693-700 Tait RC, Rodriguez RL, West RW (1980) J Biol Chem 255:813815 Tewari KK (1979) In: Hall TC, Davies JW (eds) Nucleic Acids in Plants, vol I. CRC Press, Florida, pp 41 108 VaUet JM, Rahire M, Rochaix JD (1984) EMBO J 3:415-421
634 WaddeU J, Wang XM, WU M (1984) Nucleic Acids Res 12: 3843-3856 Wang XM, Chang CH, Waddell J, Wu M (1984) Nucleic Acids Res 12:3857-3872 Wartell RM, Reznikoff WS (t980) Gene 9:307-310 Wienand U, Schwarz Z, Feix G (1979) FEBS Lett 98:319-323 Zyskind JW, Harding NE, Takeda Y, Cleary JM, Smith DW (1982) ICN-UCLA Symposia on molecular and cellular biology 22:13-25
B. Schlunegger and E. Stutz: Euglena chloroplast DNA Zyskind JW, Smith DW (1980) Proc Natl Acad Sci USA 77: 2460-2464
Communicated by U. Leupold Received July 19, 1984