J. Mol. Biol. (1991) 222, 525-536

Mitochondrial-like DNA Sequences Flanked by Direct and Inverted Repeats in the Nuclear Genome of Toxoplasma godi Pilar N. Ossorioj-, L. David Sibley and John C. Boothroyd$ Department of Microbiology and Immunology Stanford University School of Medicine Fairchild Building , Stanford, CA 94305-5402, U.S.A. (Received

14 February

1991; accepted 17 July

1991)

In the course of our genetic studies on Toxoplasma gondii, it was discovered that one cosmid hybridized to a repetitive element. The hybridization pattern observed for the enzyme BglII indicated that this cosmid hybridized to a large number of discrete, but related, elements. Four BglII fragments were subcloned from the cosmid, and each was shown to hybridize with all the others, as well as to numerous dispersed sequences in genomic DNA. Three subclones were sequenced in their entirety, and shown to contain fragments of the genes for and functional copies of cytochrome oxidase subunit I and apocytochrome b, complete which have been found in only mitochondrial genomes. All the subcloned fragments were bounded at both ends by a 91 base-pair sequence, which contains a site for BgEII. This 91 base-pair sequence could be found as either a direct or inverted repeat. It was determined that the BglII elements are arrayed downstream from a single copy nuclear gene. Comparison of genomic and cosmid DNAs confirmed that the cosmid faithfully reflects the nuclear genome. Although the mitochondrial genome of Toxoplasma has not been characterized, these nuclear mitochondrial-like sequences appear to be internally rearranged with respect to known, functional mitochondrial genomes, and with respect to each other. The finding of short repeated sequences flanking these elements may be a clue to t’he mechanism of their dissemination. Keywords:

Toxoplasma gondii; direct/inverted

gene transfer; mitochondrial-like repeats; cytochromes

DNA;

reviews, see Gray, 1989a,b). An accumulating body of data suggests that mitochondria are most closely related to bacteria of the a-subdivision of the purple photosynthetic bacteria, while plastids are most closely related to cyanobacteria (Cedergren et al., 1988; Gray 19896; Sogin et al., 1986). Thus, nuclear, mitochondrial and plastid genomes are thought to have been derived from discernably different lineages. In each of several different and widely separated phyla, DNA sequences which share an apparently common and recent ancestory are found in more than one cellular compartment, To date, t’here are examples of mitochondrial-like DNA (mt&like DNA) in the nucleus (Farrelly & Butow, 1983; Fukuda et al., 1985; Gellissen et al., 1983; Gellissen & Michaelis, 1987; Hadler et al., 1983: Jacobs et aE.,

1. Introduction Nearly all eukaryotic cells contain genetic information in two types of cellular compartments, nuclei and mitochondria. Plants include a third genetic compartment, the plastid. The genomes of mitochondria and plastids are smaller than nuclear genomes and encode only a fraction of the products necessary for their biogenesis and continuing function. The remainder of the necessary gene products are encoded by nuclear DNA, synthesized on cytoplasmic ribosomes and transported into the organelles. The prevailing theory on the origin of mitochondria and chloroplasts states that these organelles are derived from bacteria-like endosymbionts (for t Present address: Yale University School of Medicine, LCI rm. 810, 333 Cedar St.. New Haven, CT 06510, U.S.A. IAuthor to whom all correspondence should be addressed.

&jAbbreviations used: mt, mitochondrial; ct, chloroplast; n, nuclear; bp base-pair(s); kb. lo3 basepairs: SDR, short dispersed repeat; nt, nucleotide(s). 525

0022%2836/!~1/230525-12

$03.00/O

(Q 1991

Academic

Press

Limited

P. ,V. Ossorio

526

1983; Kamimura et al., 1989; Kemble et al., 1983: Thorsness & Fox, 1990), chloroplast-like DNA (ctlike DNA) in mitochondria and in the nucleus (Joyce & Gray, 1989; Stern & Lonsdale, 1982; Stern & Palmer, 1984; Timmis & Scott, 1983), and nuclear-like DNA (n-like DNA) in mitochondria (Schuster & Rrennicke, 1987). These findings suggest that transfer of genetic material between cellular compartments has occurred and that transfer has happened repeatedly over the course of evolut’ion. Here, we report the discovery of mt-like sequences which are physically linked to a nuclear gene of the protozoan parasite Toxoplasma gondii. These mt-like sequences are unusual in that they are bounded by 91 base-pair (bp) direct or inverted repeats. T. gondii, an obligate intracellular parasite. is a member of the phylum Apicomplexa, a phylum that includes other medically and agriculturally important parasites such as Plasmodium and Eimeria.

2. Materials

and Methods

(a) T. gondii growth Toxoplasma gondii RH, P and C strain tachyzoites were grown on human foreskin fihroblast monolayers in MEM (Eagle) medium (G&co) supplemented with 301; (v/v) fetal bovine serum or 10% (v/v) Nu serum (Collaborative Research). Parasites were purified by passage over a CF-11 cellulose column (Whatman) as described by Tanabe et al. (1977). DNA was prepared as described by Burg et al. (1988). (b) Oligonucleotides (1) COXl oligonucleotide: 5’.AGGATCCAAGAACTAACGCGATCTCC-3’. This oligonucleotide represents sense sequence near the 3’ end of the cytochrome oxidase subunit I gene fragment,. Nucleotides 2. 3 and 4 of this oligonucleotide do not correspond to the coding sequence. but were included to introduce a BamHI site. (2) COB oligonucleotide: 5’-GAAGTGGTGTTACGAACCGG-3’. This is an antisense probe corresponding to sequences in the central region of the cytochrome B gene fragment, (3) TERMIPU’AL oligonucleotide: 5’-AACCAGTAGTCCAACTCG-3’. This probe corresponds to sequences immediately 3’ of the BglTI site in the 91 bp repeats (see Results). (c) Southern

blot analysis

For Southern transfers, DNA was cleaved with rest,riction endonucleases and resolved on 1 o/o (w/v) agarose gels in TBE (89 mlvr-Tris, 89 m&I-borate, 2 mM-EDTA. pH 8.3). The agarose gels were soaked in denaturing solution (1.5 M-NaCl, @5 M-NaOH) for 2 periods of 30 min. followed by neutralizing solution (1 M-ammonium acetate, @02 M-Nash) for 2 periods of 30 min. A @4 pm nitrocellulose membrane (Schleicher and Schuell) was wetted with deionized water and soaked in denaturing solution for 30 min. Transfers were performed overnight without a reservoir. 32P-radiolabeled probes were produced using the Random Primed DKA Labeling Kit of BoehringerMannheim. Blots were hybridized at 65°C overnight in a

et al.

of 4 x standard saline citrattl (SS( ‘). solution 2 x Denhardt ‘s. 1 y/o (w/v) dry skim milk and 20 to 60 pg calf thymus D?iA/ml (SSC is 150 rnM-NaCl, 15 mysodium &rate: 50 x Denhardt’s is lqb (v/v) Ficoll 400. I y/, (v/v) polyvinylpyrrolidone, I ok (w/v) bovine serum albumin). Filters were washed twice in 2 x SSC, 0.1 ‘!. (w/v) SDS at ambient temperature for 30 min and then in 0.2 x ssc, 0.1 “0 SDS at 65 O(” for 30 min. Oligonucleotides were labeled using polynuoleotidt~ kinase (P;ew England Biolabs) and [y-3ZPj,4TP. The> were hybridized in 0.9 M-sodium chloride. @16 M-sodium phosphate (pH 6.8). 7 mM-EDTA. o’l”,, SDS. 100 1-18 tRNA/ml. and 1 oi, dry skim milk. (‘aIf thymus D9A was omitted from this solution to avoid the possibility that, the oligonucleotides would hybridize to the blocking D?jA. The dissociation temperature (7;) was calculated for each probe according to the formula ‘fd = 2(A + T) + 4(U + C) (where A + T is the total number of adenosine and thvmidine residues and C +V the tot*al number of guano&e and cytidine residues in the oligonucleotide) and hybridization was performed at intermediate stringencies of T, - 10°C or T, - 12 “C. The linker nucleotides were not included in calculating the ‘f, for COXl. Following hybridization. blots were washed at ambient temperature for 15 min with a solution of 2 x SK I Y,& Lo SDS and then for 15 min at 42°C’ also in 2 x SW. IO,) SDS.

(d)

Sucleotide

sequence

analysis

All fragments t,o be sequenced were cloned into one of the BlueScript series vectors (Stratagem?) and sequenced using the dideoxy chain terminating method @anger rl al., 1977). The enzyme and reagents used were part of the Sequenase Kit (Stratagem) or the Taquence Kit (United States Biochemical Corporation). Deletion clones were’ generated as described by Henikoff (1984). Sequence data were analyzed using the Sequence Analysis Software Package of the University of Wisconsin Genetics Computer Group and the DN;2 Strider 1.lTM program of Christian Marck. Sequence c*omparisons were performed using the FASTA and TFASTA programs (Pearson & Lipman. 1988) and thr ?;BRF and GEIVBAIL’K databases.

(e) Slot

blot hybridization

DNA was brought up to 30 ~1 in water and boiled for 10 min. The tubes were placed on wet ice and 30 ~1 of 1 M-PiaOH were added. After incubation at ambient temperature for 30 min, the tubes were transferred t,o wet ice and 30 ~1 of 1 w-ammonium acetate, @02 M-SaOH were added. Samples (45 ~1) were blot’ted onto pre-soaked nitrocellulose filters using a vacuum blotting apparatus (HybriSlot Manifold, Bethesda Research Laboratories). Nitrocellulose was soaked in 20 x SSC for 30 min prior t,o blotting. After each sample had been applied, the well was washed with 600 ~1 of 10 x SSC. After blotting, the filters were washed in 2 x SSC and baked for 2 h at 60°C. Filters were hybridized according to the Southern hlot hybridization met,hod described above The genomic Dh’A used for these rxperiments was treated with RNase A for 60 min at 37°C. Dilutions of plasmid were mixed with carrier calf t,hymus I)h’A at a final concentration of calf thymus DPU’A of 0.1 mg/ml.

Mitochondrial-like Probe :

1099 bp

1884 bp

Nuclear

DNA

in Toxoplasma

527

961 bp

Figure 1. T. gondii genomic digests probed with repeat fragments. DNA was electrophoresed through a lyb agarose gel in TBE and transferred to nitrocellulose (see Materials and Methods). Lanes labeled Tg contain 2 pg of T. gondii genomic DKA digested with the enzyme indicated following the slash. Lanes labeled Hu contain 1.5 pg of human DSA digested with the restriction enzyme RgZIT. Human DNA is used as a control because t,he parasites are grown in human cells. Each set of 4 lanes was probed with one of the repeat probes (see Fig. 4) as indicated at the top of the Figure. Size markers were by a i. DKA digest,ed with PstI.

3. Results The cosmid cROP1 was isolated from a RH strain cosmid library (Burg et al., 1988) during the investigation of t,he single copy gene ROPl. The ROPl gene encodes a protein found in the apical secretory organelles termed rhoptries (Ossorio et al., 1991). During t,he course of a project to screen for restriction fragment length polymorphisms in the Toxoplasma genome, it was noted that cROP1 hybridized t’o a dispersed repetitive sequence: when DKA RgZIT or BamHT digests of T. gondii genomic were probed, strong. discrete bands were seen, whereas the nine other digests (AvaT, CZaI, EcoRI, HindIII, P&I, PvuII, SacI, Sac11 and SalI) gave a heterogeneous smear (data not shown, but see Fig. 1). Digestion of the cosmid with BgZII generated a pattern with four bands between approximately 900 bp and 2150 bp and a large amount of material that migrated at 12 kb or greater. The four small BgZII bands, of 961 bp, 1099 bp, 1884 bp and approximately 2140 bp (sizes based on sequence determination, see below), were isolated and used to probe BglII. PvuII and Sal1 digests of genomic DKA from RH strain parasites (Fig. 1). Each probe apparently recognized the same family of dispersed repeats, although minor differences in hybridization could be observed. To rule out the possibility that the hybridization patterns observed in Figure 1 were due to partial digestion or degradation of the

2140 1884

I250 1099 961

Figure 2. The 4 BgZII fragments are linked in the cosmid. Single and double digests of the cosmid cROP1 were electrophoresed through a gel of 1 o/0 agarose in TBE (see Materials and Methods). The gel was blotted and probed with the 961 bp fragment (see Fig. 4). The enzyme or combination of enzymes used is indicated at the top of the Figure. Sizes are based on direct sequence determinaCon for each cloned repeat containing fragment.

DKA, the blots used for this Figure were washed and re-probed with a fragment containing the c+tubulin gene. No degradation or partial digestion was observed (data not shown). The cROP1 cosmid was double-digested with various combinations of restriction enzymes, and probed with the 961 bp fragment (Fig. 2). The results indicated that the four BgZII fragments are not randomly distributed in the cosmid, in as much as they are all contained within one large CEaI fragment, one large EcoRI fragment and one large EcoRV fragment. However, there are other sequences in the cosmid, beyond those contained in the four BgZII fragments that hybridize to this 961 bp probe (for example, in Fig. 2 the faint band observed in the lane digested with EcoRI and BgZII). In a similar experiment, the 1099 bp fragment hybridized to all of the bands recognized by the 961 bp probe, as well as to some additional bands (data not shown).

P. N. Ossorio et al.

528 (a)

45 90

AAG ACT TTT AGC TGT CTT RAG CAG WC AGT GGG GTG GTG GTG TAC Lys Thr Phe ser cys I.** Lys Gh ser ser Gly VS.1 “al “al *yr

270

AGC AAT CAT AAA CA?. CTT GGT TGT CTG TAT CTC ATA ACT GGA GTC ser Am Hi3 x.ya GlU LB” Gly cys Le” *yr J&u 110 Thr Gly “al

315

ATA TTC *GT ATC CTA GGT ACT RTA ATG TCT CTG TTT ATT CGA TTT 11e Phe ser 11e Le” Gly Thr 11e “et ser l&u Phe 11e *rg Phe

360

GAG l-r* TIC AGT TCT GGT KG CGG ATC ATT TGT *CA GAG *CA ATA Gl” Le.” *yr ser see Gly ser *rg 1le r1e cys Thr Gl” Thr 11e

405

TCT TAT MT GTG ATA ATT ACA ATA CRT GGT CTA GCT ATG **c TTT Ser Tyr Asn Val Ile Ile Thr ile His Gly L.eu Ala Met Tie *he

450

ATG TTC TTA ATG CCG GCT TTG TAC GGA GGA TAT GGT AAC TTC TTT Met Phe Leu "et Pro Ala Leu Tyr Gly Gly Tyr Gly *sn Phe Phe

495 540

AAG ACT TTT AGC TGT CTT AAG CAG TCC AGT GGG GTG GTG GTG TX Lys Thr Phe ser cys Leu lys an ser ser Gly "al "al "al *yr

270

GTA OX ATC TAT ATT GGT GGT TCG GAA GTC GTT TTC CCA AGA ACT "al Pro Ile Tyr Ile Gly Gly Ser Glu "al "al Phe Pro Arg Thr XC TAT TX CTA GTA CCA TTA GTG AAC TCA TTT GGT Ser Tyr Pha leu "al Pro leu "al Asn Sar Phe Gly

585

AGC AAT CAT AAA GAA CTT GGT TGT CTG TAT CTC ATA ACT GGA GTC Ser Am His Ly9 Gl" Leu Gly cy3 Leu *yr Le" Ile Thr Gly "al

315

AAC GCG AK Asn Ala Ile

630

ATA TTC AGT ATC CTA GGT ACT ATA ATG TCT CTG TTT ATT CGA TTT 11e Phe ser Ile Leu Gly Thr Ile Met Ser Leu Phe Ile Arg Phe

360

CTG ATC CT* AGT ACG CAG TGA GCT AAA TAG ATA CAA GGA ACT TGA Leu Ile Leu Ser Thr Gin *** Ala Lya *** lie Gl" Gly Thr ***

675

GAG TTA TAC AGT TCT GGT TCG CGG ATC ATT TGT ACR GAG ACA ATR Gl” Le” *yr ser ser my ser *rg 11e Ile cys *ix Gl" Thi Ile

405

CM. GCA TTA CTA GAT TTA TAT AAA CGA CRA AAG *CA *CA CGA TTT Gin Ala Leu leu Asp Leu Tyr Lays Aq Gln Lys Thr Ser Arg Phe

TAT AAT GTG ATA ATT AC.4 RTR CAT GGT CTA GCT RTG RTC TTT *yr *an Val Ile Ile Thr Ile His Gly Leu Ala Met Ile Phe

450

XT ser

TAG ATC GGT TGG TX AGA ATA TCA AAC CTA ATA CTA ATT ARC TGA It* GTT AAA TAA TX AGG GTT GAA CTG TGG GTT AGT TTC AAT GCC CAA GGC AGA GCA CTG GAT TGG ATA CCC AGG GAA CTG TGC XC

ATG TTC TTA ATG CCG GCT TTG TX "et Phe Leu Met Pro Ala LB" Tyr

*AA GA!+ *CA *AA TAT AAG *CA CCA GGC ATG CAA TAC CAA TCA GAT SAC AX

GGA GGA TAT GGT AAC TTC TTT Gly Gly Tyr Gly Asn Phe Phe

495

GTA cc* ATC TAT ATT GGT GGT TCG GM. GTC GTT TTC cc* AGA ACT "al Pro 11e *yr Ile Gly my ser Gl" "al "al Phe er-o arg Thr

540

CTG ATC CTA AGT RCG CAG TGR GCT AAA TAG ATR CAA GGA ACT TGR Leu 11e J.e” ser ThZ Gh *** Ala Lys l ** 11e Gh my ThI ***

630

TAG AX

CAT TAA

TGA AGC TAG ACT CCC TGT TAC ACA TTA TAA AAT GGG ATT

CCT AGG TTT RTA TM

ACT *cc TTT TCT GGG GAG TAT ATA CTA CGA l?

GTT GGA CTA CTG GTT TBG

967

l

(b)

OTT

GGT TGG TAC AGA AT.4 TCA AK

CTR ATA CTA ATT ?iAc TGR

7 2:

*AA

GTT

.765

**

?@%A

TAC

AGG

GTT

GAA

CTG

TGG

GAA T

GUlQA

RAC CAG TAG 'KC AAC TCG TAG TAT ATA CTC CCC AGA AAA

AGC TTA TM 11

ATA ATC CTG TCT CAG AGA TGA TTA CTC CCA GTA CGA

TGG TAc TGA TCA TAC TAG CAT CTG AGT RGT AGT TTT CTC TCG CTG

135

CRT ATA TTA CGC CCT AGA AC*

180

TAA CCG ATG TGG ART AAC CTT AAT GCT CGT AGG ATA TTG AM. TCC Met *rp Am As* Leu As* Ala *l-g *rg 1le Le" Lys ser co**xe

225

AX ACT TTT AGC TGT CTT AAG CAG WC AGT GGG GTG GTG GTG TAC Rsn Thr Phe Ser Cys Leu Lys Gin Ser Ser Gly "al "a1 "al Tyr

270

AGC AAT CAT AAA GAS. CTT GGT TGT CTG TAT CTC ATA ACT GGA GTC Ser Am His Lys Glu Leu Gly Cys Leu Tyr Len Ile Thr Gly Val

315

ATA TTC AGT ATC CTA GGT ACT ATA ATG TCT CTG TTT ATT CGA TTT Ile Phe Ser Ile Leu Gly Thr Ile Met Ser Le" Phe Ile Arg Phe

360

GAG TTA TAC TAA TGT GAA CAC ATA AGA TCA TCG AAT ATA KG GTA Glu Leu Tyr **+ Cys Glu His Ile Arg Ser Ser Asn Ile Thr Val

405

TGC TCC TGA ATG TM. CGG TX

450

AAG CTG TAA AC* AAG GAC WC ==A

Sar

=**

“et

l

**

ACT

TAR

ACT

GAG

GAG

GTC

SAG

TAG

GTA

CAA

ACC

GTA

CAA

GGA

TTR

ATT

ATG

TCC

ATC

TGT

GCA

TCT

AAG

TTG

ATA

CTC

GGT

TAT

ATA

TGT

TAG

ACG

CTA

ACT

XC

CGG

CTA

MC

ATC

cc*

TTT

CTT

TM

AAC

AC.4

CTT

ccc

TGG

TCG

CCG

TTA

GTA

TGA

TCT

CAR

AGT

*cc

*GA

AGC

CAT

GTG

ATC

TAT

AT.4

GTR

TAA

CGG

GAC

ATT

AGA

CCG

AAC

CTG

CGA

TAG

ATA

AAT

ATA

XT

TGG

ATG

ATT

GTA

TAT

AGC

GGC

*AA

ATG

*CA

AX

AAA

CAT

GCG

AAT

TTA

GGT

TX

CAT

GAA

AX

TAT

TTG

GAA

GAA

GAG

GCT

TGA

TAG

TAC

TAC

CGT

AAG

TAC

RTA

ATA

TX

AGT

CCC

AGC

AGT

AGC

GGT

*AA

ACT

ATR

GAA

GAG

TAG

AGT

ATT

ATC

CAT

ACR

TAC

CAG

CTG tmmo~ogy

TGT

A&A to

TAT REPI

90

TTA ATA CGA GTG ATA ACA GTA AX

cys

end

45

CTT *CA AGC GGC TTT TGG TTT GAT GGA RCT ATC AC.4 TCC AGA GAT Le" Thr Ser Gly Phe Trp Phe 4s~ Gly Thr Ile Thr Se= Arg Asp

490

Mitochondrial-like

AWL

CGC

TCT

CCT

TOT

KC

ACT

ACC

5%~

TCA

TAA

TCT

ATC

GTT

Nuclear

cc?.

1665

DNA in Toxoplasma

REp4

‘p-4

-2140 N

BmT

TAT

GCT CTT

CTG CTA I? cAILGBT_cT.

ACA

CAA

TAG

MC

TTG

GAT

CCG

GTA

AAC

RM

GAC

529

*

1845 1890

bp

K

REP3

1884 bp H

b

T

NT

om Brn

Cd) TGCTMCAC?.

ATAGAACTT *

AACCAGTAGT

CCAACTCGTA

LiLsuccffiTAA BamHl GTATATACTC

ACAAAGACCT

Tm

50 EqlIl

CCCAGMAAA

G

A 91

Figure 3. Sequence from the REP clones. Amino acids are indicated for open reading frames that have homology to portions of COXI and COB. Nucleotides, which are part of the terminal BgZII sites, are underlined. Stop codons are indicated by ***. 11,The termini of the short dispersed repeats (SDRs). (a) sequence of REPl. (b) sequence of REPS; fi”, an unusual SDR terminus. (c) Sequence of REP3. (d) Complete sequence of the SDR. The BgEII and BamHI sites are underlined. * The 5’ terminus of the 3’ SDR of REPB, which is 6 nt shorter than other SDRs.

(a) Sequenceand characterization The four repeat-containing BgZII fragments of cROP1 were cloned into the unique BamHI site of the BlueScript KS+ vector and were designated REP1 t#hrough REP4 in order of ascending size (Fig. 4). Using the dideoxy chain terminating method (see Materials and Methods), REPl, REP2 and REP3 were sequenced completely (Fig. 3). Approximately 300 bp of sequence was determined for each end of REP4. Information from the mapping and sequencing is summarized in Figure 4, where the REP fragments are graphically represented. The 5’ end of REP1 was defined by an open reading frame on this fragment (discussed below) and all other fragments were oriented in relation to REPl. Each of the four REP fragments is terminated by one of two possible sequences, labeled block A and block B in Figure 4. Block B has a BamHI site within it, and BamHI digestion of cROP1 yields a pattern similar to that described for BgZII (data not shown). This suggested that blocks A and B are parts of a single larger repeat. To confirm this, the BamHI fragments were cloned and partially sequenced. The results confirm that blocks A and B represent two halves of one repeat. This was verified by restriction mapping (data not shown). For the remainder of this discussion, the 91 bp repeat composed of blocks A and B will be referred to as the short dispersed repeat (SDR). The SDR was found both as a direct repeat, as observed for REPS, or as an inverted repeat, as observed for REPl. All four SDRs have identical sequences except that’ the SDR at the 3’ end of REP2 is six base-pairs shorter than those of the other three (see Fig. 3(d)).

*

coxl

8’

REP?.

1099 bp H

T

A

ml

coxl

A

REPl

961 bp tH

SDR

i&

T

x

or

@i

91 bp

Figure 4. Schematic representation of REP clones. The 4 REP fragments and the short dispersed repeat (SDR) are displayed from 5’ to 3’ with respect to the COXl gene fragment. All boxes with the same shading or pattern represent sequences that are identical to each other. Blocks A and B of the SDR are labeled above the appropriate boxes. Regions labeled COXl show clear similarity to a portion of the known mitochondrial gene for cytochrome oxidase subunit 1, and regions labeled COB show a clear similarity to a portion of the known mitochondrial gene for apocytochrome b. All fragments were cloned as BgEII fragments and, therefore, a BglII site is found at the end of each, although this is not denoted on the diagram. Other restriction sites are BamHI, BmaqI, T; HindIII, H; NcoI, N; KpnI, K. The lower case letters and the line associated with them represent the oligonucleotides that were used as probes, t, Terminal oligonucleotide; x, COXI oligonucleotide and b, COB oligonucleotide. Box B at the 3’ end of REP2 is starred because it is 6 bp shorter than other B boxes; however, it is otherwise identical in sequence to its counterparts.

The SDR was common to all of the cloned REP fragments and each of the REP fragments contained additional sequences that were found in at least one of the other three clones (see Fig. 4). For example, the entire sequence of REP1 can be found split between REP3 and REP4. (It is not clear how far the homology between REP1 and REP4 extends.) A comparison of the homology blocks from any two clones revealed that the sequences were identical as far as the homology extended; homology did not taper off gradually, but instead showed an abrupt transition whereby two previously homologous sequences become completely dissimilar. A comparison of REPS 1, 2 and 3, all of which contain several hundred base-pairs of identical sequence, indicates that the transition from conserved to divergent sequence does not

P. N.

530

Ossorio

et al

50

1 M WNNLNARRIL

REP I A? y&i; H. sopiens C. rhein. A. n;&/ons

LIIDLNTNNV

REP I R yoelii H. supiens C. rhein. A.nidulans

LGKKFSTSTK

KSKTFSCLKQ YILFFYL

SSGVVVY-SN NRYALITNC. M FADRWLFST. MRWLYST.H WQERWYLS..

KE.IKQIESS

SFL..KQPTE

51 HKELGCLYLI -TGVIFSILG . . T . . - . . YL WFSFL.GTY. L -F.AWAGV.. ..DI.T... .V -FAFFGGL.. -.DI.L.. A.DI.T...M -FALFSGL..

TIMS-LYRFE FLL.VIL.T. .A-.L.I.A. .SL.M.I.Y. .A-.V.I.L.

LYSSGSRIIC . . ..SL...A .GQP.NLLG.ALP.RGLL.AGP.VQY.-

TETI-SYNVI Q.NVNL.WM. NDH.--.... DGNGQL.... ADN-QL..S.

REP I Ayoe/ii‘ H.sapiens C. rhein. A. nidulans

101 ITIHGLAMIF....II..F V.A.AFV..F ..G..II.LL ..A.AIM..F

FMFLMPALYG .N-I..G.F. ..-V..IMI. ..-V....F. ..-V....I.

GYGNFFVPIY .F..YYL..L .F..WL..LM .F..WLL..M .F...LL.LL

I*GGSEVVFPR C.S..LAY.. ..APDMA... ..APDMA... V..PDMA...

TNAISYFLVP I.S...L.LQ M.NM.FW.L. L.N..FW.N. L.N..FW.LV

REP I R yoehi H. sapiens C. rhein. A. nidulans

151 LVNSFGLILS P.AFILV... PSLLL-LLA. PALAL-LL.. PSLLL-FVF.

TQ*AK*IQGT .AAEFGE-W. AMVEAGA-.. .LVEQGP-.. ATIENGA-.

"QALLDLYKR LRD.YITHPL GWTVYPPLAG GWTAYPPLSV GWT.YPPLSG

QKTSRFF*IG STSLMSLSPV NYSHPGASVD QHSGT..SVD IQSHSGPSVD

100

150

200

.

(a

I

REP 3 b? yoelii Mouse Drosophila Cattle

1 “VS-LNN*LS IF.FYILHG. LL-F.HETG. LL-F.HQTG. LL-F.HETG.

AVR*LYQ*LP TNPLG.DT-A NNPTGLNSDA NNPIGLNSNI NNPTGISSDV

IKVCSIPM-L L.IPFY.NL. D.IPFH.YYT D.IPFH.YFT D.IPFH.YYT

NITAILFNWI SLDVKG..N. IKDILGILIM FKDIVG.IVM IKDILGALLL

-NLLTSGFWF -LI.FLIQSI FLI.MTLVL. IFI.I.LVLI ILA.MLLVL.

REP 3 t? yoelii Mouse Drosophi/o Cattle

51 DGTIT--SRD F.V.RLSHP. FPDM-LGDP. SPNL-LGDP. APDL-LGDP.

NSIPVNRFVT .AYNCY-ICY .YM.A.PLN. .F..A.PL.. .YT.A.PLN.

PLHIVPEWYF TIT....... .P..K..... .A..Q..... .P..K.....

LAYYAVLKVI .PF..M.NTF .FA..I.RS. .FA..I.RS. .FA..I.RS.

PSKTGGLLVF LV.CWSSHCC .N.L..V.AL .N.L..VIAL .N.L..V.AL

MSSTCQ*NIN SIF.II1L.S IL.ILILALM VL. IAILM. AF.ILILALI

NDETYLINIT RTKKFNNYY. PFLHTSKQRS L PFYNLSKFRG PLLHTSKQRS

REP 3 F! yoefii Mouse &osophi/a Cattle

50

100

T” IN LM IQ MM

(b) Figure 5. COXI and COB amino acid homologies.(a) The alignment between the predicted amino acid sequences of REP1 and the sequences of COXI from Plasmodium yoelii, Homo sapiens, Chlamydomonas rheinhardtii and Aq)ergillu,s nidulans. The portion of REP1 shown begins with a methionine residue at nt 186. All other COXT sequences are shown beginning at their amino termini. * Indicates stop codons. Dots indicate amino acid identity with the T. gondii sequencr and dashes indicate a space introduced into a sequence to facilitate alignment. (b) The alignment of sequences in REPY with COB sequences from Plasmodium yoelii, Mus musculus, Drosophila yak&a and Bos primigenius taurus. Details arc as for (a).

Mitochondrial-like

Nuclear

happen in the same region in each clone, nor is there any obvious similarity between the sequences found at the points of divergence. Thus, within the region of homology among these three clones, there does not appear to be one specific sequence that’ is prone to rearrangements. The REP nucleotide sequences were compared to t’he Cenbank and EMBL databases and the predicted amino acid sequences were compared to the NBRF data base (see Materials and Methods). The prot,ein data base search revealed two open reading frames with strong homology to known proteins, one was cyt)ochrome oxidase subunit 1 (COXI) and t’he other cytochrome b (COB) (Fig. 5). The most highly homologous DNA sequences in the data base were a Plasmodium falciparum sequence which apparently encoded several mitochondrial proteins (Vaidya et al., 1989) and the Chlamydomonas reinhardtii mitochondrial DNA sequence (Boer & Gray, 1986; data not shown). The transcribed genes for COXI and COB have been found only in mitochondrial DNA, although fragments of these genes (presumably non-functional) have been found in other nuclear genomes (see Introduction). Homology to COXI was encoded in a 417 bp open reading frame found in both REP1 and REP3 and was partially represented in REP2 (Figs 3 and 4). R(EPl and REP3 had coding potential for about the amino-t*erminal one-t’hird of a normal COXl protein. Following the open reading frame there was a sharp drop off in homology. Analysis of the other two reading frames for REP1 showed no significant homology to COX 7. Homology to COB was encoded in REP3 by a 252 bp open reading frame that does not begin with an ATG. The REP3 open reading frame would account for only the carboxy-terminal one-third of a normal COB protein. The homology begins abruptly at nt 982 (residue 60 in Fig. 5(b)) and continues until a stop codon is reached in the Toxoplasma sequence (nt 1133 in Fig. 3(c) and amino acid residue 107 in Fig. 5(b)). REP3 also shows homology to COXI, identical to that seen in REPl, with a sequence of 268 bp separating the COXI and COB gene fragments. Neither the full coding potential for a normal COXI protein nor the full coding potential for a normal COB protein would fit in this 268 bp. The sequence of this intervening region did not show homology to any known protein or DNA sequences. (b) REP

sequences are linked to ROPl nuclear genome

in the

The ROPl gene is thought to be a nuclear gene based on several pieces of information. First, it encodes a protein that is localized in the secretory organelles (rhoptries) of the Toxoplasma cell (Schwartzman, 1986; Leriche & Dubremetz, 1991). No such secretory protein has ever been found encoded by mitochondrial or other organellar genomes: indeed. no mitochondrial genome has ever

DNA

in Toxoplasma

531

been found to encode a protein not localized within the mitSochondrion (for a review, see Tzagoloff & Myers, 1986). Second, the gene encodes a typical eukaryotic signal sequence, as expected for a nuclearly encoded secretory protein (Ossorio et al., 1991). Third, the ROPl gene is present in a single copy per cell (Ossorio et al., 1991); although the Toxoplasma mitochondrial genome has not been characterized, the precedent from other organisms suggests that mitochondrial genes are present in multiple copies. Finally, based on hybridization t’o Toxoplasma DNA separated by pulsed-field gel electrophoresis, the ROPl gene is present on a chromosome that migrates with an apparent size of greater than 6 x 106 bp (L.D.S.. unpublished results). The finding that the REP clones encoded fragments of mitochondrial genes raised the possibility that the cROP1 cosmid was a chimera, an artifact containing both single copy nuclear sequences and mitochondrial DNA. To examine this possibility it was necessary to identify the region of transition from single copy to repeated sequences. Fortuitously, a 1206 bp EcoRV to ClaT fragment (RC1.2) downstream from the ROPl gene (Fig. 6(a)) was shown to hybridize to the four BglII bands in the cosmid (Fig. 6(b)). While the RC1.2 fragment hybridized with REPS 1 to 4, it did not actually contain those fragments, but presumably contained related sequences upstream from the cloned REPS. Tt has been shown that the ROPl gene was a single copy gene and that ROPl probes did not recognize repetitive DNA (Ossorio et al., 1991). Thus, the transition from single copy to repetit’ive sequences must occur either within the R(11.2 fragment, or in the -500 bp separating the Clal site of RC1.2 from the 3’ end of the ROPl gene. According to the map of cROP1 (Fig. 6(a)), there are several restriction fragments which should include both the ROPl gene and some repetitive DNA. These fragments include a NcoI fragment of approximately 4200 bp, a Sal1 fragment of approximately 3800 bp and SacTT fragment, of approximately 4700 bp. If the cosmid accurately reflects the nuclear genome, then two predictions can be made: (1) when the ROP1.l fragment is used to probe genomic and cROP1 DNAs, both digested with the diagnostic enzymes mentioned above, a single hybridizing fragment in the genomic digest should co-migrate with a single hybridizing fragment in the equivalent cosmid digest; and (2) if the RC1.2 fragment is used to probe the same digests as in (l), then it should hybridize to the same cosmid bands as t’he ROPl .l probe, as well as to additional cosmid fragments containing repeats; and in digests of genomic DNA it should hybridize to many fragments of various molecular weights, producing a smear. If the cosmid is a cloning artifact, then the ROP1.l probe would hybridize to restriction fragments whose molecular weights were different in cosmid digests and genomic digests. The results presented in Figure 7 indicate that both predictions (1) and (2) above were fulfilled.

532

P. N. Ossorio

ROPI. I

et al.

RCl.2

I kb (a )

Figure 6. (a) A partial restriction map of the cosmid cROP1. The hatched box represents the ROPI gene. Lines above the map indicate the 2 probes that were used for the experiments in Fig. 7. The ROPl.1 probe is from a 1640 bp cDNA described and the RC1.2 probe is an EcoRV to CZaI fragment subcloned from the cosmid. The arrow beneath the map indicates the region believed to contain repetitive DNA. While the RC1.2 fragment, clearly contains repetitive DNA. the exact beginning of these sequences has not been localized, and therefore, this region is indicated with a broken line. (b) RC1.2 hybridizes to REP fragments. Restriction enzyme digests of the cosmid cROP1 were electrophoresed on a 1% agarose gel in TBE and blotted onto nitrocellulose. The blot was probed with random-prime-labeled RCl.2 fragment. RC1.2 hybridizes with the 4 small BgEII bands.

2838

2140

961

Side-by-side lanes of T. yondii genomic DSA and cROP1 DNA. each pair digested with one of the three diagnostic enzymes, were probed with ROPl. 1. A single band in each genomic digest comigrated with a single band in the paired cosmid digest. Therefore, prediction (1) was fulfilled. A duplicate blot was probed with RC1.2. In all cases, the major hybridizing band in the cosmid digest was the same band that’ hybridized to ROPl. 1, RC1.2 also hybridized to additional cosmid bands and generated a smear on genomic digests. Thus, prediction (2) was also fulfilled. From these experiments we concluded that’ the arrangement of sequences in the cosmid is not an artifact, and that there is mitochondrial-like DNA present in the nuclear genome of Toxoplasma. (c) ~~equences homologous to COXI, CUB cxnrl the SDR are all multiply repeated In order to analyze further the mt-like DSA in the Toxoplasma genome and to determine whether the various sequences in the REP clones contributed more or less equally to the genomic hybridization pattern observed with REP probes. oligonucleotides to three regions were hybridized to restriction digests of genomic DNA (Fig. 8). The oligonucleotides corresponded to the 3’ region of the COXI gene fragment, the 5’ region of the COB gene

Mitochondrial-like

Nuclear

NC01

DNA

in Toxoplasma

533

So/I

SadI

--II

R-C

ROP

nnnnnn T

c

T

R-C

c

T

ROP

c

T

R-C

c

T

ROP

c

T

c

Ii,501

4507

2838

Figure 7. Uuplicatr Southern blot probed with ROPl.1 and RC1.2. T. gondii genomic DNA (2 pg/lane. treated with RNase A) and cROPI (-80 rig/lane) were digested with the restriction enzymes NcoI, Sac11and &1.l1. Genomic and cosmid digests were electrophoresed side-by-side through 1 o/o agarose and transferred in duplicate to nitrocellulose filters (duplicate blot). One of the duplicate filters was probed with the EcoRV to CZuI fragment (R-C). and the other was probed with t,he ROPI.1 fragment (ROP) (see Fig. 6(a) for probes).

fragment and the region of the SDR immediately downstream from the BgZII site (see Figs 3 and 4; and Materials and Methods). Each of the oligonucleotides hybridized with the expected bands in the cosmid digest, with the exception that the terminal (SDR) oligonucleotide did not hybridize with REP1 (discussedbelow). By sequence analysis, the portion of COXl homologous to the COXl oligonucleotide is not present in REPB; therefore. although REP2 contains a COXI gene fragment, it was not detected by the probe used in these experiments. On genomic digests, the COXl probe hybridized with numerous fragments, generating a pattern similar to that seen when the entire REP1 fragment is used as a probe. The COB oligonucleotide also hybridized with numerous bands in the genomic digest, but while the two patterns showed significant overlap, they were not identical. The terminal oligonucleotide showed the least extensive hybridization, although numerous bands

were recognized. The pattern of hybridization observed suggested that sequences homologous to all of the oligonucleotides could be found in many genomic environments. The terminal (SDR) oligonucleotide was expected to hybridize quite strongly to the four REP fragments in the cosmid; however, little or no hybridization to REP1 was observed in several experiments. REP1

is bordered

by the SDR in an inverted

repeat

arrangement, and therefore it, was possible that the two ends of the fragment were hybridizing to each other (snap-back) and precluding hybridization of the oligonucleotide. That snap-back was occurring was demonstrated by showing that the terminal oligonucleotide does not hybridize well to the intact REP1 fragment, but does hybridize strongly to two fragments when cloned REP1 is split by digestion with the enzyme Taql (data not shown). Thus, while the hybridization patterns observed with the COXT and COB oligonucleotides can be expected to

P. N. Ossorio COB

cox

Terminal

et al. REP I

(genome equivalents)

REP 4 REP

3

REP REP

2 I

Figure 9. Quantitat,ion of REP seyurnc~. A slot blot of Toxoplasma genomic DNA (in lanes marked Tg). and dilutions of the pREP1 ptasmid (in lanes marked REPI). was hybridized t)o 32P-tabeled REP1 fragment. ‘l’hr numbers above the Tg lanes refer to quant,ities of 11X.4 in pg. The numbers above t,he REP1 lanes refer to mota~ equivalents of plasmid DNA for a single ,‘op,v grnr relative to 3 pg of DKA.

4. Discussion Figure 8. Sequences homologous

to COX 1. COB and the SDR are all multiply repeated. Southern blot of a I o~j agarose get etectrophoresed in TBE (see Materials and Methods). Lanes containing T. yondii genomica DNA (2 pg/lanr. treated with RNase A) were electrophoresed alongside lanes caontaining CROPI ( -80 rig/lane). All DBA samples were digested wit,h the restriction enzyme WnmHT. Nitrocellulose strips were probed with oligonucleotides specific for the COX 2 gene fragment. the (‘OZZ gene fragment or the SDR. Kote that the t,erminat oligonucteotide does not recognize the restriction fragment corresponding to REPI. This is due t,o snap-back hybridization of the inverted repeats at the ends of the REP1 fragment (see the text).

accurately represent the genomic distribution of homologous sequences, the hybridization pattern observed using the terminal oligonucleotide can be expected to under-represent the number of sequences homologous to the SDR. Hence, the appearance that C’OXl and COB homologous sequences are more numerous than SDR sequences is probably misleading. (d) Copy number REP1 was hybridized t,o slot blots containing T. gondii genomic DNA and dilutions of pREP1 representing the equivalent of 10, 100, 500, 1000 and 5000 copies per cell (Fig. 9). The results indicate that there are between 100 and 500 copies per cell of REPl-homologous sequences. The extent to which mitochondrial DNA contributes to this number is unknown.

We describe the discovery of’ rrpetitivv. mitochondrial-like 1)SA (REP sequences) in the nuc~lrar genome of T. gondii. That this IIN. was first isolated and characterized as a family of IQlIT fragments was due to the presence of a 91 hp short dispersed repetitive sequence (8DR). which contains a BgETI sit,e and is present as eit’her direcat, or inverted repeats throughout the mt-like DNA. Sequences simila,r to the SDRs havt> not hrrn described previously. Pichersky 8r Tank&y (1988) reported that chloroplast-like 1)N.A in the third intron of the Cab-7 gene of tomatoes has short, direct repeats probably derived from the 3’ end of the inserted fragments. However. in their report the repeats werp no more than I I bp long and different’ terminal repeats hounded ea(*h of two &like fragments in contrast) to the arrangement in SDRs. It is tempting to speculate t,hat, the SDRs might’ play some role in the generation or dispersal of the REP elements, in a manner akin to better-characterized repeats at the ends of known transposable elements. No sequence similarities were found between SDRs and other terminal repeats such as retroviral LTRs. Currently, we describe all of the DNA bounded by SI)Rs as mitochondrial-like, implying that all of it originated in the mit,ochondria and was subsequently transferred to it,s present toc*ation in t,he nucleus. This is almost certainly t.he ease for the 00X/ and C’OB gene fragments; however. because the mitochondrial genomr of T. go.?L&l; has not been characterized. this remains an assumption for the remainder of each repeat, including the SDR#s. The cloned REP fragments showed an interesting mosaicism with respect, to conserved and noriconserved sequences. In comparing REPS I 2 and 3 it appears that one sequence, found at t’he 5’ ends of

Mitochondrial-like

Nuclear

all three of these clones, has recombined in three different places with three different sequences. Thus, these three REP fragments all begin with an SDR. which is followed by 140 nt of sequence of unknown origin, in turn followed by a COXI gene fragment. However, REP2 abruptly diverges from REPS 1 and 3 at’ nt 369 in the COXl homologous sequence. REPS 1 and 3 diverge from each other at nt 744. The divergence appears to result from recombinational events because of the complete lack of homology between the previously identical sequences. The overall impression is one of several sequences having been mixed together randomly. Further characterization of other REP fragments may reveal a pattern governing the various combinations of sequences. In a similar finding, Kamimura et al. (1989) described mt-like DNA in the human nuclear genome that contained gene fragments from three genes which are not contiguous in the human mitochondrial genome. While it is generally believed that pseudogenes acquire mutations at random positions at a rate equal to t,he mutation rate of the genome, there are exceptions. For example, the members of an Aspergillus nidulans 5 S rRNA pseudogene family each contain a region which appears to have accumulat,ed mutations randomly and another which is highly conserved. It has been suggested that the conserved block is subject to the same mechanism of concerted evolution that maintains homogeneity of the functional 5 S genes (Borsuk et al., 1988). The mechanism by which the conserved regions of the REP fragments are maintained as identical, rather than similar. is not obvious. It is possible that, gene conversion events are responsible for homogenizing the conserved regions. A less likely explanation is that there is some strong selective pressure against mutation of certain sequences. The fact’ that a portion of the COXI gene fragment, which is identical between REP1 and REP3 is missing from REPS. would appear to argue against this second explanation. A third possibility is that these repeat)s have arisen quite recently. We have observed that two ot,her unrelated strains of T. gondii (I’ and C) contain dispersed repetitive DNA that hybridizes to the REP elements, giving patterns indistinguishable from those observed for RH (dat,a not shown). They are, therefore, not so recent as t,o be different in different strains. This result also indicates they are not highly “mobile”. Closed circular DNA molecules of 12 pm and 24 pm have been isolated from T. gondii (Borst et al.. 1984) and t)he possibility exists that these are mitochondrial DNA. Similar molecules have been observed for several species of Plasmodium (Gardner et al., 1988; Kilejian, 1975: Williamson et al., 1985). However, in Plasmodium, genes apparently encoding COXT, COB and COXIII have been found on discrete linear DNA fragments of approximately 6 kb, which do not appear to cross-hybridize with t)he large. circular molecules (Aldritt et al., 1989; Joseph et al., 1989; Vaidya et al., 1989). The 6 kb element of P. galinaceum was shown to hybri-

DNA in Toxoplasma

535

dize to a smear or large cluster of bands in Hind111 digests of T. go&i genomic DNA, whereas it hybridizes to one or a few bands in similar digests of Babesia, Theileria and several Plasmodium species (Joseph et al., 1989). As the REP elements contain gene fragments for some of the same genes found in the 6 kb element, it is probable that hybridization to REP sequences is responsible, at least in part, for the large degree of hybridization observed when the 6 kb Plasmodium element is used to probe the T. gondii genome. Whether T. gondii possesses molecules similar to the 6 kb Plasmodium element has not yet been determined. The unusual finding of SDRs flanking the mt-like DNA in a direct or inverted orientation may provide some insight into the generation of this and other dispersed gene families or pseudogenes. What role such sequences may play remains unclear, but further analysis of REP sequences, including comparison with their presumptive mitochondrial progenitors, might yield some clues. Data have been deposited with Genbank under accession numbers: X60240 (REPI): X60241 (REP2); X60243 (REPS). We thank Dr between spinach nuclear and chloroplast genonies. Suture

(London),

305.

65- 67.

Tzagoloff. A. & Myers. .-\. M. (1986). (ienrtics of niit,lbchondrial biogenesis. .-I nnu. Kc/s. Niochw. 55. 249.. 285.

Vaidya. A. 15.. Akella, 1~. & Suplick. K. (1989). SrcJucn~es similar to genes for tw-o nrit,ochondrial proteins ant1 portions of ribosomal RNA in tandemlp arrayed 6. kilobase-pair DNA of a malarial Jlarasite. ,J/ol. Biorhem. Parasitol. 35. !17~ 10’7. Williamson, I). H.. Wilson. R. .J . Bates. f’ .\.. ~lc(‘rratl,v S.. l’erlrr. F. & Qiang. K. ly. (1985). Suclear and tnitochondrial l)KA of the primate malarial parasite Plawwdium

199 -209.

Edited h?/ tl’. K, (‘ohen

knvrr!laiii.

Nol.

Biochrm.

Pnrrrsitol.

14.

Mitochondrial-like DNA sequences flanked by direct and inverted repeats in the nuclear genome of Toxoplasma gondii.

In the course of our genetic studies on Toxoplasma gondii, it was discovered that one cosmid hybridized to a repetitive element. The hybridization pat...
4MB Sizes 0 Downloads 0 Views