Molecular and Biochemical Parasitology, 41 (1990) 221-232

221

Elsevier MOLBIO 01357

Transcription of the heat shock 70 locus in Trypanosoma brucei M a r y G w o - S h u L e e a n d L e x H . T. V a n d e r P l o e g Department of Genetics and Development, Columbia University, New York, NY, U.S.A. (Received 22 December 1989; accepted 7 March 1990)

The 23-kb heat shock 70 locus of the protozoan parasite Trypanosoma brucei encodes several tightly clustered genes. From 5' to 3', a cognate hsp70 gene (gene 1) (see preceding paper, Lee et al.) is separated from a cluster of five identical hsp70 genes (genes 2-6) by about 6 kb of DNA that encodes several RNAs of unknown function. Polycistronic transcripts could be detected in the tandem array of hspTOgenes 2-6 and the maturation of the hsp70 pre-mRNAs involves addition of mini-exons on at least two different alternate 3' splice acceptor sites. Potential heat shock transcription factor binding sites are present upstream of hspTO gene 2 and in the intergenic regions of hsp70 genes 2-6, but could not be found upstream of the cognate hsp70 gene 1. Key words: Heat shock protein; Mini-exon; Trans-splicing; Polycistronic transcription; Post-transcriptional control

Introduction

Transcription of protein coding genes and mRNA maturation in unicellular flagellated trypanosomes and other kinetoplastida involves mechanisms that are drastically different from those in most eukaryotes (for review see refs. 1 and 2). This is illustrated by the fact that every m R N A in trypanosomes consists of two exons, a 5' mini-exon and a main coding exon, which are transcribed from two separate genes. The 5' 39-nt mini-exon, which is common to every mRNA, is derived from a 140-nt mini-exon donor R N A (medRNA) encoded by a gene that is present in 200 copies per cell [3-9]. The mini-exon is joined via trans-splicing to the main coding exon in a process that bears a striking resemblance to c/ssplicing [10-14]. This unusual feature, taken Correspondence to: Mary Gwo-Shu Lee, at present address: Division of Tropical Medicine, School of Public Health, Columbia University, 630 West 168th Street, New York, NY 10032, U.S.A. Note: Nucleotide sequence data reported in this paper have been submitted to the GenBank T M data base with the accession number M32140.

Abbreviations: Hsp, heat shock protein; nt, nucleotide.

together with the fact that protein coding genes in trypanosomes are frequently arranged in tandem arrays and may be transcribed polycistronically, has hampered the identification of their transcription initiation sites [1, 15-22]. Polycistronic transcription of genes could occur, either from promoters located upstream of the tandem arrays of genes, or from promoters located in front of each individual gene. Analysis of the phosphoglycerate kinase (PGK) genes, which are differentially expressed in insect and bloodstream form trypanosomes, revealed that the steady-state mRNA levels are most likely controlled posttranscriptionally [15]. To study transcriptional control of other genes transcribed by R N A polymerase II in trypano° somes we analyzed a 23-kb region spanning the Trypanosoma brucei hsp70 locus containing hsp70 genes 1-6. T.brucei probably contains 7 hsp70 genes [23]. hsp70 genes 2-6 (Fig. 1) are identical and arranged in a tandem array, while a related hsp70 gene (gene 1)[37] is located 6 kb upstream of the tandem array. A seventh gene, which appears to be more diverged, is located elsewhere in the genome. We chose to analyze hsp70 genes since in those eukaryotes tested to date, temperature-sensitive transcription initiation results from

0166-6851/90/$03.50 © Elsevier Science Publishers B.V. (Biomedical Division)

222 the binding of a well defined heat shock transcription factor (hstf) to conserved 14 nt DNA sequences in the promoter region (heat shock element, HSE, binding sites), leading to transcriptional activation of hsp genes. Comparing the response to heat shock of the genes in the 23kb hsp70 locus of T. brucei and searching for putative HSE binding sites may thus facilitate the identification of transcription initiation sites. To study the regulatory mechanism for gene expression of the hsp70 locus in T.brucei, we performed a detailed analysis of the structure of the hsp70 locus and the nature of steady-state R N A transcribed from this locus. Materials and Methods

Description of strains. Strains used are described in the preceding article[37].

DNA sequence analysis. D N A sequence analysis was performed by the dideoxy chain termination method [24]. Two approaches were used to generate D N A fragments for subcloning into the phage M13 mp19 and M13 mp18: D N A was cleaved with restriction enzymes and ligated into the appropriate M13 vectors; BAL 31 deletion series were prepared of some of the DNA subclones which were subsequently ligated into M13 vectors.

taining cloned band-isolated DNA fragments. Following the hybridization, blots were washed to a final stringency of 0.1 x SSC, 0.1% SDS at 65°C. Quantitation of signal intensities was performed by cutting the labeled slots from the nitrocellulose filters and determining the amount of label bound to the filter by counting in 5 ml of Econofluor in a liquid scintillation counter.

Description of the different probes. The different restriction enzyme fragments used as probes for analysis of steady state R N A and used in the nuclear run-on assay are labeled A - F in Fig. 2. These fragments represent:(A) a PstI-HindIII restriction fragment of 597 bp derived from the 3' coding sequence of hsp70 gene 1 (see Fig. 2 in the previous paper [37] and Fig. 2 of this paper); (B) a 2.8-kb HindIII fragment located immediately downstream of the hsp70 gene 1; (C) a 2.95-kb XbaI-HindIII fragment spanning ORF I, located just upstream of gene 2; (D) a 462-bp fragment, extending from 104 bp upstream of the first (major) 3' splice site of hsp70 gene 2 to the XbaI restriction enzyme site located downstream of ORF I; (F) a 554-bp HindIII fragment of the coding region of the identical genes 2-6; (G) a 197-bp fragment derived from the intergenic region between genes 2-3 and extends from the first (major) 3' splice site of gene 3 to an SspI restriction enzyme site located immediately downstream of the poly-(A) addition site of gene 2 [23].

RNA preparation and Northern analysis. This was done as described in the preceding paper [37].

Results

S1 protection analysis and primer extension.

Structure and expression of the hsp70 locus; nucleotide conservation of the intergenic regions. We had previously shown that a cognate hsp70

S1 mapping was performed essentially as described by Dudler and Travers [25]. Primer extensions and direct R N A sequencing were performed essentially as described [26].

Analysis of nascent RNA. Synthesis of nascent R N A in isolated trypanosome nuclei was performed as described by Kooter et al. [18, 27]. Nuclei were isolated from trypanosomes by disrupting cells using a Stansted cell disrupter following storage of nuclei at -140°C. Run-on transcription in the isolated nuclei was allowed to proceed for 5 min in the presence of [32p]UTP at 37°C. Labeled nascent R N A was hybridized with slot blots con-

gene (gene 1) is closely linked to the temperaturesensitive hsp70 genes (genes 2-6) [23, 37]. In order to search for sequences that might function in transcriptional control we analyzed a region up to 3522 bp upstream of the temperature sensitively transcribed hsp70 genes 2-6 and compared this region with the nucleotide sequence preceding hsp70 gene 1, the nucleotide sequence preceding the hsp70 genes from Leishmania major and the hsp83 genes from Trypanosoma cruzi. The sequences were aligned at the 3' AG splice acceptor site of each hsp gene. In the T.brucei

223 genes 2-6 and the L.major hsp70 genes, the 3' AG splice acceptor sites have been determined (23, 26). We have not yet accurately located the 3' splice acceptor site for mini-exon addition at hsp70 gene 1. The putative AG 3' splice acceptor site for the T.cruzi hsp83 gene was located on the basis of its sequence homology with the splice acceptor consensus sequence, since only the position of the ATG initiation and TAA termination codon of that gene had been published [28]. Fig. 1A shows the DNA nucleotide sequence of the region directly upstream of hsp70 gene 2 and the intergenic regions of genes 2-6. The comparison of the relevant sequences located directly upstream of these hsp70 genes with the hsp70 genes in L.major and hsp83 genes in T.cruzi is shown in Fig. lB. Conserved nucleotide sequences could not be detected when compared to the 330-bp sequence preceding the ATG translation initiation codon of hsp70 gene 1. We have therefore not included hsp 70 gene 1 from T. brucei in the remainder of the comparison. Three important nucleotide sequence blocks among the remaining hsp genes could be identified: firstly, the 5' flanking sequence of gene 2 was identical to the intergenic region sequences of genes 2-6 up to 73 bp upstream of the major 3' splice acceptor site [23]. The significance of this finding is unclear but it may indicate that the hsp70 genes of the tandem array undergo gene-conversions, leading to sequence homogenization extending into the region 5' of the splice acceptor site. Secondly, immediately upstream of the conserved DNA sequence in front of gene 2, between nt position -224 and -286 upstream of the 3' splice acceptor site, three potential HSE sequences CAAGCAGATACTCG, G G G G A A A T T T C A A A and G A A A A A A A T T C C A G are closely clustered together. Each of these three potential HSEs shares 6 out of 8 nt which make up the consensus sequence of HSE binding sites of other eukaryotes (consensus HSE: CNNGAANNTTCNNG) [29]. Similarly, the L.major hsp70 and T.cruzi hsp83 genes are preceded by potential HSEs (Fig. 1B). The level of HSE conservation found upstream of these genes has been shown to be sufficient for the regulation of temperature-sensitive transcription initiation in other eukaryotes [30]. We had previously described that the hsp70 inter-

genic regions of genes 2-6 each contain two additional HSE related sequences which share 5 out of 8 nt with the HSE consensus (only one is shown in Fig. 1B). Such potential HSE binding sites could not be found in the remainder of the 3236 bp of sequence upstream of gene 2 (Fig. 1A) as well as in the sequence upstream of gene 1. A third conserved sequence is a polypyrimidine tract of about 20 bp, located between nt -37 and -55 upstream of the 3' splice site, and potential NYY(R)AY branch-point sites preceding the 3' splice acceptor sites.

Location of steady-state RNA in the 23-kb hsp70 locus. Most genes in trypanosomes are found in multiple copies which are organized in tandem arrays and are separated from each other by short intergenic sequences. Several of these tightly linked genes have been shown to be polycistronicaUy transcribed. In the hsp70 locus of T.brucei, gene 1 is separated from the tandem array of hsp70 genes 2-6 by about 6 kb of single copy DNA of unknown function (Fig. 1 and data not shown). To determine whether potential polycistronic transcription occurs and to determine the number of transcription units in the 23-kb hsp70 locus, we first examined steady-state RNA from the 23-kb locus, we determined the distribution of nascent RNA and searched for putative polycistronic precursor mRNAs (pre-mRNAs) which encode more than one coding exon. The analysis of steady-state RNA derived from the 23-kb hsp70 locus demonstrated the presence of tightly clustered genes in the hsp 70 locus, since all probes detected transcripts (Fig. 2). Equal amounts of RNA from procyclic (lanes P) or bloodstream form (lanes B) trypanosomes were loaded in the lanes. Firstly, slightly higher hybridization intensities could be detected in the bloodstream form RNA samples using probes specific for hsp70 gene 1 (A and B). As discussed in the preceding paper [37], this difference in the level of expression of gene 1 between insect and bloodstream form trypanosomes is dependent on the batch of cells used for RNA preparation. Secondly, the level of steady-state RNA generated from gene 1 and the region between gene 1 and gene 2 is very low when compared to that of

224

A ............. HSP

70 G E N E

1 .................................

.............. 2.5 Kb undetermined

sequence .................

H AAGCTTCGTGATCCCAATGAAATATTCTACCAGCCGAGTGGTCTTGCACTGCAGCATCAG GGCTTTGTCAAACCGCTCGCTGCGGGTTCACTCGACAACTGTGGGCCGCCACTTTTGAAC TCATCACTGAACCTTCACGCCCCCGTAGAGGATTACGGTAAATTACTCCTCTTATCTCTT GATGCTATTAGGCACGCGCGAAAGGAATTGGGTGAATTCGACTCTAACTCGGGGGCTATA CCTTCGTACCCACATTACGACTTTGGCGTTGAGTGGTTGGACACTGGACGCAGACTTCAA CTCACCCGACGAGTTTTGGGTATTGATTATATACCAACTGCATCGTCGTTCCGCTACAGC TGCGAACACGACTTGGGTTGTTTTGGTATATGTAACTGTGGCACACGCGATGCATGTCTT CTTGGAAACACCATCTCGCGCGTCATCCAGCATCTTTTTGTAAAACACATCATTGAAAAG GGTGTCAACCGTGAAGGGACCAAACCTCGACAACCCCAACGAAGGGGAAAGTGAAACGGA GCTGAAGTTCCGGAAGTAGTAGACGAGCAGAAGTACACAAGTGTCTTCAAGAAACACGAT GCGCATACACGATTTTAATATTGTTACTCTGATCAGTTATAGGGAGCAGCGACGAAAGCA GGGGTAACGAGCTCAATTTCATTTTCGTGAGGCCTCTTTCCTCGTGTGCCCAAATGAGCA GACACTGAAGAATGGTGTCTATTCCTTAGTTATAGCGTATCTGGGGGCTATAGTTTACTT TTACTGCGGCTGTGGTGGAACTGACAGCAGCCACACTTATCCACACTATGCCCTTTTTAG CGTGTAAGATGGAAAGAGATGCAACGGTAACGGATATCTCCATTGCCCTTTATCGTTTCA CATTACGTTTTGCATCCTTTTCTTTGTGTCAACTCCCCCTCTTTTCTTCCCATCGTTTGT AGCCGCCTTCCCTTTCATCTGAAGCTCTTATTTGAATGTCTGCGCGGAGACGTTGTGTGT GACACCGCTCATCTGCAAAGTGGCGCGTTGCCCCTGAGGTTGTGGTGAGGTGTGCGGAGG TTCCGGGGTACTTGTGTAAGTCTGTATGCAGGTGTTGATACCGCAATCGCGAGGAGCCTG TGCGGACGCGGTGGTGGTGGTGTTGGACGCATCAGTGGATGTGTTTCGCTTCTTTAACAG CCATCACAAGAATGGCTCATTGTCCTCTTTGTTACGACGGAATGGTTAACCTGGCTGTCG CCCCGTAATAATGGAGGTGGGATGTTGTTAACCTTTTGTTTTCACGGTGTCAAATGCTTC ACCATGTTATCACTCTCAACTGGTCGTATCTCCTACGGAATTCTTCGCCAAACCACTGTG GTTGTGGCAATCTTCCTTGCACCTCGCGGCGTATGTGTGTGTGTGTGTGTGTGTGTGTTA GCACTCCACTGAGATGGCATTCGTGCCTGACTATCCGTCTGTTTGCACTGCCACTCGCGT GATTGTCAGCACAGCTTCGGGTCATCATGTGTTTTCTCGGTAGTGTGCAGTGTCCGAACT GGAGGCTGAAGAAAAAGATCAGCAAGGAACCAACTCTGTTCGGTGGGCGTCTAGTACTTT TTTCACCTGTCTTAGTATTTGTTTCCGTTATCATCACGTATGCTTGCACGTTTTAACCCC ACCTTTTTGGTCGTCGTCGCCGTATGTCTGCAGGAAGTATCATTCTACAGCTTGCGGTAC GAAGAAACAGTTGGGGACACTGACCACTGGTTTAACGTGTAAGCTGGCCCTTTCTCTTTC ATCAGTTGTGTTTCGGCAAAAACATTTATTACCCCCACTTCTTCACGCAACCGCCTTCGT TGCCGTATCTGCTAATATCGTTATTATTATTTTGCATGAGGGAGTGGGACGTTTCGCTGC CCGTTCCCCCTTTTCCCTTTCCTACCACACGGAGCGAATGGTCTGTGGAGCGTGTGCACA GAACAATAATTTTAGCGGCGGTCACTTTTCACATAGAAAAATAGGCAACGGAAGAACAAA GTAAGGCGAACGACGGGAGAACGACGAAAACAAGAGGATAATAGAC

O ~ | |

ATG GCA CAA AGG CTT GAG CCC ACC AAA CAC GCA ACC GAG ATC GTT ATT GAG GAA

ACG TCA AAG TTG AAA TTA TGG ACA GCA ACG ATA AAC TCT AGC CCC TGC GTA AGT

AGG AAA GTG CTC CCT GCA AAG GCG GAC CAA GAG GAC AGA GAG CGT AAC GAA CAC GGA AAT GTG GGC GGT GAT AGT GAA ACT AAT AGC GAG TTG TGC CCT CGG CCA GAT CAT GTG GAG ATG ACG AGA TCT GTT GAG TCG TAC CAA CCA CTA AGC CTC ACT AGC

GCC CTG GAG AAC GAC GGG AGA CTT CGT CAG CTC GCA TAC ACC CGC TCT AAC ATC

GGT AAC TTA TCA GCA CAC TCC TCG GTG CAA TCA AAC TGC ACA GGC TTG GTA ACA

GAT TAC AGT GCA CTC TGC GTG CCG AAC GAC GCA ATA GCG CAG ACC CCA GAA ATC

ATG CTC CGC CTA CTT ACA CCC GTG ATA CTT TCG GAA AAT ACC ATG TCG ACG CCT

TGG CGT GCA CTG TGC ACA GCC TTC AGT CAT TCA ACA GGT GTC ACA AAA GCG CGC

GAC ACG CAA CTA AAA AAT GGC AAA GAA AAT GAA TTC GAG GCT ACA GCC ATT AGT

TCA GCA GAG GAG GTA GAT AGT GAT CAA CCA ACG CAT TGT TAC ATG GGT CAC

CGA CAA AGA AAT CAA ATA CTT AAT GCC AAG TTA CGT CGG GAG AAC GCT CAC

GTA GAG CAA TTG GTA CAT GAA CTG TTC TTA GTT GGA GGG AGC GTT AAT AGG ACG GAA GAA ATT AAC AGC ACA TCA CCT CGT CGT ATT TAC TTT TTC GAC CCC

TGATGAGAATG~GATGAAGAAAGACAAGACGCAAAAACCGTCTCCATATAATAAACATAA CAGTACCTTCTAGAACATGTGAAGAGAGGGACACGCTACTTCAAGGGGTGCTGCGTTATT GTTTAGCAGACCCAATACCATGACGACAGGACATCACGATAACCTTGAAACAAAACTTAA CCGACGACCTGTATAACAGCACCCAATTAAAATGCOAAATAATCGGGACGACTGTCAC TGCTGTACAGTGCGAAACATGACTTAACCAACACCTTTTGCAACCCATAAAGTAGGAACA GAGCAGTAAATGCGAAAATACCATTAAAACATTAGAAAAAACACTTTGGCAAGCAGATAC TCGTTTACCTCGCGGGTTTAACGGGGAAATTTCAAAAGAAAAAAATTCCAGCAGTAAAAA AGGAAAAATAAATAAACACTGAAATCATAATAAACAAAAAAATAACTGAAAAAATAAAAT TATCTCAACGAACAAAAAAGAATTGATTAATAAACACAAATTTTCCTTCGAATTGCAATA

1 1 1

CCTCCC~GGA~TGCCTACTCC~TACTTGO~CTCCTCI~T~I?TCC~I'CCTA~I~I'GC A'I~TGCC~.CGTGCCC~.CGCGTGCAT~-~CAT,~.~Te~A~.C~GCAACAGCTATA ~%C.~GGAATATCTGCL"TCTTTG.~GI3 ATG

................

HSP

70 G E N E

2 ........................

TAA

-'I" -Z

GTTCCCAGGTGTATTTCGGACCGGTGCTGCAGTCGTCTCACTGTCTACGGTGATGGTCCG CCGCCACATTTTCTCTTCGTCTCTGTAGCATTTAGGAACCCTCGTTGCGGGAAAGATGCG TGACACATTGATGCTACTACTATTATTAATACTCTACTATTATTATTATTATTATTATTT

G ATG .................

HSP

70 G E N E

3-6 . . . . . . . . . . . . . . . . . . . . . . . . .

-Z

225

31Splice site B .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

0,

.

.

.

.

.

.

.

.

.

.

.

.

T.cruzi Hsp 83

.

T.br ucei Hsp70 gene2 T.br ucei Hsp 70 gene 3-6 .....

~

previous Gene

.........

CCAAACCA~,AC-GCAC.GATCCTA-AACACGCA~TCG

148

............

.......... A C ^ C ~ _ ~

L. major Hsp 70

HSE

( T T A ) n _V~_~__A. . . . .........

......

65 .........

"~- . . . . . . . . . . . . . .

] .......................

CGI~CGCAACTG~CG(:~~ I ~ c A C A C ~ C C G c G ~ ' J - " F F ~ ' L ~ C T ' I ~ C T C T T T G T A T A T C ~ T C T C c C ~ C C C ~ T dC~GC~GATA .L~.CC~- ].~ J ~ G G G G ~ T T ~ C J ~ k k . ~ A A T T C C A G ~ - -

-~TtC/~TCrtCAGT~---

135

1 4 9 ---TACTCCCATAC%'~I~GCACTCC

- - -~C'tGAC~T~'CGTC)- . . . . . . .

160

........

C?T~

Fig. 1. (A) Structure and nucleotide sequence of the hsp 70 locus in T. brucei.The putative HSE sequences in front of gene 2 and in the intergenic regions of genes 2-6 are underlined. The arrows indicate the major and minor 3' splice acceptor sites. ORF1 represents the open reading frame found at 635 bp upstream of gene 2. A short stretch of DNA (73 bp long), located directly upstream of the 3' splice acceptor splice of gene 2 is identical when compared to the hsp70 intergenic region sequence (located between genes 2 and 3). The intergenic region of genes 2-3 is boxed. A schematic physical map indicating the relative positions of the genes is indicated to the right of the DNA sequence. Abbreviations are: X,XbaI; H, HindIII; Xh, XhoI. (B) Comparison of intergenic region sequences of T. brucei hsp70, L.major hsp70 and T. cruzi hsp83. The DNA nucleotide sequence upstream of the ATG translation start sites for the regions upstream of the hsp83 gene of T.cruzi (line 1), the region upstream of the T.brucei hsp70 gene 2 (line 2) the intergenic region of the T.brucei hsp70 genes 2-6 (line 3) and the intergenic region of the L.major hsp70 genes (line 4) is compared. Identical nucleotide sequences are indicated with double dots. Large regions of homology are boxed. The small boxed sequences indicate HSE related sequences. Poly A indicates the poly adenylation site of the 3' end of the preceding gene. These regions are surrounded by a (TTA)n sequence in the T.cruzi hsp83 and T.brucei hsp70 genes. The 3' splice acceptor sites are underlined. Numbers indicate distances in nucleotides for areas with no obvious nucleotide sequence homology.

mRNAs derived from genes 2-6 (Fig. 2)[37]. We had previously shown that the steady-state m R N A of the cognate hsp70 gene 1 is overall unaffected after cell differentiation or in response to a heat shock [37]. The steady-state mRNA levels for O R F 1 and hsp70 genes 2-6 are higher in bloodstream form than in insect form trypanosomes. Given the differences in mRNA levels between bloodstream form and insect form trypanosomes, it should be possible to determine whether transcriptional or post-transcriptional control affects the mRNA levels of the genes in the 23-kb hsp70 locus. Thirdly, some of the R N A transcribed from sequences between genes 1 and 2 might encode functional proteins, since the partial nucleotide sequence analysis of the 6-kb region which separates hsp70 gene 1 from hsp gene 2 identified a single open reading frame (ORF I) of 798 nt, located 635 bp upstream of hsp70 gene 2. In the remainder of the sequence

upstream of ORF I, no large open reading frames were found. Transcripts derived from the tandem array of hsp70 genes 2-6, showed the presence of a 4600nt RNA, in addition to the 2300-nt mature hsp70 mRNA (Fig. 2, probe F, band marked with an arrow). This 4600-nt R N A could not be detected with probes that are derived from sequences located upstream of hsp70 gene 2 (Fig. 2) or downstream of hsp70 gene 6 (data not shown). The 4600-nt R N A may therefore represent a polycistronic R N A derived from the hsp70 genes 2-6.

Polycistronic transcription of hsp70 genes 2-6. In order to demonstrate that the 4600-nt transcript detected with the coding region specific probe of hsp70 genes 2-6 (probe F) is indeed a dimeric hsp70 RNA, hybridizations were performed with a probe specific for the intergenic regions (Fig. 2,

226

1 H

I,

2 H

H

[ ~

Fr A~

HH

B--

C

B

A

,

i[,,

3 HH

4 HH

Ft,,

i;,,

5 HH

l ill

6 HH

xh

!t,

D

C

D

F

ecule may therefore have a 5' and/or a 3' end that is different from the mature hsp70 mRNA molecule, which incidentally also measures 2300 nt. We have not determined the exact location of the 5' end of this RNA.

Analysis of transcripts by S1 protection. To confirm that the intergenic regions of hsp70 genes

7"54"4"

,~

24-

ee

~

.t~a

pe

pe

Fig. 2. Northern analysis of steady-state RNA in the 23-kb hsp70 locus. Equal amounts of total R N A isolated from procyclic trypanosomes and bloodstream form trypanosomes were size-separated in agarose gels, as described in Materials and Methods, and transferred to nitrocellulose filters. The filters were hybridized with the labeled probes derived from different regions of the hspTOlocus indicated below the physical map by the horizontal lines labeled A-F. Following the hybridization, filters were washed at a final stringency of 0.1 x SSC, 0.1%SDS at 65°C.

probe G). Probe G (197 bp in size) extends from the 3' splice acceptor site of hsp70 gene 3 to an SspI restriction enzyme site located downstream of the poly-(A) addition site of hsp70 gene 2 [23] (see Fig. 1A and Fig. 3 for probe location). As shown in Fig. 3A, this intergenic region probe detects the 4600-nt RNA. The intergenic region probe G is specific for the hsp70 intergenic regions and it does not cross-hybridize with other D N A sequences, even under relaxed stringency conditions of hybridization (65°C 3 x SSC; data not shown). Therefore, the 4600-nt R N A molecule may encode two hsp70 genes joined by an intergenic region sequence. In addition to the 4600-nt RNA, probe G also hybridized to a second, 2300nt R N A molecule. This R N A molecule was also detected with synthetic oligonucleotide probes homologous to a region located just upstream of the major 3' splice acceptor site of the hsp70 m R N A (data not shown). This 2300-nt R N A mol-

2-6 are transcribed, we performed S1 protection analysis of DNA-RNA hybrids (Fig. 3B). In the S1 protection analysis, we used a 32p end-labeled 348-nt BstEII restriction enzyme fragment as a probe, which starts in the hsp70 gene 3 coding sequence and extends throughout the intergenic region (23, 37; Fig. 3B, lane I, shows the input fragment). Treatment of D N A - R N A hybrids with S1 nuclease (lane 1, procyclic RNA; lane 2, bloodstream form RNA; lane 3 with tRNA only) resulted in the accumulation of several different protected fragments. Firstly, two major protected fragments (fragments labeled splice site 1 and 2) of 158 nt and 151 nt mapped to two 3' splice acceptor sites (see next section). These fragments are presumably derived from protection of RNA transcribed from any of the hsp70 genes 2-6, and not from gene 1, since the nucleotide sequence in this region is specific for the genes 2-6. Secondly, a protected fragment (indicated splice acceptor site 3) of 106 nt locates the end of an R N A molecule at a potential 3' splice acceptor site, at one nt upstream of the ATG translation initiation codon. This R N A fragment could be derived from either genes 1-6, since the nucleotide sequence is identical in this region for all six hsp70 genes. This 106-nt protected fragment may thus identify an hsp70 mRNA which lacks the usual 5' untranslated extension. Thirdly, several larger protected fragments ranging from the full length protected fragment of 348 nt (present only faintly) to 310 nt were detected. The 5' end of the 348nt fragment is located in an A+T-rich region and we assume that instability of D N A - R N A hybrids, may lead to artificial cleavage in this region by the S1 nuclease, resulting in accumulation of the broad band around 310 nt. Lowering the temperature of the $1 incubation reaction or lowering of the concentration of S1 nuclease indeed resulted in a decrease of the intensity of the broad band at 310 nt and an increase of intensity in the band

227

I I

Has

I,I

H

H

I,

I

f

2 3 4 BsHH BsHHBsH,H

,, II,,J

Gene 2

II,,I

ll,,I

II,,I

II,~

3' Splice sites

TAA'" -"~ ] " Poly A Addition

. . . . . . AATIATTS~Pl ................... AAGCATCAA GI.I ........ I I ..... 197 bp . . . . . . . . . -I I Intergenic region probe G I

I. . . . . . . .

B

A

Probe

6 6 BsHH I~HH

Gene 3 ~ATG,, - ~ ' , .~'., .BstEZI I I I

3411 bp fragment f o r $1 protections . . . . . .

1 23

I

I"2"3*TCGA

G

C Size bp

DNA

44--

3'SOice-- 1 "~3'~li©e-- 2

~

P

m11l

~3

Counts

F

~,

S54

5

lgl

~

197

4.5

125

3700

13-17 (15)

pile I

Q )>

5S j rRNA O

'Splice - - 3

copy #

G

l'c~B ~ O

I~

I

Relative

rates 1

95

2 87

1338

67 aid

t2o

?

485

10,000

?

30062

~i

v

Fig. 3. Evidence for the polycistronic transcription of the hsp70 genes 2-6. The top panel indicates the detailed map of intergenic region probe G and the DNA fragment used in the S1 protection experiment. (A) Northern analysis of precursor RNAs. Total RNAs were isolated from the bloodstream form trypanosomes(B) and procyclic form trypanosomes(P), separated on a 1% formaldehyde gel and transferred onto nitrocellulose filters. These filters then were hybridized with the coding region probe(F) of genes 2-6 and the intergenic region probe (G). The arrows indicate the 4600-nt precursor RNA. (B) S1 protection analysis in the intergenic regions of hsp70 genes 2-6. A 5' 32P-end labeled DNA fragment (32p BstEII site in coding sequence to the SspI in the intergenic region; see ref. 23, and the top panel) which spans the entire intergenic region of genes 2-3 was used in the S1 protection analysis with RNA from insect form (lane 1) as well as bloodstream form trypanosomes (lane 2; variant 118 clone 1). Following digestion of the DNA-RNA hybrids with 1500 units m1-1 S1 nuclease at 27°C for 30 min, the reaction products were size-separated in a 6% denaturing polyacrylamide gel. Lane 3 shows the $1 incubation using tRNA as a carrier with input DNA only (348-nt band in lane 1). TCGA denotes the sequencing reaction of a phage M13 DNA used as a size standard. Two different exposures (lanes 1,2 and 3, long exposure; and lanes 1", 2" and 3" short exposure; Fig. 3B) of the same S1 protection experiments are shown. (C) The elongation effieiencies of nascent RNAs in isolated nuclei. The 32p labeled nascent RNA, elongated and isolated from nuclei of variant 118 clone 1 bloodstream form trypanosomes was hybridized with filter containing equal amounts of different DNAs list in the first column. The amount of radioactivity bound to each probe was measured (epm) and listed in the fourth column. The relative elongation efficiencies were calculated per bp, per single copy DNA element. The labeling efficiency of the a-13 tubulin gene was arbitrarily set at 1. Abbreviations are: T -et-[3, et-fl tubulin coding sequence [31]; puc, Plasmid pUC 18; 5S, 5S rRNA gene [38]; rRNA, the rRNA repeat clone [23].

at 348 nt (data not shown). It is possible that the $1 protected band at 310 nt results from protection up to the 5' end of the 2300-nt hsp70 RNA, which extends upstream of the 3' splice acceptor site. Two different exposures (lanes 1,2 and 3, long exposure; and lanes 1", 2 ° and 3" short

exposure; Fig. 3B) of the same S1 protection experiments are shown to visualize the relative intensity differences in the protected bands. The $1 protection experiment demonstrates the presence of larger R N A molecules that extend throughout the intergenic region. These results

TCGA 228

together with the Northern analysis of steadystate R N A indicates that the intergenic region of the hsp70 genes are transcribed. We cannot exclude the possibility that processing of R N A occurs in the intergenic region resulting in the detection of 310 nt fragment. In addition, since all hsp70 genes appear to be identical, it is not possible to determine whether only two of the genes or perhaps all of the genes are transcribed.

r

Miniexon

b9

Analysis of nascent RNA. To characterize the transcription of the hsp70 genes 2-6 further, we measured the transcription of the hsp70 coding region and the intergenic region using nuclear run-on assays. 32p-labeled nascent RNAs, elongated in vitro in nuclei isolated from bloodstream form trypanosomes, were hybridized to filters containing subcloned genomic hsp70 derived fragments (Fig. 3C). Control genes used were the R N A Pol I transcribed rRNA genes [32], the R N A Pol II transcribed et-13 tubulin genes [33] and the R N A Pol III transcribed 5S rRNA genes [38]. The amount of radioactivity that hybridized to the 554-bp HindlII fragment (coding sequence; probe F) and the 197-bp sequence of the intergenic region (probe G) was quantitated and normalized per bp. The quantitation showed that the intergenic regions are transcribed at similar efficiency as the hsp70 coding sequence (Fig. 3C; the relative amount of incorporated radioactivity in fragments F and G=1.95 and 2.87, respectively). Based on the efficiency of elongation of nascent R N A in the hsp70 genes 2-6 and the S1 protection and Northern analysis of steady-state RNA, we concluded that transcription proceeds through the array of hsp70 genes 2-6.

Alternate 3' splice sites for mini-exon addition. The $1 protection experiment located the 5' ends of steady state hsp70 m R N A at several potential 3' splice acceptor sites. To confirm that alternate 3' splice sites were used in the maturation of hsp70 mRNAs, we performed direct dideoxy nucleotide sequencing of the 5' ends of hsp70 m R N A by primer extension using reverse transcriptase. A 30-mer oligonucleotide complementary to the sequence between nucleotides 11 and 40, just upstream of the ATG translation initiation codon in the 5' untranslated extension

1 2 3 TCGA Fig. 4. Determination of alternate 3' splice sites for mini-exon addition. Direct R N A sequencing was performed as described in the materials and methods. A 30-mer oligonucleotide complementary to the sequence of 11 to 40 bp located upstream of the ATG of the hsp70 genes 2-6 was used as a primer (DNA sequence underlined). Two primer extended products, showing the presence of mini-exon sequences, can be read from the sequence (highlighted in boxes). The two different sequences are labeled with asterisks and dots respectively in the different lanes with the sequencing reactions wtuch are labeled TCGA (containing ddT, ddC, ddG and ddA respectively) on the bottom right of the panel. On the left-hand side, a sequencing ladder is shown as a size standard, labeled T C G A on top of the different lanes. The lanes labeled 1, 2 and 3 represent primer extensions with total R N A in the absence of any ddNTP, from insect from trypanosomes, bloodstream form trypanosomes and insect form trypanosomes heat-shocked for 2 h at 41°C, respectively. At the extreme right, the two sets of mini-exon complementary sequences are indicated.

229 of the m R N A was used as a primer. The di-deoxy nucleotide sequence revealed two different sets of mini-exon sequences at the 5' ends of the hsp70 mRNA (Fig. 4). One set of mini-exon sequences was added onto the first A G 3' splice acceptor site as described in a previous paper [23]. The level of this hsp70 m R N A as measured by primer extension and S1 protection indicates that it represents the major population of steady-state hsp70 mRNAs. The second mini-exon sequence mapped to an alternate 3' splice acceptor site located at 7 nucleotides downstream of the first 3' splice acceptor site. The significance of the use of alternate 3' splice acceptor sites in the maturation of hsp70 m R N A is not dear. Our preliminary data do not show a difference in the use of either splice-acceptor site when insect and bloodstream form trypanosomes are compared. Discus~on

The hsp70 locus in T.brucei contains several tightly linked genes since steady-state RNAs could be detected throughout the 23-kb locus. Some of these genes may be transcribed polycistronically. Evidence in favor of polycistronic transcription of hsp70 genes 2-6 included: (i) Transcription of coding regions and intergenic regions in nuclear run-on assays; and (ii) large hsp70 pre-mRNAs were detected, derived from transcription of two tandemly linked hsp70 genes leading to a dimeric hsp70 RNA. These RNA molecules are, however, present in low concentrations and we have not determined their putative precursor-product relationship with the hsp70 mRNA. An alternative explanation for the organization of transcription units in the hsp70 locus is therefore that these polycistronic RNAs in the hsp70 locus result from a rare event caused by readthrough into immediately adjacent hsp70 genes. Each of the hsp70 genes might then be transcribed from its own promoter. Polycistronic transcription has previously been described for other trypanosomes genes arranged in tandem arrays (VSG genes,refs. 18, 19; calmodulin genes, ref. 22; et-13tubulin genes,ref. 34; PGK genes, ref. 15). The analysis of the polycistronicaUy tran-

scribed phosphoglycerate kinase genes led to the proposal that the mRNA levels of these genes may be controlled post-transcriptionally. To address transcriptional versus post-transcriptional control of mRNA levels we have attempted to map the 5' ends of RNAs that might point to transcription initiation sites at putative hsp70 promoters. Our analysis using S1 protection analysis and primer extensions on total RNA and poly-(A) ÷ RNA with synthetic oligonucleotides located upstream of the 3' splice acceptor site (data not shown) failed to locate the 5' ends of primary transcripts, and identified the 3' splice acceptor sites only. Presumably the primary transcripts are present in low amounts, they might be processed co-transcriptionally, on nascent RNA, or they cannot be detected with the probes used. The induction of hsp synthesis, following a heat shock or other stress signals is, in most other eukaryotes tested, initially controlled at the transcriptional level, through the activation of heat shock transcription factors which bind to HSEs in the hsp70 promoters (the palindromic sequence element C N N G A A N N T r C N N G ) [30, 33, 35]. Consequently, transcription might initiate immediately downstream of the HSE sites. In view of this conserved mechanism, we searched for potential promoter sequences in front of the T.brucei hsp70 genes. Such controlling elements could be found in the intergenic regions of the hsp70 genes [23]. We now show that potential HSE binding sites with a 6 out of 8 nt identity to the HSE consensus sequence are also located in the region that extends up to 286 nt upstream of the first hsp70 gene (gene 2) of the tandem array. Additional potential HSE elements could not be detected in the region that extended up to 3236 bp upstream of gene 2 or upstream of the cognate hsp70 gene 1. It is therefore possible that these elements direct the temperature-sensitive control of transcription initiation, since they are located specifically in front of the T.brucei hsp70 genes as well as upstream of the T. brucei ubiquitin genes and the L.major hsp70 genes (26, 36). We are currently testing the validity of this model by footprinting DNA binding proteins in the intergenic regions and in the region directly upstream of hsp70 gene 2.

230

Acknowledgements We thank all colleagues from the laboratory for critical reading of the manuscript. This work was supported by NIH grant AI 21784 to L.H.T.V.D.P. and by a Grant from the John D. and Catherine T. MacArthur foundation.

References 1 Borst P. (1986) Discontinuous transcription and antigenic variation in trypanosomes. Annu. Rev. Biochem. 55,701732. 2 Van der Pioeg, L.H.T. (1986) Discontinuous transcription and splicing in trypanosomes. Cell 47,479--480. 3 Van der Ploeg, L.H.T., Liu, A.Y.C., Michels, P.A.M., De Lange, T., Borst, P., Majumder, H.K., Weber, H., Veeneman, G.H. and Van Boom, J. (1982) RNA splicing is required to make the messenger RNA for a variant surface antigen in trypanosomes. Nucleic Acids Res. 10, 3591-3604. 4 Boothroyd, J.C. and Cross, G.A.M. (1982) Transcripts for different variant surface glycoproteins of Trypanosoma brucei have a short, identical exon at their 5' end. Gene 20, 281-289. 5 Campbell, D.A., Thornton, D.A. and Boothroyd, J.C. (1984) Apparent discontinuous transcription of Trypanosoma brucei variant surface antigen genes. Nature 311, 350-355. 6 De Lange, T., Liu, A.Y.C., Van der Ploeg, L.H.T., Borst, P., Tromp, M.C. and Van Boom, J. (1983) Tandem repetition of the 5' mini-exon of variant surface glycoprotein genes; a multiple promoter for VSG gene transcription? Cell 34, 891-900. 7 Dorfman, D. and Donelson, J. (1984) Characterization of the 1.35 kilobase DNA repeat unit containing the conserved 35 nucleotides at the 5' termini of variable surface glycoprotein mRNAs in Trypanosoma brucei. Nucleic Acids Res. 12, 4907-4920. 8 Nelson, R.G., Parsons, M., Barr, P.J., Stuart, K., Selkirk, M. and Agabian, N. (1983) Sequences homologous to the variant antigen mRNA spliced leader are located in tandem repeats and variable orphons in Trypanosoma brucei. Cell 34,901-909. 9 Parsons, M., Nelson, R.G., Watkins, K.P. and Agabian, N. (1984) Trypanosome mRNAs share a common 5' spliced leader sequence. Cell 38, 309-316. 10 Murphy, W.J., Watkins, K.P. and Agabian, N. (1986) Identification of a novel Y branch structure as an intermediate of trypanosome mRNA processing: evidence for trans splicing. Cell 47, 517-525. 11 Sutton, R.E. and Boothroyd, J.C. (1986) Evidence for trans splicing in trypanosomes. Cell 47,527-535. 12 Laird, P.W., Kooter, J.M. and Borst, P. (1985) Mature mRNAs of Trypanosoma brucei possess a 5' cap acquired by discontinuous RNA synthesis. Nucleic Acids Res. 13, 4253-4266.

13 Ralph, D., Huang, J. and Van der Ploeg, L.H.T. (1988) Physical identification of branched intron side-products of splicing in Trypanosoma brucei. EMBO J. 7, 2593-2545. 14 Van Doren, K. and Hirsh, D. (1988) Trans-splicing leader RNA exists as small nuclear ribonucleoprotein particles in Caenorhabditis elegans. Nature 335,556-559. 15 Gibson, W.C., Swinkels, B.W. and Borst, P.(1988) Posttranscriptional control of the differential expression of phosphoglycerate kinase genes in Trypanosoma brucei. J. Mol. Biol. 201,315--325. 16 Gonzalez, A., Lerner, T.J., Huecas, M., Sosa-Pineda, B., Nogueira, N. and Lizardi, P.M. (1985) Apparent generation of a segmented mRNA from two seperated tandem gene families in Trypanosoma cruzi. Nucleic Acids Res. 13, 5789-5804. 17 Johnson,A.J., Kooter, J.M. and Borst, P. (1987) Inactivation of transcription by UV irradiation of T.brucei provides evidence for a multicistronic transcription unit including a VSG genes. Cell 51,273-281. 18 Kooter, J.M., Van der Spek, H.J., Wagter, R., d'Oliverira, C.E., Van der Hoeven, F., Johnson,P.J. and Borst, P. (1987) The anatomy and transcription of a telomeric expression site for variant-specific surface antigens in Trypanosoma brucei. Cell 51,261-272. 19 Shea, C., Lee, G-S. M. and Van der Ploeg, L.H.T. (1987) VSG gene 118 is transcribed from a cotransposed pol I-like promoter. Cell 50,603-612. 20 Imboden, M.A., Laird, P.W., Affolter, M. and Seebeck, Th. (1987) Transcription of the intergenic regions of the tubulin gene cluster of Trypanosoma brucei: evidence for a polycistronic transcription unit in a eukaryote. Nucleic Acids Res. 15, 7357-7368. 21 Van der Pioeg, L.H.T. (1987) Control of variant surface antigen switching in trypanosomes. Cell 51,159-161. 22 Tschudi, C. and Ullu, E.(1987) Polygene transcripts are precursors to calmodulin mRNAs in trypanosomes. EMBO J. 6, 455-463. 23 Glass, D.J., Polvere, R.I. and Van der Ploeg, L.H.T. (1986) Conserved sequences and transcription of the hsp70 gene family in Trypanosoma brucei. Mol. Cell. Biol. 6, 4657-4666. 24 Sanger, F., Coulson, A.R., Barrell, B.G., Smith, A.J.H. and Roe, B. (1980) Cloning in single-stranded bacteriophage as an aid to rapid DNA sequencing. J. Mol. Biol. 143,161-178. 25 Dudler, R. and Travers, A.a. (1984) Upstream elements necessary for optimal function of the hsp70 promoter in transformed flies. Cell 38, 391-398. 26 Lee, G-S. M., Atkinson, B.L., Giannini, S.H. and Van der Ploeg, L.H.T. (1988) Structure and expression of the hsp70 gene family of Leishmania major. Nucleic Acids Res. 16, 9567-9585. 27 Kooter, J.M. and Borst, P. (1984) Alpha-amanitin-insensitive transcription of variant surface glycoprotein genes provides further evidence for discontinuous transcription in trypanosomes. Nucleic Acids Res. 12, 9457-9472. 28 Dragon, E.A., Sias, S.R., Kato, E.A. and Gabe, J.D. (1987) The genome of Trypanosoma cruzi contains a constitutively expressed tandemly arranged multicopy gene homologous to a major heat shock protein. Mol. Cell.

231 Biol. 7, 1271-1275. 29 Topoi, J., Ruden, D.M. and Parker, C.S. (1985) Sequences required for in vitro transcription activation of a Drosophila hsp70 gene. Cell 42, 527-537. 30 Pelham, H.R.B. (1986) Speculation on the functions of the major heat shock and glucose-regulated proteins. Cell 46, 956-961. 31 Thomashow, L.S., Milhausen, M., Rutter, W.J. and Agabian, N. (1983) Tubulin genes are tandemly linked and clustered in the genome of Trypanosome brucei. Cell 32, 35--43. 32 White, T.C., Rudenko, G. and Borst, P. (1986) Three small RNAs with the 10-kb trypanosome rRNA transcription unit are analogous to domain VII of other eukaryotic 28S rRNAs. Nucleic Acids Res. 14, 9471-9489. 33 Banerji, S.S., Berg, L and Morimoto, R.Z. (1986) Transcription and post-transcriptional regulation of Avian hspTO gene expression. J. Biol. Chem. 261, 15740-15745. 34 Muhich, M.L. and Boothroyd, J.C. (1988) Polycistronic

35 36

37

38

transcripts in trypanosomes and their accumulation during heat shock: Evidence for a precursor role in mRNA synthesis. Mol. Cell. Biol. 8, 383%3846. Pelham, H.R.B. (1985) Activation of heat-shock genes in eukaryotes. Trends Genet. 1, 31-35. Swindle, J., Ajioka, J., Eisen, H., Sanwal, B., Jacquemot, C., Browder, Z. and Buck, G. (1988) The genomic organization and transcription of the ubiquitin genes of Trypanosoma cruz/. EMBO J, 7, 1121-1127. Lee, G.-S. M., Polvere, R.I. and Van der Ploeg, L.H.T. (1990) Evidence for segmental gene conversion between a cognate hsp 70 gene and the temperature-sensitively transcribed hsp70 genes of Trypanosoma brueei. Mol. Biochem. Parasitol. 41,213-220. Lenardo, M.J., Dorfman, D.M., Reddy, L.V. and Donelson, J.E. (1985) Characterization of the Trypanosomabrucei 5S ribosomal RNA gene and transcript: The 5S ribosomal RNA is a spliced leader-independent species. Gene 35,131-141.

Transcription of the heat shock 70 locus in Trypanosoma brucei.

The 23-kb heat shock 70 locus of the protozoan parasite Trypanosoma brucei encodes several tightly clustered genes. From 5' to 3', a cognate hsp70 gen...
970KB Sizes 0 Downloads 0 Views