Molecular Biology Reports 16: 217-227, 1992. 9 1992 Kluwer Academic Publishers. Printed in Belgium.

217

Review

RNA editing in trypanosomes The us(e) of guide RNAs

Rob Benne E.C. Slater Institute for Biochemical Research, University of Amsterdam, Academic Medical Centre, Meibergdreef 15, 1105 A Z Amsterdam, The Netherlands Received 11 May 1992; accepted 18 May 1992

Key words: guide RNA, mitochondrion, RNA editing, trypanosomes Abbreviations: gRNA - guide RNA

Abstract

Guide RNAs are encoded in maxicircle and minicircle D N A of trypanosome mitochondria. They play a pivotal role in RNA editing, a process during which the nucleotide sequence of mitochondrial RNAs is altered by U-insertion and deletion. Guide RNAs vary in length from 35 to 78 nucleotides, which correlates with the variation in length of the three functionally important regions of which they are composed: (i) a 4-14 nucleotide 'anchor' sequence embedded in the 5' region, which is complementary to a target sequence on the pre-edited RNA downstream of an editing domain, (ii) a middle part containing the editing information, which ranges from guiding the insertion of just one U into one site to that of the insertion of 32 Us into 10 sites, and (iii) a 5-24 nucleotide 3' terminal oligo [U] extension. Moreover, a variable uridylation site creates gRNAs containing a varying segment of editing information for the same domain. Comparison of different guide RNAs demonstrates that, besides the U-tail, they have no obvious common primary and secondary sequence motifs, each particular sequence being unique. The occurrence in vivo and the synthesis in vitro of chimeric molecules, in which a guide RNA is covalently linked through its 3' U-tail to an editing site of a pre-edited RNA, suggests that RNA editing occurs by consecutive transesterification reactions and is evidence that the guide RNAs not only provide the genetic information, but also the Us themselves.

Introduction

Mitochondrial (mt) D N A of trypanosomes consists of a network of catenated maxicircles and minicircles (reviewed in [1,2]). The maxicircle contains genes for the mt ribosomal RNAs and for a number of subunits of the respiratory chain complexes ([3, 4] see Fig. 1). A number of maxi-

circle genes are expressed in a very unconventional manner, since the nucleotide sequence of primary transcripts is altered post-transcriptionally via a U-insertion/deletion process, which we have called RNA editing ([5], reviewed in [6-8 ]). A systematic analysis of mt transcripts of the insect trypanosome C. fasciculata and the lizard trypanosome L. tarentolae, has revealed a

218 12S

ND8 9S

ND7

G2

CYb

cox3

MURF1 ND1

MURF4

G3

MURF2

cox2

ND4 rpS12 cox1 G4

G5

ND5

1 kb Fig. 1. Composite gene map of the trypanosome maxicircle. The position of the genes is indicated by boxes. Abbreviations: 12S and 9S code for the mt ribosomal RNAs; cox, cytochrome c oxidase; ND, NADH dehydrogenase; CYb, apocytochrome b; rpS, small subunit ribosomal protein; MURF, Maxicircle Unidentified Reading Frame. The arrows indicate RNA segments that are edited in C. fasciculata (see Fig.2), grey areas produce (extensively) edited RNAs in T. brucei. G = G-rich, these areas are most likely not yet identified cryptic genes, the transcripts of which undergo extensive editing. The variable region [VR] has been omitted.

n u m b e r o f edited R N A s , for which editing is limited to relatively small segments [7,9,10] (see Fig. 2). In the African t r y p a n o s o m e T. brucei, however, editing can be m u c h m o r e spectacular since, besides R N A s with limited editing, five extensively edited R N A s have been found, o f

which around 50 ~o o f the nucleotides are derived f r o m editing (Table 1) [ 1 1 - 1 5 ] . Such an R N A has recently also been identified in L. tarentolae[ 16]. The nucleotide sequence of these R N A s is totally different f r o m that o f the corresponding 'cryptic' [7] gene and it could only be found by

100 nts

5'

,

AuuGuAuA

Cff6

-@W / ~ ~ GuAuuuUGAuA AuAuuuAuuuuAuuuuuAuuGuuuuuuuG

AuuAuGuuuuuuCGuGuuAGGAuuuuuAuuAuuuuuuuuGuuAuuuAGAAAuuuAuGuuGuC

AuGuuuuuAuuuCGuGuuAuuuuuGuuGGuG ( MURF2~

~

~

- UGAGuGGuGuuuuuGuuuuuUUA - U

~

ApGiuuuGGuuGuuuuAAuuuAGuuuuAuuuuuAuGuuuuGACUGUA

(-ff6g) G - - AGuAuuuAGuuGuGuuuuAuuAAAuuuAA

- U

Fig. 2. RNA editing in C. fasciculata. For abbreviations, see legend to Fig. 1. The sequence of the edited segments is indicated; - = deleted U; CAPITAL letters: original sequence; lower case letters: inserted u's.

219 Table 1. Editing in T. brucei

Transcript

Number of Us inserted/deleted

Number of editing sites

cox2 MURF2 CYb rpS 12 ND8 ATPase6 cox3* ND7

4/0 26/4 34/0 132/28 259/46 448/28 558/40 551/88

3 11 13 77 127 185 > 200 291

* The sequence of the 5' end of edited cox3 RNA has not been determined.

making use of existing, partially-edited RNAs as a 'go-between'. This illustrates the difficulties investigators have when attempting to identify a candidate gene with only the mt D N A sequences in hand and it predicts that gene maps such as the one shown in Fig. 1 have only temporary value and will be made obsolete by the identification of more cryptic genes. The characteristics of some of the above-mentioned, partially edited RNAs suggest that editing is occuring in a 3' to 5' direction in a stepwise fashion, one single site at the time and, within a site, one single nucleotide at the time [17]. An unexpected bonus from these analyses has been the identification, both in L. tarentolae and T. brucei maxicircle DNA, of a gene encoding a small subunit ribosomal protein. This puts the trypanosomes at the level of other eukaryotic microbes, such as yeast [18] and Neurospora [19] which also possess a mitochondrially encoded mitoribosomal protein, and it adds some more direct evidence to the hypothesis that trypanosomes have a mt protein synthetic machinery, although mt ribosomes have still not been identified, (see [20]).

Guide RNA molecules providing the editing information are transcribed from maxicircles and minicircles

The information for the editing process is provided by guide (g)RNAs. The first evidence for

their existence was provided by Blum et al., who showed in a pioneering study that seven small, mostly intergenic sections of the L. tarentolae maxicircle contain sequences which are complementary to edited RNA segments if G:U base pairing is allowed [21] (see Fig. 4, next section). They further demonstrated that these regions are transcribed into RNA species about 70 nucleotides long, which were called 'guide' RNAs, in a cautious attempt to best describe their supposed role in RNA editing (see below). Based on the complementarity principle, a large number of gRNAs and/or their genes have been discovered since then, not only in L. tarentolae [21-23], but also in other trypanosome species, such as C. fasciculata [24], T. brucei [12-15, 25, 26], T. equiperdum [27] and T. evansi [28], and this number is growing rapidly. The ease with which the gRNAs and their coding sequences can be found, together with the fact that the location of gRNA genes and the potential for base pairing in gRNA:mRNA hybrids are conserved between L. tarentolae and C. fasciculata, species with very similar patterns of editing [24], strongly indicates that gRNAs participate in the editing process. Some more direct evidence and a discussion of possible mechanisms of gRNA function will be presented in the following sections. One of the more spectacular results of the search for gRNAs has been the discovery that they are often encoded in minicircle DNA, the smaller of the two classes of mt D N A circle. Historically, the function of minicircles has long been enigmatic, as they were thought to play a role in the ordered segregation of maxicircles during mt D N A replication [1, 29]. In L. tarentolae, the organism in which the first minicircle encoded gRNA was identified [22], one single gRNA gene is found per minicircle at a distance of about 250 nucleotides from a conserved dodecamer sequence [22, 23] (Fig. 3A). This dodecamer, which is present in all species, is envisioned as playing a role in minicircle and maxicircle replication [29, P. Sloof et al.; unpublished observations]. Both minicircle- and maxicircle-encoded gRNAs seem to be primary

220

A bent

B

universal dodecamer ..,~l~--..~

universal dodecamer bent DNA

~

18 bp inverted repeats

0.86

gRNA' gone

/

gRNA gone 1

A gone 2

Fig. 3. Guide RNA gene map of L. tarentolae and T. brucei minicircles. Minicircles from L. tarentolae (A) and T. brucei (B). The

universal dodecamer sequence ( G G G G T T G G T G T A ) is possibly involved in minicircle replication and present in all trypanosomes. It is part of a larger area of species-specific conservation, which is not indicated. The D N A bend contains a regularly phased homopolymeric oligo dA:dT motif, which causes anomalous electrophoretic behaviour of the D N A fragment that contains it. The transcriptional direction of the gRNA genes is indicated.

transcripts, as judged from the fact that they can be capped with guanylyltransferase from Vaccinia [25,27,30], an enzyme which needs a 5'-di- or triphosphate. It is estimated that in L. tarentolae a maximum of 20 different minicircle classes exists [32], suggesting that the total number of minicircle gRNAs in this organism is of the same order. Minicircles of trypanosoma species contain three gRNA genes, each flanked by 18 bp inverted repeats separated by approximately 110 nucleotides (Fig. 3B). They are also primary transcripts. In T. brucei a minimum of 250 different minicircle sequence classes exists [32], bringing the number of possible gRNA genes to at least 750 [25]. The high gRNA coding potential of the T. brucei minicircle correlates well with the large number of editing sites within maxicircleencoded RNAs (compare Fig. 2 to Table 1). Strikingly, the maxicircle in all species studied so far exclusively encodes gRNAs involved in editing of RNAs that display limited editing, which implies that the gRNA genes whose products function in the editing of extensively edited transcripts are located in minicircle DNA. The reason for this division of labor is unclear at present. Very little is known about the sequences that regulate gRNA transcription. Alignment of the

five minicircle gRNA genes identified thus far in L. tarentolae, has revealed the presence of two potentially interesting sequence motifs, CCAAT around -95 with respect to the start of transcription, and C G A T A G G T T G T A at about -35 [23]. These motifs are not absolutely conserved with respect to sequence and location and they are absent from the upstream region of maxicircle genes. Transcription of the Trypanosoma minicircle gRNAs studied so far initiates at the first purine within the sequence 5'-RYAYA-3', about 31 nucleotides from the upstream repeat [ 15, 25-28]. However, one of the three hypothetical maxicircle gRNA genes of T. brucei [24] lacks an RYAYA sequence in the 5' region, and none of them is found in the vicinity of the repeat sequence, (P. Sloof et al.; unpublished observations). L. tarentolae and C. fasciculata gRNAs do not contain a RYAYA motif at the 5' end. Another potentially interesting motif is a phased homopolymeric oligo dA:dT motif, resulting in a D N A bend (Fig. 3). This motif, often found in regions that are important for transcription initiation [33], is present in both the L. tarentolae and T. brucei minicircles. It is attractive to assume that it also plays a role in minicircle gRNA transcription. It is absent, however,

221 from the coding region of the T. brucei maxicircle (P. Sloof et al.; unpublished observations). The consistency with which possible minicircle regulatory motifs fail to show up upstream of maxicircle gRNA genes could mean that different regulatory sequences are employed for maxicircle and minicircle gRNA transcription. Alternatively, it could simply underline our ignorance of the gRNA transcription process. Only time and further experiments can tell. Some trypanosome species, such as T. equiperdum and T. evansi, do not have a functional mitochondrion. In T. equiperdum a deletion in the maxicircle has removed essential genes [27, 34]; in T. evansi the maxicircle is missing altogether [35]. The minicircle population is homogeneous in both species. Strikingly, the T. equiperdum and the T. evansi minicircles contain the same three gRNA genes, flanked by the inverted repeats mentioned above [27,28]. These gRNA genes are virtually 100~o conserved between the two species, whereas the remainder of the minicircles is much more divergent in sequence. The question arises why a few minicircle gRNA genes are conserved in species which lack many or all maxicircle RNAs. At present the answers can only be speculative. If the function of gRNAs is exclusively related to editing, then the conclusion must be that in T. equiperdum and even in T. evansi they are utilized for that purpose. The identity and origin of their target RNA(s) (mitochondrially or even nuclearly encoded) remain to be determined.

gRNAs are highly variable As outlined in the previous section, gRNAs possess a stretch of nucleotides complementary to an edited RNA segment. This region can be subdivided into two parts: (i) a 5' end segment containing the gRNA anchor sequence capable of base pairing with unedited RNA downstream of an editing domain, and (ii) a region situated immediately 3' to the anchor containing the information for the editing of that domain. This is illustrated in Fig. 4, which shows a schematic

representation of a gRNA together with a number of gRNA sequences derived from our work on C. fasciculata [24]. The variation in size of both these areas is substantial. For example, the CYb-II gRNA only possesses a marginal anchor of 4 (+ 1?) nucleotides, the 5' end segment being only 9 nucleotides long. Other gRNAs, including the minicircle-encoded gRNAs that can form an anchor duplex with unedited RNAs, have more solid anchor sequences - up to a maximum of 14 nucleotides - and a total possible length of the 5' segment of 22 nucleotides. Minicircle-encoded gRNAs are mostly involved in the editing of extensively edited RNAs [12-16,25,26]. Since a large part of the sequence of these RNAs is generated during the editing process, requiring many different gRNAs, it is difficult at present to estimate the length of most gRNA anchors, as this would require precise knowledge of the extent to which the gRNA sequences overlap (see [25,26]). Determination of the 5' end with the aid of reverse transcriptase has revealed one major extension product for each gRNA studied, indicating that they have a homogeneous 5' end [21-27]. The variation in size of the informational part of the gRNAs is even more pronounced, particularly for the maxicircle-encoded gRNAs. Two of the extremes are shown in Fig. 4B. The MURF2-I gRNA informational part, in fact, consists of only one nucleotide guiding the insertion of just one U. The informational part of CYb-II gRNA, on the other hand, measures 44 nucleotides and guides the insertion of 32 Us into 10 different sites. Intermediate values are found for the other gRNAs, with the restriction that the transition point between anchor and informational part cannot be precisely determined for most minicircle-encoded gRNAs, as explained above. It should be noted that the informational part of some gRNAs contains nucleotides that cannot base pair with the corresponding nucleotide in the fully edited mRNA. Two of the C. fascieulata gRNA:mRNA hybrids contain C:A mismatches, with the C in the gRNA (as checked by direct RNA sequencing, see Fig. 4B and [24]). Occasional odd base combinations have also

222 A editing domain

,

mRNA //////z'd

llllllll*rllItilllill gRNA

Y/////A

3' ~ ~ 1

information

anchor

B MURF2-1

s

AAGAAGGACUGuAGUCGAAUUUUUGAUUUAUUUG I

3'

I1.1.11.

I *1

*

IIIIII

--UGUCACgUUAGUACAUGAUUGAAAUAA

--

ND7 [5'] 5

~

UCGACUGCAUGACAAACGUAuAuuuAuuuuAuuuuuAuuGuuuuuuuGCACUUAUAUCGAUUUACU t *

3'

IIIIIIIIIIIII

I*lllllll**llllllllllllFtll*

U

**

[U]nAUUUUAAACAACAUauaaauaaaacagaaauaacggaaaaaCGUGAAUAUAGUCGGC 9 9 AA 9 9 9 9 9 AAAAAA 9 9

A

CYb-II 5'

AGUU ' ~ u ' ~ *1

3'

G uuuuuuC G uG uuA G ~uuuu-~uA ~uuuuuu%u~uu A G A A A A G G C U U

III1.111-1.11-1-111.1.***1111

I**1111111111111111

I

- -GUGUaaugcaagagaGuauaauuuggggaDaacaggaaaaacaauaaaucuUGUAAA

Fig. 4. Important functional elements of gRNAs. The nucleotide sequence of three gRNA:mRNA hybrids in C.fasciculata is given in (B) and is schematically represented in (A). Conventional base pairing is indicated by vertical lines and G:U base pairing by asterisks. A C:A base pair in the ND7 [5'] and CYb-II hybrids has been printed in bold type. Lower case Us in the mRNA are derived from insertion guided by the lower case As and Gs in the gRNA sequence. A U-rich tri- or tetranucleotide, possibly involved in transcription termination and/or uridylation, is underlined. Uridylation sites found for ND7 [ 5'] gRNA are indicated by black triangles. Two gRNAs are required for editing of the M U R F 2 and CYb RNAs in C. fasciculata (and L. tarentolae). They have been called I and II, respectively, in line with the assumption that the gRNAs involved in editing of the 3' part of an editing domain act first (see [21]). Dashed lines at the ends ofa gRNA indicate that the 3' or 5' terminal nucleotide has not been precisely identified.

been observed in a number of minicircle-encoded gRNA:mRNA duplexes in T. brucei [14,15,26]. Although current models of editing do not take this into account (see next section), the editing machinery is apparently capable of coping with a limited number of mismatches. Direct sequence analysis of a number of gRNAs in L. tarentolae [30] and T. equiperdum [27] has demonstrated that they contain stretches of non-genomically encoded U residues

at the 3' terminus varying between 5 and 24 nucleotides. U tails appear to be present on all gRNAs, which has allowed us to synthesize oligo dA-primed cDNAs, and amplify them with the aid of PCR and gRNA-specific 5'-oligonucleotides, (G.J. Arts et al.; unpublished observations). These studies have revealed yet another source of heterogeneity in gRNA length. As shown in Fig. 4 for C. fasciculata ND7 [5'] gRNA, the U-tail is found to be hooked up to the

223 remainder of the sequence at many different positions. This is not a PCR artifact since, in control experiments with synthetic gRNAs, no sequences other than input sequences were generated. Rather, one might be dealing with a sloppy uridylation enzyme that lacks a fixed site of action. Indeed, comparison of the 3' end of different gRNAs in L. tarentolae and C. fasciculata failed to provide evidence for a clear consensus motif. A U-rich tri- or tetranucleotide sequence, which might be considered as potentially interesting in this repect, is variable in sequence and in location relative to the uridylation sites observed [21, 24] (underlined in Fig. 4). After all this, it comes as no surprise that considerable differences in length exist between different gRNAs on Northern blots and that considerable size heterogeneity occurs, even within one gRNA species (35-78 nucleotides) [21-27,30, G.J. Arts et al.; unpublished observations]. It is not clear, in most cases, what the precise contribution of each of the composing gRNA regions is to the total length of an individual gRNA. For example, a small informational part could be compensated for by a longer than average oligo[U] tail and/or longer 5' and 3' spacer sequences (and vice versa). The message from these considerations is clearly that, besides the U-tail, gRNAs may have little in common, since the anchor and the informational part contain sequences that are specific for each individual gRNA. Moreover, possible common secondary structures, in which the U-tail is base paired to the purine-rich gRNA sections [30], can only be drawn for those gRNAs with long informational segments. Nevertheless, however variable it may be, the editing machinery must recognize all gRNAs as such. Possible models for gRNA function that attempt to take this into account will be discussed in the next section. The mechanism of action of gRNAs; the properties of chimeric molecules

A number of models for the mechanism of action of gRNAs in the editing process have been put

forward during the last couple of years [21,30,36-38]. All of these models agree on a crucial role for the gRNA anchor sequence in the initial phase of the editing process, via the formation of a duplex with the corresponding sequence of the pre-edited RNAs immediately downstream of an editing site (Fig. 5A, B, step 1). In a later adaptation of this model it was proposed that the U-tail of the gRNAs might also take part in the recognition process, since it can base pair with the purine-rich section containg the editing sites of pre-edited RNA [30]. The RNAs with small editing regions are not particularly purine-rich in this area, however, so the contribution of the U-tail might not be essential. In the first model of Blum etaL [21], it was further proposed that the U-insertion process may proceed through the consecutive action of enzymes, such as an endonuclease cutting the pre-edited RNA at the first non-base paired nucleotide 5' of the annealed anchor sequence, a Terminal Uridilyl Transferase, adding a U (or Us) derived from UTP to the 3' end of the 5' cleavage product and an RNA ligase to join the two halves (Fig. 5A). For a deletion event a similar scheme can be drawn, the difference lying in the particular sequences of the pre-edited mRNAs, from which, following the endonuclease cut, excess U-residues would be removed by specific U-exonucleases (not shown in Fig. 5A, see [21]). As a result, the duplex between the pre-mRNA and the gRNA is extended by a number of base pairs, either via direct guiding of the insertion/deletion process by A:U and G:U base pairing [21], or via a random insertion/deletion process during which base pairing between gRNA and correctly edited sequences stops further editing [36]. The stepwise addition of Us visualized in Fig. 5A is predicted by the properties of the partially edited RNAs mentioned in the introduction. The reactions would be repeated until all information present in the gRNA has been utilized. In support of this model, some of the required enzymatic activities appear to be present in (mt) extracts from trypanosomes [39,40,41]. Later models, inspired by further experimenta-

224

step I

3 2

~

pre-mRNA

gRNA

5'

step 1

1

9

anchor

GAUGA]AGAUUAGAUUA~

/ 3UUUUU-,

CGUaagACUa[UCUAAUCUGAU]

32 5' " ' ' ~ G A U G A u

3, -- 5'

1- -

"~ - - - C g U a a g A C U ~ U C U A A U C U G A U

I-- 5'

step 2

32

I

9

i

3'

transesterlfication 1

-- -- U u u u u u u p u A G A U U A G A U U A

chimeric molecule

CGUaagACUaUCUA~UCUGAU~5'

f k

3'uuuu u step 3

1

m

3'

[llllllll*tl -- - - C g U a a g A C U a U C U A A U C U G A U - -

~ RNA Iigase 32

3'

5' m C - A U G A o H llllfi11*ll

step 4

anchor

, v ~. 5' - - G A U G A p ~ G A U U A G A U U A

~re-mRNA

~, AGAUUAGAUUA

/

32 \~

gRNA

] sndonuclease ~ U-transferase UTP

step 2,3

,~-UUUUUUIIIIOH /

5'

transestsriflcation 2

/UUUUUUUoH 1

S' I G A U C - A u A G A U U A G A U U A m

llllllllllll.II .. CGUaagACUaUCUAAUCUGAU~

3'uuuuu/"

5'iGApUGAuAGAUUAGAUUA

\

5'

~

3'

l[llllllllll*ll '~"- C g U a a g A C U a U C U A A U C U G A U - - 5 '

~[ tore cycle~

more cycle:

/

3

final result

3'

2

1

/ uuu~

T

5' ~ G u A u u u U G A u A G A U U A G A U U A m I*lll*lllllllllll I*11

3'

/ CGUaagACUaUCUAAUCUGAU~ 5'

final result

~,

s 2 I ~5' " ~ ' G u A u u u U G A u A G A U U A G A U

\

UA ~'~

3'

l*lll,ILlllllllLII*]l "". - C G U a a g A C U a U C U A A U C U G A U

--5'

3'UUUU u -

Fig. 5. Models for gRNA functioning in RNA editing. For details, see text.

tion and evolutionary considerations, have been put forward independently by Blum etal. [37] and Cech [38]. The essence of these models is shown in Fig. 5B. The initial stages and (evidently) the end result of the editing process are essentially the same as those of the model of Fig. 5A, predicting the formation of gRNA:prem R N A hybrids through base pairing with the anchor sequence and extension of the base paired region by the editing process. The basic difference is the assumption that RNA strands are broken and resealed via a sequence of transesterification reactions in which both the U-tail of the gRNA and the pre-mRNA participate (Fig. 5B). In the edited RNA the inserted Us are derived from the U-tail of the gRNA and not from UTP (at least not directly). A number of observations argue in favour of the transesterification model. First, since RNA splicing is also carried out via consecutive transesterification reactions, the implication would be that RNA editing and splicing are carried out

according to the same mechanistic principles. It is attractive to assume that editing and splicing have a common evolutionary background (summarized in [37,38]). In this view U-deletion would be analogous to splicing and U insertion to reverse splicing. In the splicing process, too, small RNAs help to select the relevant RNA segments, the difference, of course, being that only a few snRNAs participate in the splicing process at numerous sites and that, as far as we know, the snRNAs themselves do not directly participate in transesterification reactions. A second point that strongly supports the model of Fig. 5B is that the predicted intermediates in the transesterification reactions, i.e. chimeric molecules consisting of a guide RNA covalently linked through its 3' U-tail to an editing site of a pre-edited RNA, do indeed exist in trypanosome mt RNA ([37], G.J. Arts etal.; unpublished observations). A sample of chimeric molecules of ND7 [ 5' ] gRNA and pre-mRNA as found via PCR analysis in mt RNA from C.

225 between the gRNA and the pre-mRNA: 'false anchors'. Importantly, gRNAs with only part of the guiding information required for an editing domain are also found in the collection, indicating that they do indeed participate in the editing process. This could imply that the partially edited R N A s mentioned above are not the result of a prematurely terminated stepwise editing process with a full length gRNA, but rather of the action of one of these shorter gRNAs. Last, but not least, it should be noted that in some of the chimeric molecules (e.g. those shown in Fig. 6) the number of Us in the connecting sequence is less than the number required for a particular site

fasciculata is shown in Fig. 6. As for chimeric molecules in L. tarentolae [37], the predominant

linkage site between gRNA and pre-edited R N A is the first editing site, which might indicate that the processing of these molecules to other products is, in fact, rate limiting. Chimeric molecules are not formed artificially during the m t R N A isolation procedure, since they also occur in R N A directly obtained from living trypanosomes. Neither are they a P C R artifact, since the link between guide and pre-mRNA is found almost exclusively in editing sites. Unexpected combinations between the wrong partners only occur if some form of base pairing is possible

5'gRNA C91 ACGGCUGAUAUAAGUGCAAAA~GGCAAUAAAGAOAAAAUAAAUAUAuuuuuu clone 4a 7a ACGGCUGAUAUAAGUGCAAAAAGGC

GCACUUAUAUCG..

3 ACGGCUGAUAUAAGUGCAAAAAGGCAAUAAAuu

GCACUUAUAUCG..

4b ACGGCUGAUAUAAGUGCAAAAAGGCAAUAAAGAC

GCACUUAUAUCG..

6 ACGGCUGAUAUAAGUGCAAAAAGGCAAUAAAGACA

GCACUUAUAUCG..

2 ACGGCUGAUAUAAGUGCAAAAAGGCAAUAAAGAOAAu

GCACUUAUAUCG..

7b ACGGCUGAUAUAAGUGCAAAAAGGCAAUAAAGACAAAAu.

GCACUUAUAUCG..

10 ACGGCUGAUAUAAGUGCAAAAAGGCAAUAAAGACAAAAUAuu

GCACUUAUAUCG..

8 ACGGCUGAUAUAAGUGCAAAAAGGCAAUAAAGACAAAAUAAAUAu~GCACUUAUAUCG.. 6 I~I

5 I

4 I

I

3 I

I

2 I

r"~I

1 i

i

...CAUGACAAACGUAuAuuuAuuuuAuuuuuAuuGuuuuuuuGCACUUAUAUCG _.....J

E#- GUAAUG.'~U~CAUUUGCUAUCCUUAUG C34 editedmRNA ND7 Fig. 6. Chimeric moleculesfor ND7 [5'] in C. fasciculata. Sequences shown are derived from cloned PCR products. The edited

mRNA sequence with numbered editing sites is shown at the bottom of the figure, and the gRNA sequence at the top. The horizontal lines represent single phosphodiester bonds and are used for alignment purposes only. Oligonucleotidesused in the PCR-amplification reactions are indicated by arrows.

226 (and can even be zero). The model of Fig. 5B cannot explain these molecules and they could therefore be non-functional side products. Their existence does suggest, however, that they are active in transesterification reactions in vivo. Recent experiments with synthetic CYb-I gRNA and pre-mRNA have shown that small gRNA molecules of 29 nucleotides can form chimerics in vitro upon addition of a T. brucei mt extract [42]. This gRNA contains a long anchor of 17 nucleotides, the information for only 6 Us, and has a 6 nucleotide U-tail. It has no other sequences. The results from experiments in vitro imply that such small, 'minimal' guide RNAs are indeed functional and that no other sequences are required. Surprisingly, C addition to the tail of the synthetic gRNA did not significantly affect its activity in vitro in the transesterification assay, as long as a free 3' OH group remains available. This supports the inference from the U-less chimeric molecules discussed above, that a 3' terminal U residue is not per se required in the transesterification reaction. Similar conclusions were drawn by Koslowsky etal. [43], who studied in vitro chimeric formation with endogenous gRNAs and synthetic pre-editied RNA with mt extracts from T. brucei.

Although the extent to which all the gRNA variants can participate in the complete sequence of editing reactions remains to be established, the in vitro results are in full support of the transesterification model of Fig. 5B. It is unclear at present what, if any, the function of proteins would be in the transesterification process. Possibilities range from a completely proteindominated procedure, in which 'cut and paste' enzymes similar to the ones postulated to act in the enzymatic model of Fig. 5A would run the whole process, to the other extreme of an essentially RNA-catalysed process, in which maybe other small RNAs participate, besides gRNAs. Even in this last case, proteins would be still needed e.g. in order to keep the RNAs in the correct conformation, to recharge the tail of the gRNAs during insertion (TUTase!) or to trim away the excess Us from the pre-mRNA in the case of a deletion event (U-exonuclease). For a

more detailed discussion, see [37 and 38]. The in vitro transesterification experiments [42] do indeed indicate that proteins from an mt extract of T. brucei are required, but their precise identity and role remain to be worked out.

Perspectives The elucidation of the mechanism of RNA editing has long depended exclusively on the study of nucleotide sequences of D N A and RNA. Notwithstanding that, a lot of information has been gathered and at least the initial mystery concerning the origin of the edited sequence has been solved. However, the next logical step must be to put the ideas to the experimental test and perform a biochemical analysis similar to that performed for the other basic processes of gene expression, such as D N A replication and transcription, RNA splicing and translation, etc. Although tantalizingly difficult, the situation is not without prospects, now that a system for the transesterification reaction in vitro has been developed, in which important cis-acting elements of gRNAs and pre-mRNAs can be further identified. On top of that one could start to isolate and purify other essential transacting protein and RNA factors. With purified components one might be able to fully reconstitute the RNA editing process in vitro, and with the genes of these components in hand one could hope to manipulate the editing process in vivo, now that reliable simple transformation techniques have also been developed for trypanosomes [44,45]. Furthermore, the sequences of the components involved might possibly reveal more about the evolutionary origin of the editing process, for example when homologies are found to components of the splicing process. The next couple of years (decades?) promise to be exciting.

Acknowledgements I thank the former and present members of the trypanosome group for their dedication, Profes-

227

sors P. Borst and H.F. Tabak for continuing interest and stimulating discussions, Dr. P. Sloof and D. Speijer for critical reading of the manuscript and E. Vlugt-van Daalen and W. van Noppen for expert help in the preparation of the manuscript. I am greatly indebted to G.J. Arts for allowing me to cite some of his unpublished work and to H. van der Spek who designed the first version of some of the figures and of Table 1. This research is supported by the Netherlands Foundation for Chemical Research (SON), which is subsidized by the Netherlands Foundation for Scientific Research (NWO).

References 1. 2. 3. 4. 5. 6. 7. 8. 9.

10.

11. 12. 13. 14. 15. 16. 17.

Borst P & Hoeijmakers JHJ (1979) Plasmid 2:20-40. Simpson L (1986) Int. Rev. Cytol. 99: 119-179. Benne R (1985) Trends Genet. 1: 117-121. Simpson L (1987) Ann. Rev. Microbiol. 41: 363-382. Benne R, Van den Burg J, Brakenhoff JPJ, Sloof P, Van Boom JH & Tromp MC (1986) Cell 46: 819-826. Benne R (1990)Trends Genet. 6: 177-181. Simpson L & Shaw JM (1989) Cell 57: 355-366. Stuart K (1991) Trends Biochem. Sci. 16: 68-72. Van der Spek H, Van den Burg J, Croiset A, Van den Broek M, Sloof P & Benne R (1988) EMBO J 7: 2509-2514. Van der Spek H, Speijer D, Arts GJ, Van den Burg J, Van Steeg H, Sloof P & Benne R (1990) EMBO J. 9: 257-262. Feagin JE, Abraham JM & Stuart K (1988) Cell 53: 413-422. Bhat GJ, Koslowsky D J, Feagin JE, Smiley BL & Stuart K (1990) Cell 61: 885-894. Koslowsky DJ, Bhat GJ, Perrolaz AL, Feagin JE & Stuart K (1990) Cell 62: 901-911. Read LK, Myler PJ & Stuart K (1992) J. Biol. Chem. 267: 1123-1128. Souza AE, Myler PJ & Stuart K (1992) Mol. Cell. Biol. 12: 2100-2107. Maslov DA, Sturm NR, Niner BM, Gruszynski ES, Paris M & Simpson L (1992) Mol. Cell. Biol. 12: 56-67. Sturm NR & Simpson L (1990) Cell 61: 871-878.

18. Butow, RA, Perlman PS & Grossman LI (1985) Science 228: 1496-1501. 19. Burke JM & RajBhandary UL (1982) Cell 31: 509-520. 20. BenneR& SloofP (1987) Evolution ofthemitochondrial protein synthetic machinery. BioSystems 21: 51-68. 21. Blum B, Bakalara N & Simpson L (1990) Cell 60: 189-198. 22. Sturm NR & Simpson L (1990) Cell 61: 879-884. 23. Sturm NR & Simpson L (1991) Nucleic. Ac. Res. 19: 6277-6281. 24, Van der Spek H, Arts GJ, Zwaal RR, Van den Burg J, Sloof P & Benne R (1991) EMBO J. 10: 1217-1224. 25. Pollard VW, Rohrer SP, Michelotti EF, Hancock K & Hajduk SL (1990) Cell 63: 783-790. 26. Koslowsky DJ, Riley GR, Feagin JE & Stuart K (1992) Mol. Cell. Biol. 12: 2043-2049. 27. Pollard VW & Hajduk SL (1991) Mol. Cell. Biol 11: 1668-1675. 28. Gajendran N, Vanbecke D, Bajyana Songa E & Hamers R (1992) Nucleic. Ac. Res. 20: 614. 29. Englund PT, Hajduk SL & Marini JC (1982) Ann. Rev. Biochem. 51: 695-726. 30. Blum B, Simpson L (1990) Cell 62: 391-397. 31. Muhich ML & Simpson L (1986) Nucleic. Acids Res. 14: 5531-5556. 32. Steinert M & Van Assel S (1980) Plasmid 3: 7-17. 33. Hagerman PJ (1990) Ann. Rev. Biochem. 59: 755-781. 34. Frasch A, Hajduk S, Hoeijmakers J, Borst P, Brunel F & Davison J (1980) Biochim. Biophys. Acta 607: 397-410. 35. Borst P, Fase-Fowler F & Gibson WC (1987) Parasitol. 23: 31-38. 36. Decker CJ & Sollner-Webb B (1990) Cell 61: 1001-1011. 37. Blum B, Sturm NR, Simpson AM & Simpson L (1991) Cell 65: 543-550. 38. Cech TR (1991) Cell 64: 667-669. 39. Bakalara N, Simpson AM & Simpson L (1989) J. Biol. Chem. 264: 18679-18686. 40. Harris ME, Moore DR & Hajduk SL (1990) J. Biol. Chem. 265: 11368-11376. 41. Zwierzynski T, Widmer G & Buck GA (1989) Nucleic. Ac. Res. 17: 4647-4660. 42. Harris ME & Hajduk SL (1992) Cell 68: 1-20. 43. Koslowsky D J, G6ringer HU, Morales TH & Stuart K (1992) Nature 356: 807-809. 44. BeUofatto V & Cross GAM (1989) Science 244: 1167-1169. 45. Ten Asbroek ALMA, Ouellette M & Borst P (1990) Nature 348: 174-175.

RNA editing in trypanosomes. The us(e) of guide RNAs.

Guide RNAs are encoded in maxicircle and minicircle DNA of trypanosome mitochondria. They play a pivotal role in RNA editing, a process during which t...
833KB Sizes 0 Downloads 0 Views