~'~ERSPECTIVES

T h e discovery of introns immediately led to two opposing theories of their evolutionary origin: either they were derived from transposable elements that inserted themselves into previously unsplit gene#, 2, or alternatively they were involved in the initial assembly of primordial protein-coding genes during precellular evolution3.4, and genes were split from the outset. In order to decide between these two contrasting theories we have to understand the phylogenetic relationships among the three major classes of intron and to explain their present phylogenetic distribution, both among organisms and in different sorts of genes. The recent exciting discovery ifi Jeff' Palmer's and David Shub's laboratories of group I self-splicing introns in a variety of eubacterial Leu tRNA genes5,6 suggests a radically new interpretation of intron phylogeny. When coupled with the recent clear demonstration by gene duplication molecular phylogenies that eubacteria are phylogenetically older than either archaebacteria or eukaryotes 7, it gives strong support to the view put forward some years ago2, 8,9 that self-splicing tRNA introns are much older than and are ancestral to nuclear spliceosomal introns and to the protein-spliced introns of eukaryotic and archaebacterial tRNA genes.

Introns are of unequal antiquity To frame the intron origins debate in terms of 'introns early' versus 'introns late' is an unfortunate oversimplification, for the phylogenetic evidence appears to indicate that self-splicing tRNA introns arose very early (probably around 3500 million years ago), whereas spliceosomal introns (i.e. the nuclear introns spliced by RNA-containing snRNPs) arose much more recently (possibly between 1700 and 1000 million years ago). Protein-spliced introns, which are restricted to nuclear tRNA genes and to the tRNA and rRNA genes of archaebacteria (Table 1), may be of intermediate antiquity. Since group I introns are present in tRNAs of a wide diversity of eubacterial phyla, some of which probably also have tRNA introns more like group II introns 10, it is possible that the divergence of group I and II introns (and perhaps other as yet uncharacterized types of self-splicing intron) occurred before the divergence of the eubacterial phyla, which probably occurred about 3500 million years ago9. The uncertainty about the time of origin of spliceosomal introns, which are never found in bacteria, arises because so little is known about the structure of protein-coding genes of the more primitive eukaryotes. What has become increasingly clear is the general nature of those early eukaryotes: eukaryotes are fundamentally divisible into two great superkingdoms (the primitively amitochondrial Archezoa, and the more recently derived mitochondrion-containing Metakaryota)8,9,11 Metakaryotes are the 'typical' aerobic eukaryotes (e.g. animals, plants, fungi, protozoa), with 80S ribosomes, mitochondria and peroxisomes. Archezoa by contrast contain only three little known groups: archamoebae such as Mastigamoeba or Pelomyxa, metamonads such as Giardia, and the microsporidia. Like bacteria, they have 70S ribosomes and lack both mitochondria and peroxisomes even though they have

Intr0n phylogeny: a new hypothesis T. CAVALIER-SMITH The three major classes of intron are clearly of unequal antiqui(y. Structured (often self-splicing and sometimes mobile) introns are the most ancient, probably dating (at leastfor group I) from the ancestral (eabacterial) cell 3500 million years ago, and were originally restricted to tRNA~ Protein-spliced introns (usuafly in tRNA) probably evolved from them by a radical change in splicing mechanism in the common ancestor of eukaryotes and arcbaebacteria~ perhaps only about 1700 million years ago. Spliceosomal introns probably evolvedfrom group-ll-Uke self-splicing introns after the origin of the nucleus between 1700 and 1000 million years ago, and were probably mostly inserted into previously unsplit protein-coding genes after the origin of mttochondria 1000 million years ago. well-developed nuclei, cytoskeleton and endomembrahe systems, and therefore are clearly eukaD,otes. Evidence is accumulating that archezoa evolved into metakaryotes by evolving mitochondria (and probably also peroxisomes) from eubacterial symbionts, and that archezoa and archaebacteria evolved from a common ancestor that was in turn derived from a eubacterium. Archezoa probably originated about 700 million years before mitochondria 12. A key question for understanding the origin of spliceosomal introns is whether or not they are present in archezoa. Their absence would indicate that they probably arose only after the origin of mitochondria in the first metakaryote (i.e. about 1000 million kcars ago), rather than soon after the origin of nuclei (around 1700 million years ago) as suggested earlier. No introns have been found in the seven or so protein-coding genes so far sequenced in the archezoan Giardia (mostly unpublished) even though these include four genes (encoding R- and [3-tubulin, actin and glyceraldehyde phosphate dehydrogcnase) that very commonly have introns in metakaryotes. 1 predict that spliceosomat introns ,,',ill prove to be absent from all archezoa, and suggest tha~ lhcv max have evolved from group-II-like self-splicing inirons that were introduced into the first metakaD'ote b\ thu eubacteria[ ancestors of mitochondria and/or pcroxisomes about 1000 million years ago. By contrasi. I expect protein-spliced introns to be present in at least some archezoan tRNA genes, since these are present in both archaebacteria and metakaryotes and were therefore probably also present in the first euka~'e~te Figure 1 summarizes the phylogenetic implications of these suggestions. Origin and spread of spliceosomal introns Since spliceosomal introns share a 3'-Ott lariat splicing mechanism with group lI seltLsplicing intr~ns and splice site sequence similarities with group Ill introns, it is very probable that they evolxed from a similar type of self-splicing intron l*, Since the to',q/ record indicates that eukawotes are only half as old :t>

TIG MAY1991 VOL.7 NO. 5 tc)Ol ~]sc~it'r S~icncc' Publish~-r~ Ltd (I'KI /)16~ 9-i~9 gt S02/)0

H E

[~ERSPECTIVES

TABUE1. Distribution o f t h e major c l a s s e s o f ~ n Bacteria

sptictn8

Eubacteria Archaebacteria

Archezoa

mechanism

Metakaryotes Nuclei

3'-phosphate/5'-OH

tRNA rRNA

~

tRNA

?

rRNAb

~hondria

3'-OH/5'-phosplmte

Structured (commonly self-splicing) group I Leu tRNAa

mRNA rRNA

group II

Vat tRNAc

~

?

group III

?

Spliceosomal

~

-

mRNA

Leu tRNA rRNA mRNA tRNA mRNA tRNA mRNA

mRNA

aAlso in protein-c~mg (mRNA) genes of Escherichia coil T2, T4 and T6 ohages and in Bacillus subtilis phage SPO1. bprotozoa, fungi and algae only. cprelimina~ observations10 suggest the presence of non-group I structured introns, but whether t~ey are really group II or some new type is unclear. eubacteria, it is possible that group II self-splicing introns existed for about 1750 million years before they (or a similar self-splicing type) eventually evolved into spliceosomal introns and into the RNA component of snRNPs 14. I earlier argued s that the slow splicing of spliceosomal introns would have made it difficult for them to evolve in bacteria (or in mitochondria or chloroplasts) because if splicing is slow the coexistence of DNA and functional ribosomes in the same cell compartment would allow ribosomes to translate unspliced premessengers and make incorrect proteins with intron sequences or, if the introns had stop codons, truncated and chimeric proteins. But as soon as the nuclear envelope and pores evolved sufficiently to exclude ribosomes, spliceosomal splicing could evolve within the nucleus if suitable precursors were present 8. If, as suggested above, archezoa lack both spliceosomal and self-splicing introns, such evolution could not have preceded the transfer of genes containing self-splicing introns into the nucleus from the eubacterial symbiont (derived from the 0t subdivision of the purple eubacteria) that was converted into the ancestral mitochondrion. Since several group II introns code for reverse transcriptase-like proteins, it is likely that retroposition was in some way inw)lved in spread and insertion of introns into new genes. Group II introns may be absent from the nucleus because, though originally present, they were all eventually converted into spliceosomal introns and genes for snRNA used in splicing, whereas group I introns remained in nuclear rRNA in at least some protozoa. The ribozymal activity of introns would have quickly been lost by degenerative mutations as soon as they could be spliced in trans by snRNPs; if their ability to insert themselves into genes depended on reverse splicing, as is possible, this degeneration would have put a stop to further spreading. Some introns (i.e. those in highly conserved positions across

the largest phylogenetic distances within metakaryotes) probably originated during the earliest phases of metakaryote/protozoan evolution, but others (e.g. in actin and tubulin 15) seem to have been inserted relatively late in protozoan evolution or, in many cases, later still after the origin and diversification of the animal kingdom13. The conversion of group-II-like selfsplicing introns to spliceosomal introns (which would require very few mutations13) therefore was probably spread over many hundreds of millions of years. The persistence of group I self-splicing introns in eubacteria for 3500 million years (shown for example by the presence of Leu tRNA group I introns in both Thermotoga and cyanobacteria, which probably diverged during the primary eubacterial radiation 3500 million years ago9,16) makes the view that spliceosomal introns were originally widely present in bacterial protein-coding genes, but were totally lost by 'streamlining' of the genome3,4, very much less tenable than it was when it was first proposed.

Origin of 3'-phosphate/5'-OH protein-spliced introns I argue that the ability of introns to self-splice may have been eliminated from the common ancestor of archaebacteria and archezoa by the evolution of a new substitute type of splicing, by proteins that made cuts generating Y-phosphate and 5'-OH ends instead of the 3'-OH/5'-phosphate cuts generated by spliceosomes or self-splicing. The fact that cyanobacterial and some other eubacterial Leu tRNA genes have a group I self-splicing intron in the anticodon loop, whereas some archaebacterial Leu tRNA genes have a non-self-splicing 3'-phosphate/5'-OH intron (i.e. the intron type characteristic of archaebacterial and nuclear tRNA genes) in the same position 17, favours the hypothesis that Y-phosphate/Y-OH spliced introns evolved from 3'-OH/5'-phosphate self-splicing introns by evolving

rig ~1AY1991 VOL. 7 ~O. 5

It(

~'~ERSPECTIVES Empire Bacteria Kingdom Eubacteria

Empire Eukaryota Kingdom Archaebacteria

Superkingdom Archezoa

Superkingdom Metakaryota

0-

It snRNP-spliced I introns

700E

g

1000-

I

3

CL

j

O .,Q

L Protein-spliced introns

1750-

0.) ¢-

._o

- -

3500-

Self-splicing introns

1

FIGH The phylogeny of organisms and the five major steps in intron evolution. (1) Origin of group I and II and other types of y-OH/5'-phosphate self-splicing introns by diversification from primordial ribozymes and their selfish insertion into certain tRNA genes in the ancestral eubacterium (or its precellular ancestors9). (2) Conversion of self-splicing tRNA introns into 3'-phosphate/5'-OH introns spliced by proteins in the common ancestor of eukaryotes and archaebacteria. (3) Transfer of self-splicing introns into eukaryotes by the purple bacterium ancestor of mitochondria and their spread into protein-coding and rRNA genes of the reduced mitochondrial genome. (4) Evolution of 3'-OH/5'-phosphate spliceosomal introns and of spliceosomal RNA from group II self-splicing introns, probably in the ancestral metakaryote from introns supplied by the purple bacterium (but possibly instead from the presumed symbiotic ancestor of peroxisomes or in the ancestral archezoan from group II introns inherited directly from its eubacterial ancestor). (Metakaryota includes five kingdoms: Protozoa, Plantae, Animalia, Fungi, Chromista9.) (5) Transfer of additional self-splicing introns into eukaryotes by the cyanobacterial ancestor of chloroplasts lollowed by their spread from tRNA genes into chloroplast protein-coding genes. The phyk)geny and dates are based on a synthesis of molecular phylogenetics, ultrastructural cladistics, and the fossil record 12. a chemically novel splicing mechanism (using only protein enzymes) in the c o m m o n ancestor of eukaryotes and archaebacteria. The selective advantage for this changeover would have been the great reduction in intron size from several hundred to a handful of nucleotides. This would have been especially important for tRNA, which makes up several per cent of the total cellular RNA and in which a self-splicing intron is several times the length of the mature tRNA itself. But this change occurred only once, and neither eubacteria nor archaebacteria ever totally lost tRNA introns. This shows that selection for streamlining has not been able to eliminate them, presumably because selection is powerless in the absence of enabling mutations, which must clearly have been exceedingly rare for tRNA introns. It is therefore not reasonable to suggest that most bacterial protein-coding genes were once copiously split by spliceosomal introns3,4 since this requires that they were later totally eliminated from bacteria, but only billions of years later, after they gave rise to eukaryotes. It makes little sense to assume that they persisted for 1800 million years in at least one lineage of bacteria but were eventually lost from all.

If all group I and II self-splicing introns were converted into 3'-phosphate-spliced introns in the common ancestor of eukaryotes and archaebacteria, the most primitive eukaryotes (Archezoa) probably entirely lacked both self-splicing and spliceosomal introns; bul if only some self-splicing introns were thus converted then archacbacteria and archezoa may both turn out t() have self-splicing introns as well. As mentioned previously, no introns have yet been found in Giardia. the only archezoan for which any protein-coding gencs have been sequenced. If spliceosomal introns are found in Archezoa, however, this would place step ~) in Fig. 1 lower down the eukaryote tree.

The insertional origin of introns The fact that bacterial tRNA introns are located at different positions in different tRNA genesl-, and ma~ lie within the anticodon, argues for an insertional originl,2,~3,~8, either directly into DNA by means of a I)NA topoisomerase or indirectly by reverse splicing >, reverse transcriptase and gene conversion, or a similar mode of integration by homologous recombination. This may be the mode of insertion for the optional introns of chloroplasts and mitochondria, which arc

ri(5 ray 1991 rot. 7 xo. 5

m

[]~ERSPECTIVES located in such a disparate variety of genes that they must have been mobile in the past 2. Although RNA splicing may date from the RNA world, the initial insertion of introns into tRNA genes probably took place after the origin of DNA; if it had taken place earlier, in the RNA world where RNA genes replicated, the introns are unlikely to have persisted, because the spliced version would have been at a replicative advantage. It is the inability of RNA to replicate that allows introns to persist.

lntron spread and exon shuffling Contrary to what is sometimes asserted, the insertional theory of intron origins1,2,8,9,13,15,18,2° does not postulate that introns were inserted entirely at random, but assumes sequence-specific mechanisms. Nor does it deny the importance of exon shuffling of protein domains 21 in the evolution of eukaryote proteincoding genes. Indeed exon shuffling by homologous DNA recombination would have been much easier after the explosive insertion of large numbers of homologous spliceosomal introns (or strictly speaking their self-splicing ancestors) into nonhomologous genes than it would have been for eukaryotes according to the primordial intron theory. But the only well-established cases of exon shuffling are in metakaryotes, in particular in animals; no examples are yet known for intracellular proteins az, and there is no evidence at all that primordial proteins were assembled by exon shuffling. What the insertional theory does argue is that the evolution of slow multimolecular spliceosomal splicing and of the spread of such introns widely into protein-coding genes would have been much easier after the origin of the nuclear envelope and the separation of transcription and translation into separate compartmentsS, TM. Whether this occurred in the ancestral eukaryote as I suggested earlier

Intron phylogeny: a new hypothesis.

The three major classes of intron are clearly of unequal antiquity. Structured (often self-splicing and sometimes mobile) introns are the most ancient...
551KB Sizes 0 Downloads 0 Views