Molec. gen. Genet. 145, 119-123 (1976) © by Springer-Verlag 1976

Molecular Evolution of 5S RNA Hiroshi Hori Department of Biochemistryand Biophysics,ResearchInstitutefor NuclearMedicineand Biology, Hiroshima University,Hiroshima,Japan 734 Summary. Based on the comparative analyses of the primary structure of 5 S RNAs from 19 organisms, a secondary structure model of 5S RNA is proposed. 5S RNA has essentially the same structure among all prokaryotic species. The same is true for eukaryotic 5 S RNAs. Prokaryotic and eukaryotic 5 S RNAs are also quite similar to each other, except for a difference in a specific region. By comparing the nucleotide alignment from the juxtaposed 5S RNA secondary structures, a phylogenic tree of nineteen organisms was constructed. The time of divergence between prokaryotes and eukaryotes was estimated to be 2.5 x 109 years ago (minimum estimate: 2.1 x 109).

Introduction In a previous paper (Hori, 1975), we compared nucleotide sequences of 5 S RNA from seventeen organisms to calculate the rate of nucleotide substitution during the course of evolution and to construct the phylogenic tree from the substitution data. The method involved was the comparison of two given 5 S RNA molecules in which the GAAC or GAUC sequences in the 5 S RNA (probably tRNA binding site (Richter, Erdmann and Sprinzl, 1973)) at about positions 40-45 were first matched and then the homology of the rest of the sequences was mathematically estimated to obtain the "best match alignments". This kind of comparison is almost purely statistical and does not necessarily guarantee the comparison of homologous regions throughout the molecules except for the GAAC (or GAUC) sequence. It is therefore probable that the rate of nucleotide substitution so obtained is considerably smaller than the real one. In the present paper, we have made a more reasonable approach to this problem.

5S RNA Secondary Structure The construction of the alignment from the juxtaposed 5S RNA secondary structure would allow the comparison of homologous sequences throughout most of the molecules. The rate of nucleotide substitution calculated from such comparison would be more reasonable than that previously obtained, since the apparent rate of nucleotide substitution must be strongly affected by the regional structural difference. A secondary structural model of 5 S RNA was recently proposed by Fox and Woese (1975). We have also independently done similar work to detect the possible intramolecular helical regions common in all the known 5S RNA, having constructed their secondary structures. To obtain the alignments of nineteen 5S RNA sequences, the structurally corresponding helical regions which had been computed by the method of Tinoco, Uhlenbeck and Levine (1971) were matched first and the "best match alignment" with minimum gap insertions was obtained for most of the non-paired regions (see Hori, 1975). Between positions 63 and 77, the alignment was done manually. The AGU of position 74-76 and GGU of position 70-72 of human 5 S RNA were matched first with AGU and with Pu-Pu-Pyr of the other 5S RNAs in the respective regions (Fig. 1). Our structural model is similar to that of Fox and Woese. However, one important difference is that in our model of eukaryotic 5S RNA, the hairpin structure that exists in prokaryotic 5S RNA in positions 82-94 (in the case of E. coli) is lacking. In contrast, eukaryotic 5S RNA has a well conserved loop in positions 83-94 (in the case of human 5 S RNA) which is lacking in prokaryotic 5S RNA (Fig. 2). The existence of the loop, sPecific to eukaryotic 5S RNAs, may be supported by the fact that the position 85-90 in Chlorella (Jordan, Galling and Jourdan, 1974) and the position 89-92 in yeast (Nishikawa and Takemura, 1974) were one

120

H. Hori: Molecular Evolution of 5S RNA 123456789

KB XK xo sc TLJ 58 CC AN BM PF pl.-I EC ST AA EA 5M p/,t YP

I 012345

--CAUACC---CACACC---CACACC---CAUAUC---CAUAUC---CAUAUC-~u~cuAcau[ - - C A U A - C - u ICCUGGUGU( --UAUGGC-IUTCUGGUGGC( ---AUAGC-~]CUAGUGAC/ ---AUAGCG~ u~Guucu~u~l~ CGAGUAGU6!u I~cuu~c~ -CCAUAGC-IU GCCUGGCG( -CCGUAGC-lu GCCUGGC~SC -dAdUAGC-flu GCCU~G(C]G( ,U ~CCUGGCG( ~UG~UAGC-IU GCCUGGCG( -Cg/gUAGC-lu GCUUGrdCJG( -C~A)UAG~~__IGCCUGGCT( -Cd,~UAGC--

TGUCUACGG< IC~CCUACGG( IGCCUACGG( IC:GUUGC@G( I~GUUGCG~( IGGUUGCGG(

4

3

678901 2345678 9012 3#5678 901234 ~ AACGCGC- CCG~ UCUCGU-CUGAUC ACCCUG[AAAGUGC- CCG~ UCUCGU-CUGAUC ~CCCUGIAAAGUGC- CUG~ UCUCGU-CUGAUC JACCAGJAAAGCAC- CGUL CUCCGU-CCGAUC JAGCAGI AAAGCAC- CGUL CUCCGU-CC~AUC JACCAGI AAAGCAC- CGUL CUCCGU-CCGAUA ~CCACQAAAGCAC- CC~ UCCCAU-CAG.AA~ 3GUAUGI GAACCACU CUG~ CCCCAUCCCGAAC SAAGAGIGUCACAC- CCGL UCCCAUACCGAAC 3GAGA~ AAA-CAC- CCGL CUCCAUCCCG~Ac ~CAUUGI GAA-CAC- CUGA UCcCAUCCCGAAC 3UUAUG] GACCCAC- CUG~ UCCCUUGCCGAAC 3CGGUGI GUCCCAC- CUGI CCCCAUGCCGAAC 3CGGUGI GUCCCAC- CUGA CCCCAUGCCGAAC 3~')GGUGI 6UCCCAC- CUGA CCCCAUGCCSAAC ~CGGUG ~UCCCAC- CUG~ CCCCAUGCCGAAC 3CG6UG 6UCCCAC- CUGA CCCCAUGCC)AAC ~(C~GUG{GUCCCAC- :UGA CCCCAUGCCGAAC ?CGGU~GUCCCAC- ~U~A UCCCAUGCCGAAC

5

90123456

5878

[uc~( UCA( ACU( ACU( CCUC UCG( UCA( ACG( ACG~ UCA~ UCAC UCA( UCA( UCA( UCA( UCA( UCA( UCA(

AAGCUAAG AAGCCAA6 AAGCGAUA UAGUU~AG UAGUUAAG UAGUUAAG AAGUUAAA UUGUGAAA AAGUUAAG AA~UUAAG AGGUG~AA UAGUGAAA AAGUGAAA AAGUGAAA AAGUGAAA AAGUGAAA AAGUGAAA AAGUGAAA AAGUGAAA

I

6 7 8 9 789012 3456789012 3 4 5 5 7 89012 345678901234 5 6 7 8 9

0 0123456789

I I O12345678

t 2 90

-CGGGCCUGGU--UAGUA- ~[TJT.J~] AUGGGAGACCGCFCU-GC ' G]---AAUACCGGGU-- GCUGUAGG~ UU CAGGGUI -CGGGCCUGGU--UAGUA- LCUUGG , AUGGGUGACCGC',CUGGGI---AAUACCAGGU-- GUCGUAGGq UU CAGGGuI -C(~GGCCUGGU--UA@UA-~CCUG6~AUGGGUGAGCGCICUGGGI---AAUACCAGGU-- GUCGUAGG~UU CUGGUAI -AGAqCCUGACC-GAGUA-IGUGUA~,GUGGGUGACCA-~UACGCI--GAAACUCAGGU-- GCUGCAAUq -U CUGCUAI -AGAGCCUGAUC-GAGUA-~LGUGUA~GUGG6UGACCA-IUACGCI--GAAACUCAGGU-- IGCUGCAAUq -U CUGGUAI -AGAGCCUGAC_.C-GAGUA-,6UGLUL',~UGGGUGACCA-',UACG.C.I--GAAACCUAGGU-- ~ C U q C A ~ U c L ~ ZGUGGUI-UGGGCUCGAC--UAGUAC UGGGU,-UGGGAGGAUU- ACCUG~--AGUGGGAACCCC- ~ A C G U A G U G - ~ .CAUACCJ-UGCGGC-_AA_C.G_A_U_A_GC_U- ]~ICCGG. . . . GUAG. . . . CCG~ CGCUAAAAUAGCUC- ~ACGCCAGGJUC CUCUUU[-AGCGCC-_A_A.LJ_~CU_A_G_ULJ.- GGGAC . . . . . UUU. . . . GUCCCIUGUGAr~AGUAGGA- - ~GUUGCCAGF'~C ZUCUCCI CAGCC~CC-G_AU§_G_UA._G_UU- GGGGC . . . . . CAGC--- GCCCCIUGCAAGAGUAGGU-- UGUCGCUAGLG',C CGAUGCJ-AUCGCC-§..A_UGGU_AG_UG_- UGGGG . . . . . UUU. . . . CCCCAIUGUCAAGAUCUCGA- ~CAUAGAGC~U CGUUAUI-=AC]CGCC-G_AgG_GUAG_UG- UGGGG . . . . -.UCU. . . . CCCCAIUGUGA(~AGUAGGACA UCGCCAGGCI~U CGCCGUI-AGCGCC-~_ALLG_GJ.U_AGU~- UGGGG . . . . . UCU. . . . CCCCAIUGCGAGAGUAGGGAA CUGCCAGGCI ~ CGC(CC~uI-A(~CGCC-GAUGGUAGUG- UGGGG . . . . . UCU. . . . CCCCAIUGCGAGAGUA~,GGAA CUGCCAGGCJ AIU cGc~r-.~UI-AG(C)'GCC-_G~_QG_G_O~_G_L;~-- UGGGG . . . . . UCU. . . . CCCCAIUC~QGAGAGUAGGGAA CUGCCAGGCI Ab ~G[CCr-.~JI-AGCGCC-GAUGGUA@UG- UG@GG. . . . . CCU. . . . CCCCAIUGCGAGAGUAGGGAA CUGCCAGGCJ &U CGC(CG)Ul-AGCGCC-G/],L]_G_G_O_A_G_O_G- UGGGG. . . . . UCU. . . . CCCCAIUGC6AGAGUAGGACA CUGCCAGGCI At/ ~GCCF..~[-AG~GCC-GA_U~G_U_ASU_~-UGGG~. . . . . UCU. . . . CCCCAIU~C~AGA(SUAGGGAA CUGCCAAGCJ Ab CGCC~G)JJ-AGCGCC-G_A_UG_G_U_A.G_US.- UGGGG. . . . . UCU. . . . CCCCAIUGCGAGAGUAGGACA CUGCC~GG~A~

HUNAN KB XEN0PUSIKI XENOPU5(O} 5,CEREVI5, T,UTILIS 5,CARL5BE, CHLORELLA A,NIDULAN5 B,MEGATER, B,STEAROT, P,FLUORES, PHQTQBACTJ E,COLI 5,TYPHIMR, A,AEROGEN, E,AEROIDE. S,t~ARSCES, P,IIIRABIL, Y.PESTIS

Fig. 1. Alignment of 5S RNAs. The squared-off sequences: Helix regions in secondary structures. Underline: The sequence probably interacts with the common GT~PC loops in tRNA. Dotted underline: The sequence probably interacts with 23S RNA (Herr and Noller, 1975). Sequence in parenthesis: Sequences were not completely determined Abbreviations: KB (Human KB cell), XK (Xenopus laevis, kidney cell, African toad), XO (Xenopus lae'vis, ovary cell), CC (Chlorella, cytoplasmic), SC (Saccharomyces cerevisiae, Yeast), TU (Torulopsis utilis, Yeast), SB (Saccharomyces carlsbergensis, Yeast), AN (Anacystis nidulans, Blue-green alga), BM (Bacillus megaterium), BS (Bacillus stearothermophilus), PF (Pseudomonasfluorescens), PH (Photobacterium sp.8265), EC (Escherichia coli), ST (Salmonella typhimurium), AA (Aerobacter aerogenes), EA (Erwinia aeroideae), SM (Serratia marcescens), PM (Proteus mirabilis), YP (Yersinia pestis)1. Sequences of ST, AA, EA, SM, PM and YP were not completely determined and therefore reconstituted from oligonucleotide maps

of the m o s t R N a s e - s e n s i t i v e regions in the 5 S R N A molecules. The difference in s e c o n d a r y structure between 5 S R N A s m e n t i o n e d a b o v e suggests the n o n linear e v o l u t i o n of 5S R N A from p r o k a r y o t e s to e u k a r y o t e s ; the d r a s t i c changes such as deletions a n d insertions in certain regions (positions 63-100) w o u l d have o c c u r r e d u p o n e u k a r y o t e emergence.

The Rate of the Nucleotide Substitution in 5S R N A

from the a b o v e a l i g n m e n t were c o m p a r e d to each o t h e r only at the n o n - g a p p e d sites ( K i m u r a a n d Ohta, 1973). D e l e t i o n s a n d insertions ( = g a p s ) were n o t t a k e n into a c c o u n t even if they exist. T h e n u m b e r of sites which differ f r o m each o t h e r was c o u n t e d a n d the rate o f the n u c l e o t i d e s u b s t i t u t i o n p e r site, K(nu), in 5 S R N A was c a l c u l a t e d as follows.

K(nu) = - 3 / 4 ( l n ( 1 - 4 / 3 ) 2 )

(1)

To o b t a i n the r a t e of n u c l e o t i d e s u b s t i t u t i o n in the course of evolution, two 5 S R N A sequences o b t a i n e d

where 2 is the fraction of different sites which differ from each other. The rate of s u b s t i t u t i o n per site per year (k) m a y be c a l c u l a t e d as

1 References for all sequences were cited in Fox and Woese (1975) and Hori (1975)

k = K(n u)/2 T

(2)

121

H. Hori: Molecular Evolution of 5 S R N A

where T is the number of years that have elapsed since the evolutionary divergence of the two polynucleotides from their common ancestor. These equations were essentially the same as those introduced by Kimura and Ohta (1971, 1973) and used by us (ttori, 1975). The value deduced from the average K(nu) between human and Xenopus was used for the evolutionary clock. The K(nu)~_~ was about 0.108. It has been known from the paleontological studies that the common ancestor of the amphibians and the mammals appeared about 300 Myr ago (Harland, 1967) (mini-

CCC 3 2 U AO CCC ~GUAG C 0 G C C AO I C GCGGUG CUGA U 5': U OC GCCuGGCGG UGCCGC GACU C A 6A. A 5 C C CGGACCGUC G DA A 0 AAG A 1A 3':U IA C A G I OG 6 GU C70 2 G C 0 G G E . COLI 55RNA A A U G U A G IOOG G A U G A C G G USO U G A.~U C=G C~G C,~G 90C=G U U C

AGC 2 A A 3 CCC UAC 0 A C 0 U A I A ACCACG CCGA U OC **~**~ **@* U UGGOGC GGCU. C 5: AUGCUACGU U 6 A A C A UGUGAUGCAGC cGG 3': I C U 2 C C 0 C GTO A A A C G U G A G G U U IOOG A A C GwU UwGSO CwG C~G A~U U U U G A G GGAG 9 0

A G5 UU 0

0

CHLORELLA 55RNA

mum estimate: about 250 Myr ago (Simpson, 1950)). It then follows that k is 1.8 x 10 -1° (2.2 x 10 -10) from the Equation (2).

Phylogenic Tree The phylogenic tree constructed from K(nu) values deduced from the 5S RNA sequences for different organisms is shown in Figure 3. The lowest node represents the divergence of two populations, one of

ACC

5': U

IA OU CCUGGUGUC

UGG C

2 A 0 G GGUAUG

3 A 0 CCC 4 CU C AO CUGA U

CCAUAC GACU C GGACCGCAG U 6A U 5 C C U l 6 OA U O AAG 3':C 1C C A G ! OU G GU G70 Z C 0 G C A A . N I D U L A N S 5SRNA A U A A C G A A A [0OA U U A C 6 G C80 C U "U C G~'C G*C C*G C~G 90G G A U

CGC 2 A G 3 CUC 0 A C 0 U G ACC ACCCUG CCGA U I U OA UGGGAC GGCU C C C 6 G A C U 5': GUCUACGGC G 0 A A UAG4 A G5 0 CGGAUGUCG CGG C UC 0 U lU U ~:U IG GTO t OG HUMAN KB CELL 5SRNA 2 G G 0 C U C U A A U G A U IOOA A G~C G~U GwU80 U*G C~G C A G U C G C G AG AG 9 0

Fig. 2. Secondary structural models of prokaryotic and eukaryotic 5 S R N A s

122

H. Hori: Molecular Evolution of 5S RNA

(2.0)

(~.0)



'

~

Human KB cell

~.Xenopus laevis / / /

i Torulopsisutili___~s Sacch. . . . g. . . . . . . . isiae " Sacch. . . . y. . . . ,arlsber~ens!s

~ ~

~

Chlorelia (cytoplasmic)

~/~'7/"-/,~S

~ ~

f

Anacystls nidulans ~

Baclll. . . .

/ / ~ _ ~

sat. . . . .

Bacillus st.... th....philus

/

Pseudornonasfluorescens

/

Proteus mir abiIis Erwiniaaeroideae S. . . . ti. . . . . . . . . . . yersinia pestis

/ / i

S~elm°°~:tl~taye;h]g22re;um Escherichia coli

bi[lion years

-2.0 -[5 ;I. 0 Fig. 3. Phylogenic tree. The K(n u) values were used for the construction of the phylogenic tree. All pairs of organisms were rearranged in the order of increasing K(nu) values, and the pair to be chosen first was decided simply by choosing the pair with the smallest K(nu) value. The value of 1/2 K(nu) of the pair was taken to settle the branching point between them. The branching points between two or more pairs were determined from the average number of 1/2K(nu) between the pairs. In this figure, the time scale,which had been calculated from K(nu) values from Equation (2) was used in the abscissa. The value in parenthesis is the minimum estimate

which had developed into the kingdom Monera (bacteria and blue-gree algae) and the other eventually into an ancestor of the eukaryotes. The average K(n u) between prokaryotes and eukaryotes was 0.906. The time of their divergence was estimated to be 2.5 x 109 years ago (minimum estimate: 2.1 x 109) from the Equation (2). The structure of eukaryotic 5S RNA is quite similar to that of prokaryotic type, and yet the alignment (Fig. 1) shows the apparent discontinuity between them. This would make some difficulty in this kind of evolutionary studies. To avoid this difficulty, the stable regions in the secondary structures (in the case of human, positions 1-80 and 111-120) were selectively compared and the K'(nu) values of these parts were calculated. The average K'(nu) between prokaryotes and eukaryotes was 0.867 which is almost equal to the case of whole sequence comparison. This would mean that the changes which had occurred in certain regions of 5 S RNA (in positions 63-100) upon eukaryote emergence did not seriously influence the k value. The time of divergence between prokaryotes and

eukaryotes or more generally between two given organisms calculated here was considerably longer than that obtained previously (Kimura and Ohta, 1973; Hori, 1975). This is to be expected, since in the previous papers, as discussed in Introduction, the rate of nucleotide substitution was calculated from the "best match alignments" data without considering reasonable sequence homology between two 5 S RNA species. The obtained branching order of the organisms in this paper was in essential agreement with that reported previously (Hori, 1975). In the eukaryotic kingdom, however, the previous order was Fungi, Planta and Animalia, while the order in this paper is Planta, Fungi and Animalia, which is in accordance with the order obtained from cytochrome C data (McLauglin and Dayhoff, 1973). Acknowledgements. I wish to express my sincere thanks to Prof. S.Osawa for his support and critical reading of the manuscript. This work was supported by a grant from the Scientific Research Funds of the Ministry of Education, Japan (No. 94815).

H. Hori: Molecular Evolution of 5 S RNA

References Fox, G.E., Woese, C.R.: 5S RNA secondary structure. Nature (Lond.) 256, 505-507 (1975) Harland, W. B.: The fossil record. London: Geological Society 1967 Herr, W., Noller, H.F.: A fragment of 23S RNA containing a nucleotide sequence complementary to a region of 5S RNA. FEBS letters 53, 248-252 (1975) Hori, H.: Evolution of 5S RNA. J. molec. Evol. 7, 75-86 (1975) Jordan, B. R., Galling, G., Jourdan, R.J.: Sequence and conformation of 5 S RNA from Chlorella cytoplasmic ribosomes: Comparison of other 5 s RNA molecules. J. molec. Biol. 87, 205-226 (1974) Kimura, M., Ohta, T.: Theretical aspects of population genetics, pp. 16-32. Princeton: Univ. Press 1971 Kimura, M., Ohta, T.: Eukaryotes-prokaryotes divergence estimated by 5S ribosomal RNA sequences. Nature (Lond.) New Biol. 243, 199-200 (1973) McLauglin, P.J., Dayhoff, M.O.: Eukaryote evolution: A view

123 based on cytochrome C sequence data. J. molec. Evol. 2, 99-116 (1973) Nishikawa, K., Takemura, S.: Nucleotide sequence of 5 S RNA from Torulopsis utilis. FEBS letters 40, 106-109 (1974) Richter, D., Erdmann, V.A., Sprinzl, M.: Specific recognition of GT~PC loop (loop IV) of tRNA by 50S ribosomal subunits from E. coll. Nature (Lond.) New Biol. 246, 132-135 (1973) Simpson, G.G.: The meaning of evolution: A study of the history of life and its significance for man. New Haven and London: Yale Univ. press 1950 Tinoco, I., Jr., Uhlenbeck, O.C., Levine, M.D.: Estimation of secondary structure in ribonucleic acids. Nature (Lond.) 230, 362-367 (1971) Communicated

by H.G.Wittmann

Received November 3, 1975/February 4, 1976

Molecular evolution of 5S RNA.

Molec. gen. Genet. 145, 119-123 (1976) © by Springer-Verlag 1976 Molecular Evolution of 5S RNA Hiroshi Hori Department of Biochemistryand Biophysics,...
351KB Sizes 0 Downloads 0 Views