BioSystems, 9 (1977) 229--243 © Elsevier/North-Holland Scientific Publishers Ltd.

229

THE ORIGIN OF THE PROTEIN SYNTHESIS MECHANISM* MASAHIRO ISHIGAMI, KEI NAGANO and NOBUKO TONOTSUKA Laboratory of Biology, Jichi Medical School, Minami Kawachi-machi, Tochigi-ken 329-04, Japan (Received May 25th, 1977) (Revised version received August 8th, 1977) The origin and development of the protein synthesis mechanism is considered in four successive steps. The genetic code is supposed to be controlled by the relative amount (availability) of various amino acids and nucleotides on the one hand, and utility of each amino acid in the polypeptide, on the other hand. Thus, more simple (inutile) and abundant amino acids tended to correspond to codons which were rich in the less frequent base species, G and C. Features of primitive tRNA in the discrimination of amino acid are discussed. Primitive tRNA is proposed to have a discriminator sil;e for amino acid and, separated from it, an anticodon site for interaction with nucleotides. A hypothetical course of subdivision of various nucleic acid species is proposed. In the scheme, mRNA and ribosomal RNA (rRNA) were derived from more primitive insoluble RNA. DNA appeared in the late, not first, step of the development. Severn other aspects of evolutionary development of the whole protein synthesis mechanism, e.g., role of the discriminator site on primitive tRNA, modification and subdivision of code catalogue into a more precise specification of amino acids, and possible primordial interactions between tRNA and tRNAbinding sites on insoluble rRNA, are discussed.

1. I n t r o d u c t i o n T h e origin and d e v e l o p m e n t o f the p r o t e i n synthesis m e c h a n i s m is one o f the p r i n c i p a l p r o b l e m s in t h e s t u d y o f the origin o f life, since it r e p r e s e n t s a k e y e v e n t in the t r a n s i t i o n f r o m c h e m i c a l to biological evolution. P r o t e i n synthesis in its established f o r m requires b o t h catalytic activity of relevant enzymes for a s s e m b l i n g o f a m i n o acids and also nucleic acids as the t e m p l a t e s as well as the a p p l i a n c e s f o r the assemblage. S i m u l a t i o n e x p e r i m e n t s in c h e m i c a l e v o l u t i o n suggest t h a t the c a t a l y t i c activity o f p o l y p e p t i d e s ( F o x and Dose, 1 9 7 2 ) a n d the d u p l i c a t i n g r e p r o d u c t i o n o f p o l y r i b o n u c l e o t i d e s (Sulston et al., 1 9 6 8 a , b ; 1 9 6 9 ) a l r e a d y existed be:Fore t h e a p p e a r a n c e o f a m o r e o r less c o m p l e t e p r o t e i n synthesis mechanism. Moreover, prebiological selection possibly o c c u r r e d a m o n g various c o a c e r v a t e d r o p l e t s f o r m e d in the initial organic s o u p as * Presented in part at; the Fifth International Conference on the Origin of Life, Kyoto, Japan, April 5--10, 1977.

suggested b y O p a r i n (1938). T h e p r o t e i n s y n t h e s i s m e c h a n i s m m u s t h a v e gradually d e v e l o p e d t h r o u g h c o m b i n a t i o n and i m p r o v e m e n t o f these p r i m o r d i a l e l e m e n t s . In this p a p e r we p r o p o s e a m o d e l for t h e sequential p r o c e s s leading to e s t a b l i s h m e n t o f the p r e s e n t p r o t e i n s y n t h e s i s m e c h a n i s m . T h e process is c o n s i d e r e d as consisting o f f o u r steps. (1) T h e first s t e p was a c t i v a t i o n of a m i n o acids and a c c e l e r a t i o n o f p o l y p e p t i d e f o r m a t i o n b y r i b o n u c l e o t i d e s . (2) T h e s e c o n d step was t h e p r e f e r e n t i a l i n c o r p o r a t i o n o f less c o m m o n a m i n o acids i n t o p o l y p e p t i d e s a i d e d by the action of both primitive tRNAs and insoluble p o l y r i b o n u c l e o t i d e s . T h e g e n e t i c code system, a l t h o u g h still i n c o m p l e t e , s u p p o s e d l y b e g a n to w o r k at this step. (3) I n the t h i r d step, p r i m i t i v e r R N A and m R N A were d i f f e r e n t i a t e d f r o m insoluble R N A , and t e m p l a t e - d i r e c t e d p o l y p e p t i d e s y n t h e s i s started. (4) In the final step, s o m e p r o t e i n s , e.g., r i b o s o m a l p r o t e i n s or , a m i n o a c y l t R N A s y n t h e t a s e , b e g a n to t a k e p a r t in the p r o t e i n synthesis mechanism.

230

r 1st step

2nd step

Activation of amino acid by mono- or Selectiveincorporation of lesscommon amino acidsinto polypeptide polyribonucleotide. Polypeptide forrna- by the action of primitive tRNAs and insoluble polynucleotides. tion was accerelatedby combination of Selectivesurvival among primitive tRNAs. Origin of genetic code. amino acid with nueleotide. ~ [ ~ polypeptide

l

General concept of pro tein synthesis mecha nism

:

~ 1 ~

~[[~

(~(~

~q~ ~ ~iiteCrim i nator

5

~s~"

antic°d°n I°°P

• ammo acid O: purine mononueleo hde • pyrimidMe mono nucleotide

po,ynuc,ootides

doublet codon

mono or polyribonucleotides

Dhfferentiation of poly nucleotides

C

Evolution of primitive tRNA

E

Evolution of poly peptides

[~

soluble RNA (primitive tRNA) . . . . . . . . . .

L

insoluble RNA

Appearance of primitive tRNA with triplet anticodon (code letter specification not strict) and amino acid binding region (binding site plus discrimator site). Amino acid was bound on the 5' phosphate. tRNA was duplicated.

random polypeptide

Relative amount of various amino acids in polypeptides become different from that of environtment.

U Phe U

(2r,d letter) A Tyr

C Ser

G Cys

Asn Leu Thr

E

Elaborationof genetic code

A

lie Met

Lys His

C

Pro

Gly

Gin Val

Asp

Ala

G Glu Primitive tRNA discriminated 2nd letter clearly and 1st (and 3rd) letters vaguely.

Fig. 1. Development of the protein synthesis mechanism summarized in four steps.

231

4th step

3rd step

Some template-directed polypeptides began to take part in the protein synthesis mechanism.

Origin of template-directed polypeptide synthesis on the specific binding region for tRNA on primitive rRNA

~,polypeptide

ptide

f

~

/

/ I

' ~"

~ ,-,

specific /1~1 ~ i~ binding.~ I I I I ~

tPF~iNrrA tive ~

(without \ protein)

.

I

oO~',

; ~ J

'

tRNA 3'

5'' primitive mRNA

5'' mRNA

primitive ribosome

tRNA

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

- - primitive rRNA ("ribosome" without protein) . . . . . . . .

.....

rRNA

~ -

--

ribosome

prim~twe mRNA

with peptidyltransferase and other proteins )

mRNA

DNA

Nucleotide reductase appeared.)

Appearance'of ARSase and switchover of discriminating mechanism from polynueleotide to protein. Conformationally fit binding region for amino acid evolved. Amino acid began to combine to 3' terminal as a result of appearance of peptidyltransferase.

Polynucleotide sequence-determired polypeptide appeared.

Evolved enzymes and proteins.

Phe

(2nd letter) A C Tyr Ser

Leu

Term

lie

Asn

(2nd letter) U Phe

A Tyr

C Set

U

G Cys U

Leu lie

Asn

Thr

Ser

Thr

G Cys Term Trp Ser

A

~'A

A

Lys Met Leu

Lys

His

Pro

Leu

(Arg) C

Asp

Ala

Gly

Glu A new discrimination mechanism for 3rd letter.

Gi

His

Pro

Arg

Ala

Gly

Gin

Gin Val

Arg

Met

Val

Asp Glu

Arg and Trp was incorporated into the code.

232 2. The proposed process 2.1. The first step: Activation of amino acids with nucleotide The possibility of polycondensation of amino acids activated as aminoacyladenylates has been examined successfully (PaechtHorowitz and Katchalsky, 1967; Lewinsohn et al., 1967). Aminoacyladenylate was formed from amino acid and adenosinemonophosphate in the presence of dicyclohexycarbodiimide as a condensing agent (Berg, 1958). It was also formed from amino acid and adenosine 5'-phosphorimidazolide (Lohrmann et al., 1975). Although at present we have no definite evidence indicating the presence of such condensing compounds under primordial conditions, these results may suggest spontaneous formation and accumulation of aminoacyladenylates in the prebiotic soup. As for nucleotides, ribonucleotides might be predominant, since ribose is easier to obtain in simulation experiments than other pentoses including deoxyribose (Mariani and Torraca, 1953; Pfeil and Ruckert, 1961), although deoxyribose is also formed in small amounts (Oro and Cox,1962). Thus, the activation of amino acids and the consequent acceleration of polypeptide bond formation by mono-, oligo- or polyribonucleotides were possibly the first step of interaction between amino acids and nucleotides. Several types of interactions between polynucleotides through base pairing which were already probable at this early stage are illustrated in Fig. 1. (1st step A). The pairing between bases would enhance the velocity of the polypeptide formation process by bringing the aminoacyl nucleotides together. 2.2. The second step: Selective incorporation of less common amino acids into polypeptides (i) Primitive t R N A and insoluble R N A . Among primitive polynucleotides, longer polyribonucleotides were perhaps insoluble owing to their high molecular weight, and

shorter ones were soluble. The latter, being soluble, had more chance to form complexes with amino acids. On the other hand, the insoluble polynucleotides might be convenient adsorbing matrices for the soluble aminoacyl polynucleotides, thus bringing the latter into a more condensed state than in the uniformly solubilized state and therefore providing a greater chance for reactions to produce peptide bonds between aminoacyl moieties (Fig. 1, 2nd step-A). Thus, the soluble polynucleotides could behave as primitive tRNAs and insoluble ones could assume functions like both mRNA and rRNA. The primitive tRNAs are depicted (Fig. 1, 2nd step-A) to have a binding site for amino acid and a primitive " a n t i c o d o n " site. The number of purine and pyrimidine bases in the " a n t i c o d o n " region of the primitive tRNA is depicted as variable. However, associating and dissociating tendencies between two interacting RNA strands have been calculated to be suitably balanced when the interacting part is composed of three adjacent base pairs (Eigen, 1971). Two or one base pair(s) would be insufficient to hold the primitive tRNA on the insoluble polynucleotide, while more than four base pairs lining up would exert too much attracting force and would result in retardation of replacement of the amino acid-disengaged tRNA by a substituting charged tRNA. A single hairpin structure with several base pairings between two portions of the same molecule (Fig. 1, step 2-A) is postulated as the shape of primitive tRNA molecule. This structure fulfills the conditions required for acceleration of peptide bond formation suggested above. The number of base p i r i n g (4 to 5 pairs), however, does not have particular experimental support. When a primitive tRNA formed an anhydride complex with an amino acid, it lost one negative charge on the phosphate group and acquired one positive charge as an amino group. Thus it would adsorb relatively more easily on the insoluble polynucleotide which was also highly negatively charged. When the

233 amino acid was removed from the tRNA to be incorporated i:ato the elongating polypeptide at the neighboring position (Fig. 1, 2nd step-A), the phosphate of the tRNA again assumed mono-ester form with two negative charges, and electrostatic repulsion between the two RNAs became larger and made t he m again more apt to dissociate. In this step, the primitive t R N A is supposed to combine with amino acid on its 5'-terminal phosphate since the amino acid attached t o this position forms a peptide bond without help o f enzymatic activity (Paecht-Horowitz et al., 1970).

(ii) Statistical relation between the primordial occurrence frequency o f various amino acids and the ratio o f different bases in codons. We have already pointed out (Ishigami, 1974; Ishigami and Nagano, 1975} t ha t amino acids which are pr oduc e d in higher yield in simulation experiments correspond to the codons with high G plus C c ont e nt , while o t h e r amino acids produced with lower yield in such experiments correspond to the codons with high U plus A. The correspondence is summarized graphically in Fig. 2. The values of amino acid yield in the figure are adopted from the experiments o f Yoshino et al. {1971) and Harada and Fox ( 1 9 6 4 ) . The two sets of experiments show rather similar amino acid distributions, in spite of different starting materials (CO, H2 and NH3 in Yoshino e t al.; CH4, NH3 and H20 in Harada's experiment) and different reaction conditions. In contrast, the relative amounts of various amino acids in tiae bulk protein of some c o n t e m p o r a r y microorganisms are similarly p l o t t ed against codon G--C c o n t e n t (Fig. 3). A clear correlation b e t w e e n amino acid f r e q u e n c y and G--C cont ent , if present, is difficult to observe. The possible basis of the apparent correlation is discussed in the n e x t section. For the purpose of additional confirmation, however, we q u o te several t h e r m o d y n a m i c and o t h e r data here. Standard e n t r o p y changes o f

f o r m a t i o n of various amino acids (Hutchens, 1976) show similar parallelism to the G--C c o n t e n t of the respective codons (Fig. 4). Standard e n t r o p y change of f o r m a t i o n is closely related t o the form at i on ratios of each amino acid in equilibrium state at higher temperatures where the e n t r o p y change predominates in the total free energy change. Primordial amino acid synthesis is considered to take place in such a high t e m p e r a t u r e state,

100

OA

OR

"o 10 OS

o

OB

"6 E g

OB OL lip O~ ~yF

OL OT

n'-

OA

OZ

OP

OV Ol

YI~)F

50

I[]0

G-C c o n t e n t (%)

Fig. 2. Correlation between the relative amount of amino acids in simulation experiments and GC content of their codons. ~: Calculated from experimental results by Yoshino et al. (1971). CO, H2 and NH3 were allowed to react at 200--700°C for several hours in the presence of alumina, baked meteorites or other metal compounds as catalysts. Each plot is the average of 56 runs. -: Calculated from experimental results by Harada and Fox (1964). CH4, NH 3 and H20 were allowed to flow through silica powder at 950-1050 ° C. Products were recovered in NH4OH solution. Each plot is the average of 3 runs.

234 IOQ

IIG

15Q

OC

OA

oA Io u (5 o c

OZ

G

T

10

Bil IIL 8v

.g

OIoK r i l lOK --' o"

m

OS

-o

~T

oR

~S

OR

OP

v OP Op

O0 200

OF ON

OY

g

E

OF

E m

OM

©

oH

OE OY

OY

&

m

OI

O[ OO

c

OW

OH" tm

250

O

"o (,9 O1

5O

100

G-C content (%)

Fig. 3. Correlation between the relative amount of amino acids in bulk protein from contemporary microorganisms (values by Sueoka, 1961) and GC content of their codon. Two bacterial species showing rather extreme GC content of DNA are chosen for calculation. -: Micrococcus lysodeiktics protein. GC content of DNA = 72%. ~): Protein of bacillus cereus. GC content of DNA = 35%. w h e t h e r the energy was supplied actually as thermal energy ( Y o s h i n o et al., 1 9 7 1 ; Harada and F o x , 1 9 6 4 ) or as energies o f electric discharge (Miller, 1 9 5 3 ) , ultraviolet radiation (Pavlovskaya and Pasynskii, 1 9 5 9 ; Terenin, 1 9 5 9 ) , or s h o c k waves p r o d u c e d b y m e t e o r i t e s ( H o c h s t i m , 1 9 6 3 ) . Thus we have again similar parallelisms as seen in Fig. 2, although in s o m e o f these cases, it is difficult to consider that a m i n o acids were f o r m e d in an equilibrium state. When t h e m o l e c u l a r weight o f various a m i n o acids is p l o t t e d against G--C c o n t e n t o f c o d o n s for these amino acids, a general

t 0

OK"

OR" 5O

100

G-C c o n t e n t ( % )

Fig. 4. Correlation between the standard entropy change for formation of amino acids and GC content in their codons. * HC1 salt. t e n d e n c y o f correlation is discernible although arginine and t r y p t o p h a n are e x c e p t i o n s {Fig. 5). This is n o t unusual since the f r e q u e n c y distribution o f naturally occurring a m i n o acids is e x p e c t e d to be reversibly p r o p o r t i o n a l to the m o l e c u l a r weight based o n the f o l l o w i n g reasons. First, if the primordial a m i n o acid synthesis p r o c e e d e d under thermally equilibrated conditions, a roughly parallel relationship ~hould exist b e t w e e n molecular size o f an a m i n o acid and its e n t r o p y change o f f o r m a t i o n , while the latter and the G--C c o n t e n t o f its c o d o n s are in linear correlationship as s h o w n above (Fig. 4). On the o t h e r hand, if the synthesis w e n t on under unequilibrated c o n d i t i o n s , the n u m b e r

235 50

.

.

.

.

.

.

.

OG OA 100 @S

OP

li

& OION

-~ 150

OK

0[

.M

@0

II~ @H

OF OR

OY

200

OW

50

100

G-C content (%)

Fig. 5. C o r r e l a t i o n b e t w e e n t h e m o l e c u l a r w e i g h t o f a m i n o acids a n d GC c o n t e n t in t h e i r c o d o n s .

of small units (CO2,. CO, CH4, N2, NH3, H:O, etc.) to be combined to make an amino acid would restrict its formation rate, and hence its occurrence frequency in the primordial environment. Here again the latter is, as shown above (Fig. 5), in direct proportionality with the Cr-C c o n t e n t of codons. (iii) How has this correlation been attained? We have already proposed a possible interpretation for the correlation between the primordial frequency of an amino acid and the G--C content of codons which correspond to this amino acid (Ishigami, 1974; Ishigami and Nagano, 1975). The outline is summarized here as the establishment of this correlation represents an essential step in the developm e n t of the protein synthesis mechanism. A primitive tRNA molecule had an amino acid discriminator site and, distinguished from it, a triplet (Eigen, 1971; see above) anticodon site. Amino acid specification by each tRNA molecule was, however, not strict. The

third (and perhaps the first) letters of anticodons might be mere space fillers, and a group specification mechanism might prevail as suggested in Fig. 1, 2nd step-E. Availabilities of different base species were not equal, but more abundant bases would be proportionally more often incorporated into the polymers. Thus in the primitive "antic o d o n s " A and U would be far more frequently encountered than G and C, since A is the main purine species obtained in simulation synthesis experiments (Ord and Kimball, 1961, Ponnamperuma, 1965) and U would be selectively incorporated into polynucleotide as the anti-base of A. Insoluble high molecule RNAs also contained much more A and U than G and C, and primitive tRNAs containing U and A in its " a n t i c o d o n " sites had more chance to be held on the insoluble RNA by means of base pairing. Throughout the gradual establishment of more exact correspondence between an amino acid and its codons, the precise process of which is still unknown to us, several guiding principles must have been operating. We suppose that one of the principles was the tendency of G and C being incorporated into the codons for more commor,, and at the same time more small and simple (Figs. 2--5), amino acids. The reasons for this supposition have been discussed elsewhere (Ishigami and Nagano, 1975), but the basic idea is that, by combination of more infrequent base species and more c o m m o n amino acids, the resulting protein would contain a far smaller proportion of the common amino acids than expected from mere chance. If a c o m m o n amino acid (e.g., glycine) was specified by codon(s) containing c o m m o n base species (A and U) in the primitive protein synthesis mechanism, highly m o n o t o n o u s proteins containing a large proportion of glycine (and alanine, and other c o m m o n and "less w o r t h y " amino acids) would result (Fig. 6). On the contrary, if combination between less frequent amino acids, which were often more active in biochemical functions, and

236

primitive tRNA and insoluble RNA

polypeptide

probiont

increase of Gly

selectively disadvantageous

decrease of Gly

selectively advsntageo us

Gly Aor

~

U

Gly G orC

Fig. 6. Selection of primitive tRNAs. more frequent bases (A and U) had been established, the resulting proteins would be more varied and have more versatile properties which must have been advantageous in the course of pre-biological selection. Several more concrete criteria for "advantageous" polypeptides will be shortly discussed here. Polypeptides containing a high ratio of hydrophobic amino acids might be important in the formation of membranous structures among themselves and also with other hydrophobic small molecules, especially various kinds of lipids. They also worked as good catalytic adsorbants for condensation reactions. Thus hydrophobic amino acids were advantageous. Incorporation of complex polar amino acids (e.g., histidine and lysine) in the primitive polypeptide would also make the probiont advantageous through enhancing pre-metabolic catalyzed reactions. In the case of aspartic and glutamic acids the criteria of being u n c o m m o n and advantageous seems to be incongruent since they are among the mast c o m m o n amino acids formed in simulation experiments, while t h e y are

highly polar and, in the present context, may be advantageous. That the codons for aspartic and glutamic acids have A at the second position (consequently U at the second position of their anticodons) may suggest their frequent incorporation into the primordial proteins as into the present day proteins. The correspondences between amino acids and anticodons are supposed to become gradually more exact by the combined effect of these criteria.

(iv) Base ratio o f primitive R N A . The reianve a m o u n t of each nucleotide in the primitive RNA has been calculated based on our hypothesis (Ishigami and Nagano, 1975). The values are shown again in Table 1. In this calculation, the relative amounts of various amino acids was considered to shift from the primordial distribution pattern to t h a t of contemporary organisms through the pressure of primitive molecular selection. As the amino acid composition values of proteins in m o d e m organisms, those of two kinds of bacteria which were used in Fig. 3 were

237 TABLE 1 E s t i m a t e d relative a m o u n t ( p ' ) o f f o u r base species in p r i m i t i v e R N A (revised f r o m Ishigami and Nagano, 1975). Data used in c a l c u l a t i o n a

E s t i m a t e d base d i s t r i b u t i o n (Per c e n t of t o t a l bases)

Primordial amino acid d i s t r i b u t i o n

Contemporary amino acid d i s t r i b u t i o n

U

A

Y o s h i n o et al. (1971)

Bacillus cereus

72.6

23.5

7.8

(-3.8)

67.9

19.0

16.0

(-3.0)

39.3

33.4

16.0

11.2

30.0

30.2

19.0

20.8

G

C

(GC = 35%) Micrococcus lysodeikticus

(GC = 72%) Harada and Fox (1964)

Bacillus cereus

(GC = 35%) Micrococcus lysodeikticus

(GC = 72%) a F o r details of p' calculation, see Ishigami a n d N a g a n o ( 1 9 7 5 ) .

TABLE 2 C o r r e s p o n d e n c e of p h y s i c o c h e m i c a l g r o u p i n g of various a m i n o acids to t h e i r codes (second letter) a n d to t h e i r discriminator nucleotides. Amino acid

2nd l e t t e r of anticodon

Grouping a

Discriminator b

Phe Leu ne Met Val

A A

I I

A A

A

I

A

A A

I I,II

A A

Tyr Asn Lys His Gln Asp Glu

U U U U U U U

I

Ser Thr Pro Ala

G,C G G G

II,III II II II

Cys Trp Arg Gly

C C C C

a b

A III

G G,

IV III,IV III,IV III III,IV

G G G G A A V

I IV II

U C

U A,G A,G A,

U U

A d o p t e d f r o m Hasegawa a n d Y a n o (1975). Criteria of groups are: I, n o n - p o l a r a n d large a m i n o acids; II, n o n p o l a r or slightly p o l a r a n d small; III, polar a n d small; IV, polar a n d large; V, specific. F o r d e f i n i t i o n of a d i s c r i m i n a t o r n u c l e o t i d e , see C r o t h e r s et al. (1972).

238 adopted. Relative amounts of each prebiotic amino acid are calculated from the reshlts of Yoshino et al. (1971) and Harada and Fox (1964) (Fig. 2). Amino acid distribution might be somewhat different between the primitive environment and in the primitive polypeptides, since polypeptide formation experiments both by thermal synthesis (Fox and Waehneldt, 1968) and by condensing agent (Steinmann and Cole, 1967) showed different incorporation among different amino acids. Thus, we must consider our calculation as an approximation. The calculating procedure was described previously (Ishigami and Nagano, 1975). The results show that A and U must be more abundant than G and C. Moreover, in the calculation based on the experimental values of Harada and Fox (1964), the relative amount of A is nearly equal to that of U, and G is nearly equal to C. These coincidences are consistent with the notion that duplication of polynucleotides had occurred already at that stage. This may be, however, only an accidental coincidence since we obtain somewhat different values based on the other simulation experiments (Yoshino et al., 1971). Especially, the yield of basic amino acids varies widely depending on the nature and amount of the N-source materials added to the system. As another problem, we have scanty data on the primordial frequency of cysteine and methionine since in most experiments S-source materials are not included. Plausible amino acid patterns in the prebiotic soup may be an important point to be investigated further.

(v) Evolution of the discriminating region. The tendency that the physicoclaemically akin amino acids are coded by similar codons has been pointed out by Epstein (1966). Recently Hasegawa and Yano (1975) have divided 20 amino acids in the codon catalogue into 5 groups based on the Grantham's (1974) criteria, where elemental composition, polarity and molecular volume were taken into consideration.

We have noticed the correlation between the 2nd letter of genetic codons (and consequently anticodons) and the five groups of amino acids (Table 2). As the second letter of anticodons A corresponds to group I, U corresponds to III and IV, and G to II. These relations have usually been explained by "Lethal Mutation" model by Sonneborn {1965) or "Translation Error-Ambiguity" model by Woese (1965) (see Woese, 1969). We try to understand the correspondence, however, by supposing the evolution of amino acid discriminating sites on the primitive tRNAs. At the first stage of evolutionary elaboration of the discriminating mechanism, amino acids were only coarsely discriminated based on the five groupings discussed above (Table 2). At this stage, only the second letter of the triplet was unambiguously read in the codon~anticodon correspondence. The third (and perhaps the first) letters were discriminated only vaguely. On the primitive tRNA molecule, we postulate that there were two separate regions which were concerned in amino acid discrimination. The first was the anticodon site repeatedly referred to above. The second, which was directly involved in amino acid recognition, was the discriminator site near the " t a i l " of the RNA molecule (cf. Fig. 1, 2rid step-A). Crothers (1972) has suggested some important implications of the 4th nucleotide from the 3'-end o f contemporary tRNA in the discriminating function for amino acid. The nucleotide species of the 4th position and that of the second letter of the anticodon triplet have an interrelationship with each other: A, G or U in the 4th position corresponding to A, U or C in the 2nd letter of anticodon respectively (Table 2). We interpret the correspondence as a vestige of the primitive discriminating mechanism. Before appearance of the enzyme, aminoacyltRNA synthetase, the direct recognition and discrimination of an amino acid by tRNA molecule must have depended on the supposed

239 discriminator site. t h e site might take a cagelike conformation fitting for the " r i g h t " kind(s) of amino acids and reject other " w r o n g " amino acids. Differential affinity of oligonucleotides for various amino acids is not a mere imaginary assertion but is supported by some experimental results (Saxinger and Ponnamperuma, 1971; 1974). The site, howeve:c, had not enough exactness to discriminate each of the whole array of amino acids used in the primitive protein synthesis mechanism. A tentative codon catalogue in this stage is summarized in Fig. 1 (2rid step-E). In the table we arrange the four bases in the order of U, A, C and G (only for the first and second letters) instead of the widely adopted order of U, C, A and G. This arrangement seems more convenient in the situation where U and A prevailed over the other two base species. Moreover, this new order has several accidental advantages concerning the present discussion. For instance, serine and threonine, a pair o f homologous amino acids, come to juxtaposed positions in the table. For another, split positi~ons for serine and arginine in the usual arrangement are brought together. A disadvantage, however, is the position for leucine, which is now split into two in the new arrangement. The usual order of U, C, A and G is adopted for the third letter of the anticodon, since this arrangement gives a better coherence in the table as exemplified by the well-known equivalency between two purines and that of two pyrimidines in the code catalogue. This suggests that different principles might be working in code determination of the second and first letters, ~ d of the third letter of primitive tRNA, respectively.

2.3. The third step: Initiation of template directed polypeptide synthesis Let us suppose that some single stranded portions were scattered on the insoluble RNA as considered above, and that aminoacylated primitive tRNAs were sometimes attracted to

these portions by base pair interactions. These portions could be said to play the role of primitive mRNA. The interactions, however, were rather random and sporadic before specific binding sites for primitive tRNA appeared on the insoluble RNA. After appearance of these binding sites, the mechanism of systematic information transfer from polynucleotide to polypeptide became an object of evolutionary improvement, and differentiation o f mRNA and ribosomal RNA was promoted, mRNA had to become an independent polynucleotide strand separated from the bulk insoluble RNA (primitive ribosome) in order to assure one-directional flow of molecular information written in this strand as the nucleotide sequence. The time of separation o f the mRNA strand and the primitive ribosome is assumed to be accomplished at the third step in Fig. 1. The primitive system must be faw~red from several environmental conditions if polypeptide synthesis by this system should proceed without support from the specific enzymes. Especially, condensation reactions must proceed at a recognizable rate for the synthesis of polynucleotides from mononucleotides and also for the elongation of polypeptide chains by accepting aminoacyl residues from aminoacyl tRNAs. The condensing activity could be provided by some condensing reagents which might be adsorbed and accumulated on the membranous surface of probionts through hydrophobic interactions. Under these circumstances, we could have a tentative image of the primordial polypeptide formation mechanism as follows. First, the binding region for tRNA consisted of two juxtaposed binding sites. The two sites had much higher affinity for amino acyl-carrying tRNA molecules than free tRNAs because, as discussed above, of the diminished negative charge at the aminoacylated 5'-phosphate, and consequently of the attenuated electrostatic repulsion between tRNA and insoluble RNA molecules. Second, the two binding sites differred in their affinity for tRNA. We allot,

240

H-site .---.

,/~

E-site "....---.

amino acid

polypeptide

Fig. 7. S c h e m e of f u n c t i o n of t h e specific b i n d i n g region for a m i n o a c y l t R N A on a primitive r i b o s o m e .

seeking tacit conformity with the present day ribosomal system, the high-affinity site (H-site) to be nearer the 5'-terminal of the primitive mRNA (Fig. 7), and the low-affinity site (L-site) nearer the 3'-terminal. The function of these specific sites should depend on the polynucleotides in the primitive ribosomes. Although polypeptides were also possibly included in the primitive ribosomes, the specific sites could have appeared only on the reproducible polynucleotides. These nonspecific polypeptides initially could not have joined in the specific function of the binding sites. Based on the above assumptions, the

following nonenzymatic mechanism of polypeptide elongation will ensue (Fig. 7). (1) The two sites are occupied by two molecules of primitive aminoacyl-tRNA (Fig. 7A). The amino acid species is specified by primitive mRNA, although the specification may not yet be sufficiently strict at this stage. (2) One of the aminoacyl residues is transferred to the other to make a peptide bond. Transfer from H-site to L-site (Fig. 7B) is more favorable to the operation of the whole system because the resulting free tRNA is expelled from the H-site (see above, the first assumption), and the peptidyl tRNA is attracted from the L-site to the H-site by the higher affinity of the latter• A coupled shift of mRNA on the rRNA will also occur and a new aminoacyl tRNA, specified by the mRNA, will occupy the L-site. If the direction of aminoacyl transfer were reversed, the assembly line would not flow and the system would come to a standstill or, at best, it would produce a homopolypeptide corresponding to the same mRNA code which continues to occupy the L-site. (3) By attraction of the peptidyl tRNA to the H-site from the L-site, the latter becomes vacant and another synthetic cycle will start again. The H-site and L-site on the primitive rRNA may be considered as precursors of the P-site and A-site on the contemporary ribosomes, respectively. Initiation and termination of peptide elongation is not directly suggested by the mechanism outlined in Fig. 7. Here we must assume the appearance of initiating and terminating codes, which were not necessarily the same as ones in the contemporary code list but had similar roles.

(ii) Role o f the third position o f codons in amino acid discrimination. We have pointed out above that the second letter of codons corresponded to the gross physicochemical classification of amino acids (Table 2). The first and the third letters show poorer correspondence than the second letter. To explain this fact, let us take notice that the anticodon

241

nucleotides were arrayed on a loop of primitive tRNA, while it must contact with a straightline nucleotide sequence of insoluble RNA. As a result, the contact was most strong at the center, that is the second letter, of a triplet. Weaker restlSctions should be imposed on the first and third letters. This different degree o f restriction between the second letter and the first and the third letters may be reflected in the more strict specification by the second letter than the other two letters. The third letter, however, seems to be under the influence; of some other particular restriction in t h a t purines and pyrimidines specify different amino acids respectively in many cases (e.g., purines for lysine and pyrimidines for ~sparagine). The precise character of the restriction, if present, is unknown to us. In Fig. 7, the restriction is symbolized by a s].ight inclination of tRNA from the vertical direction to mRNA. This asymmetry would impose a restriction to the third letter, and would lead to establishment of a particular ru].e observed in the third position of today's code list.

2.4. The fourth step: Further elaboration of the protein synthesizing system (i) Participation o f some functional proteins in the system. In the earlier steps of evolution of the protein synthesizing system, there would exist no basis for determining the fitness or unfitness of the produced proteins, since no specific functions were y e t established for proteins, although there might have been some criteria for general superiority (e.g., relative abundance of functionally active amino acids in the protein, as discussed above). Fitness of a protein molecule for some specific functions perhaps became significant as late as the third step in the development of the protein synthesis mechanism when a definite amino acid sequence began to be more or less stably reproduced. Several proteins specifically participating in the protein synthesis itself might begin to take part in the system at this step of the

development. For instance, enzymes such as aminoacyl-tRNA synthetase, and other nonenzymatic proteins such as most ribosomal proteins might have their origin at the third step and start their full-scale evolution at the fourth step. We suspect also t h a t two kinds of soluble proteins, G-factor and T-factor, for binding of aminoacyl-tRNA on ribosomes and for translocation of tRNA from P-site to A-site of ribosomes respectively, appeared in this stage, and polypeptide elongation began to be driven by the energy of GTP, while the activation of amino acids as aminoacyl tRNA still depended on ATP. As these proteins proceeded on their course of evolution, the role of amino acid discrimination which was originally assigned to primitive tRNAs was gradually transferred from them, especially to aminoacyl-tRNA synthetase. Amino acid activation and peptide bond formation also became enzyme catalyzed reactions, and appearance of peptidyltransferase ensured the formation of polypeptides from the amino acids bound to 3'-hydroxyl group of tRNA.

(ii) Inclusion of arginine and tryptophan. A general idea in this paper is that the number and kinds of amino acid species included in the code list have not much changed since the primordial code system was established but the exactness of codons in specifying amino acids became gradually more strict, finally leading to the state that only one amino acid was appointed by each codon as seen in the present system. There seem, however, to be two important exceptions to this generalization: arginine and tryptophan. Jukes (1974) suggested that arginine was included in the code later in place of another amino acid, ornithine. The supposition of arginine as a later intruder fits well to the theory outlined in the present paper. Moreover, tryptophan, seems to be a similar late comer although in this case the predecessor amino acid (such as ornithine in the case of arginine) is difficult to suppose. Reasons for this interpretation for arginine and trypto-

242 p h a n are briefly s u m m a r i z e d below. (1) Both a m i n o acids h a v e rather complicated s t r u c t u r e s and are n o t a m o n g t h e members found in s i m u l a t i o n synthesis experiments. (2) T h e y are irregular in the sense t h a t the spots c o r r e s p o n d i n g to t h e m are isolated in b o t h graphs r e p r e s e n t i n g s t a n d a r d e n t r o p y change f o r i n f o r m a t i o n vs. GC c o n t e n t (Fig. 4) and m o l e c u l a r weight vs. GC c o n t e n t (Fig. 5).

(iii) Appearance of DNA. D N A has sometimes been m e n t i o n e d as the first i m p o r t a n t m a c r o m o l e c u l e in the p r i m o r d i a l living s y s t e m s (Rich, 1962) based on its unique i n f o r m a t i o n carrying role in p r e s e n t d a y organisms. B e t w e e n the t w o p e n t o s e s f o u n d in nucleic acids, h o w e v e r , ribose is far m o r e o f t e n r e p o r t e d in s i m u l a t i o n e x p e r i m e n t s (Mariani and Torraca, 1953; Pfeil and R u c k e r t , 1961) than d e o x y r i b o s e . D e o x y r i b o s e , and conseq u e n t l y D N A , m a y join in the p r o t e i n synthesis m e c h a n i s m only a f t e r an e n z y m a t i c r e d u c t i o n s y s t e m of ribose to d e o x y r i b o s e appeared. S w i t c h o v e r f r o m ribose t o d e o x y r i b o s e p r o b a b l y had several advantages, o n e o f which m i g h t be r e t r e a t f r o m the usual m e t a bolic r o u t e o f c o n t i n u i n g f o r m a t i o n and decay. A f t e r p a r t i c i p a t i o n o f D N A in t h e s y s t e m we can n o w s p e a k of the p r e s e r v a t i o n o f genetic i n f o r m a t i o n in its intrinsic sense, and this event m a y be r e c k o n e d to essentially c o m p l e t e the p r o t e i n synthesis m e c h a n i s m . (iv) Unity and universality of the genetic code. One of the m o s t r e m a r k a b l e features of the genetic c o d e is its universality t h r o u g h o u t all living organisms. This f e a t u r e has lead some investigators t o s u p p o s e invasion o f e x t r a t e r r e s t r i a l germs and their spreading o u t as the m o n o p h y l e t i c first a n c e s t o r s o f all living things t h e r e a f t e r (Crick and Orgel, 1973). O u r p r e s e n t p r o p o s a l is based on gradual and statistical e s t a b l i s h m e n t o f the genetic code. If this was really the case, we m i g h t

e x p e c t some statistical f l u c t u a t i o n or plurality o f the coding rule, w h i c h is n o t , h o w e v e r , observed a m o n g c o n t e m p o r a r y organisms. A clue m a y be f o u n d in Orgel's (1973) suggestion t h a t very severe c o m p e t i t i o n with regard t o s o m e biological traits b e t w e e n r e m o t e ancestors has cut d o w n the surviving lines to only one, and all living things therea f t e r are offsprings of this single ancestor, the unique rule of its genetic code being transm i t t e d u p to now. On the o t h e r h a n d , we can also s u p p o s e t h a t selection a m o n g various code s y s t e m s has b e e n e x t r a o r d i n a r i l y strict and the c o n t e m p o rary c o d e has survived b y virtue o f its merits alone. F o r this line o f a r g u m e n t t o be possible, however, our knowledge about the primordial p r o t e i n synthesis s y s t e m seems still to be t o o poor.

References Berg, P., 1958, The chemical synthesis of amino acyl adenylates. J. Biol. Chem. 233,608. Crick, F.H.C. and L.E. Orgel, 1973, Directed panspermia. Icarus 19,341. Crothers, D.M., T. Seno and D.G. S611, 1972, Is there a discriminator site in transfer RNA? Proc. Natl. Acad. Sci. USA. 69, 3063. Eigen, M., 1971, Selforganization of matter and the evolution of Biological macromolecules. Naturwissenschaften 58,465. Epstein, C.J., 1966, Role of the amino-acid 'code' and of selection for conformation in the evolution of proteins. Nature 210, 25. Fox, S.W. and K. Dose, 1972, Molecular evolution and the origin of life (Freeman, San Francisco) pp. 167--171. Fox, S.W. and T.V. Waehneldt, 1968, The thermal synthesis of neutral and basic proteinoids, Biochim. Biophys. Acta. 160, 246. Grantham, R., 1974, Amino acid difference formula to help explain protein evolution. Science 185, 862. Harada, K. and S.W. Fox, 1964, Thermal synthesis of natural amino-acids from a postulated primitive terrestrial atmosphere. Nature 201,335. Hasegawa, M. and T. Yano, 1975, Classification of amino acids and its implication to the genetic code. Viva Origino 4, 11. Hochstim, A.R., 1963, Hypersonic chemosynthesis and possible formation of organic compounds

243 from impact of mel~eorites on water, Proc. Natl. Acad. Sci. USA. 50, 200. Hutchens, J.O., 1976, Handbook of Biochemistry and Molecular Biology, G.D. Fasman (ed.) (CRC Press, Cleveland) Vol. 1. pp. 109. Ishigami, M., 1974, The origin of the modern codon catalog. Viva Origino 2, 35, Ishigami, M. and K. Nagano, 1975, The origin of the genetic code. Origins of Life 6, 551. Jukes, T.H., 1974, On the possible origin and evolution of the genetic code. Origins of Life 5,331. Lewinsohn, R., M. Paecht-Horowitz and A. Katchalsky, 1967, Polycondensation of amino acid phosphoanhydrides. III. Polycondensation of alanyl adenylate. Biochim. Biophys. Acta 140, 24. Lohrmann, R., R. Ranganathan, H. Sawai and L.E. Orgel, 1975, Prebiot:ic peptide-formation in solid state. I. Reaction of benzoate ion and glycine with adenosine 5'-phosphorimidazolide. J. Mol. Evolution 5, 57. Mariani, E. and G. Torraca, 1953, The composition of formose. A chromatographic study. Intern. Sugar J., 55,309. Miller, S.L., 1953, A production of amino acids under possible primitive esrth conditions. Science, 117, 528. Oparin, A.I., 1938, The Origin of Life, translated by S. Morgulis (MacMillan, New York). Orgel, L.E., 1973, The Origins of Life: Molecules and natural selection (John Wiley, New York). Or(t, J. and A.C. Cox., 1962, Non-enzymic synthesis of 2-deoxyribose. Federation Proc., 21, 80. Or(t, J. and A.P. Kimball, 1961, Synthesis of purines under possible primitive earth conditions. I. Adenine from hydrogen cyanide. Arch. Biochem. Biophys., 9 4 , 2 1 7 . Paecht-Horowitz, M. and A. Katchalsky, 1967, Polycondensation of amino acid phosphoanhydrides. II. Polymerization of proline adenylate at constant phosphoanhydride concentration. Biochim. Biophys. Acta., 140, 14. Paecht-Horowitz, M., J. Berger and A. Katchalsky, 1970, Prebiotic synthesis of polypeptides by heterogeneous polycondensation of amino-acid adenylates. Nature 2,'!8,636. Pavlovskaya, T.E. and A.G. Pasynskii, 1959, The original formation of amino acids under the action of ultraviolet rays and electric discharges, IUB Symposium Series, The Origin of Life on the Earth, A.I. Oparin (ed.) (Pergamon, Oxford) pp. 151--157. Pfeil, E. and H. Ruckert, 1961, Die Bildung von Zuckern aus Formaldehyd unter der Einwirkung

von Laugen. Ann. 641,121. Ponnamperuma, C., 1965, Abiological synthesis of some nucleic acid constituents, The Origin of Prebiological Systems and of their Molecular Matrices, S.W. Fox (ed.) (Academic Press, New York) pp. 221--236. Rich, A., 1962, On the problems of evolution and biochemical information transfer, Horizons in Biochemistry, M. Kasha and B. Pullman (eds.) (Academic Press, New York) pp. 103. Sonneborn, T.M., 1965, Degeneracy of the genetic code: Extent, nature, and genetic implications, Evolving genes and Proteins, V. Bryson and H.J. Vogel (eds.) (Academic Press, New York) pp. 377. Steinman, G.D. and M.N. Cole, 1967, Synthesis of Biologically pertinent peptides under possible primordial conditions, Proc. Natl. Acad. Sci. USA. 58, 735. Sulston, J., R. Lohrmann, L.E. Orgel and H.T. Miles, 1968a, Nonenzymatic synthesis of oligoadenylates on a polyuridylic acid template, Proc. Natl. Acad. Sci. USA. 59,726. Sulston, J., R. Lohrmann, L.E. Orgel and H.T. Miles, 1968b, Specificity of oligonucleotide synthesis directed by polyuridylic acid, Proc. Natl. Acad. Sci. USA. 60,409. Sulston, J., R. Lohrmann, L.E. Orgel, H. SchneiderBerntoehr and B.J. Weimann, 1969, Non-enzymic oligonucleotide synthesis on a polycytidylate template. J. Mol. Biol. 4 0 , 2 2 7 . Saxinger, C. and C. Ponnamperuma, 1971, Experimental investigation on the origin of the genetic code. J. Mol. Evolution, 1, 63. Saxinger, C. and C. Ponnamperuma, 1974, Interactions between amino acids and nucleotides in the prebiotic milieu. Origins of Life, 5,189. Sueoka, N., 1961, Correlation between base composition of deoxyribonucleic acid and amino acid composition of protein. Proc. Natl. Acad. Sci. USA. 47, 1141. Terenin, A.N., 1959, Photosynthesis in the shortest ultraviolet, I.U.B. Symposium Series, The Origin of Life On the Earth, A.I. Oparin (ed.) (Pergamon Press, Oxford) pp. 136--139. Woese, C., 1965, On the evolution of the genetic code, Proc. Natl. Acad. Sci. USA. 54, 1546. Woese, C., 1969, Models for the evolution of codon assignments. J. Mol. Biol. 43, 235. Yoshino, D., R. Hayatsu and E. Anders, 1971, Origin of organic matter in early solar system. III. Amino acids: Catalytic synthesis. Geochim. Cosmochim. Acta., 35,927.

The origin of the protein synthesis mechanism.

BioSystems, 9 (1977) 229--243 © Elsevier/North-Holland Scientific Publishers Ltd. 229 THE ORIGIN OF THE PROTEIN SYNTHESIS MECHANISM* MASAHIRO ISHIGA...
1MB Sizes 0 Downloads 0 Views