J. theor. Biol. (1975) !W,203-212

A Concept of Amino Acid Archaeorelation: Origin of Life and the Genetic Code H. N. MIKELSAAR Institute of Experimental Biology of the Academy of Sciences of the Estonian S.S.R., Laboratory of Bioorganic Chemistry of Moscow State University, U.S.S.R.~ (Received 11 April 1974) Using certain assumptions, an attempt has been made to obtain a quantitative evaluation of the similarity of the bulk amino acid compositions of various organisms and of the quantitative ratios of amino acids in the products of the presumable prebiological reactions and in the natural proteins. The results of comparison between the molar ratios of amino acids in synthesized products and in natural proteins and the quantitative distribution of triplets among amino acids in the existing genetic code are given. Proceeding from the correlations found and the data reported in literature, a concept has been suggested amounting, in essence, to the assertion that the structure of the genetic code (the quantitative distribution of the triplets) is in a certain way determined by the preexisting, primary ratio (archaeorelation) of amino acids.

1. Introduction This study attempts to combine some reported experimental data, concerning the origin of life, and data on the structure of the genetic code and the protein composition of various organisms,, in order to demonstrate that the organic world of today bears a profound imprint of prebiological times and to put forward a probable mechanism for the preservation of such traces up to the present era. 2. Amino Acid Composition of Living Organisms This concept is based on the evidence that the bulk amino acid compositions of diverse organisms are similar. Table 1 presents the data on the bulk amino

acid

composition

for

bacteria,

green

algae and

protozoa

in

7 Permanent address: Institute of Experimental Biology of Academy of Sciences of

Estonian S.S.R.,Ha&u 203051, Estonian S.S.R.,U.S.S.R. 203

204

H.

N.

MIKELSAAR

TABLE

1

Comparison of the bulk proteins of various organisms with the averaged pro rein Type of comparison

Hydrophilic/ hydrophilic

Hydrophobic/ hydrophobic

Hydrophilic/ hydrophobic

Total

Source of protein

No.

No.

pairs with positive correlation

pairs with negative correlation

Pairs with positive correlation (%I

Bacillus subtilis Escherichia coli Chlorella vulgaris Tetrahymena pyriformis

32 32 31 26

4 4 5 2

Bacillus subtilis Escherichia coli Chbrella vdgaris Tetrahymena pyriformik

26 34 33 23

2

Bacillus subtilis Ekcherichia coli Ch Iorella vulgaris Tetrahymena pyriformis

67 70 68 55

5

93 86

:: 9

s”:

Bacillus subtilis Escherichia coli Chlorella vdgaris Tetrahymena pyriformis

125 136 132 104

11 17 21 16

: 5

89 89

86 93 93 94 92 82

92 89

The correlations were analyzed in the following way. For each of the so-called bulk proteins the molar per cent of all the amino acids were compared with each other. The number of the amino acid pairs thus obtained (n) is n = a(a - 1)/2, where a is the number of the amino acids taken into account. For instance, if a = 18, n = 153 and so on. The same procedure was applied to the amino acid molar per cent of the averaged protein. Two sets of relations have been obtained, for instance:

The subscripts 1 and 2 designate the bulk and averaged protein, respectively. Comparing these sequences we obtain results belonging to two classes-with positive correlation (in the above example the Ala/Val pair) and with negative correlation (Ala/Leu pair). Besides, in some cases the neutral correlation is obtained, when for one pair the ratio is unity and for another pair it is more (or less) than unity. For the sake of simplicity such cases are classified as having the negative correlation. The data on the ammo acid composition of the total protein are taken from the reports of Woese (1967) and Kvenvolden (1973). The hydrophobic and hydrophilic amino acid are indicated according to the classitication of Volkenstein (1967).

AMINO

ACID

ARCHAEORELATION

205

comparison with the averaged protein of vertebrates. The amino acid compositions of the bulk proteins are indicative of the relative amounts of amino acids in intact organisms, not allowing for the differences in the quantitative ratios of the individual proteins. The averaged protein reported by YZas (1969) was determined by analyzing the amino acid composition of 74 different purified proteins of vertebrates. The averaged composition gives equal statistical weight to each protein species. It may be seen that the degree of correlation, established as indicated in the footnote to Table 1, is indeed high. According to YEas (1969), this is quite natural since the proteins of even quite distant species have to perform similar functions. However, Fox (1969) and Krampitz 8c Fox (1969) found that natural proteins and proteinoids formed in a purely chemical fashion from a mixture of equal amounts of individual amino acids have much in compon, suggesting that the roots of the similarity lie in the prebiological stage of evolution. 3. Siiilarity

of Proteins and Roteinoids

Up till now the similarity in the composition of chemically synthesized mixtures of amino acids and proteinoids on the one hand, and of natural proteins on the other, has been estimated at a purely qualitative level. Given in Table 2 are the results of our calculations based on certain assumptions and aimed at obtaining a quantitative estimate of the similarity between the amino acid compositions of the synthesized products and of the averaged protein. It may be seen that our calculations do not contradict the estimates given by Fox. It may be noted that the ratios closest to the “natural” are obtained with the thermal synthesis of amino acids from ammonia, methane and water. With other sources of energy and different compositions of the substrates, a similar trend reveals itself. In condensation experiments, proteinoids most similar to the averaged protein were synthesized from amino acid adenylates. Summing up these results, it may be said that even at the earliest stages of chemical evolution some features could have emerged, which have later become inherent in the organic world. This must have been due to a number of chemical processes rather than a single reaction. This idea is consistent with Oparin’s theory of the origin of life (Oparin, 1957). The above statement is purely phenomenological. It would be interesting to find a possible mechanism of transfer of the primary relations (archaeorelations) of the amino acids to the existing proteins rigorously determined by the nucleic acids. This question evidently bears on the problem of the origin of genetic code.

Starting materials, reaction conditions (source of energy, catalyst)

No. pairs with negative correlation I II

Synthesis of amino acids 48 52 18 14 48 52 18 14 49 53 17 13 32 30 23 25 34 27 11 18 41 37 25 29 32 32 13 13 26 28 29 27 32 34 36 30 26 28 19 17 25 41 43 23 47 45 19 21 Synthesis of peptides (proteinoids)$ 24 96 68 23 68 68 40 15 75 61

No. pairs with positive correlation I II 73 73 74 58 76 62 71 47 52 58 62 71 80 75 50 73 55

79 79 80 55 60 56 71 51 55 62 65 68

Pairs with positive correlation (%I I II

Fox et Banda Fox & Fox & Fox &

al. (1971) & Ponnamperuma (1971) Nakashima (1967) Waehneldt (1968) Waehneldt (1968)

Harada & Fox (1965) Harada & Fox (1965) Harada & Fox (1965) Taube et al. (1967), Oro (1965), Czuchajowski & Zawadzki (1968) Ring et al. (1972) Matthews & Moser (1967) Matthews & Moser (1967) Moser et al. (1968a) Moser et al. (19686) Moser et al. (19686)

References

Comparison with the averaged protein has been carried out using the procedure described in the footnote to Table 1. I, calculations carried out under the assumption that the amides of aspartic and glutaminic acids are not formed. II, it is assumed that the amides are also synthesized (formation of the amides has not been considered in the studies of the chemical synthesis of amino acids). t The symbols in brackets designate the sets of experiments and the peptide fractions formed concurrently with the amino acid synthesis. $ The starting molar fractions of the free amino acids are the same. 8 Figures in the second row are means for four samples.

:: 17

16 14

Adenylates of amino acids5

Amino acids, heating 170°C 192-194°C 192-194°C

12 12 12 11 10 12 10 11 12 10 12 12

No. amino acids

950°C quartz 950°C silicagel 1050°C silicagel 1000°C silicagel 1030°C CH*, NH3, H,O electric charge CHI, Na, NH3, H,O electric charge HCN-polymer, HzO, (4C)t HCN-polymer, H,O, (6C) HCN-trimer, HzO, NHaOH, (5) HCN-trimer, HaO, (2B) HCN-tetramer, HzO, (2C)

CHa, NHs, H,O

2

Comparison of synthesized amino acid mixtures and proteinoids with the averaged protein

TABLE

AMINO

ACID

207

ARCHAEORELATION

4. Formulation of the Concept We propose the following concept (see Fig. 1). We assume that the quantitative structure of genetic code (the quantitative distribution of triplets between amino acids) is in a certain way determined by a preexisting primary relation (archaeorelation) of amino acids.? The code reflecting

Evolution statisticd protein level

Closing of Eigen’s hypercycle

I

Contemporary proteins

on

I

Evolution on concrete protein levd

FIG. 1, A hypothetical diagram of the mechanism of transfer of the amino acid archaeorelation to the existing proteins (see text).

the archaeorelation of amino acids allowed proteins to be synthesized from random nucleotide sequences, whose composition in turn (secondarily) reflected the archaeorelation of the amino acids. 5. Motivation of the Concept The hypothesis is based on the following facts and arguments. It is well known that the number of codons corresponding to one amino acid is t Jukes (1966) and Fox, Yuki, Waehneldt & Lucy (1971) were the first authors to put fonvard the general concept of the primary flow of information from proteins to nucleic acids. Many studies support the concept of all the existing tRNAs having had a common precursor (the proto-tRNA) (Dayhoff & Eck, 1968; Dayhoff, 1971; Jukes & Holmquist, 1972).

208

H.

N.

MIKELSAAR

different for different amino acids. We have compared the ratios of the numbers of triplets coding for individual amino acids with the molecular ratios of these amino acids in the averaged and bulk proteins. For the sake of brevity, this will be referred to below as “correlation of codons (triplets) and amino acids” or “correlation of the protein composition to the degree of the code degeneracy”. The results of this comparison are given in Table 2. TABLE

3

Comparison of the amino acid ratios of the bulk proteins, averaged protein, synthesized amino acid mixtures and proteinoids with the degree of code degeneracy

Analyzed material

No. pairs with positive correlation I II

No. pairs with negative correlation I II

Pairs with positive correlation I (%) II

117

23

84

76 75 94 69 76 70 74 93 62

26 27 24 21 26 20 16 25 28

75 74 80 77 75 78 82 79 69

Averaged protein Bulk proteins Bacillus cereus Bacillus subtilis Escherichia coli Sarcina lutea Serratia marcescens Micrococcus lysodeikticus Mycobacterium phlei Chlorella vulgaris Tetrahymena pyriformis Amino acids? 950°C quartz 950°C silicagel 1050°C silica4 1000°C silica& 1030°C Proteinoids from amino acid adenylates

30 30 30 29 25

32 32 :: 24 55 47

19 19 19 13 10

11 11 11 11 9

61 61 61 69 71

74 74 74 71 73 60 72

f Amino acids formed by heating of mixture of CH,, NHB, H,O (in the presence of the catalyst indicated). The procedure of comparison is similar to that described in the footnote to Table 1; the only difference is that one series of amino acid ratios was replaced by a series of the ratios of codon numbers determining the amino acids under consideration. Ail pairs of amino acids having different numbers of coding triplets have been taken into consideration. I, II-see the footnote to Table 2. The data on the amino acid composition of the bulk proteins have been taken from the works by Woese (1967) and Kvenvolden (1973). The reports on the thermal synthesis of amino acids and on the synthesis of proteinoids from amino acid adenylates are referred to in Table 2.

AMINO

ACID

ARCHAEORELATION

209

The degree of correlation is rather high. The same conclusion may be obtained by arranging the arithmetical means of the concentrations of amino acids with the same number of codons, according to the number of triplets. The following regularity is observed: the amino acids having a larger number of codons occurring in proteins more frequently. The above conclusion is in line with the results of a number of authors (MacKay, 1967; Kimura, 1968; King & Jukes, 1969; see, also, Jukes, 1973). On the other hand, as shown above, similarity is observed between the amino acid ratios in natural proteins, proteinoids and synthesized mixtures of amino acid. Therefore, in the latter case, too, we may assume the existence of a regularity of the same type: the larger the number of codons, the higher the concentration of the amino acid. The data presented in the lower half of Table 3 are indicative of this regularity. Since the given concentration ratio of amino acids should have formed independently of the presence of nucleic acids and obviously earlier than the coding ratios could emerge in the course of evolution, it may well be that it is this part that was the factor predetermining the quantitative structure of the genetic code. Our conclusion does not contradict the fact that in the present day translation vocabulary amino acids are differentiated with respect to the affinity of the side chains (Volkenstein, 1967, 1968; Woese, 1965; Epstein, 1966) nor to the idea put forward by Volkenstein (1967, 1968) and Woese (1965, 1967, 1970) about the primary group coding of amino acids. Furthermore, it may be demonstrated that the most complete correlation is found when the code and amino acids in the first coding group (the second nucleotide of the triplets is uridylic acid), which codes exclusively the hydrophobic amino acids, are compared. This only indicates that both the concentration and the steric factor were, apparently, involved in the development of the code. But, taking into account the correlation of triplets and the concentrations of amino acids we believe that existence of rigorous stereochemical relationships between nucleotide triplets forming codons (anticodons) and the amino acids seem unlikely. Evidently, other parts of the precursor tRNA molecule should have also been responsible for the specificity of interaction (Crick, 1968). 6. Necessary Assumption Our concept requires nucleic acids to have had more or less similar proportions of the randomly positioned nucleotides by the time the type of coding which exists now was beginning to form. It is only in such molecules that the number of all the triplets are statistically equal, the synthesis on the templates of this type makes it possible to obtain peptides whose amino T.B. 14

210

H.

N.

MIKELSAAR

acid composition corresponds to the starting quantitative distribution of codons between amino acids. It is important to note that the quantitative distribution of the amino acids is the same irrespective of the position of the initial point and of the direction of read-out of the above templates (Mikelsaar & Mikelsaar, 1972). 7. Discussion:

Evolution of Proteins

The problem is, why do these “primordial” amino acid ratios occur in the proteins of existing organisms ? What advantage, if any, do proteins gain in catalytic terms from the archaeorelation of amino acids? That the amino acid archaeorelation is functionally preferential is indirectly evidenced by the fact that the existing proteins, including enzymes, have the amino acid composition close to the archaeorelation. Hence, it cannot be ruled out that the prebiological proteinoids could have, quite accidentally, the amino acid composition functionally close to the optimal. Thus, it is easy to understand why any noticeable deviation of the amino acid composition of the proteins from archaeorelation is unfavourable for the primitive organism and why the genetic code reflects the archaeorelation of amino acids. On the other hand, another explanation seems to be no less relevant. In their time these relations of amino acids might have provided the organism with a kind of kinetic perfection (see ShnoI, 1973; Eigen, 1971) determining perpetuation of such organisms. At present these ratios are but rudimentary. In other words, the archaeorelation of the amino acids in the existing proteins may be a reflection of a once formed genetic code (King & Jukes, 1969). Let us discuss this in detail. We believe it may be that in the process of evolution the genetic code tied, for some reason, the amino acid ratios of the statistical protein, that is, the archaeorelation of amino acids. Further, the evolution pathways of the proteins strictly determined by the nucleic acids were, in general, radial, that is, the amino acid compositions were gradually diverging from the archaeorelations in various directions. The relative amount of some amino acids in certain proteins was increasing, in other proteins it was decreasing, whereas in some proteins it could be more or less constant. The sum total of these variations should have tended to zero. In other words, the amino acid composition of the existing proteins averaged over the mole per cent should correlate with triplet ratios (the degree of reflection of the amino acid archaeorelation) to a greater extent than most individual proteins per se. A comparison of the amino acid compositions of 40 various proteins to the degree of degeneracy of the amino acid code has demonstrated that all the proteins have amino acid compositions to a greater or lesser extent correlated to the code. The arithmetical

AMINO

ACID

ARCHAEORELATION

211

mean for the degrees of correlation is 74%. Calculations for the averaged protein on the basis of the above 40 proteins yield a higher value of 85%. It is evident that these results to a certain extent support our idea of the radial evolution of the existing proteins from the protoproteins having the archaeorelation of amino acids. All the above does not rule out the possibility of alternative or additional reasons for conservation of the amino acid archaeorelation in the existing proteins. The author is deeply grateful to M. B. Berkinblit, R. Vilu, G. A. Zavarzin, D. I. Kahnin, I. A. Kozlov, B. M. Mednikov, S. A. Ostroumov, V. P. Skulachev, V. M. Stepauov, S. E. Shuol, and A. A. Jasaitis for valuable advice and discussion and to Miss T. I. Kheifets for translating the paper into English. REFERENCES BANDA, P. W. & PONNAMPERUMA, C. (1971). SpaceLife Sci. 3, 54. F. H. C. (1968).J. molec. Biol. 38, 367. Czuc~~.~ows~~, L. SCZAWADZKI,W. (1968).Roczn.Chem. 42,697. DAYHOFP, M. 0. (1971). In Chemical evolution and the origin of life (R. Buvet & H. C. Ponnamperuma, eds)Vol. 1, p. 392.Amsterdam:North-Holland. DAYHOPP, M. 0. & ECK,R. V. (1968).Atlas of protein sequence and structure 1967-1968. SilverSprings,Maryland: NationalBiomedicalResearch Foundation. EIGEN,M. (1971).Naturwissenschaften 58,465. EPSTEIN, J. C. (1966).Nature, Land. 210,25. FOX, S. W. (1969).Naturwissenschaften 56, 1. Fox, S. W. & NAKASHIMA, T. (19.67).Biochim. biophys. Actu 140, 155. FOX,S. W. & WAE~NELDT T. V. (1968).Biochim. biophys. Acta 160,246. Fox, S. W., YUKI, A., WAEHNELDT, T. V. h LACEY, J. C., Jr. (1971).In Chemical evolution andr/reorigin of life (R. Buvet & H. C. Ponnamperuma, eds)vol. 1, p. 252.Amsterdam: North-Holland. HARADA, K. & Fox, S. W. (1965).In The Origins of Prebiological Systems and of their Molecular Matrices (S. W. Fox, ed.) p. 187.New York: AcademicPress. Jurors,T. H. (1966). Molecules and Evolution, p. 187.New York, Columbia:University CRICK,

JUKES, T. H. (1973).B&hem. biophys. Res. Commun. 53,709. Jurors,T. H. & HOLMQUI~T, R. (1972). Biochem. biophys. Res. Commun. 49, 212. KIMURA,M. (1968).Getter. Res. 11, 247. KING, J. L. & Jurcss, T. H. (1969).Science,N. Y. 164, 788. KRAMPITZ,

G. & Fox,

S. W. (1969).

Proc.

mtn.

Acad, Sci. U.S.A. 62, 399.

KVENVOLDEN, K. A. (1973).Space life Sci. 4, 60. MACKAY, A. L. (1967). Nature, Land. 216,159. MATTHEWS, C. N. & MOSER, R. E. (1967).Nature, Land. 215, 1230. MIKELSAAR, H. N. & MIKELSAAR, R. N. (1972).Biofiztha 17,218. MC+ER, R. E., CLAGGETT, A. R. & MATITBZWS, C. N, (1968a).Tetrahedron

Lett. 1605. Mosan,R. E., CLAGGEIT, A. R. & MATTHEWS, C. N. (19686).Tetrahedron Left. 1599. OPARIN,A. I. (1957).The Origin of Life on Earth Moscow: IzdatelstvoAkademiiNauk SSSR. ORO, J. (1965). In The Origins of Prebiological Systems and of their Molecular Matrices (S. W. Fox, ed.) p. 137.New York: AcademicPress.

212 RING,

H. D., WOLMAN,

Y., FRIEDMAN,

N.

MIKELSAAR

N. & MILLER,

S. L. (1972).

Proc. natn. Acad. Sci. U.S.A.

69,765. SHNOL, S. E. (1973). Z/z. Obshc/z. Biol. 34, 331. TAUBE,

M.,

ZDROJEWSKI,

ST. Z., SAMOCHOCKA,

K. & JEZIERSKA,

K. (1967).

Angew. Chem.

79, 239. VOLKENSTEIN, M. V. (1967). Physics of Enzymes. Moscow: Nauka. VOLKENSTEIN, M. V. (1968). V&tnik APN kauchnaya Misl, March, April. Worse. C. R. (1965). Proc. n&n. Acad. Sci. U.S.A. 54. 1546. WOESE; C. R. (i967j. The Genetic Code. New York, E&ston and London: WOFSE, C. R. (1970). Bioscience 20,471.

Ybs,

M. (1969). The Biological Code, p. 144. Amsterdam and London:

Harper & Row. North-Holland.

A concept of amino acid archaeorelation: origin of life and the genetic code.

J. theor. Biol. (1975) !W,203-212 A Concept of Amino Acid Archaeorelation: Origin of Life and the Genetic Code H. N. MIKELSAAR Institute of Experimen...
539KB Sizes 0 Downloads 0 Views