Proc. Natl. Acad. Sci. USA Vol. 74, No. 6, pp. 2321-2324, June 1977

Biochemistry

The complete amino acid sequence of prochymosin (protease/primary structure/homology)

BENT FOLTMANN, VIBEKE BARKHOLT PEDERSEN, HENNING JACOBSEN*, DOROTHY KAUFFMANt, AND GRITH

WYBRANDTf

Institute of Biochemical Genetics, University of Copenhagen, 0. Farimagsgade 2A, DK-1353 Copenhagen K, Denmark

Communicated by Hans Neurath, March 18,1977 The total sequence of 365 amino acid residues ABSTRACT in bovine prochymosin is presented. Alignment with the amino acid sequence of porcine pepsinogen shows that 204 amino acid residues are common to the two zymogens. Further comparison and alignment with the amino acid sequence of penicillopepsin shows that 66 residues are located at identical positions in all three proteases. The three enzymes belong to a large group of proteases with two aspartate residues in the active center. This group forms a family derived from one common ancestor.

Chymosin (EC 3.4.23.4) is the major proteolytic enzyme in the stomach of the preruminant calf. Like pepsin, chymosin belongs to a group of acidic proteases in which two aspartate residues participate in the catalytic mechanism (1, 2). In analogy with the term serine proteases, the term ispartate proteases has been suggested (3). Chymosin is secreted as an inactive precursor, prochymosin, consisting of a single peptide chain with 365 amino acid residues. The zymogen is irreversibly converted into active enzyme by limited proteolysis during which a total of 42 amino acid residues are released from the amino-terminal part of the peptide chain. Prochymosin contains six half-cystine residues forming three disulfide bridges, all located in the enzyme part of the molecule. Investigations on the primary structure began with determination of the amino acid sequences around the disulfide bridges (4). In subsequent papers (5, 6) we reported the sequence of the activation segment and the sequence of the first 61 residues of the active enzyme. This communication presents the complete amino acid sequence of prochymosin B. It is shown that this zymogen is highly homologous to porcine pepsinogen A (7-10). To facilitate the comparison, we have chosen to number the amino acid residues from the NH2-terminus of prochymosin and then continue by counting gaps where such occur in prochymosin relative to pepsinogen. The comparison is further extended to penicillopepsin, the only other aspartate protease of which the sequence is almost completely known (11). METHODS Prochymosin and chymosin used in these investigations were prepared in our laboratory according to the methods described previously (12). The sequence was obtained after a series of degradation experiments carried out in parallel. In most cases the first steps were enzymatic cleavage with trypsin or chemical cleavage with cyanogen bromide. To improve solubility of the digest and to reduce the number of fragments, we used maleylated or citraconylated preparations for the digestions with trypsin. The digestions were carried out at 120 (pH 8, for 15-30 min) in

order to avoid unspecific, chymotrypsin-like cleavages (13). After such treatment the large fragments were purified by gel filtration on Sephadex G-100 in 0.05 M NH4HCO3, pH 8, with 8 M urea. After cleavage of chymosin with cyanogen bromide the fragments were purified by gel filtration on Sephadex G-100 in 25% acetic acid. The best results were obtained if cleavage was performed on enzyme with intact disulfide bridges. By such treatment two of the large fragments, CB(211-302) and CB(314-373), are held together and separated from the fragment next in size, CB(45-126). After reduction and amino ethylation the two former fragments were separated by repeated gel filtration. Each of the large fragments was digested either with chymotrypsin, elastase, thermolysin, or staphylococcal protease, or, after deblocking of amino groups, with trypsin or Armillaria mellea protease and then with one of the other proteases. Low-molecular-weight peptides were mainly purified by high-voltage paper electrophoresis/paper chromatography. Amino acid analyses were performed with BioCal 201 or Durrum D-500 amino acid analyzers. The sequences of the purified peptides were determined by Edman degradation (14) followed either by dansylation (15) or by identification of liberated phenylthiohydantoin-derivatives (16, 17). Location of amides were obtained from electrophoretic mobilities of peptides (18) or by thin-layer chromatography of phenylthiohydantoin-derivatives (16). RESULTS Fig. 1 shows in schematic form the relative location of fragments obtained after cyanogen bromide cleavage or tryptic digestion of chymosin with blocked amino groups. The final sequence of prochymosin is presented in Fig. 2. Throughout the structure most of the amino acid residues have been identified in two or more independent peptides. Exceptions are some residues from no. 230 to no. 250. Due to the apolar character of this region the peptides have been difficult to isolate. Except for phenylalanine no. 163, all residues have been located in peptides with overlaps of at least two residues. However, overlaps around phenylalanine no. 158 and the amino acid composition of CB(127-169) make no other sequence possible. If we add up the molecular weights of all residues in prochymosin and chymosin, molecular weights of 40,777 and 35,652 are found. These are higher than the molecular weights published previously (19); but if the original amino acid compositions (20) are recalculated on the basis of the molecular weights found by determination of the sequence, there is a satisfactory agreement (Table 1). DISCUSSION

*Novo Research Institute, Bagsvaerd, Denmark; tDept. of Oral Biology, University of Washington, Seattle; and tDept. of Zoophysiology, University of Copenhagen.

Present addresses:

Crystalline chymosin is resolved into three different components by chromatography on columns of DEAE-cellulose (21). 2321

2322

Proc. Natl. Acad. Sci. USA 74 (1977)

Biochemistry: Foltmann et al.

Table 1. Amino acid compositions of prochymosin B and chymosin B*

TM

CB '5I Gly

5

Gly

Prochymosin Bt

S-101

127

[02

Phe

106

Lys

Gin

Ala

188 201

Asn

189

2 210

Asp

Lz20J

Fy211

LeuI

His Asp

s-

Asx Thr Ser Glx Pro Gly Ala

½Cys Val Met Ile Leu Tyr Phe

His Lys Arg Trp Amide

Chymosin Bt

AAA

Seq.

AAA

Seq.

37.6 23.0 34.4 40.9 15.7 32.1 16.9 5.7 25.7 7.6 21.4 28.8 20.7 18.6 5.6 14.6 7.9 4.6 38.3

37 24 35 39 16 32 17

34.8 21.4 31.0 33.6 14.0 28.4 14.9 5.7 24.3 7.7 17.6 21.5 17.9 16.0 4.8 9.4 5.6 4.8 35.6

36 23 31 33 15 28 15 6 26 8 19 23 19 17 5 9 6

6 26 8 22 29 22 19 6 15 8 4 39

4 36

* Comparison between amino acid analyses (AAA) (20) and compositions determined from the sequences (Seq.). t Molecular weight: 40,777. t Molecular weight: 35,652.

_

302

s-7

Pro

31-31

Tyr Glu 363 Ala lie lie of chymosin 1. of fragments Diagrammatic presentation FIG. obtained by cyanogen bromide cleavage (CB) and by tryptic digestion of chymosin with modified amino groups (TM). The fragments are characterized by the number of first and last residues (for zymogen numbering, see Fig. 2). Residue no. 187 is methionine. It was recovered as free homoserine after cleavage with cyanogen bromide but is not marked as a separate fragment. NH2-termini of the fragments are shown. All CB fragments except CB(314-373) have COOH-terminal homoserine; all TM fragments except TM(363-373) have COOHterminal arginine. 373

373

Of these, the component designated chymosin C is partly autolyzed enzyme, while separate zymogens exist for chymosins A and B. Prochymosin B is most abundant of the two, and all residues have been identified in this protein. The fragments obtained by tryptic digestion and by cyanogen bromide cleavage of chymosin A separate on gel filtration in a manner that is indistinguishable from fragments of chymosin B. Furthermore, the individual fragments from chymosins A and B have very similar amino acid compositions. A total of 119 residues have been identified in chymosin A, and among these we have found only one that is different from the corresponding residue in chymosin B: at position no. 290 chymosin A has aspartic acid, while chymosin B has glycine. There may be more differences, but we can conclude that the two structures are very similar.

However, more important features of an investigation like this appear when the primary structure is compared with primary structures of different but related enzymes. In Fig. 2 we indicate residues that are common to bovine prochymosin B, porcine pepsinogen A, and penicillopepsin. Out of 373 positions for amino acid residues in the two gastric zymogens, 204 are occupied by common residues. In chymosin B, 24 out of 28 glycines, 12 out of 15 prolines, and all three disulfide bridges are located in the same positions as in pepsin A. The similarity goes even further than demonstrated in Fig. 2. Of the remaining residues a large part represents conservative substitutions in which the polarity of the side chains is maintained, and two gaps in the peptide chain of prochymosin (41-42 and 338-340) correspond to proline sequences in pepsinogen. All this implies that the tertiary structures of the two proteins are very similar, and that the gaps occur in bendings of the peptide chain which in prochymosin are shorter by a few residues. As indicated in Fig. 2, the comparison moreover shows that penicillopepsin has 66 residues identical to those in the two gastric proteases. These identities are not scattered at random along the peptide chain. They are mainly clustered in six short sections with extensive homology: Phe77 to Serg8; Ser118 to Ser,25; Asp164 to Ala170; Gly214 to Leu22s; Ile2s9 to Thr264; and Gly349 to Asp361. All together these six sections account for about 20% of the peptide chain, but they include two-thirds of the identical residues. If we compile the information from determination of sequence, activation of zymogens, and inactivation of the enzymes, we can begin to discern outlines of a relationship between structure and function of the aspartate proteases and their zymogens. A zymogen has not been found for penicillopepsin (22). With regard to the zymogens for the gastric proteases, we shall briefly draw attention to some points raised in other papers (7, 21, 23, 24). In the amino-terminal part of the peptide chain (the so-called activation segment or propart) a pattern of basic

Biochemistry: Foltmann et al.

Proc. Natl. Acad. Sci. USA 74 (1977)

2323

10 5 -. 15 20 Ala -Glu- Ile -Thr-Arg- Ile -Pro-Leu-Tyr-Lys-Gly-Lys-Ser-Leu-Arg-Lys-Ala-Leu-Lys-Glu25 30 35 40 His -Gly-Leu-Leu-Glu-Asp-Phe-Leu-Gln-Lys-Gln-Gln-Tyr-Gly- Ile -Ser-Ser-Lys-Tyr-Ser-

45 50 55 60 -Gly-Phe-Gly-Glu-Val-Ala-Ser-Vai -Pro -Leu-Thr-Asn-Tyr-Leu-Asp-Ser-Gln-Tyr65 70 75 80 Phe -Gly -Lys- Ile -Tyr-Leu-Gly -Thr-Pro - Pro-Gln -Glu -Phe-Thr- Val-Leu-Phe -Asp-Thr-Gly -

85 90 95 100 Ser -Ser-Asp-Phe-Trp-Val-Pro-Ser- Ile -Tyr-Cys-Lys-Ser-Asn-Ala-Cys-Lys-Asn-His-Gln105

110

115

120

Arg-Phe-Asp- Pro-Arg-Lys -Ser - Ser -Thr-Phe -Gln -Asn-Leu-Gly -Lys- Pro -Leu- Ser - Ile - His 130

125

135

140

Tyr-Gly-Thr-Gly-Ser-Met-Gln-Gly- Ile -Leu-Gly-Tyr-Asp-Thr-Val-Thr-Val-Ser-Asn- Ile 145 150 160 155 Val -Asp- Ile -Gln -Gln -Thr- Val -Gly -Leu-Ser -Thr-Gln -Glu -Pro -Gly -Asp- Val -Phe -Thr-Thr175 170 180 165 Ala -Glu -Phe -Asp-Gly - Ile -Leu-Gly-Met- Ala -Tyr-Pro - Ser -Leu- Ala- Ser -Glu -Tyr- Ser - Ile 190 185 195 200 Pro- Val-Phe-Asp-Asn-Met-Met-Asn-Arg-His-Leu- Val-Ala-Gln-Asp-Leu-Phe-Ser- Val-Tyr-

205 Met-Asp-Arg-Asp-Gly -Gln-Glu-

210 215 220 -Ser -Met-Leu-Thr-Leu-Gly-Ala- Ile -Asp-Pro-Ser -Tyr-

225 230 235 240 Tyr-Thr-Gly - Ser -Leu- His -Trp - Val-Pro - Val -Thr- Val -Gln -Gln -Tyr-Trp-Gln - Phe-Tyr- Val -

250 255 245 260 Asp-Ser - Val -Thr- Ile - Ser -Gly - Val - Val - Val -Ala-Cys-Glu-Gly -Gly -Cys -Gln -Ala- Ile -Leu265 270 275 Asp-Thr-Gly-Thr-Ser -Lys-Leu- Val -Gly -Pro - Ser -Ser -Asp- Ile -Leu-

280 -Asn- Ile -Gln -Gln -

290 285 295 300 Ala - le -Gly -Ala-Thr-Gln -Asn-Gln-Tyr-GAP -Giu -Phe-Asp- le -Asp-Cys-Asp-Asn-Leu- Ser Gly 305 310 315 310 Tyr-Met-Pro -Thr- Vai - Val -Phe -Glu - Ile -Asn-Gly -Lys -Met-Tyr- Pro-Leu-Thr-Pro -Ser -Ala -

325 330 335 Tyr -Thr - Ser -Gln -Asp- Gin -Gly -Phe -Cys -Thr - Ser -Gly -Phe -Gln - Ser -Glu -AsnHis -Ser -

340 -

-

345 350 355 360 -Gln-Lys-Trp- Ile -Leu-GIy-Asp- Val-Phe- Ile -Arg-Glu-Tyr-Tyr- Ser -Val-Phe-

365 370 Asp-Arg-Ala -Asn-Asn-Leu- Val -Gly -Leu- Ala-Lys- Ala- Ie FIG. 2. The amino acid sequence of prochymosin. To facilitate the comparison with porcine pepsinogen A, gaps are marked and counted where such occur relative to porcine pepsinogen A. The NH2-terminus of active chymosin is Gly45. Disulfide bridges connect Cys9j and Cys96, Cys252 and Cys256, and Cys296 and Cys329. Residues in italics are common to prochymosin and porcine pepsinogen. Residues in boldface are common to the gastric enzymes and to penicillopepsin. (Gaps or insertions in the latter alignment are not-marked.)

amino acid residues with spacers of apolar residues is common for all zymogens of this group analyzed so far. This part of the structure is essential to anchor the zymogen molecule in inactive conformation at neutral pH. The conformation is mainly stabilized by electrostatic interactions between positive charges of basic amino acids in the propart of the peptide chain and negative charges in the enzyme part of the molecule. When the

pH is lowered, the electrostatic interactions are weakened, the zymogen molecules rearrange into an active conformation (21, 25, 26), and irreversible activation takes place by limited pro-

teolysis. In the alignment of peptide chains the NH2-termini of the active enzymes start at different positions: penicillopepsin at residue no. 43, chymosin at no. 45, and porcine pepsin at posi-

2324

tion no. 47. There is only little homology in this region of the primary structure. This indicates that the NH2-terminus of the active enzyme does not participate directly in formation of the active center. The enzymes are inhibited when aspartate no. 78 is esterified by reaction with substrate-like epoxides (27). Inhibition also occurs if asparate no. 261 is esterified by reaction with activesite-directed diazo compounds (28). For this reaction diazoacetyl-norleucine-methylester has been used in many experiments (29). In all three enzymes these aspartate residues are located in sections of highly homologous sequences. This is consistent with current concepts of protein evolution, that residues that carry a special function are conserved in homologous surroundings throughout the evolution. If we extend this reasoning to the four other sections of extensive homology, we may expect that they, in some way, are

essential for the tertiary structure and thereby for the biological activity. X-ray crystallography on the aspartate proteases is in progress (30-32) and will undoubtedly shed light on these problems. In addition to the three enzymes considered in this paper, we have fragmentary information about sequences of human (33), bovine (34), horse (35), seal (24), and dogfish (24) pepsinogens or pepsins; all results show a high degree of homology. Furthermore, proteases like cathepsin D (36), renin (37), and several microbial proteases (22) are inhibited by diazo-acetyl-norleucine-methylester, and where the sequences of peptides containing the reactive aspartate have been determined, the results indicate homology. It is premature to

discuss the evolutionary relationship in

detail, but we may conclude that all these enzymes are derived from a common ancestor, and we may add the asparate proteases to the list of protein superfamilies. Armillaria mellea protease was obtained as a gift from Dr. D. Smyth; staphylococcal protease was a gift from Dr. G. Drapeau. Both are gratefully acknowledged. The investigations have been supported by the Carlsberg Foundation and by the Danish Natural Science Research Council. The costs of publication of this article were defrayed in part by the payment of page charges from funds made available to support the research which is the subject of the article. This article must therefore be hereby marked "adveri'isement" in accordance with 18 U. S. C. §1734 solely to indicate this fact. 1.

Froc. Natl. Acad. Sci. USA 74 (1977)

Biochemistry: Foltmann et al.

Knowles, J. R. (1970) Phil. Trans. Roy. Soc. (London) B 257, 135-146.

2. Fruton, J. S. (1976) Adv. Enzymol. 44, 1-44. 3. Kovaleva, G. G., Shimanskaya, M. P. & Stepanov, V. M. (1972) Biochem. Biophys. Res. Commun. 49, 1075-1081. 4. Foltmann, B. & Hartley, B. S. (1967) Biochem. J. 104, 10641074. 5. Pedersen, V. B. & Foltmann, B. (1975) Eur. J. Biochem. 55, 95-103. 6. Pedersen, V. B. & Foltmann, B. (1973) FEBS Lett. 35, 250254.

7. Ong, E. B. & Perlmann, G. E. (1968) J. Biol. Chem. 243, 6104-6109. 8. Pedersen, V. B. & Foltmann, B. (1973) FEBS Lett. 35, 255-

256.

9. Sepulveda, P., Marciniszyn, J., Liu, D. & Tang, J. (1975) J. Biol. Chem. 250, 5082-5088. 10. Moraivek, L. & Kostka, V. (1974) FEBS Lett. 43,207-211. 11. Cunningham, A., Wang, H-M., Jones, S. R., Kurosky, A., Rao, L., Harris, C. I., Rhee, S. H. & Hofmann, T. (1976) Can. J. Bio-

chem. 54, 902-914. 12. Foltmann, B. (1970) in Methods in Enzymology, eds. Perlmann, G. E. & Lorand, L. (Academic Press, New York), Vol. 19, pp. 421-436. 13. Hapner, K. D. & Wilcox, P. E. (1970) Biochemistry 9, 44704480. 14. Gray, W. R. (1967) in Methods in Enzymology, ed. Hirs, C. H. W. (Academic Press, New York), Vol. 11, pp. 469-475. 15. Hartley, B. S. (1970) Biochem. J. 119,805-822. 16. Kulbe, K. D. (1974) Anal. Biochem. 59,564-573. 17. Smithies, O., Gibson, D., Fanning, E. M., Goodfliesh, R. M., Gilman, J. G. & Ballantyne, D. L. (1971) Biochemistry 10, 4912-4921. 18. Offord, R. E. (1966) Nature 211,591-593. 19. Djurtoft, R., Foltmann, B. & Johansen, A. (1964) C. R. Trav. Lab. Carlsberg 34,287-298. 20. Foltmann, B. (1964) C. R. Trav. Lab. Carlsberg 34,275-286. 21. Foltmann, B. (1966) C. R. Trav. Lab. Carlsberg 35, 143-231. 22. Hofmann, Th. (1974) Adv. Chem. Ser. 136, 146-185. 23. Harboe, M., Andersen, P. M., Foltmann, B., Kay, J. & Kassell, B. (1974) J. Biol. Chem. 249,4487-4494. 24. Klemm, P., Poulsen, F., Harboe, M. K. & Foltmann, B. (1976) Acta Chem. Scand. Ser. B 30,979-984. 25. McPhie, P. (1972) J. Biol. Chem. 247,4277-4281. 26. Al-Janabi, J., Hartsuck, J. A. & Tang, J. (1972) J. Biol. Chem. 247, 4628-4632. 27. Chen, K. C. S. & Tang, J. (1972) J. Biol. Chem. 247, 25662574. 28. Knowles, J. R. & Wybrandt, G. B. (1968) FEBS Lett. 1, 211212. 29. Rajagopalan, T. G., Stein, W. H. & Moore, S. (1966) J. Biol. Chem. 241,4295-4297. 30. Subramanian, E., Swan, I. D. A. & Davies, D. R. (1976) Biochem. Biophys. Res. Commun. 68,875-880. 31. Jenkins, J. A., Blundell, T. L., Tickle, I. J. & Ungaretti, L. (1975) J. Mol. Biol. 99,583-590. 32. Andreeva, N. S., Fedorov, A. A., Gushchina, A. E., Shutskever, N. E., Riskulov, R. R. & Vol'nova, T. V. (1976) Dokl. Akad. Nauk SSSR 228,480-483. 33. Sepulveda, P., Jackson, K. W. & Tang, J. (1975) Biochem. Biophys. Res. Commun. 63, 1106-1112. 34. Harboe, M. K. & Foltmann, B. (1975) FEBS Lett. 60, 133136. 35. Stepanov, V. M., Lavrenova, G. I., Rudenshaya, G., Gouschar, M. V., Lobareva, L. C., Kotolova, E. K., Strongin, A. Ya., Baratova, L. A. & Belyanova, L. P. (1976) Biokhimiya 41, 12851290. 36. Keilova, H. (1970) FEBS Lett. 6,312-314. 37. Inagami, T., Misono, K. & Michelakis, A. M. (1974) Biochem. Biophys. Res. Commun. 56,503-509.

The complete amino acid sequence of prochymosin.

Proc. Natl. Acad. Sci. USA Vol. 74, No. 6, pp. 2321-2324, June 1977 Biochemistry The complete amino acid sequence of prochymosin (protease/primary s...
692KB Sizes 0 Downloads 0 Views