Protein Engineering vol.5 no.7 pp.679-691, 1992

Homology modeling of a heme protein, lignin peroxidase, from the crystal structure of cytochrome c peroxidase

Ping Du, Jack R.Collins and Gilda H.Loew Mokxui.,r Res USA

Institute, 845 Page Mill Road, Palo Alto. CA 94304.

A 3-dimensional model of lignin peroxidase (LiP) was constructed based on its sequence homology with other peroxidases, particularly cytochrome c peroxidase, the only protein with a known crystal structure in the peroxidase family. The construction of initial conformations of insertions and deletions was assisted by secondary structure predictions, amphipathic helix predictions, and consideration of the specific protein environment. A succession of molecular dynamics simulations of these regions with surrounding residues as constraints were carried out to relax the bond lengths and angles. Full protein molecular dynamics simulations with explicit consideration of bound waters were performed to relax the geometry and to identify dynamically flexible regions of the successive models for further refinement. Among the important functionally relevant structural features predicted are: (i) four disulfide bonds are predicted to be formed between Cys3 and Cysl5, Cysl4 and Cys285, Cys34 and Cysl20 and Cys249 and Cys317; (ii) a glycosylation site, Asn257, was located on the surface; (iii) Glu40 was predicted to form a salt bridge with Arg43 on the distal side of the heme and was considered as a possible origin for the pH dependence of compound I formation; and (iv) two candidate substrate binding sites with a cluster of surface aromatic residues and flexible backbones were found in the refined model, consistent with the nature of known substrates of LiP. Based on these predicted structural features of the model, further theoretical and experimental studies are proposed to continue to elucidate the structure and function of LiP. Key words: cytochrome c peroxidase/homology modeling/ horseradish peroxidase/lignin peroxidase/manganese peroxidase

Introduction Peroxidases are heme proteins that catalyze the oxidation of substrates by hydrogen peroxide and substituted peroxides (Bosshard et al, 1991; Dunford, 1991). The iron atom in the heme unit of the resting state of peroxidases is in a ferric oxidation state, Fe(III). Binding of H2Q2 to the resting state of the enzyme leads to the formation of compound I, a porphyrin'+Fe50% identity) leads to a successful prediction of structures (Maggiora et al., 1991). Even when sequence homology is low (20-30%), structural homology often persists for proteins in the same family (Sutcliffe et al., 1987). However, additional information must be used in the prediction of a structure when there is low homology. Since CCP is the only protein with a known crystal structure in the peroxidase family, we have used multiple sequence alignments to help identify the reliably aligned regions of LiP with CCP. We have also used inferred structural information available from other peroxidases to assist in the construction of the variable regions. There are two current approaches for the prediction of variable regions, those that are 'knowledge-based' (Greer, 1981; Blundell et al., 1987) and those that are 'energy-based' (Moult and James, 1986; Bruccoleri and Karplus, 1987; Shenkin et al., 1987; Dudek and Sheraga, 1990). Knowledge-based methods search for the most probable conformations of these regions by using a library of existing structures of proteins of the same family. Energy-based methods use geometric criteria to sample all possible conformations of the region and select the lowest energy minima by energy minimizations and molecular dynamics (MD) simulations of these sampled conformations. In this work, the initial conformations of the variable regions were constructed by taking account of both empirical and energetic considerations. Empirical preference of conformation for a sequence segment was taken into account by secondary structure predictions, hydrophilic —hydrophobic (hydropathicity) analysis, and consideration of the particular protein environment in which these regions occur. Constrained energy minimization and MD simulations with the surrounding residues as steric constraints were carried out for these variable regions to relax bond lengths and angles and to eliminate unfavorable steric contacts in these regions. To assess the dynamic stability of the initial model, additional steps involving full protein MD simulations were taken after the construction of the initial model from sequence alignment and the four predicted disulfide bonds. Levin (1983) has shown that the predicted structure with the lowest energy is the closest to the native crystal structure of a protein. MD simulations of short 680

length on a crystal structure result in only a slightly deformed structure (van Gunsteren et al., 1983). By analogy, a reasonable model structure should be dynamically stable. In this work, full protein energy minimizations and MD simulations were performed during the model building for two purposes: (i) to identify dynamically unstable regions for further refinement; and (ii) as one criterion for judging the reliability of the model structure. Solvation effects were taken into account by explicitly simulating bound water molecules for the full protein. In addition to dynamic stability, another criterion used to evaluate the plausibility of the model structure is consistency with known experimental information for LiP. Although little is currently known about its 3D structure, experiments have revealed a qualitative picture of many key features of this protein that can be used to evaluate the predicted structure. A summary of the known structural, mechanistic and functional features of LiP that have been used to evaluate the model are given below. (i) The catalytic heme environment of LiP should be similar to that of CCP to allow the formation of compound I, which was detected in UV spectra (Andrawis et al., 1988; Marquez et al., 1988). Specifically: (a) on the distal side, the heme pocket should include LiP43R (de Ropp etai, 1991) and the heme pocket should be large enough to accommodate a H2O2 and have easy access to the solvent; and (b) on the proximal side, the heme unit should be directly ligated to the Ne atom of LiP176H, as suggested by NMR (de Ropp et al., 1991) and Resonance Raman (Kuila et al., 1985) studies. The N6 atom of LiP176H may also be hydrogen bonded to a nearby amino acid. (ii) The pH dependence of LiP activity near pH = 1, extrapolated from Cl~ inhibition in the pH range 2.5-7.0 (Cai and Tien, 1991), should be explained by an anionic amino acid forming a salt bridge with a positively charged group such as the side chain of LiP43R. (iii) LiP is N-glycosylated (Tien and Kirk, 1984; Paszczynski et al., 1986; Kuan and Tien, 1989) and an asparagine with the correct sequence pattern should be located on the surface. (iv) Model aromatic substrate studies (Kersten et al., 1985; Hammel et al., 1986; Kersten et al., 1990; Sarkanen, 1991) show that LiP displays low selectivity in substrate binding. NMR (Depillis et al., 1990) and I8O labeling (Kersten et al., 1990; Renganathan et al., 1986) studies also suggest that the substrate bound to LiP does not interact with the heme directly. Thus a surface substrate binding site should be found consisting of residues that show a large preference for aromatic substrates, that adopt a flexible conformation, and that are distant from the heme unit. In addition to agreement with known properties, new experiments are proposed to test the specific structural predictions of this model such as substrate binding sites, catalytic residues near the heme, and a surface glycosylation site. Theoretical studies of the binding of known substrates such as a series of PAH and methoxybenzenes to the model structure are also planned. The resulting properties of the substrate—enzyme complex can be compared with known activities of these substrates, thereby further assessing the ability of the model to elucidate function. Finally, once the crystal structure becomes available, the predicted structure of the model can be directly evaluated. Although the model will most likely not be exactly the same in all detail as the crystal structure, the goal of this work will have been achieved if the model can be used to provide possible explanations for existing experimental data and to suggest new experiments for studies of substrate—enzyme interactions.

Horoology modeling of lignin peroxidase

Sequence alignment and feasibility of modeling To assess the feasibility of homology modeling of LiP using the CCP crystal structure, their sequences have been aligned, the results analyzed, and additional structural information considered. There are several cDNA sequences of LiP available in the literature (Tien and Tu, 1987; de Boer et ai, 1987; Andrawis et al., 1989; Black and Reddy, 1991; Ritch et al., 1991). In the alignment, the sequence from a cDNA clone encoding LiP (cLiP2) from white-rot fungus Phanerochaete chrysosporium (Ritch et al., 1991) was used. The LiP sequence contains 343 amino acids compared with 294 of CCP. Sequence alignments Table I. Alignment statistics between CCP, LiP and MnP Score

LiP

UP (l-300) c MnP

CCP 3.43 5.84 8.27

LiP

36 07

Identity (%)"

Conservative replacement11

CCP 21.4 19.4 23.8

CCP 41 0 40.0

LiP

45.2

"Identity % are calculated from the number of identical residues in the alignment over the total number of residues of the shorter sequence. ''Conservative replacements are defined as mutations within trie following groups: (i) [A,V,L,I,M], (ii) [F,Y,W], (in) [S,T], (iv) [K,R], and (v) [D,E,N,Q]. Replacement of residues H, P, G and C was not considered similar to any amino acid in the calculation of similarity. T h e first 300 residues of LiP.

10 I CCP-L CCP-M MNP LIP

20 30 I I aaaaaaaaaaaaaaaaaaa

were performed with the Needleman and Wunsch algorithm (1970) using the program ALIGN (Orcutt etai, 1982). The Substitution Data Matrix (Summers, 1990; Summers and Karplus, 1992) resulted in conservation of the known catalytic residues, including CCP48R and CCP52H, and generally good alignment in the middle regions of the sequence alignment between LiP and CCP. This alignment is similar to a recent study on the plant peroxidase sequences (Welinder, 1991). Although the alignment using the Mutation Data Matrix (Dayhoff, 1978) gave a similar alignment score and percent identity, no residues were found to correspond to the distal catalytic residues of CCP and the LiP alignment. Therefore, the Substitution Data Matrix was used for all alignments. To examine the reliability of the alignment procedures, the sequences were aligned in both forward and reverse directions. The same result was obtained for both directions. An alignment score of 3.4SD, with amino acid identity of 21.4%, was achieved between LiP and CCP, as shown in Table I and Figure 1. This is higher than the 3.0SD that is needed for the alignment to be significant (Argos and McCaldon, 1988). The similarity between the sequences, based on conservative replacements, is 41.0%. Conservative replacements are defined as mutations with the following groups: (i) [A,V,L,I,M], (ii) [F,Y,W], (iii) [S,T], (iv) [K,R], and (v) [D,E,N,Q]. Replacement of residues H,P,G and C was not considered similar to any amino acid in the calculation of similarity. Additionally, after truncating 50

40 I

60 I

aaaaaaaa

aaa

PVL-TOLAWHTSCfrWD KHDNTCGSYGGTYI RFKK T^PLVHVASVEKGRSYEDFQKVYNAIALKLREDDEYDNYIGY-G TTPLVHVASVEKGRSY EDFQKVYNAI AIJOJffiDDEYDNYIGYGPVlJVRLAWHTSC --TWD 13 A from the Fe atom. It is connected to a network of water molecules, as shown in Figure 5. Instead, LiP40E can be considered as an alternative. The pKt of either an aspartate or a glutamate can be substantially decreased, by the formation of an ion pair with LiP43R. Based on our model structure, LiP40E does form a salt bridge with LiP43R, a residue thought to be important for compound I formation and hence is a more likely candidate for the observed pH dependence. This salt bridge results in an altered environment of LiP43R and may be responsible for die disappearance of the NMR signal of this residue (de Ropp, 1991). Another candidate for the observed pH dependence is one of the propionate groups of the porphyrin. Single site-directed mutagenesis experiments transforming LiP40E and LiP48D respectively, to a non-polar residue would test each hypothesis for the origin of this pH dependence. Substrate binding sites LiP binds not only lignin, but also PAHs (Hammel et al., 1986) and a broad range of methoxy benzene substrates (Kersten et al., 1985, 1990; Sarkanen, 1991). Aromatic radical cations have been detected with ESR during enzymatic reactions with the model substrates. These radical cation intermediates subsequently react with solvent water non-enzymatically to form quinone compounds as final products, similar to radical cation reactions observed in solution. It is also found that the activity of LiP toward the oxidation correlates with the substrate's ionization potential, that is, compounds with IP lower than a certain threshold are oxidized and those above are not. Experiments carried out using H218O and I8C>2 labeling found that the oxygen atom in the final quinone products formed after the LiP catalysis comes from solvent water, not from the ferryl oxygen of compound I. Mechanistic studies of LiP reactions with an irreversible inhibitor, phenylhydrazine (Depillis et al., 1990) show that the heme unit is not accessible to the inhibitor, since no covalent adduct is formed at the meso carbon of the porphyrin, a product observed

Fig. 7. CPK model of the first candidate binding site suggested for UP, with its partially exposed aromatic residues colored yellow. The proposed glycosylatkm site UP257N is also shown and colored magenta.

Fig. 8. CPK model of the second candidate binding site suggested for Lip, *•* its partially exposed aromatic residues colored yellow. 687

P.Du, J.R.Collins and G.H.Loew

in the phenylhydrazine inhibition of HRP, in which a direct heme-substrate interaction is certain (Ortiz de Montellano, 1987). It has been proposed that the inhibition of LiP activity by phenylhydrazine was achieved through a reaction between the phenyl radical intermediate and an amino acid of the protein (Depillis etal, 1990). These experimental studies taken together strongly indicate three important characteristics of the LiP catalytic mechanisms, (i) The substrate binding site is aromatic (or non-polar) and flexible in order for it to recognize and accommodate aromatic substrates of various sizes, (ii) The oxidation of substrates by LiP is by a two-sequential one-electron oxidation through a longrange electron transfer mechanism and is controlled mainly by the redox potential of the heme active center. There is no direct chemical interaction observed between the heme and the substrate, (iii) A substrate is bound to the surface of the enzyme to allow rapid reaction between the radical cation intermediate and solvent water. In our 3D model of LiP, two regions with these characteristics have been found and are proposed as candidate substrate binding sites. These sites emerged during the analysis of the refined model and were not explicitly part of the model building procedure. The first candidate, found in the refined model only, is rich in aromatic residues partially exposed to the surface, and is quite flexible, allowing it to accommodate substrates of different sizes. This site is located to the side of and slightly below the heme unit. It consists of LiP171W, LiP301F, LiP303F and LiP304F (Figure 7). The three phenylalanines are located at the turn of a flexible loop and are largely exposed to the surface. LiP171W is more buried than the phenylalanines and is close to the proximal histidine LiP176H both in distance (14 A) and in sequence. The second candidate appears in both the initial and the refined models and is also located on the surface on the distal side of the heme. CPK models show that it forms an aromatic cleft that could bind an aromatic substrate (Figure 8). The cluster of aromatic residues consists of LiP17W, LiP18F, UP97F, LiPl 1 IF and LiP129F. However, this area is less flexible, and the distance from LiPlTVV to the Fe atom is - 2 3 A, longer than the first candidate substrate binding site. Conclusions Both the initial and the refined models reveal common characteristic features predicted for the three-dimensional structure of LiP, including a common predicted substrate binding site in the N-terminal region. The results from the refined model of LiP illustrate an alternative conformation proposed for the unaligned C-terminal region. All of the structural properties of the three-dimensional model structure of LiP, predicted from its sequence homology with CCP, are consistent with the known experimental data. Further experimental studies are suggested to test the model, as summarized below. (i) (ii) (iii)

(iv)

688

The a helix content of UP is predicted to be nearly 50%, which can be estimated by circular dichroism experiments. A glycosylation site, LiP257N, was predicted to be on the surface at a helix-turn-helix motif. This site can be identified by biochemical experiments. Four disulfide bond linkages, LiP3C-LiP15C, LiP14C-LiP285C, LiP34C-LiP120C and LiP249CLiP317C, were predicted and can be identified with protein digesting enzymes and sequence analyses. On the proximal side of the heme, the axial ligand of Fe LiP176H is hydrogen bonded to LiP238D. The aspartate residues hydrogen bonded to the proximal histidine appear

to be universal in peroxidases and may affect peroxidase activity. Mutations on LiP238D would test this hypothesis for LiP. (v) Near LiP238D, LiP193F replaces CCP191W, providing an explanation for why compound I of LiP is stable to intramolecular electron transfer, and the prediction that compound ES would form in the LiP193W mutant of LiP. UV spectroscopy on compound I of the mutant LiP is expected to show a Soret band with unreduced intensity, characteristic of the compound ES of CCP. (vi) On the distal side, LiP43R and LiP47H are located in the heme cavity, consistent with known NMR experimental data. The importance of these residues in compound I formation can be probed via site-directed mutagenesis. (vii) LiP40E, which is predicted to be hydrogen bonded to LiP43R, is proposed to be responsible for the pH dependence at pH ~ 1 of the LiP activity. This prediction can be directly probed by mutation experiments on this residue. In addition, the model can be used to suggest experimental studies to identify residues that are predicted to be part of two candidate substrate binding sites. The suggested experiments are summarized below. (viii) Based on the refined model, the first candidate for the substrate binding site is proposed to be on the surface, ~ 14 A from LiP176H. This flexible site is composed of four aromatic residues, LiP171W, UP301F, LiP3O3F and LiP304F. If this is the preferred binding site, mutation of these residues to polar amino acids is expected to reduce the binding affinity for LiP of aromatic substrates such as PAHs and methoxybenzenes. (ix) The second possible substrate binding site appears in both the initial and the refined models and is also predicted to be on the surface and 23 A from the heme. It includes LiP17W, LiP18F, LiP97F, LiPl 1 IF and LiP129F. This site is less flexible than the first. Site-directed mutagenesis of residues in this putative binding site together with substrate binding studies is also suggested. In a parallel effort, theoretical studies of enzyme-substrate interactions are also planned to determine which, if any, of the two candidate substrate binding sites yield results that account for the known experimental activities of PAH and methoxybenzene model substrates. The dynamic properties of the enzyme — substrate complexes will be characterized and the preferred electron transfer pathways from substrates bound to the two substrate binding sites to the heme will be studied. Additionally, the possible use of LiP for oxidizing other aromatic environmental pollutants will be explored. The final verification of the model will come from the X-ray structural determination. The goal of this study is that the predicted structure be reliable enough to be useful in characterizing enzyme—substrate interactions and in addressing mechanistic questions such as preferred electron transfer pathways rather than a complete agreement of the model with the crystal structure. If this reliability is confirmed, the methods and procedures used here for model building can then be applied to other proteins for which no X-ray structures are available. Acknowledgements The authors thank Neena L. Summers for discussions on protein sequence alignment and for kindls prowding us with the program ALIGN and the scoring matrices pnor to its publication. We also thank Tom Poulos for suggesting the homologv

Homology modeling of lignin peroxidase modeling of LiP. Financial support from NSF (grant no. DMB-9096181) for computer time from the Pittsburg Supercomputer Center are gratefully acknowledged.

References Amit.A.G. and Manuzza.R.A. (1986) Science, 233, 747-753. Andrawis.A., Johnson,K A. and Tien.M. (1988) J Biol Chem., 263, 1195-1198. Andrawis.A., Pease.E.A. Kuan.l., Holzbaur.E. and Tien.M. (1989) Biochem. Biophys. Res. Commtn., 162, 673-680. Araiso.T. and Dunford.H.B. (1980) Biochem. Biophys. Res. Commun., 94, 1177-1182. Argos.S. and McCaldon.P. (1988) Genet. Engng, 10, 21-65. Black.A.K. and Reddy,C.A. (1991) Biochem Biophys Res. Commun., 179, 428-435. BhmdeU.T.L., Sibanda.B.L., Stemberg.MJ.E and ThomtonJ.M. (1987) Nature, 326, 347-352. de Boer.H.A., Zhang.Y.Z., Collins.C. and Reddy.C.A. (1987) Gene. 60, 93-102. Bosshard.H.R., Anni,H and Yotenani.T. (1991) In Everse.J., Everse.K.E. and Gnsham,M.B. (cds), Peraxidases in Chemistry and Biology. CRC Press, Boca Raton, FL pp. 51-138. Bruccoleri.R.E. and Karplus.M. (1987) Biopolymers, 26, 137-168. Brooks.B.R., Bruccoleri.R.E., Olafson.B.E.R., States.DJ., Swaminathan,S. and Karplus.M. (1984)/ Comput. Chem., 4, 187, QUANTA2.1 Program, Porygen Corp., Waltham, MA. Burns.P.S., Williams.R.J.P. and Wright.P.E (1975) / Chem. Soc. Chem. Commun., 795. Cai.D. and Tien.M. (1991) / Biol. Chem., 266, 14464-14469. Chance.B. (1949) Arch. Biochem. Biophys., 21, 416. Chiche.L., Caboriand.C, Hertz.A., Momon,J.-P., Castro,B. and Kollman.P (1989) Proteins, 6, 405-417. Chothia.C, Lesk.A.M., Levitt.M , Amit.A.G., Manuzza.R.A., Phillips.S.E.V. and Poljak.R.J (1986) Science, 233, 755-758. Chou.P.Y. and Fasman.G.D. (1978) Aram. Rev. Biochem., 47, 251-276. Collins.J.R and Loew.G.H. (1992) Int. J. Quantum Chem., in press. Crawford,R.L. (1981) Lignin Biodegredaaon and Transformation. John Wiley and Sons, New York Dayhoff.M.O. (1978) In Dabkott.M.O. (ed.), Atlas of Protein Sequence and Structure Vol. 5. suppl. 3, National Biomedical Research Foundation, Washington, DC. Depilhs.G.D., Wanishi.H., Gold.M.H and Ortiz de Montellano.P.R. (1990) Arch. Biochem. Biophys , 280, 217-223. Dudek,M.J. and Scheraga.H.A. (1990)7. Comp Chem., 11, 121-151. Dunford.H.B. and Araiso.T. (1979) Biochem Biophys. Res. Commun., 89, 764-768 Dunford.H.B. (1991) In Everse.J., Everse.K.E. and Gnsham.M.B (eds), Peraudases in Chemistry and Biology. CRC Press, Boca Raton, FL, pp. 1 - 2 4 . Erman.J.E., Vitello.L.B., MauroJ.M. and Kraut.J. (1989) Biochemistry, 28, 7992 Fernn.T.E., Huang.C.C, Jams.L E. and Langndge.R. (1988) J Mot. Graphics, 6, 13-27. Finzel,B.C.,Poulos,T.L. and KrautJ. (1984)/ Biol. Chem., 259, 13027-13036. GlennJ.K., Morgan.M.H., Mayfield.M.B., Kuwahara.M and Gold.M.H. (1983) Biochem. Biophys. Res. Commun., 114, 1077-1083. Goid.M., Wariishi.H and Valli.K (1989) In WhitakerJ.R. (ed ), Btocatalysis m Agricultural Biotechnology. ACS Symposium Series 389, pp. 127-140. GreerJ. (1981)/ Mol. Biol., 153, 1027-1042. van Gunstercn.W.F., Berendsen.H.J.C , HermansJ., Hol.W.G J. and Postma.J.P.M. (1983) Proc. Natl Acad. Sri. USA, 80, 4315-4319. Hammel.K., Kalyanaraman.B. and Kjrk.T K. (1986) / Bid. Chem., 261, 16948-16952 Hennssat.B., Saloheimo.M., Lavaitte.S. and Knowles.J K.C. (1990) Proteins, 8, 251-257. Kersten.P.J., Tien.M., Kalyanaraman.B. and Kirk.T.K. (1985)/ Biol. Chem , 260, 2609-2612. Kersten.P.J., Kalyanaraman.B., Hammel.K.E., Reinhammar.B. and Kirk.T.K (1990) Biochem. / , 268, 475-480. Kuan,I. and Tien.M. (1989) / Biol. Chem., 264, 20350-20355. Kuila.D., Tien.M., Fee.J.A. and Ondrias.M.R (1985) Biochemistry, 24, 3394-3397. Kuwahara.M., Gleen.J.K., Morgan,M.A. and Gold.M.H. (1984) FEBSLett., 16V, 247-250. Levitt.M. (1983) / Mol. Biol., 170. 723-764. Loew.G.H., CoHins,J.R. and A*e,F.U. (1989) Int. J. Quantum Chem. Quantum Biol. Symp., 16, 199-209. Loo.S. and Erman.L.E. (1975) Biochemistry, 14, 3467-3470.

Maggiora.G.M., Mao,B., Chou,K.G. and Narasimmhan.S.L. (1991) In Suelter.C.H. (ed.), Methods of Biochemical Analysis, VoL 35: Protein Structure Determination. John Wiley & Sons, Inc. New York, p. 57. Margalit.H., SpougeJ., CoroetteJ., Cease,K.B., Delisi.C. and Berzofsky.J.A. (1987) / Immunol., 138, 2213-2229. Marquez.L, Wariishi.H., Dunford.H.B. and Gold.M.H. (1988)/ Biol. Chem., 263, 10549. Milhs.C.D., Cai.D., Stankvkh,M.T. and Tien.M. (1989) Biochemistry, 28, 8484-8489 Monshima.I. and Ogawa.S. (1979) / Biol. Chem., 254, 2814-2820. Moult.J. and James.M.N.G. (1986) Proteins, 1, 146-163. Needleman.S.B. and Wunsch.C.D. (1970)/ Mol. Biol., 48, 443-453. Newberger.A., Gottschalk.A., Marshal.R.D. and Spiro.R.D. (1972) In Gottschalk.A. (ed.) The Gtycoproteins: Their Composition, Structure, and Function. FJsevier/North Holland Biomedical Press, Amsterdam, pp. 450-490. Orcutt.B.C, Dayhoff.M.O. and Barker.W.C. (1982) ALIGN, National Biomedical Foundation, Georgetown University Medical Center, Washington, DC. Ortiz de Montellano.P.R. (1987) Ace. Chem. Res., 20, 289-294. Paszczynski.A., Huynh,V.-B. and Crawford.R. (1986) Arch. Biochem. Biophys., 244, 750-765. PonderJ.W. and Richards.F.M. (1987) J. Mol. Biol, 193, 775-791 Poulos.T.L. and Finzel.B.C. (1984) In Hearn.M.T.W. (ed.), Peptide and Protein Review. Vol. 4, Marcel Dekker, Inc., New York, pp. 115-171. Poulos.T L., Freer.S.T., Alden.R.A., Edwards.S.L., Skogland.U., Takio.K., Eriksson,B., Xuong.N., Yonetani.T and Kraut.J. (1980) Biol. Chem., 255, 575-580. Pribnow.D., Mayfield.M.B , ValerieJ.N., BrownJ.A. and Gold.M.H. (1989) / Biol. Chem., 264, 5036-5040. Renganathan.V., Miki.K. andGold.M.H. (1986) Arch. Biochem. Biophys., 246, 155-161. Ritch.T.G., Nipper.V.J., Akileswaren.L., Pribnow.D. and Gold.M.H. (1991) Gene, 107, 119-126. deRoppJS.,LaMar,G.N., Warrishi.H. and GokLM.H. (1991)/ Biol. Chem., 266, 15001-15008. Sarkanen.S., Razal.R.A., Piccariello.T., Yamamoto.E. and Lewis,N G. (1991) / Biol. Chem., 266, 3636-3643. Satterlee,J.D. and Erman.J.E. (1991) Biochemistry, 30, 4398-4405. Schejter.A., Lanir.A. and Epstein.N. (1976) Arch. Biochem. Biophys., 174, 36-44. Shenken.P.S., Yarmush,D.L., Fine.R.M., Wang.H. and Levinthal.C. (1987) Biopolymers, 26, 2053-2085. Singh.U.C, Weiner.P.K., CaldwelU.W. and Kolman.P.A. (1986) AMBER UCSF Version 3 0a, Department of Pharmaceutical Chemistry, University of California, San Francisco. Sivaraja.M., Goodin.D.B., Smith.M. and Hoffman.B.M. (1989) Science, 245, 738. Summers.N.L. (1990) Monsanto Co. internal report MSL-10930, December 1990. Summers.N.L. and Karplus,M. (1992) / Mol. Biol., in press. Sutcliffe.M.J., Haneef.I , Carney,D. and Blundell.T.L. (1987) Protein Engng, 1, 377-384. Tanabal.V., La Mar.G N. and de RoppJ.S. (1988) Biochemistry, 27, 5400-5407 Tien.M. (1987) CRC Critical Rev. Microbiol., 15, 141-168. Tien.M. and Kirk.T.K. (1983) Srience. 221, 661-662. Tien.M. and Kirk.T.K. (1984) Proc. Natl Acad. Sci. USA, 81, 2280-2284. Tien.M. and Tu.C.-P. (1987) Nature, 326, 520-523. Uno.T., Nishimura.Y , Tsuboi.M. and Makinu.R. (1987)/ Biol. Chem., 262, 4549. Wade.R.C, Mazor.M.H., McCammonJ.A. and Quiocho.F.A. (1990) / Am. Chem. Soc., 1L2, 7057-7059. Weiner.S.J., Kollman.P.A., Case.D.A., Sigh.U.C, Ghio.C , Alagona,G., Profeta.S Jr, and Weiner.P. (1984) / Am. Chem. Soc., 106, 765. Welinder.K.G (1976) FEBS Lett., 72, 19-23. Welinder.K.G (1985) Eur. J Biochem., 151, 497-503. Welinder.K.G. (1991) In LobarzewskiJ., Greppin.H., Penel.C. and Gasper.T. (eds), Biochemical, Molecular and Physiological Aspects of Plant Peroxiaases. University of Geneva, pp. 3 - 1 3 . Received on May 15, 1992; revised on July 31, 1992; accepted on August 4, 1992

Appendix Al. Construction of the core-framework The sequence alignment provides a basis for the initial construction of the coreframework resulting from the assignment of coordinates of reliably aligned residues in LiP, using the crystal structures of CCP. The reliably aligned residues in LiP are defined as those that align with CCP in exactly the same way as MnP does, as shown in the boxes in Figure 1. In addition to these residues, two more segments,

689

P.Du, J.R.CoUins and G.H.Loew LiP22D-LiP32G and LiP76T-LiP83P, with LiP and CCP aligned, were also included in the core-framework. Although these segments were constructed directly from the CCP coordinates, they were allowed to deviate from the CCP structure during the construction of insertions and deletions. Backbone coordinates of the residues in the core-framework model were taken directly from the CCP crystal structure. Residues that are in a helices are labeled with an 'a' in Figure 1. For side chain substitutions of equal or shorter length, for example Ala for He and Leu for De, the dihedral angles from the side chain conformation in the crystal structure were used for the new residue If the side chain in the model was longer than the corresponding residue in the crystal structure, for example Leu for Ala, the dihedral angles from the crystal structure were taken for the corresponding ones in the new structure and the rest of the side chain was set to an extended conformation. If close contacts with other atoms occurred, the side chain was changed to one of the gauche conformations. Extended and gauche conformations are the most common side chain conformations found in proteins (Ponder and Richards, 1987). The resulting initial model of the coreframework and its coordinates was kept mostly unchanged during the subsequent construction of insertions and deletions.

Appendix A2. Prediction of disulfide linkages In contrast to CCP, which has only one cysteine residue, the cDNA sequence of LiP has eight cysteines, indicating that four disulfide bonds are possible, as experimentally determined for HRP (Welinder, 1976). From the core-framework model, one such cysteine, LiP249C, is predicted to be located in the proximal domain of LiP and six cysteines, LiP3C, LiP14C, LiPI5C, LiP34C, LiP120C and LJP285C, are all predicted to be in the distal domain. The position of LJP317C, in the middle of the extended C-terminus of LiP that is not aligned with the sequence of CCP, was not initially predicted. Pairwise spatial relationships were found which allowed the formation of all four disulfide bonds, one in the proximal domain and three in the distal domain. The most obvious disulfide pair was LiP34C and LiPI20C. LiP120C occurred at a conserved turn region of the core-framework in stage 1 and its position was completely determined by CCP119M. UP34C, one of the inserted residues, occurs in a region that is spatially close to LiP120C (the C a distance was ~ 7 A). Backbone dihedral angles of the 10 residue segment (LiP32G - LiP41S) were adjusted so that the distance between the two S atoms was — 2 0 A and the dihedral angle along the SS bond was near 90°. The last few residues in this segment were allowed to form a helical conformation that became part of the helix containing the distal catalytic residues LiP43R and LiP47H The initial positions of the other four cysteines in the distal domain could only be estimated because three (LiP3C, LiP14C and UP15C) appeared in the variable N-terminal region (LiPlA-LiP32G) and one (LiP285Q occurred in the beginning portion of the unahgned C-terminus (LiP277Q-LiP287D). An initial conformation of the N-terminus was built by using the corresponding residues of CCP (UP227Q-UP287D), and the first 11 residues of the C-tcrminus were constructed by using the C-terminus of CCP (CCP274I-CCP294L). Detailed examination of the spatial relationship of these four cysteines in these initial conformations showed that when UP285C is paired with LiP14C and UP3C with LiP15C, a cross-over of the two disulfide bridges could be avoided. The connection of these two disulfide bridges was achieved by rotating the backbone dihedral angles of residues LiP3C-LiP7K and LiP277Q-LiP285C. Substantial adjustments of these UP residues were performed in order to make these disulfide connections. The large deviations from the CCP template structure in these two regions are ambiguous since the LiP N-terminus (LiPlA-LiP32G) was identified as unreliable from the three-way alignment with MnP and the initial conformation of the LiP277Q —LiP287D segment was based on unaligned C-terminal residues of CCP. After the formation of the initial conformations of the above three disulfide bonds, constrained energy minimization and MD simulations of this region were performed to relax bond lengths and bond angles and to eliminate bad contacts between some atoms. During the refinement, the surrounding residues were fixed and used as constraints. The remaining pair of cysteines, LiP249C and LiP317C. were then used to form the fourth disulfide bond Since the position of LiP249C was fully determined and that of LiP317C was unknown, LiP317 was placed near the LiP249C residue for the formation of this disulfide bond connection.

Appendix A3. Initial structure of insertions and deletions All insertions and deletions involved one to five residues (Figure 1) One or two residue gaps include deletions of CCP80P, CCP81S, CCP101W, CCPBVand CCP132D, and insertions of LiP123A, LiP19IL and LiP229M. A three residue deletion. CCP153Y-CCP155R, was found in the middle of a short helical segment Three insertions with four to five residues. LiP54P-LiP58A. LiP72M-LiP75D and L1PI6IA -LiP164F. occurred in loops Since no deletions or insertions were longer than five residues, secondary structure predictions were not used in the construction of the initial conformations of these regions To build initial conformations, reoricntation of several residues

690

on either end of the region was required Specifically, the backbone dihedral angles and \p of these residues and adjacent ones were rotated within the range of commonly found values, i.e -180

Homology modeling of a heme protein, lignin peroxidase, from the crystal structure of cytochrome c peroxidase.

A 3-dimensional model of lignin peroxidase (LiP) was constructed based on its sequence homology with other peroxidases, particularly cytochrome c pero...
8MB Sizes 0 Downloads 0 Views