Effects of knot type in the folding of topologically complex lattice proteins Miguel A. Soler, Ana Nunes, and Patrícia F. N. Faísca Citation: The Journal of Chemical Physics 141, 025101 (2014); doi: 10.1063/1.4886401 View online: http://dx.doi.org/10.1063/1.4886401 View Table of Contents: http://scitation.aip.org/content/aip/journal/jcp/141/2?ver=pdfcov Published by the AIP Publishing Articles you may be interested in Fold and sequence independent protein binding sites prediction algorithm AIP Conf. Proc. 1504, 729 (2012); 10.1063/1.4771798 Conformational transition free energy profiles of an adsorbed, lattice model protein by multicanonical Monte Carlo simulation J. Chem. Phys. 122, 084707 (2005); 10.1063/1.1849772 Coupled folding–binding versus docking: A lattice model study J. Chem. Phys. 120, 3983 (2004); 10.1063/1.1643900 Topological complexity, contact order, and protein folding rates J. Chem. Phys. 117, 8587 (2002); 10.1063/1.1511509 On the role of conformational geometry in protein folding J. Chem. Phys. 111, 10375 (1999); 10.1063/1.480387

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 130.239.20.174 On: Sat, 06 Sep 2014 02:51:27

THE JOURNAL OF CHEMICAL PHYSICS 141, 025101 (2014)

Effects of knot type in the folding of topologically complex lattice proteins Miguel A. Soler, Ana Nunes, and Patrícia F. N. Faísca Centro de Física da Matéria Condensada and Departamento de Física, Faculdade de Ciências, Universidade de Lisboa, Av. Prof. Gama Pinto 2, 1649-003 Lisboa, Portugal

(Received 3 May 2014; accepted 20 June 2014; published online 9 July 2014) The folding properties of a protein whose native structure contains a 52 knot are investigated by means of extensive Monte Carlo simulations of a simple lattice model and compared with those of a 31 knot. A 52 knot embedded in the native structure enhances the kinetic stability of the carrier lattice protein in a way that is clearly more pronounced than in the case of the 31 knot. However, this happens at the expense of a severe loss in folding efficiency, an observation that is consistent with the relative abundance of 31 and 52 knots in the Protein Data Bank. The folding mechanism of the 52 knot shares with that of the 31 knot the occurrence of a threading movement of the chain terminus that lays closer to the knotted core. However, co-concomitant knotting and folding in the 52 knot occurs with negligible probability, in sharp contrast to what is observed for the 31 knot. The study of several single point mutations highlights the importance in the folding of knotted proteins of the so-called structural mutations (i.e., energetic perturbations of native interactions between residues that are critical for knotting but not for folding). On the other hand, the present study predicts that mutations that perturb the folding transition state may significantly enhance the kinetic stability of knotted proteins provided they involve residues located within the knotted core. © 2014 AIP Publishing LLC. [http://dx.doi.org/10.1063/1.4886401] I. INTRODUCTION

Knotted proteins are proteins that contain a topological knot in their native structure. In 1994 Mansfield investigated for the very first time the existence of knotted proteins in the Protein Data Bank (PDB).1 Out of the 400 entries surveyed only one, that corresponding to protein carbonic anhydrase B (CAB, PDB ID: 2cab), was found to be knotted.1 Indeed, the backbone of CAB is arranged in the form of a trefoil (or 31 ) knot, i.e., a knot for which there is no planar projection with less than three crossings. Mansfield also noticed that the trefoil knot in CAB is “incipient” or “loosely formed” because it suffices to remove a few residues from either terminus to untangle the protein. This seminal investigation raised some scepticism regarding the idea that there could be knots in proteins. However, after the discovery by Taylor of a deep knot in protein acetohydroxy acid isomeroreductase (PDB ID: 1yveI)2 this view changed radically. This knot is classified as deep because it can only be removed if 70 residues are deleted from the carboxy terminus or 245 residues are deleted from the amino terminus. The most recent survey of the PDB found 398 knotted proteins amongst the 79 224 analysed entries.3 Clearly, the most frequent knot type found in proteins is the 31 knot but there are also some proteins with figure-eight (or 41 ) knots, and a few examples of penta (or 52 ) knots.4, 5 So far the most topologically complex knot found in the PDB is the stevedore (or 61 ) knot with six crossings on a planar projection.6 It was detected in only one protein, α-haloacid dehalogenase I (PDB ID: 3bjx). While these findings indicate that knotted proteins do exist they are considered statistically rare. This is so because knots are orders of magnitude less frequent in proteins than in random polymers of comparable length, compactness, and flexibility.7 Furthermore, in the 0021-9606/2014/141(2)/025101/10/$30.00

PDB the relative frequency with respect to the trefoil of topologically complex knots (e.g., the knot 52 ) is also orders of magnitude smaller than that exhibited by random polymers of comparable length,7, 8 indicating that knotted proteins become more elusive as their topological “complexity” increases. The recognition that knotted proteins exist sparked two immediate questions: (1) how do such topologically complex systems fold? and (2) what is the functional advantage of knots in proteins? To solve these challenges researchers have been developing a growing body of experimental and theoretical work; perhaps not surprisingly, these investigations have been focusing on knotted trefoils. Seminal studies by Jackson and co-workers on proteins YbeA and YibK9–12 (α/β knotted methyltransferases from prokaryotes) recently culminated in a series of beautifully designed experiments which showed that newly translated polypeptide chains (which are necessarily unknotted) can actually tangle spontaneously and without forming misfolded states.13 However, these investigations also made it rather clear that the folding speed of unfolded (and, most importantly, unknotted) chains is very slow (and up to 1.5 orders of magnitude slower than that recorded in refolding experiments following chemical denaturation,13 which are likely to start from a denatured but yet pre-knotted conformation14 ). They also showed that knotting occurs posttranslationally, is rate-limiting, and is significantly accelerated by the bacterial chaperonin GroEL-GroES. In line with these findings both simulation and experimental results indicate that the folding speed of knotted proteins is significantly smaller than that of control systems (i.e., unknotted proteins that represent minimal structural modifications of their knotted counterparts),15–18 The kinetic performance of knotted proteins stems most likely from their peculiar folding mechanism. Indeed, microscopic insights gained from

141, 025101-1

© 2014 AIP Publishing LLC

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 130.239.20.174 On: Sat, 06 Sep 2014 02:51:27

025101-2

Soler, Nunes, and Faísca

computer simulations based on a wide array of protein representations and sampling strategies19–22 (ranging from Monte Carlo simulations of lattice models23 to Molecular Dynamics (MD) in explicit solvent24 ), have put forward a remarkably choreographed picture of their folding process. The picture conveyed by simulations is that trefoil knotting results from the threading movement of one of the protein termini (typically the C-terminus that frequently lays closer to the knotted core) through a native knotting loop formed by the remainder of the chain (like a thread through the eye of a needle). This threading event may occur via a slipknotted conformation25 if the protein terminus is arranged into a hairpin-like conformation.20, 24 Slow folding rates may thus be the natural outcome of a restrictive or smaller number of productive folding pathways. Indeed, if folding does not start from the “right” conformation, or if it follows an incorrect order of events, it will be necessary to backtrack16, 21 and start again until the protein enters the correct route, decreasing the folding speed. In Ref. 7 Grosberg and co-workers proposed that knots in proteins are rare because their local geometrical features (e.g., subchain size and subchain interpenetration) do not favour tangling of the backbone. On the other hand, the slower folding rates of knotted proteins indicate that knotting is not good for folding. Why then are these topologically complex folds evolutionary conserved?3 The biological oriented mind may think that the occurrence of knots in protein results into some added functional advantage. In line with this hypothesis, recent simulation studies predicted that knots in proteins are a source of kinetic stability,16 and several proposals for the effects of knots (including that of an enhanced thermal26 and structural stability3, 4, 26 ) have been put forward in the literature. However, in the majority of cases, it is not possible to determine the biological advantage of knotted proteins.13 Therefore, one cannot rule out the possibility that they do not carry any functional advantage at all. It may just be the case that knotted proteins have withstood evolutionary pressure because their folding process is assisted by chaperonins,13 which appear to have arisen very early during the evolution of densely crowded cells as a way to minimize protein aggregation.27 Another relevant question concerns the frequency of knot types. Why is the trefoil the most frequent knot type found in the PDB? A likely explanation is that as topological “complexity” of knots increases, folding becomes more and more intricate, less efficient, and consequently less prone to evolutionary selection and conservation. Unfortunately, with a few exceptions6, 28 little is known about the folding process of proteins with knots more intricate than the trefoil. The present study seeks to fill in this gap by exploring in detail the folding properties of a lattice protein with a 52 knot. The 52 knot has five projected crossings and was originally found in protein human ubiquitin C-terminal hydrolase (UCH-L3)4 (PDB ID:1xd3). So far, it has not been found in any other protein family. Based on its function (deconjugation of ubiquitin) researchers proposed that the knot in UCH-L3 protects it from degradation by sterically precluding translocation through the proteasome pore.4 Seminal studies by Jackson suggested that UCH-L3 folds reversibly, without entering any deep kinetic traps, through a folding mechanism that in-

J. Chem. Phys. 141, 025101 (2014)

volves two low populated intermediate states forming from the denatured state.28 However, since these studies were based on chemical denaturation the possibility that the backbone remained knotted in the denatured state should not be ruled out. Therefore, the reported insights on the effects of knotting in the folding properties of UCH-L3 should be interpreted with caution. Here we consider a lattice model of a 52 knot and use Monte Carlo (MC) simulations to assess its folding behavior. MC simulations of lattice models have a long tradition in the study of protein folding29–38 (and more recently protein aggregation39–41 ) because its computational feasibility renders accurate estimates of thermodynamics and kinetics. In the context of lattice models it is straightforward to create an unknotted conformation by minimally modifying a knotted one.23 Therefore, this simple protein representation provides an adequate framework to determine the effects of knotting on protein folding.16 Here we investigate the effects of a 52 knot on protein folding properties and compare our results with previous and new findings on the 31 knot. In doing so, we seek to predict novel features of the folding of knotted proteins that might provide insight into their (lower) frequency in the PDB. We pay particular attention to the effects of single point mutations on the thermodynamics and kinetics of folding. This article is organized as follows. In the section titled Model and Methods we provide a brief summary of the model and methods adopted in our study. Then we present and analyse the results, and in the Conclusions we draw some concluding remarks. II. MODEL AND METHODS A. The simple lattice Go¯ model and Monte Carlo folding simulation

We consider a simple three-dimensional lattice model of a protein molecule with chain length N. In the simple lattice representation the protein is reduced to its backbone structure: amino acids, modeled by beads of uniform size, are placed on the vertices of a regular three-dimensional lattice and the peptide bond, which covalently links amino acids along the polypeptide chain, is represented by sticks with uniform (unit) length corresponding to the lattice spacing. In order to satisfy the excluded volume constraint only one bead is allowed per lattice site. Protein energetics is modeled with the G¯o potential.42 In the G¯o potential the energy of a conformation, defined by the set of bead coordinates {ri }, is given by the contact Hamiltonian  H ({ri }) = ε(ri − rj ), (1) i>j

where ε = −1 is the (uniform) interaction energy parameter and the contact function (ri − rj ) is unity only if beads i and j form a native contact and is zero otherwise. In order to mimic protein relaxation towards the native state we use the Metropolis Monte Carlo (MC) algorithm43 together with a local move set that includes corner-flips and end-moves (which displace one bead at a time; the end-

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 130.239.20.174 On: Sat, 06 Sep 2014 02:51:27

025101-3

Soler, Nunes, and Faísca

moves are exclusively used to move the chain’s termini and the corner-flip is used to displace all the other beads in the chain) and the crankshaft move (which involves the simultaneous displacement of two beads except termini beads). The adequacy of the adopted move set to study the dynamics of knotted polymers was established in Ref. 44. A MC simulation starts from a randomly generated unfolded conformation and folding progress is monitored through several properties (e.g., the fraction of the established native contacts, Q). Further details on the adopted simulation algorithm can be found in Refs. 45–47.

B. Folding thermodynamics

In order to explore the thermodynamics of the folding transition and compute equilibrium properties we have conducted long replica-exchange (RE) MC simulations at 40 different temperatures.48 Each MC trajectory consists of – at least – 109 MC-steps per residue after equilibration. We swap replicas every 106 MC steps, which is about one order of magnitude larger than the largest auto-correlation time for the energy recorded in simulations at fixed temperature, which allows the replicas to equilibrate between consecutive RE attempts. The acceptance ratio for the RE is high (>80%) and each replica reliably and repeatedly visits all the temperatures in the grid with cycle time of 40 RE moves. A single simulation comprises at least 25 full cycles. This guarantees good convergence of the data. The results reported here correspond to an average of three RE simulations. The heat capacity Cv is evaluated from the mean squared fluctuations in energy at each temperature considered in the RE simulations according to the definition Cv = (E 2  − E2 )/T 2 . The melting temperature Tm (also known as folding or transition temperature) is defined as the temperature at which the unfolded and native states are equally populated at equilibrium. Here, as well as in experiments in vitro, Tm is estimated as the temperature at which the heat capacity Cv attains its maximum value. The free energy as a function of selected reaction coordinates was evaluated with the weighted histogram analysis (WHAM) method.49

J. Chem. Phys. 141, 025101 (2014)

knotted we use a version of the Koniaris-Muthakumar-Taylor (KMT) algorithm2 adapted for lattice systems. Details on the adopted procedure can be found in Ref. 23.

E. Structural clustering

In order to determine the relevant conformational classes present in an ensemble of conformations we have used the hierarchical clustering algorithm jclust available in the MMTSB tool set.50 Since we are using a lattice model, clustering is done based on contact map similarity. In this study structural clustering is performed over a starting ensemble containing 2000 uncorrelated conformations that were collected from many independent folding runs at Tm . From each identified cluster the so-called cluster representative is the conformation that is the closest to the cluster’s centroid. III. RESULTS A. Model systems

In this work we consider a lattice protein with chain length N = 52, which was designed by hand to have a 52 knot in its native structure (Fig. 1(a)). The knotted core, i.e., the

C. Folding kinetics

To obtain kinetic properties such as the folding rate, we have carried out fixed temperature MC simulations at Tm . To get statistically significant kinetic measurements, we considered 2000 independent folding runs. The corresponding folding times (i.e., first passage times) allow evaluating the distribution of proteins which remain unfolded as a function of MC “time” (i.e., MC steps). The folding rate is given by the slope of the linear fitting of this distribution to a single-exponential decay.34

D. Knot detection

The native structure of the lattice protein studied in this work contains a 52 knot. In order to determine whether a conformation sampled in the course of the folding simulation is

FIG. 1. Lattice protein with a 52 knot and its unknotted counterpart. Three dimensional representation of the 52 knot (a) and unknotted lattice protein (b) investigated in this study. The subscript 2 in 52 stands for 2nd knot with five crossings according to standard knot tables. In (a) the knotted core, i.e., the minimal segment of the backbone that contains the 52 knot is highlighted in blue and extends from residue 21 to residue 51. Panels (c) and (d) represent the contact maps of the knot and of the unknot, respectively. In each native structure there are 52 native contacts. The 25 dots in blue in the contact map of the knot (a) represent contacts establishing between the residues that make up the knotted core. In order to facilitate the comparison of the two contact maps we have replicated the contact map of the knotted system on the contact map of the unknotted one, and represented it with brown dots.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 130.239.20.174 On: Sat, 06 Sep 2014 02:51:27

025101-4

Soler, Nunes, and Faísca

minimal segment of the chain that contains the pentaknot is located two residues away from one of the termini and 20 residues away from the other. Since it is enough to remove a few residues from one of the termini to untie the knot, this knot 52 classifies as shallow. This feature is shared with the 52 knot in chain A of protein UCH-L3 (PDB ID: 1xd3A), as well as with the 52 knot in the human protein ubiquitin hydrolase UCH-L1 (PDB ID: 2etl) for which the deletion of 11 and 7 residues, respectively, is enough to eliminate the knot. Interestingly, protein knots of type 52 identified so far are typically shallow knots.4, 5 We note that upon removing the first 20 residues, the remaining protein backbone still contains a trefoil knot (see Fig. 1 of the supplementary material56 ). In Figure 1(b) we report the three dimensional structure of the unknotted native structure that was constructed by appropriately tweaking the backbone of the 52 knot. The unknotted model protein can be used as a control system in this study because its native structure represents a minimal modification of the native structure of the 52 knot. This similarity is apparent from visual inspection of the three-dimensional structures and also by comparing the corresponding native contact maps (Figs. 1(c) and 1(d)).

B. Folding thermodynamics

We start by exploring the impact of the 52 knot in several thermodynamic properties of the folding transition. In previous reports we systematically observed that the existence of increasingly deeper knots of type 31 in the native structure of lattice protein systems does not enhance the thermal stability of their carriers.16 We took these results as an indication that knots of type 31 do not imprint such functional advantage to proteins. Since the 52 knot investigated here is topologically more complex its occurrence in the native structure may cause an increase of thermal stability. In order to test this hypothesis we measured the heat capacity as a function of temperature for the 52 knot and for its unknotted homologue. The curves reported in Figure 2(a) show that the melting temperature Tm , a quantifier of protein’s thermal stability, is the same for both lattice systems suggesting that the topological “complexity” of knots does not play a role in modulating thermal stability. However, much like the 31 knot investigated in our previous studies,16 the 52 knot investigated here is also shallow, a feature that may smear out the effects of knot type on thermal stability. Therefore, we considered additional 52 knots with the knotted core placed deeper inside the protein backbone. These systems were built from the original one by adding up to six beads to the chain terminus that is closest to the knotted core. The results recorded for the deeper 52 knots are in line with those obtained for the shallow knot (see Fig. 2 of the supplementary material56 ) thus strengthening the idea that knots 52 do not enhance the protein’s thermal stability. Next we evaluated the free energy profiles (Fig. 2(b)) and surfaces (Figs. 2(c) and 2(d)) at Tm. The free energy profiles represent a one-dimensional projection of the free energy onto the fraction of native contacts Q, which is considered an appropriate reaction coordinate for model systems with en-

J. Chem. Phys. 141, 025101 (2014)

FIG. 2. Thermodynamics of the folding transition. (a) Heat capacity for the 52 knot and unknot as a function of temperature, with the melting temperature, i.e., the temperature at which the heat capacity peaks being the same for both systems (Tm = 0.67). (b) Free energy profiles evaluated at Tm represent a one-dimensional projection of the free energy onto the reaction coordinate fraction of native contacts Q. In the free energy profiles the transition state region is located at 0.6 < Q < 0.4. Note that in the lattice representation the native conformation is unambiguously defined by the total number of native contacts which results into a very narrow free energy minimum located at Q = 1. The free energy surfaces of the knot (c) and unknot (d) evaluated at Tm represent a projection of the free energy onto reaction coordinates Q and the radius of gyration, Rg .

ergetics modeled by the G¯o potential.51 The curves for the knotted and unknotted protein are remarkably similar and the only noticeable (but marginal) difference is a slightly broader transition state region for the knotted system. On the other hand, the free energy surfaces (Figs. 2(c) and 2(d)) showing the projection of the free energy on the reaction coordinates Q and radius of gyration Rg , reveal a narrower and less extended native basin for the knot. This shrinking of the native basin was not observed previously for knotted trefoils and may be taken as an indication that topologically complex knots enhance the structural stability (or rigidity) of the native structure.

C. Folding kinetics

In a recent study Mallam and Jackson used a cell-free translating system to show that methyltransferases YibK and YibA can spontaneously self-tie without the assistance of chaperones although with a folding rate that is up to 1.5 orders of magnitude slower than that registered in refolding experiments following chemical denaturation.13 These studies also showed that the folding rate of these particular knotted trefoils is specifically and significantly accelerated by the GroEL– GroES chaperonin system. Furthermore, our previous studies with lattice knotted trefoils,16, 17, 23 as well as experimental15 and computational studies undertaken by others,18 showed that deep knotted trefoils fold substantially slower than unknotted control systems. A slower folding rate thus appears to be a characteristic trait of deeply knotted trefoil proteins such as YibK and YibA.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 130.239.20.174 On: Sat, 06 Sep 2014 02:51:27

025101-5

Soler, Nunes, and Faísca

FIG. 3. Folding kinetics and probability of knot formation at Tm . Estimation of the folding rate (a) and unfolding rate (b) from data collected at fixed temperature Tm for the 52 knot and its unknot (indicated by u(52 )). The (un)folding rate constant is given by the slope of the regression lines. For the sake of comparison we also show the curves for the 31 knot (and corresponding unknot, u(31 )), studied in previous reports. (c) Knotting probability as a function of reaction coordinate Q where the transition state region is highlighted.

J. Chem. Phys. 141, 025101 (2014)

Another relevant kinetic property is the knotting probability, pknot , i.e., the probability of knot occurrence in an ensemble of conformations. More precisely, the dependence of pknot on a suitably chosen reaction coordinate such as the fraction of native contacts, Q, provides insight onto the knotting mechanism. In order to evaluate pknot vs. Q we constructed 12 ensembles of conformations (starting at Q = 0.08 and up to Q = 0.92), with each ensemble containing 2000 uncorrelated conformations that were extracted from many independent folding runs at Tm . Results reported in Figure 3(c) compare the knotting probability of the 52 knot with that of the 31 knot17 and highlight the fact that knotting is a particularly late folding event for the pentaknot. Indeed, it occurs with nonnegligible probability (pknot = 0.12) only when Q ∼ 0.7 (note that for this value of Q pknot is already 0.82 for the knotted trefoil). The peak of the free energy barrier signals the folding transition state (0.4 < Q < 0.6), and therefore the folding transition. The results reported in Figure 3(c) also show that the probability of knotting to occur concomitantly with folding is 8.2% in the case of the 52 knot, versus 39% in the case of the 31 knot. Knotting should occur considerably late when the knot is of higher complexity because the knotted core region of the 52 knot (comprising 30 residues that represent 60% of the chain length) is larger than that of 31 knot (comprising 20 residues that represent 24% of the chain length) and therefore for the knot 52 the threading of the chain terminus requires a larger fraction of residues to be in their native positions.

D. Knotting mechanism

Intuitively one may be led to think that a more intricate knot like the pantaknot should play a more stringent effect in the folding rate of the lattice protein that carries it. In order to explore this and other aspects of the kinetic behavior of the 52 knot we measured its folding (and unfolding) rates as well as those of its unknotted counterpart and compared the results with those previously obtained for a knotted trefoil.16 Since the pentaknot studied here is shallow, for the sake of comparison we also considered a shallow trefoil knot. Results reported in Figure 3(a) are illuminating: the ratio between the folding rate constants of the knot (kf knot ) and the unknot (kf unknot ), rf = kf knot /kf unknot , shows that the folding process of the pentaknot is much slower (rf = 0.61) than that of the knotted trefoil, which folds at a similar rate as its unknotted homologue (rf = 0.95). This result is particularly interesting because it predicts that a shallow but topologically intricate knot is able to significantly impair the folding speed of its carrier. An important prediction from our previous study is that knotted trefoils, even shallow ones, may act as a source of kinetic stability.16 Could the pentaknot have a more pronounced effect than the 31 knot as enhancer of kinetic stability? In order to answer this question we measured the unfolding rate ku (Fig. 3(b)), which is an indicator of kinetic stability. The ratio of the unfolding rate constants (ru = ku knot /ku unknot ) confirms indeed that the 52 knot (ru = 0.59) is considerably more kinetically stable than the 31 knot (ru = 0.85).

Results reported in the section titled Folding kinetics show that knotting occurs particularly late during folding of the 52 knot and that the probability of knotting and folding to occur concomitantly is rather low. In order to gain further insight into the knotting mechanism, we investigated the structural changes that are likely to occur as the fraction of native contacts increases towards unity. In particular, we performed hierarchical clustering over ensembles of conformations with increasing Q and extracted the representative conformation from the dominant cluster. The succession of representative conformations thus obtained provides insight into the structural changes underlying the knotting mechanism at play during the folding of the 52 knot (Figure 4). The native structure starts developing around the chain segment that extends from residue 43 to residue 46. This segment plays the role of a guiding axis. In particular, when Q = 0.30, it pivots the formation of two loops, one that extends from residue 19 to 24, and another one extending between residues 26 and 32 (Fig. 4(b)). Subsequently, when Q = 0.46, the chain terminus containing the last residue—the one that will eventually thread the loop formed by residues 33–42 (see Fig. 4(f)), starts to move towards its native position (Fig. 4(c)). The formation of the loop itself starts in more consolidated conformations (with fraction of native contacts Q = 0.61), wherein the threading terminus acquires a hairpin-like conformation (Fig. 4(d)), suggesting the possibility of knotting to occur via a slipknotted conformation.20, 21, 24, 25 The

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 130.239.20.174 On: Sat, 06 Sep 2014 02:51:27

025101-6

Soler, Nunes, and Faísca

FIG. 4. Knotting mechanism. Succession of conformations with increasing Q that provide insight into the knotting mechanism. Each conformation with fraction of native contacts Q was obtained from structural clustering being the closest to the cluster’s centroid (i.e., it is taken as the cluster’s representative). The residues in green have at least two of its native contacts formed. The knotted core extending from residue 21 to residue 51 is highlighted in dark grey. In panel (f), which represents the native conformation, we highlighted in orange the segment of the chain extending from residue 43 to residue 46 (which acts like a guiding axis around which the native structure develops) and the loop formed by residues 33–42. The latter is threaded by the chain terminus that is closer to the knotted core.

representative conformation corresponding to Q = 0.76 (Fig. 4(e)) is a rather interesting one because, depending on the conformational changes it undergoes, the formation of the 52 knot (Fig. 4(f)) may be preceded by the formation of the trefoil knot embedded in the chain segment 30–52 (see Fig. 2 of the supplementary material56 ). We stress, however, that we have not observed the formation of trefoil knots in our simulations, presumably because these conformations are scarcely populated and, therefore, will not be detected via a conformational analysis based on structural clustering.

E. Mutational analysis

Here we investigate the effect of selected mutations on the thermodynamics and kinetics of the 52 and 31 knots. In the context of the present investigation, which uses a G¯o potential to model protein energetics, to perform a mutation is equivalent to turn-off a selected intramolecular native interaction by ascribing it zero energy (i.e., upon mutation an attractive native interaction becomes a neutral one). Since the folding transition is two-state, we consider a first class of mutations that correspond to turn-off native interactions forming in the transition state ensemble (TSE) with high probability, thus contributing to its enthalpic stabilization. We name these mutations thermodynamic mutations (TM). Furthermore, since knotting and folding may occur concomitantly we find it relevant to consider two different types of TMs: we designate by thermodynamic mutations of the first type (TM1) those that disrupt interactions forming with high probability in the TSE and for which the two interacting residues pertain to the knotted core. Alternatively,

J. Chem. Phys. 141, 025101 (2014)

in a thermodynamic mutation of the second type (TM2), the perturbed native interaction is highly likely in the TSE but at least one of the two interacting residues is located outside the knotted core. The second class of mutation we consider is termed structural mutation (SM). A SM disrupts native interactions that establish between residues that are located on the threading terminus or within the knotting loop, and form in the TSE with low probability (i.e., that do not play a role in the enthalpic stabilization of the transition state). What is the rationale behind studying SMs? The premature establishment of these particular interactions will lead to topological bottlenecks that must be solved by means of backtracking (i.e., the breaking and re-establishment of specific native contacts).20 The expected outcome of this process is a delay in the formation of the native structure. SMs are expected to enhance the folding performance by increasing the folding rate because they should decrease the probability of occurrence of topological bottlenecks. Therefore, SMs allow one to infer about the importance of backtracking in knotted proteins. For each considered knotted topology we constructed an ensemble of conformations representative of the TSE by collecting 2000 equilibrated and uncorrelated conformations with fraction of native contacts 0.4 < Q < 0.6 from many independent MC runs at Tm . The probability maps show the frequency of formation of each native contact in the TSE of the 31 knot (see Fig. 3(a) of the supplementary material56 ) and 52 knot (see Fig. 3(b) of the supplementary material56 ). Since the considered knotted proteins fold via a two-state process, a potentially relevant question concerns the relation between the native interactions that nucleate folding (i.e., the folding nucleus) and those establishing between the residues of the knotted core. We observe that for both knot types the native interactions forming in the TSE with high probability (p > 0.8) are within the knotted core (see Fig. 4 of the supplementary material56 ). However, in the case of the 31 knot all knotted core interactions have moderate to high probability (0.5 < p

Effects of knot type in the folding of topologically complex lattice proteins.

The folding properties of a protein whose native structure contains a 52 knot are investigated by means of extensive Monte Carlo simulations of a simp...
2MB Sizes 3 Downloads 3 Views