This article was downloaded by: [University of Newcastle (Australia)] On: 26 September 2014, At: 13:06 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Biomolecular Structure and Dynamics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/tbsd20

What is the shape of the distribution of protein conformations at equilibrium? a

L. Cruzeiro & L. Degrève

b

a

CCMAR and Physics, FCT, Universidade do Algarve, Campus de Gambelas, 8005-139 Faro, Portugal b

Grupo de Simulação Molecular, Departamento de Quía mica, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, Av. Bandeirantes 3900, 14040-901, Ribeirão Preto, SP, Brazil Accepted author version posted online: 17 Sep 2014.

To cite this article: L. Cruzeiro & L. Degrève (2014): What is the shape of the distribution of protein conformations at equilibrium?, Journal of Biomolecular Structure and Dynamics, DOI: 10.1080/07391102.2014.966148 To link to this article: http://dx.doi.org/10.1080/07391102.2014.966148

Disclaimer: This is a version of an unedited manuscript that has been accepted for publication. As a service to authors and researchers we are providing this version of the accepted manuscript (AM). Copyediting, typesetting, and review of the resulting proof will be undertaken on this manuscript before final publication of the Version of Record (VoR). During production and pre-press, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal relate to this version also.

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Publisher: Taylor & Francis Journal: Journal of Biomolecular Structure and Dynamics DOI: http://dx.doi.org/ 10.1080/07391102.2014.966148

What is the shape of the distribution of protein conformations at equilibrium? L. Cruzeiroa1 and L. Degr`eveb

ce pt e

d

M

an

us

cr ip t

CCMAR and Physics, FCT, Universidade do Algarve, Campus de Gambeao Molecular, Departamento las, 8005-139 Faro, Portugal; b Grupo de Simula¸c˜ de Qu´ımica, Faculdade de Filosofia, Ciˆencias e Letras de Ribeir˜ ao Preto, Universidade de S˜ao Paulo, Av. Bandeirantes 3900, 14040-901, Ribeir˜ ao Preto, SP, Brazil

Ac

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

a

1

Corresponding author: L. Cruzeiro; email: [email protected]

1

What is the shape of the distribution of protein conformations at equilibrium?

cr ip t

ce pt e

d

M

an

us

According to the thermodynamic hypothesis the native state of proteins is that in which the free energy of the system is at its lowest, so that, at normal temperature and pressure, proteins evolve to that state. We selected four proteins representative of each of the four classes and, for each protein, make four simulations, one starting from the native structure and the other three starting from the structure obtained by threading the sequence of one protein onto the native backbone fold of the other three proteins. Because of their large conformational distances with respect to the native structure, the three alternative initial structures cannot be considered as local minima within the native ensemble of the corresponding protein. As expected, the initial native states are preserved in the 0.5 microsecond simulations performed here and validate the simulations. On the other hand, when the initial state is not native, an analysis of the trajectories does not reveal any evolution towards the native state, during that time. These results indicate that the distribution of protein conformations is multi-peak shaped so that apart from the peak corresponding to the native state there are other peaks associated with average structures that are very different from the native and that can last as long as the native state. Keywords: protein folding, thermodynamic hypothesis, kinetic mechanism, molecular dynamics

1

Introduction

In order to perform their functions, proteins must first assume a well defined average three-dimensional structure known as the native structure. This is true even for many of the so-called intrinsically disordered proteins, whose native structure is only reached upon the binding of different substrates (Dyson, 2002).Given that even a small protein, with some 60 amino acids, can have more than 3000 degrees of freedom, how is it that proteins, in cells, most of the times assume the native structure? The current answer to this question is that the native state is that in which the free energy of the system

Ac

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

Abstract

2

cr ip t

us

an

M

d

ce pt e Ac

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

is at its lowest (a hypothesis known as Anfinsen’s thermodynamic hypothesis (Anfinsen, 1973)) and that proteins can fold in a reasonable time, thus avoiding the so-called Levinthal paradox (Levinthal, 1968), because their free energy landscape is funnel shaped, a hypothesis put forward by several authors (Bryngelson, Onuchic, Socci & Wolynes, 1995; Onuchic, LutheySchulten & Wolynes, 1997; Dill and Chan, 1997), the native state being that located at the bottom of the funnel. As emphasized in (Yang, 2013) the native state is an ensemble of states, and the free energy landscape is a function of reaction coordinates that are themselves averages over different ensembles. This procedure, together with the number of degrees of freedom, makes the free energy landscape a difficult concept to tackle for a protein. In a recent publication (Ben-Naim, 2012), Ben-Naim proposes another description of the equilibrium ensemble, based on the distribution of the protein conformations. He points out that, in an equilibrium process, the free energy is minimized with respect to this distribution, which in principle “may have any form, with or without minima and maxima” (see p. 118 Ben-Naim, 2012). Within this formalism, the thermodynamic hypothesis corresponds to assuming that, at normal temperature and pressure, the distribution of protein conformations has a single peak, centred at the average native structure of proteins. The purpose of this study is to probe the distribution of conformations of four proteins. While most of the simulations reported in the literature start either with initial conditions that are, or have been obtained from, the known native structures of proteins, here we start also from compact structures that are very different from the native and thus cannot be interpreted as local minima within the native ensemble or in its vicinity. If, as the funnel hypothesis proposes, all pathways lead to the native state, we should expect these alternative structures to evolve to the native state. Although the four native structures are stable, after 0.5 microseconds, none of the 12 alternative structures evolved to the corresponding native ensemble. Globally, these results indicate that, apart from the peak corresponding to the native ensemble, the distributions of the conformations of the four proteins have also other peaks associated with average structures that are very different from the native.

3

Molecular dynamics simulations

ce pt e

d

M

an

us

cr ip t

As β-sheets are generally more rigid than α-helices, in order to minimize structural bias, we have selected four proteins, each representative of one of the four CATH (Orengo, Michie, S. Jones, D.T. Jones, Swindells, & Thornton, 1997) classes: 1BDD (mainly-α) (Gouda, Torigoe, Saito, Sato, Arata, & Shimada, 1992), 1J08 (mainly-β) (Fazi, et al., 2002), 1IGD (α/β) (Gallagher, Alexander, Bryan, & Gillilan, 1994) and 1AAP (few secondary structures) (Hynes, Randal, Kennedy, Eigenbrot, & Kossiakoff, 1990). The protein 1BDD includes one histidine, which can be protonated or unprotonated. Here we shall report the results on the protonated protein. For each of these four proteins, three alternative, non-native, structures were built by threading the sequence of the first protein onto the fold of the other three. It should be emphasized that a protein’s class is its main structural identification so that forcing a given protein to assume the fold of an alien class is one of the most artificial conformations we can impose upon it. Thus, if the native structure is indeed a well defined free energy minimum, these alternative structures should be very unstable indeed. Both the native and the non-native structures of the four proteins were energy minimized in the presence of an explicit SPC/E water bath (Berendsen, Postma, van Gunsteren & Hermans, 1981),using sodium ions as counter-ions to make the total charge of the system neutral. The initial structures of the four proteins are shown in figure 1. Inspection of the columns in figure 1 shows that although the minimization procedure changes the packing of the secondary structures and can even distort them, the overall architecture of the non-native structures is still recognized as that of the template structure. The thermodynamic stability of each initial structure was investigated by molecular dynamics (MD) simulations within the NPT ensemble, with T=300 K and p=1 atm, using GROMACS (Hess, Kutzner, van der Spoel, & Lindahl, 2008) and the Gromos96 43A1 force field (van Gunsteren, et al., 1996). To allow for sufficient time for the non-native structures to evolve towards the native structures, the simulations were all run until 0.5 microseconds (µs). The time step for the integration was 2 femtoseconds (fs) and for the analysis in figures 4 and 5 below 250 snapshots were collected every 2 nanoseconds (ns). Figure 2 shows all the structures at the end of 0.5 µs. The time to relax towards equilibrium is not the same for all physical variables. Since we are interested in the average structure at equilibrium, in order to assess the degree of convergence of each of the trajectories the variation with

Ac

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

2

4

cr ip t

us

an

M

d

ce pt e Ac

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

time of the cumulative root mean square deviation per atom (RMSD) is displayed in figure 3 for all the trajectories. This figure shows that the degree of structural convergence is not the same in all cases, being very good for all the trajectories of the proteins 1IGD and 1AAP. The best convergence, however, is that of the native trajectory of the mainly-β protein 1J08, which is related to the fact that β-sheets are more rigid structures. An intermediate degree of convergence is observed for the all trajectories of protein 1BDD, including that which started from the native structure of this protein, as well as for the non-native trajectories of protein 1J08 which started from conformations with the native folds of proteins 1IGD and 1AAP. On the other hand, the trajectory of protein 1J08 which started from a conformation with the native fold of protein 1BDD has not converged after 500 nanoseconds. A comparison between figures 1 and 2 shows that the native states (along the diagonal) have been well preserved at the end of their trajectories, as expected. However, this also true for the non-native conformations of α/β protein 1IGD (in yellow) and of the few-secondary-structures protein 1AAP (in green) which not only conserve the general initial fold that had been imposed on them but also conserve even the secondary structures of each region. On the other hand, the non-native structures of mainly-α 1BDD protein (cyan) and of the mainly-β 1J08 protein (red), although preserving the general fold, have deviated markedly from the initial condition. The important point, however, is to determine whether the non-native structures show any tendency to evolve to the native state. A quantitative estimation of the conformational space spanned by each protein in its trajectory can be obtained from the RMSD, of each conformation, i (i=1, · · ·, 250), with respect to every other conformation, j (j=1,· · ·,250) in the same trajectory (RMSD(i,j)). The module ptraj of AMBER (Case, et al., 2005) was used in the calculation of the RMSD, including only the carbon, oxygen and nitrogen atoms of the backbone. Since the RMSD values between two conformations of a protein with N amino acids can be expected to increase with N, in spite of the small differences in the number of amino acids of the four proteins selected, to make their RMSD values more comparable they were all normalized to 100 amino acids using the formula statistically determined by (Carugo, & Pongor, 2001): RMSD(i, j) q RMSD(i, j)100 = (i, j = 1, · · · , 250) (1) N 1 + ln 100 It is the normalized RMSD(i, j)100 matrices that are portrayed in each of the 5

cr ip t

us

an

M

d

ce pt e Ac

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

16 squares in figure 4. Each square is formed by 250 by 250 dots, one dot for each of the elements of the RMSD(i, j)100 matrices (one matrix for each of the 16 simulations). The value of each element is given by its colour and the colour scale is on the right hand side of the figure. The more to the blue a spot is, the more similar the two structures being compared are and the more red a spot is, the greater the differences between the two structures being compared. The first line in each plot is the deviation of each conformation in the trajectory with respect to the initial conformation; the second line is the deviation of of each conformation with respect to the second conformation; and so forth. The matrix RMSD(i, j)100 defined above is of course symmetric, as an inspection of figure 4 can confirm (some slight asymmetries are due to the compression from the original size). Inspection of the first lines in the 16 squares of figure 4 shows that large deviations from the initial structure occur more consistently for the nonnative conformations of the mainly-β 1J08 protein, whose final conformations are up to 8.5 ˚ A away from the initial one. However, the greatest deviation from the initial structure is found in the trajectory in which the initial structure was obtained by imposing the mainly-β fold of 1J08 onto the mainly-α 1BDD protein, whose final structures are 11 ˚ A away from the starting one. Also, imposing the α/β fold of 1IGD onto the essentially disordered 1AAP protein led to a trajectory in which the final conformations deviated by a maximum of 6.5 ˚ A from the initial structure. The remaining 6 non-native structures only deviated by a maximum of 5 ˚ A from the initial structure, not far from the degree of fluctuation of 3.5 ˚ A of the structures that characterizes most of the native ensembles. The spots in the diagonal of each square of figure 4 are the RMSDs of each structure of that trajectory with respect to itself, which is of course zero; the spots in the first parallel to the diagonal are the RMSDs between “nearest neighbour” conformations along the trajectory (conformations that are separated by 2 ns in time); similarly, the second parallel to the diagonal represents the RMSDs between “second neighbours” in time, that is, conformations that are separated by 4 ns in time; and so on. Thus, the more blue a square in figure 4 is, the more conformationally limited the corresponding protein ensemble is, and the more red a plot is the more conformationally expanded the corresponding protein ensemble. Within the scale used, the native states generally cover the smallest conformational space, with the exception of 1BDD which covers a greater conformational space than some of the non-native structures. In a funnel free energy landscape, it might be 6

cr ip t

us

an

M

d

ce pt e Ac

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

expected that all non-native trajectories would deviate more from the initial structure than the corresponding native trajectories but this is not what is observed in many cases. In fact, forcing the mainly α 1BDD protein to assume the backbone fold of the α/β 1IGD (first row, third column, of figure 4) or of the essentially disordered 1AAP (first row, fourth column) leads to an ensemble that is less expanded than the native ensemble of 1BDD. Furthermore, imposing a mainly-α or mainly-β fold to the α/β 1IGD protein (third row, first and second columns, of figure 4) leads to structures that have a degree of similarity equivalent to that of the native state. On the other hand, the protein whose non-native structures deviate the most from the initial is the mainly-β protein 1J08. The squares in figure 4 allow for a visual identification of the conformational clusters in each trajectory, i.e., of the intervals of time in which the conformation of the protein did not change beyond 3.5 ˚ A. Indeed, going along the diagonal, the dark blue regions correspond to periods in which the conformation essentially fluctuated around an average structure, as happens to the structures within the native ensemble. Thus, while the native structures span essentially one average conformation, a different picture is obtained from the square in the second line, fourth column: there we can identify six average conformations, corresponding to the six dark blue regions counted along the diagonal. The colours in the first line tell us that the first square is the set of conformations within 3.5 ˚ A of the initial conformation, the second is the set of conformations within 5 ˚ A of the initial, the third and the fourth are within 6.5 ˚ A of the initial and the fifth and sixth are within 8.0 ˚ A of the initial. Similar analyses can be made of the other squares, for instance, the square in the fourth line, third column shows three clusters, the last of which is for conformations that are within 6.5 ˚ A of the initial one; here, the first line shows that the conformations first evolved away from the initial one, reaching an RMSD value of 8.0 ˚ A, but in the latter half of the trajectory returned to conformations structurally closer to the initial one. Figure 4 shows that many of the non-native structures evolve away from the initial state and, in the case of the mainly-β 1J08 protein, by as much as 9.5 ˚ A in RMSD. The question we are interested in whether this evolution is towards the native state, something that cannot clarified by figure 4. In order to monitor such a possible evolution, the off diagonal squares in figure 5 display the RMSD of the structures in those trajectories with respect to the structures in the corresponding native trajectory, thus providing a quantitative estimation of the overlap of each non-native trajectory with the 7

cr ip t

us

an

M

d

ce pt e 3

Discussion

We have made 0.5 µs long simulations, in the presence of explicit water molecules, of the native plus three non-native structures of four proteins, representative of the four CATH classes. As the non-native structures consisted of imposing a fold from a foreign class on each protein and can therefore be considered as some of the most artificial conformations that each of the

Ac

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

corresponding native trajectory. I.e, the off diagonal squares in figure 5 are now the matrices RMSD(i, j)cross 100 , (i,j=1, · · ·, 250), where i stands for the ith snapshot of the non-native trajectory and j stands for the jth snapshot in the corresponding native trajectory. Note that the RMSD(i, j)cross 100 matrices in the off-diagonal squares of figure 5 are not symmetric because RMSD(j, i)cross 100 is the RMSD between the jth snapshot in the non-native trajectory and the ith snapshot in the native trajectory. On the other hand, the diagonal squares in figure 5 are the same as in figure 4. Inspection of figure 5 shows that, with one exception, the non-native trajectories cover conformations that are at least 9.5 ˚ A away from the native ensemble. The exception is the trajectory of the non-native structure obtained by imposing the fold of the α/β 1IGD protein onto the backbone of the few-secondary-structures 1AAP protein. But figure 5 shows that this non-native structure already starts from a conformation that happens to be “only” 8.0 to 9.5 ˚ A away from the native state and figure 4 shows that it does not evolve much beyond that. Indeed, the non-native protein structure that evolves the most in the 0.5 µs interval of the simulations is that in which the fold of protein 1BDD has been imposed on protein 1J08 which moves as much as 9.5 ˚ A away from the initial structure (second row, first column of figure 4). If this evolution was towards the native state of protein 1J08, the RMSD values in the second row, fourth column of figure 5 should be in the lighter blue or even dark blue, corresponding to RMSD values of 6.5 to 3.5 ˚ A. Instead, figure 5 shows that this most changeable of the non-native structures simulated here remains 9.5 to 11 ˚ A away from the native throughout the whole 0.5 µs trajectory. I.e., although this non-native structure is evolving the most away from the initial structure it is not evolving towards the native state. In fact, the general reddish tone of the off-diagonal squares in figure 5 means that none of the non-native structures tried here shows any tendency to evolve towards its native ensemble.

8

cr ip t

us

an

M

d

ce pt e Ac

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

four proteins can assume, according to the thermodynamic hypothesis, they should evolve towards the native state. The four native state trajectories only spanned structures within 3.5 ˚ A of the initial one, largely preserving the average experimentally determined structure, as expected. On the other hand, although one of the 12 non-native trajectories evolved as much as 9.5 ˚ A away from the initial structure, none of them showed any tendency to evolve to the native state. Thus, our answer to the question in the title is that our results challenge the thermodynamic hypothesis that the distribution of protein conformations has a unique well-defined maximum and favours instead a multi-peak distribution. The peaks corresponding to the non-native conformations are generally broader than those associated with the native, i.e. the conformational space they explore is larger, as can be gauged from the generally more colourful off-diagonal plots in figure 4, and do not overlap much with the native peak as can be shown by the generally reddish tone of the off-diagonal plots in figure 5. Using landscape theory terminology, a multi-peak distribution corresponds to saying that the free energy landscape of proteins is multi-funnel-shaped (Cruzeiro-Hansson, & Silva, 2001; Cruzeiro, 2008; Cruzeiro, & Lopes, 2009; Cruzeiro, 2010). Thus, we can also say that our results challenge the hypothesis that the free energy landscape of proteins is funnel-shaped and favour instead a multi-funnel-shaped landscape. In a multi-funnel landscape, there is not just the native funnel but also many other funnels that are associated with average conformations very different from the native and yet as stable as the native state. The non-native states of 1IGD simulated here are examples of such non-native funnels for that protein. On the other hand, within BenNaim’s formalism (Ben-Naim, 2012) we can say that the non-native states of 1IGD are examples of non-native peaks in the equilibrium distribution of that protein. When the conformational distribution is multi-peaked, unconstrained free energy minimization can lead, with equal probability, to many different structures and the thermodynamic hypothesis alone cannot explain why proteins fold to a well defined average structure. Indeed, in a multi-peak conformational distribution, the difficulty in determining the native structure from such a protocol is not just due to the size of the conformational space, the inaccuracies of the potential energy function or the lack of computer power, but, more essentially, to the fact that the native state is only one of many equivalent local population maxima, hence not all the proteins will fold to the native state. In a multi-peak conformational distribution protein fold9

cr ip t

Rebuttal to the referee comments

us

4

4.1

an

In this section we include the referee’s comments to the second round of submission and our reply to those comments.

The referee’s comments

ce pt e

d

M

The authors have probably not understood my criticism. So I will try to make my point clearer. The authors run MD simulations starting from misfolded compact structures (3 misfolded structures for each of the 4 proteins). In each of these simulations, some small part of the conformational space is explored, but no evolution towards the native structure is observed. If this result were taken seriously, it would mean that any (or at least many) compact structures are kinetic traps. This is difficult to believe, since we know that proteins fold into their native structure in a reasonable time, starting from almost any conformation. I would rather believe that this result is due to the fact that the simulation times are too short (or other parameters are given inadequate values). To check if I am right or wrong, the authors could perform MD simulations for each on the 4 proteins, starting from an unfolded (extended) conformation. If the final structure is close to the native structure after 0.5 microseconds, I could be wrong. In summary, I believe that the paper is not publishable without additional investigations.

Ac

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

ing must include a non-equilibrium, kinetic step, in which the native peak is selected among all the other peaks. Once a polypeptide has been driven into the vicinity of the native peak, folding can proceed by free energy minimization. The suggestion that folding is a kinetic process was first made by Levinthal (Levinthal, 1968) and later by others (Creighton, 1993; Baker, & Agard, 1994; Cruzeiro-Hansson, & Silva, 2001), and a kinetic mechanism that can explain reproducible in vivo protein folding in a multi-peak conformational distribution has been described in detail (Cruzeiro, 2010).

10

Our reply to the referee’s comments above

ce pt e

d

M

an

us

cr ip t

We did indeed not understand referee 2 (now referee 1). We thought the referee was questioning the degree of convergence of the trajectories in our study but instead, the referee simply does not like the results we have. The referee states that “we know that proteins fold into their native structure in a reasonable time, starting from almost any conformatio”. This is an opinion that, admittedly, is shared by many researchers in the field, but we beg to differ. In fact, we argue that there is experimental evidence to the contrary, as follows. Unfolding experiments show that proteins denature irreversibly when they are heated beyond a given threshold. On the other hand, some proteins, when they are denatured by chemical means, do refold to the native state. The irreversibility of unfolding by heating is usually associated with aggregation, but it does demonstrate that the unfolded ensemble obtained by chemical denaturation is different from the unfolded ensemble obtained by heating and that proteins do not refold to their native states from any unfolded conformation. In fact, the proteins used in the refolding studies are a very special set of proteins, highly unrepresentative of the full proteome, and even so, reversible folding of those proteins is only obtained in especially selected conditions of denaturation (Braselmann, Chaney & Clark, 2013). This is further experimental evidence that, contrary to the opinion of the referee, proteins only fold to their native state from very particular unfolded conformations. The referee suggests that we start from extended conformations and says that if the native state is reached, he/she may be wrong. We think that it is highly unlikely that the native state will be reached when starting from a fully extended conformation. If it were likely, then the prediction of the native state of proteins solely from their amino acid sequence would be a trivial result by now, which is very far from being the case (see the evaluation of the latest CASP exercise by Grishin in http://www.predictioncenter.org/casp10/doc/presentations/CASP10 Keynote NG.pdf). T. Lazaridis and M. Karplus have a paper in which they tried to refold a protein after heating, in a computer simulation and what they find is that re-folding from open conformations does not lead to the native state, but rather to other non-native compact structures (see note 22 of (Lazaridis & Karplus, 1997)). This is in agreement with the results reported in the manuscript and it is what we will most likely find when folding from extended structures (in fact, it is what we have already found in preliminary simulations). And, since we will not

Ac

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

4.2

11

cr ip t

References

M

an

us

Acknowledgements The first author acknowledges partial support from the European Regional Development Fund (ERDF) through the COMPETE - Operational Competitiveness Programme and from national funds through FCT Foundation for Science and Technology, Portugal, under the project PEst-C/MAR/LA0015/2011. The second author is thankful to the Funda¸c˜ao de Amparo a` Pesquisa do Estado de S˜ao Paulo (FAPESP), and to the Conselho Nacional de Desenvolvimento Cient´ıfico e Tecnol´ogico (CNPq) for financial support.

ce pt e

d

Anfinsen, C.B. (1973). Principles that govern the folding of protein chains. Science 181, 223 – 233. Ben-Naim, A. (2012). Levinthal’s question revisited, and answered. Journal of Biomolecular Structure and Dynamics 30, 113-124. Berendsen, H.J.C., Postma, J.P.M., van Gunsteren, W.F. & Hermans, J. (1981). Intermolecular Forces. In (ed) Interactions models for water in relation to protein hydration. Reidel, Dordrecht, the Netherlands, 331-342. Braselmann E., Chaney J.L. & Clark P.L. (2013). Folding the proteome. TIBS 38, 337-44. Bryngelson, J.D., Onuchic, J.N., Socci, N.D. & Wolynes, P.G. (1995). Funnels, pathways, and the energy landscape of protein folding: A synthesis. Proteins 21, 167 – 195. Carugo, O. & Pongor, S. (2001). A normalized root-mean-square distance for comparing protein three-dimensional structures. Protein Science 10, 1470 – 1473. Case, D.A., Cheatham, III, T.E., Darden, T., Gohlke, H., Luo, R., Merz, Jr., K.M., Onufriev, A., Simmerling, C., Wang, B. & Woods, R. (2005). The

Ac

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

reach the native state, the referee will think he/she is right (in rejecting our manuscript?), either because we have a wrong field (even if that same field leads to the stability of all 4 native structures, as it should), or because the answers we have would be different if we ran the trajectories for longer times, no matter how reasonable the convergence of many of them already seems to be. This is a no-win situation for us and as far as we are concerned it is based on prejudice on the part of the referee rather than on an objective scientific analysis of our report.

12

cr ip t

us

an

M

d

ce pt e Ac

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

Amber biomolecular simulation programs. Journal of Computational Chemistry 26, 1668 – 1688. Creighton, T.E. (1993). Protein Structures and Molecular Properties. W.H. Freeman and Cy, NY, 2nd Ed., p.309. Baker, D. & Agard, D.A. (1994). Kinetics versus Thermodynamics in Protein Folding. Biochemistry 33, 7505 – 7509. Cruzeiro, L. (2008). Protein’s multi-funnel energy landscape and misfolding diseases. Journal of Physical Organic Chemistry 21, 549 – 554. Cruzeiro, L. & Lopes, P.A. (2009). Are the native states of proteins kinetic traps? Molecular Physics 107, 1485 – 1493. Cruzeiro, L. (2010). Protein Folding. In M. Springborg, (Ed.),Specialist Periodical Reports, Chemical Modelling, Applications and Theory, vol. 7, Royal Society of Chemistry, London, pp. 89 – 114. Cruzeiro-Hansson, L. & Silva, P.A.S. (2001). Protein folding: thermodynamic versus kinetic control. Journal of Biological Physics 27, S6 – S9. Dill, K.A. & Chan, H.S. (1997). From levinthal to pathways to funnels. Nature Struct. Biol. 4, 10 – 19. Dyson, H.J. & Wright, P.E. (2002). Coupling of folding and binding for unstructured proteins. Current Opinion in Structural Biology 12, 54 – 60. Humphrey, W., Dalke, A. & Schulten, K. (1996). VMD - Visual Molecular Dynamics. Journal of Molecular Graphics 14, 33 – 38. Lazaridis, T. & Karplus, M. (1997).”New view” of protein folding reconciled with the old through multiple unfolding simulations. Science 278, 1928 – 1931. Levinthal, C. (1968). Are there pathways for protein folding? Journal de Chimie Physique 65, 44 – 45. Onuchic, J.N., Luthey-Schulten, Z. & Wolynes, P.G. (1997). Theory of protein folding: the energy landscape perspective. Annual Review of Physical Chemistry 48, 545 – 600. Orengo, C.A., Michie, A.D., Jones, S., Jones, D.T., Swindells, M.B., & Thornton, J.M. (1997). CATH - A Hierarchic Classification of Protein Domain Structures. Structure 5, 1093 – 1108. van der Spoel, D., Lindahl, E., Hess, B., van Buuren, A.R., Apol, E., Meulenhoff, P.J., Tieleman, D.P., Sijbers, A.L.T.M., Feenstra, K.A., van Drunen, R. & Berendsen, H.J.C. (2010). Gromacs User Manual version 4.5, www.gromacs.org. van Gunsteren, W. F., Billeter, S. R., Eising, A. A., Hnenberger, P. H., Krger, P., Mark, A. E., Scott, W. R. P., & Tironi, I. G. (1996). Biomolecular Simulation: The GROMOS 96 Manual and User Guide, Biomos, Groningen. 13

cr ip t us an M d ce pt e Ac

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

Yang, L.Q., Ji, X.L., & Liu, S.Q. (2013). The free energy landscape of protein folding and dynamics: a global view. Journal of Biomolecular Structure and Dynamics 31, 982-992.

14

cr ip t us an

M d ce pt e

ly

On

Ac

w

ie

ev

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

rR

ee

rP Fo

(Color online) Initial conformations of the proteins. All proteins in the same row (plotted in the same colour) have the amino acid sequence of the protein identified at the beginning of the row (i.e. they are different conformations of the same protein). The native structures of each protein are displayed along the diagonal. The first row displays the initial structures for the mainly-α 1BDD protein (cyan), the second row is for the mainly-β 1J08 protein (red), the third row is for α/β 1IGD protein (yellow) and the fourth row is for essentially disordered 1AAP protein (green). Along each column, away from the diagonal, are the non-native structures obtained by imposing the backbone fold of the native structure in that column onto the sequence of the other proteins. The protein pictures were prepared with the software Visual Molecular Dynamics (VMD) (Humphrey, Dalke \& Schulten, 1996)

cr ip t us an

M d ce pt e

ly

On

Ac

w

ie

ev

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

rR

ee

rP Fo

(Color online) Conformations of the proteins at the end of 0.5 µs. The organization is as in figure 1. The protein pictures were prepared with VMD (Humphrey, Dalke \& Schulten, 1996).

cr ip t us an

M d ce pt e

ly

On

Ac

w

ie

ev

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

rR

ee

rP Fo

(Color online) Cumulative RMSD (in Å) as a function of time (the reference structure is that obtained after 0.48 nanoseconds of thermalisation). The trajectories are organized in the same order as in the previous figures. The labels in the ordinate of each plot identify the protein and the labels within each plot identify the initial structure. For instance, Str 1J08 in the top plot means that the initial structure of that trajectory was obtained by imposing the backbone fold of protein 1J08 on protein 1BDD. The curves for the native trajectories of the four proteins are in black.

cr ip t us an M

d

ce pt e Ac

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

(Color online) Each of the 16 squares is a representation of the RMSD100(i,j) matrix (see text) for the 16 trajectories whose corresponding initial and final structures are displayed in figures 1 and 2 (the organization is the same as in the latter figures). The RMSD color scale is the same for all the trajectories and is shown on the right (in Å).

cr ip t us an M

d

ce pt e Ac

Downloaded by [University of Newcastle (Australia)] at 13:06 26 September 2014

(Color online) The off-diagonal squares are a representation of the RMSDcross100(i,j) matrix which provides a quantitative estimate of the overlap between the non-native and the native trajectories (see text). The organization and the squares in the diagonal are the same as in figure 3. The RMSD color scale is the same for all the trajectories and is shown on the right (in Å).

What is the shape of the distribution of protein conformations at equilibrium?

According to the thermodynamic hypothesis, the native state of proteins is that in which the free energy of the system is at its lowest, so that at no...
935KB Sizes 0 Downloads 4 Views