PHYSICAL REVIEW E 90, 022719 (2014)

Prediction of allosteric sites on protein surface with an elastic-network-model-based thermodynamic method Ji Guo Su,1,* Li Sheng Qi,2 Chun Hua Li,3 Yan Ying Zhu,1 Hui Jing Du,1 Yan Xue Hou,1 Rui Hao,1 and Ji Hua Wang2,† 1

2

College of Science, Yanshan University, Qinhuangdao 066004, China Shandong Provincial Key Laboratory of Functional Macromolecular Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China 3 College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100022, China (Received 7 January 2014; revised manuscript received 9 July 2014; published 27 August 2014) Allostery is a rapid and efficient way in many biological processes to regulate protein functions, where binding of an effector at the allosteric site alters the activity and function at a distant active site. Allosteric regulation of protein biological functions provides a promising strategy for novel drug design. However, how to effectively identify the allosteric sites remains one of the major challenges for allosteric drug design. In the present work, a thermodynamic method based on the elastic network model was proposed to predict the allosteric sites on the protein surface. In our method, the thermodynamic coupling between the allosteric and active sites was considered, and then the allosteric sites were identified as those where the binding of an effector molecule induces a large change in the binding free energy of the protein with its ligand. Using the proposed method, two proteins, i.e., the 70 kD heat shock protein (Hsp70) and GluA2 alpha-amino-3-hydroxy-5-methyl-4-isoxazole propionic acid (AMPA) receptor, were studied and the allosteric sites on the protein surface were successfully identified. The predicted results are consistent with the available experimental data, which indicates that our method is a simple yet effective approach for the identification of allosteric sites on proteins. DOI: 10.1103/PhysRevE.90.022719

PACS number(s): 87.15.hp, 87.16.Vy, 87.14.E−, 05.20.Gg

I. INTRODUCTION

Protein structure-based drug design has been more and more widely used in the discovery of new drugs since 1980s and now it has become an integral part of the industrial drug development process [1]. The conventional strategy for structure-based drug development is to seek or design molecules that bind to the active site with high affinity to stop the activity of proteins [2]. These kinds of drugs are called “orthosteric” drugs. However, active sites are usually evolutionarily conserved and thus homologous proteins often have similar active sites. Drugs that are designed for binding to one protein may also unexpectedly bind to another homologous protein. Therefore, orthosteric drugs usually have many side effects. By contrast, allosteric regulation of protein functions is considered to be a more promising strategy for structure-based drug design, in which drugs bind to the allosteric site on a protein surface distinct from the active site. Allosteric drugs show several potential advantages over the conventional orthosteric drugs. For example, they have fewer side effects and are more specific [2–5]. Allosteric drug design has attracted increasing interest from the theoretical and experimental researchers; however, there are still many challenges in this field. One of the major challenges is how to effectively identify the allosteric sites on protein surfaces. It is increasingly clear that protein structure and its intrinsic dynamics play an important role in allosteric regulation [6–9]. Based on the analysis of protein structural topology and structure-encoded dynamics, various computational methods have been proposed to investigate the mechanism of protein allostery and identify the key sites involved in allosteric

* †

[email protected] [email protected]

1539-3755/2014/90(2)/022719(10)

regulation. In the topology-based methods, protein tertiary structure is simplified as an amino acid network, in which residues are represented by nodes and inter-residue interactions are represented by edges, and the functionally important sites are identified through analyzing the topological properties of the network. Del Sol et al. found that the residues responsible for maintaining the shortest path length of the network correspond to the functional sites [10]. Chennubhotla and Bahar analyzed the Markov propagation of information in the amino acid network to predict the key sites mediating the transmission of allosteric signals across the protein structure [11]. In the dynamics-based methods, atomic molecular dynamics (MD) simulation technique and its modified methods were largely used to study protein allostery and identify the involved key sites. Ivetac and McCammon applied MD simulation to sample protein multiple conformations and then the conformational flexibility was taken into account to predict druggable binding sites on the protein surface [12]. Grant et al. combined MD simulation with bioinformatics information and the ensemble docking technique to identify the allosteric sites on Ras for lead generation [13]. McClendon et al. analyzed the crosscorrelation motions in proteins by using MD simulation and a mutual information metric to detect the potential allosteric sites [14]. Sharp and Skinner developed a pump-probe MD to explore the propagation pathway of allosteric signal across protein structure and identify the involved key sites [15]. Although the atomic MD simulation and the modified methods show a great success in predicting protein allosteric sites, these approaches are very time consuming, especially for large protein systems. A simple yet effective computational method for exploring protein intrinsic dynamics is normal mode analysis (NMA) of the coarse-grained elastic network model (ENM) [16–18]. Based on ENM, several theoretical approaches have been proposed to identify functionally key sites in proteins.

022719-1

©2014 American Physical Society

SU, QI, LI, ZHU, DU, HOU, HAO, AND WANG

PHYSICAL REVIEW E 90, 022719 (2014)

Bahar and co-workers investigated the collective dynamics of proteins by using ENM, and they found that the residues in the catalytic sites and metal-binding sites exhibit minimal fluctuations [19–21]. Haliloglu et al. introduced a thermodynamic quantity calculated by using ENM, which accounts for the mean-square fluctuations in spring length, to determine the functionally important residues displaying large energy exchange with other residues in proteins [22,23]. Similarly, based on ENM, Erman proposed a method to identify key residues involved in protein allosteric communication pathway through calculating the correlations between the fluctuations in spring length of pairwise residues [24]. Atilgan et al. developed a perturbation-response scanning method based on ENM to detect the key residues whose perturbation can reproduce the functional motions of proteins [25,26]. Zheng et al. proposed an ENM-based perturbation analysis method to locate the dynamically important residues, whose perturbation largely changes the functional motions of proteins or alters the structural fluctuations of the active sites [27,28]. We have proposed an ENM-based thermodynamic method to identify the functionally key residues whose perturbation largely changes the ligand-binding free energy of proteins or the free energy difference between the protein states before and after conformational transition [29,30]. Based on ENM, Ming and Wall developed a dynamics perturbation analysis (DPA) method to locate the functional sites of proteins. They perturbed different regions of protein surface and the functional sites are identified as the regions where ligand binding induces significant changes of protein conformational distributions [31,32]. Motivated by the works of Ming and Wall [31,32], in the present study, a thermodynamic method was proposed to perturb different regions on protein surface and then identify the allosteric sites. In the method of Ming and Wall, the allosteric communication between the allosteric and active sites was not considered, and thus the predicted functional sites are not necessarily the allosteric sites. In our proposed method, we focus on the thermodynamic coupling between the allosteric and active sites, and the allosteric sites on protein surface were identified as those where the binding of an allosteric effector induces a large change in the binding free energy of the protein with its ligand. In our method, the allosteric effector, which is represented by a ball in the adopted coarse-grained model, randomly binds to the surface of the studied protein to generate a large number of configurations. For each configuration, the ENM was used to calculate the protein-ligand-binding free energy for the two cases where the allosteric effector is bound and not bound to the protein. The allosteric sites are identified from the configuration with a large change in the protein-ligand-binding free energy. In this work, the proposed method was applied to the 70 kD heat shock protein (Hsp70) and GluA2 alpha-amino-3-hydroxy5-methyl-4-isoxazole propionic acid (AMPA) receptor as case studies. Hsp70 is a molecular chaperone that plays essential roles in assisting protein folding, and preventing aggregation and misfolding, whose activity is powered by ATP hydrolysis [33–35]. The binding of ATP induces the closure of the nucleotide-binding domain (NBD) of Hsp70, and then triggers a critical conformational change in the distant substrate-binding domain (SBD) that leads to the release of the

substrate. Experimental evidences have indicated that there exist allosteric couplings between different regions in this protein [34,35]. In this work, our proposed thermodynamic method was used to identify the allosteric sites on Hsp70 that are coupled with the binding of ATP. The AMPA receptor is an ionotropic glutamate receptor that mediates the fast excitatory synaptic transmissions in the central nervous system, which is related to many neurodegenerative disorders, such as Parkinson’s and Alzheimer’s disease [36–40]. Therefore, it is a potential target for drug design. It has been proved that the activation of GluA2 AMPA receptor is triggered by the binding of glutamate to the ligand-binding domain of the protein [38,40,41]. In the present study, the allosteric sites on protein surface that are thermodynamically coupled with the glutamate binding were identified by using our proposed method and then compared with the experimental observations. II. THEORY AND METHODS A. The thermodynamic method to identify the allosteric sites

In our method, the allosteric sites are identified as those to which the binding of the allosteric effector results in a large change in the protein-ligand-binding free energy. Without the binding of the effector, the protein-ligand-binding free energy is denoted as Gapo , and for the protein with the binding of the effector, the protein-ligand-binding free energy is represented as Gholo . We desire to calculate the change between Gapo and Gholo , which is written as G = Gholo − Gapo

(1)

In the present work, the free energy of different protein states is calculated based on ENM. The G with relative large value indicates that the binding of the ligand is largely coupled with the binding of the effector, and thus the corresponding sites are considered to be the allosteric sites. B. The Gaussian network model

The Gaussian network model (GNM) [16,42], one kind of ENM, describes the tertiary structure of a protein as an elastic network in which each residue is represented by its Cα atom ˚ is and the residue pairs within a certain cutoff distance (7.0 A adopted in this work) are connected by harmonic springs with a uniform force constant. The internal potential energy of the protein system can be expressed as H = 12 γ (R T R),

(2)

where γ is the force constant of the harmonic springs; {R} is the N -dimensional column vector of fluctuations of the Cα atoms, where N is the number of residues in the protein; the superscript T represents the transpose of the column vector; and  is the N × N Kirchhoff matrix whose elements can be written as ⎧ if i = j and Rij  Rc ⎪ ⎨−1 if i = j and Rij > Rc , (3) = 0 ⎪ ⎩−   if i = j i,i=j ij where Rij is the distance between the ith and j th Cα atoms and Rc is the cutoff distance.

022719-2

PREDICTION OF ALLOSTERIC SITES ON PROTEIN . . .

PHYSICAL REVIEW E 90, 022719 (2014)

Based on the GNM, we now calculate the value of Eq. (1). According to thermodynamic theory, free energy is composed of potential energy and entropy. Then, Eq. (1) can be written as G = Gholo − Gapo

=

N

 3(N − 1)kB 3kB − 1 − ln e − ln (2π kB T /γ )λ−1 i 2 2 i=2

=

N  3kB ln (2π kB T e/γ )λ−1 . i 2 i=2

(8)

= (Uholo − Uapo ) − T (Sholo − Sapo ), (4) where U and S are represented as the potential energy and entropy changes of the systems, respectively. In our method, we assume that the binding of the allosteric effector does not change the protein-ligand-binding mode. The same contacts between the ligand and the protein residues are formed for the two cases where the allosteric effector is bound and not bound to the protein. Therefore, the changes of the potential energy are the same for the two systems, i.e., Uholo = Uapo . Thus, G = −T (Sholo − Sapo ) = −T S

It should be pointed out that the total numbers of Cα atoms in Hsp70 and GluA2 AMPA receptor are 600 and 1558, and thus the total numbers of normal modes with nonzero eigenvalues for these two proteins are 599 and 1557, respectively. In our paper, all of these 599 and 1557 modes are considered to calculate the entropy for the corresponding systems. For the complex structures including protein-ligand, protein-effector, and protein-ligand-effector complex structures, the internal potential energy of the system can be written as

= −T [(S++ − S+− ) − (S−+ − S−− )] = −T (S++ + S−− − S+− − S−+ ),

(5)

where the first and second subscripts “+ (–)” refer to the protein states with (without) the binding of effector and ligand, respectively. The entropy of different protein states, i.e., S++ , S−− , S+− , and S−+ , can be calculated based on GNM. According to the theory of GNM, the vibrational entropy of the protein is given by [43] H −F 3(N − 1)kB = − kB ln ZGNM T 2 3kB 3(N − 1)kB − ln[(2π kB T /γ )N−1 det( −1 )], (6) = 2 2

S=

where kB is the Boltzmann constant; T is absolute temperature; ZGNM is the vibrational partition function expressed as ZGNM = ∫ exp(−H /kB T )d{R}; det denotes the determinant of the matrix. The inverse of the Kirchhoff matrix can be calculated by decomposition of  as  −1 = U −1 U T ,

(7)

where U is an orthogonal matrix whose columns are the eigenvectors of , and  is a diagonal matrix whose diagonal elements are the eigenvalues λi (1  i  N) of . It should be noted that one of the eigenvalues is zero and the corresponding eigenvector represents the translational motion of the protein. Because we are only interested in the internal motions of the protein, the zero eigenvalue and the corresponding eigenvector are removed. According to the decomposition of  −1 in Eq. (7), the  −1 determinant of  −1 can be computed as det( −1 ) = N i=2 λi . Thus, Eq. (6) can be written as [43] 

N 3kB 3(N − 1)kB −1 N−1 − ln (2π kB T /γ ) λi S= 2 2 i=2 =

N 3kB 3(N − 1)kB − ln[(2π kB T /γ )λ−1 i ] 2 2 i=2

Hcom =

1  γ RpT 2

RnT

  pp np

pn nn

  Rp , Rn

(9)

where Rp represents the degree of freedom of the protein, and Rn denotes the degree of freedom for the nonprotein parts of the complex, i.e., the ligand, effector, or both of them; the superscript T represents the transpose of the column vector. The Kirchhoff matrix of the complex is divided into four submatrices, i.e., pp , pn , np , and nn . The effective internal potential energy can be expressed as [31,44] eff = 12 γ RpT  eff Rp Hcom   −1 = 12 γ RpT pp − pn nn ne Rp ,

(10)

−1 where nn is the inverse of submatrix nn . Through the decomposition of  eff , we can get its eigenvalues and eigenvectors, and then can calculate the entropy of the system based on Eq. (8). According to the above method, the entropy of different protein states, i.e., S++ , S−− , S+− , and S−+ , can be computed by using Eqs. (10) and (8), which are given as

S++ =

N 3kB {ln[(2π kB T e/γ )(λ++ )−1 i ]}, 2 i=2

S+− =

N 3kB {ln[(2π kB T e/γ )(λ+− )−1 i ]}, 2 i=2

S−+ =

N 3kB {ln[(2π kB T e/γ )(λ−+ )−1 i ]}, 2 i=2

S−− =

N 3kB {ln[(2π kB T e/γ )(λ−− )−1 i ]}, 2 i=2

(11)

where the first and second subscripts “+ (–)” refer to the protein states with (without) the binding of effector and ligand, 022719-3

SU, QI, LI, ZHU, DU, HOU, HAO, AND WANG

PHYSICAL REVIEW E 90, 022719 (2014)

respectively. Then Eq. (5) can be calculated as  

 N −1 (λ++ )−1 3kB T i (λ−− )i G = − ln −1 2 i=2 (λ+− )−1 i (λ−+ )i  N   (λ+− )i (λ−+ )i 3kB T ln . =− 2 i=2 (λ++ )i (λ−− )i

(12)

The allosteric sites can be identified with the relative large G value. C. Generation of the complex configurations with the effector randomly binding to the protein surface

Similar to the strategy adopted by Ming and Wall [31,32], surface points of the protein were generated by using the ˚ program MSMS [45]. The probe sphere radius is set to 1.4 A ˚ 2 . The generated surface and the sphere density is 1 point per A points were used to construct the complex configurations, in which the effector locates at each of these points. It should be noted that the effectors that directly interact with the ligand (the distance between them is less than the cutoff value) certainly have significant effects on the binding of ligand, whereas in this work we only focus on the allosteric regulation of the effector. Therefore, these surface points were not selected in this work. For each configuration, the impact of the effector on the protein-ligand-binding free energy was evaluated by using the above introduced method. The allosteric sites were identified from the configurations where the effector binding results in a large change in the protein-ligand-binding free energy. III. RESULTS AND DISCUSSION A. The calculation results of Hsp70

Hsp70 is an ATP-regulated molecular chaperone crucial for cellular protein folding, which contains two domains: the NBD that is responsible for ATP binding and hydrolysis, and the SBD that binds the substrate polypeptides. The NBD consists of four subdomains (IA, IB, IIA, and IIB) wrapped around the ATP-binding pocket, where IA and IB form lobe I, and IIA and IIB form lobe II. The SBD is composed of two subdomains, SBDα and SBDβ, as shown in Fig. 1. The binding of ATP is expected to induce the closing movements of lobes I and II in NBD, and then the conformational changes are transmitted to SBD through the domain interface, which results in the substrate release [35]. Many experimental studies have indicated that the allosteric couplings between disparate regions in the protein are essential for the chaperone activity of Hsp70 [34,35]. In this work, the allosteric sites on Hsp70 surface that are coupled with ATP binding were identified by using our proposed method. The crystal structure of Hsp70 from Escherichia coli with an ATP molecule bound to the nucleotide-binding site (PDB code: 4JNE) was used as the protein-ligand structure [35]. Then, based on this structure, surface points of the protein were generated by using the MSMS program [45]. A total of 195 surface points were selected, as shown in Fig. 1(b). It should be noted that we only focus on the allosteric effect of the surface points; therefore, the surface points that directly

FIG. 1. (Color online) (a) The crystal structure of Hsp70 from Escherichia coli (PDB code: 4JNE), which is composed of two domains, i.e., NBD and SBD. The NBD consists of four subdomains: IA displayed as red (dark gray) solid ribbon, IB shown as yellow (light gray) solid ribbon, IIA displayed as yellow (light gray) flat ribbon, and IIB shown as red (dark gray) flat ribbon, where IA and IB form lobe I, and IIA and IIB form lobe II. The SBD contains two subdomains: SBDα displayed as purple (light gray) line ribbon and SBDβ shown as red (dark gray) tube. (b) The generated 195 surface points by using the MSMS program, which are shown as black balls in the figure. The ATP molecular binds to the nucleotide-binding site in NBD, which is shown in the black stick model.

interact with the ligand (the distance between them is less than the cutoff value) were not selected in this study. Then an effector was placed at each of the surface points to generate the protein-ligand-effector ternary complex structure. In this study, a coarse-grained model was adopted, in which the effector was simplified as a point whose position is represented by the coordinates of the surface point. Based on these selected surface points, a total of 195 protein-ligand-effector ternary complex structures were generated. For each generated protein-ligand-effector ternary complex structure, the ligand, effector, and both of them were respectively removed to generate the protein-effector, protein-ligand, and protein structures. Based on these four structures, the corresponding elastic network models were constructed. The cutoff Rc for the interactions between protein Cα atoms was ˚ and for the interactions between a test effector and set to 7.0 A, ˚ Then, according the protein, the cutoff Rc was set to 10.0 A. to Eq. (12) described in the Theory and Methods section, the value of G was calculated for each test surface point. The surface points with relatively large G value were identified as the allosteric sites, which are thermodynamically coupled with the binding of ATP to Hsp70. The calculation results are shown in Fig. 2. In this work, the surface points with the absolute value of G greater than 0.002 were considered to be the allosteric sites. According to this standard, there are 20 surface points that were identified as the allosteric sites. The exact G values for these 20 surface points are displayed in Table I. In order to display these allosteric sites more intuitively, they are mapped onto the protein structure, as shown in Fig. 3. Based on their locations on the protein surface, these allosteric sites can be grouped into five regions, which are marked by ellipses and denoted by the numbers 1–5 in Fig. 3.

022719-4

PREDICTION OF ALLOSTERIC SITES ON PROTEIN . . .

PHYSICAL REVIEW E 90, 022719 (2014)

FIG. 2. G values for all the perturbed surface points. The points with the absolute value of G larger than 0.002 were identified as the allosteric sites. The dashed line in the figure corresponds to the absolute value of G to be 0.002. In this figure, the unit of G is 32 kB T .

(i) Group 1 is located at the interface between the IB and IIB subdomains of NBD, which forms the mouth of the ATP-binding cleft. The interactions between the sites from the opposite sides of the mouth control the opening and closing of the binding pocket. Allosteric effectors binding to these sites can modulate the interactions between them and affect the open-closed conformational motion of the binding cleft, which then influences the ATP binding. Therefore, the allosteric sites in group 1 play important roles in the binding of ATP. Woo et al. [46] and Brehmer et al. [47] have showed that the mutations of the residues at the interface between IB and IIB destabilize the closed state of NBD and thus facilitate the dissociation of the nucleotides. Chang et al. [48] successfully identified flavonoid inhibitors by using a chemical screening method, which inhibits the activity of Hsp70 by up to 75%. They found that these inhibitors bind to the site at the interface between IB and IIB subdomains. (ii) Groups 2 and 3 are situated on the hinge region between the IA and IB subdomains, and the hinge region between the IIA and IIB subdomains of NBD, respectively. These hinge regions mediate the conformational changes between TABLE I. The exact G values of the predicted allosteric sites for Hsp70. Number of surface points

G values for these points (×10−2 )a

Group 1

10

Group 2 Group 3 Group 4 Group 5

1 3 2 4

−4.39, −1.12, −0.43, −0.33, −0.32, −0.27, −0.24, −0.23, −0.23, −0.21 −0.49 −3.77, −0.65, −0.50 −0.38, −0.26 −0.57, −0.24, −0.22, −0.21

Group number

a

The unit of G is 32 kB T .

FIG. 3. The locations of the allosteric sites identified by our method. The predicted allosteric sites are grouped into five groups, which are marked by black ellipses and denoted by the numbers 1–5 in the figure.

the connected subdomains, and play critical roles in the relative movements of these subdomains. Allosteric effectors binding to these hinge regions may hinder the functionally significant interdomain motions, and then affect the binding and hydrolysis of ATP. Ung et al. [49] have utilized molecular dynamics simulation combined with mutagenesis and enzymatic assays to explore the chaperone function of Hsp70 and revealed that the hinge residues between subdomains IIA and IIB are significant for the chaperone activity and allosteric communications in the protein. (iii) Group 4 is located at the global hinge region between lobes I and II of NBD. Upon ATP binding, NBD undergoes the conformational transition from open to closed states, in which lobes I and II move around this hinge region. Therefore, this region controls the collective functional motions of NBD and thus plays an important role in ATP binding. Vogel et al. [50] as well as Johnson and McKay [51] have found that the mutations of residues 147 and 175 at this region decrease the ATPase rate and impede the conformational transition of NBD. Wisen et al. [52] successfully discovered a class of inhibitors that bind to Hsp70 at the hinge region between lobes I and II, which can regulate the chaperone activity of the protein. (iv) Group 5 is located at the interface between NBD and SBDβ. Biochemical and cellular studies have indicated that the conformational changes in NBD triggered by ATP binding are transmitted to SBD through the NBD-SBD interface. This interface is crucial for the allosteric signaling transmission within Hsp70. Sequence conservation analysis by ConSurf has showed that the residues on NBD-SBD interface are highly conserved across different species [35]. Many mutational studies have confirmed the functional importance of the residues on the interface between NDB and SBD, including residues 148, 151, 152, 155, 394, 397, 418, 442, 507, and 515. Biochemical experiments have revealed that these residue mutations disrupt the ATP-induced allosteric coupling [35,50,53,54]. Zhuravleva and Gierasch [55] performed NMR analysis on the NBD of Hsp70 and found that the NBD-SBD linker (residues 389–392) plays critical roles for the collective motion of NBD and the allosteric communication between NBD and SBD.

022719-5

SU, QI, LI, ZHU, DU, HOU, HAO, AND WANG

PHYSICAL REVIEW E 90, 022719 (2014)

FIG. 4. (Color online) (a) The x-ray crystal structure of the fulllength rat GluA2 AMPA receptor in the resting state (PDB code: 3KG2), which is composed of four subunits, i.e., subunits A shown as red (dark gray) solid ribbon, B shown as yellow (light gray) solid ribbon, C displayed as yellow (light gray) tube, and D displayed as red (dark gray) tube. These four subunits are assembled as a dimer of dimers. Each subunit consists of four domains, which are NTD, LBD, TMD, and CTD. CTD is not resolved in the crystal structure. (b) The constructed dimeric complex structure of AMPA receptor where the ligand glutamate binds to subunit A. The glutamate is shown as a stick in black. Based on the constructed structure of AMPA receptor, surface points were generated by using the MSMS program and a total of 500 surface points were randomly selected, which are shown as black balls in the figure.

B. The calculation results of GluA2 AMPA receptor

GluA2 AMPA receptor is a tetrameric protein assembled as a dimer of dimers. Each subunit is composed of four domains, i.e., the N-terminal domain (NTD), the ligand-binding domain (LBD), the transmembrane domain (TMD), and the C-terminal domain (CTD) [36,38,40,41], as shown in Fig. 4(a). The NTD is responsible for receptor assembly, trafficking, and modulation. The LBD consists of two lobes, D1 and D2, which form a cleft for the binding of ligand glutamate. Glutamate binding induces the closure of the cleft and then the conformational motion of LBD is propagated to TMD, which triggers the opening of the ion channel and influx of cations. In this way, the GluA2 AMPA receptor is activated. The TMD consists of three transmembrane helices, M1, M3, and M4, along with a short reentrant loop M2, which form the ion channel pore. The CTD mainly contributes to the anchoring and trafficking of the AMPA receptor. The x-ray crystal structure of the full-length rat GluA2 AMPA receptor (the CTD is not resolved) in the resting state, in which a competitive antagonist is bound to LBD, has been determined (Fig. 4) (PDB code: 3KG2) [36]. However, for the glutamate-activated state, only the crystal structure of the isolated LBD dimer in complex with glutamate has been reported (PDB code: 1FTJ) [56], whereas the NTD and TMD were not resolved. Firstly, based on the coordinates of 3KG2 and 1FTJ, the dimer structure of the full-length AMPA receptor (except for the CTD) in complex with glutamate was constructed using the following strategy [30]: (1) The D1 domain of the LBD of 3KG2 was superposed onto the

FIG. 5. G values for all the 500 perturbed surface points. The surface points with the absolute value of G larger than 0.003 were identified as the allosteric sites. The dashed line in the figure corresponds to the absolute value of G to be 0.003. In this figure, the unit of G is 32 kB T .

LBD D1 domain of 1FTJ, and then the NTD was grafted from the former to the latter; (2) the TMD was grafted from 3KG2 to 1FTJ after the superposition of the LDB D2 domain of these two structures. It should be noted that 3KG2 contains four subunits, and only subunits A and D were used in constructing the full-length dimeric complex structure of the AMPA receptor with the ligand glutamate. Besides that, the ligand glutamate binds to both subunits A and D, and in this work we only focus on the binding of glutamate to subunit A. The constructed AMPA structure in complex with glutamate is shown in Fig. 4(b). Based on the constructed structure of the AMPA receptor in complex with glutamate, surface points of the protein were generated by using the MSMS program. A total of 500 points were selected to construct the receptor-ligand-effector ternary complex structures, as shown in Fig. 4(b). For each test surface point, the elastic network models of the receptor, receptor-ligand, receptor-effector, and receptorligand-effector were constructed, respectively. The values of cutoff Rc are the same as those adopted in Hsp70 protein. Then the G value for each test point was calculated based on Eq. (12) and the points with the absolute value of G greater than 0.003 were identified as the allosteric sites, as shown in Fig. 5. According to this standard, 24 surface points were identified as the allosteric sites. The exact G values for these 24 surface points are displayed in Table II. These predicted allosteric sites can be classified into five groups according to their location on the protein structure, as shown in Fig. 6. (i) Group 1 is located at the mouth of the glutamatebinding pocket in LBD. The allosteric sites in group 1 are located on the opposite sides of the ligand-binding cleft and they can form cross-cleft interactions. These interdomain interactions stabilize the closure conformation of the ligandbinding pocket and play an important role in the binding affinity of the AMPA receptor with its ligand glutamate. It has been demonstrated that agonist efficacy is correlated with

022719-6

PREDICTION OF ALLOSTERIC SITES ON PROTEIN . . .

PHYSICAL REVIEW E 90, 022719 (2014)

TABLE II. The exact G values of the predicted allosteric sites for GluA2 AMPA receptor. Group number

Number of surface points

G values for these points (×10−2 )a

Group 1

8

Group 2

9

Group 3 Group 4 Group 5

3 2 2

−4.00, −1.98, −1.82, −1.65, −1.16, −1.15, −0.81, −0.32 −3.97, −3.07, −2.24, −2.20, −1.91, −1.76, −0.79, −0.64, −0.58 −0.53, −0.53, −0.34 −0.47, −0.34 −0.63, −0.41

a

The unit of G is 32 kB T .

the degree of ligand-binding domain closure [56–60]. Several experiments have indicated that the mutations of residues in this region significantly reduced the binding affinity of agonist to the receptor. For example, the experiments of Robert et al. showed that the mutations of residues Glu402 and Thr686 in this region disrupt the cross-cleft interactions and decrease the binding affinity and efficacy of agonist [61]. (ii) Group 2 is located at the twofold axis of LBD dimer adjacent to the hinge region of ligand-binding clamshell. The two lobes of LBD are connected by a hinge region composed of two polypeptide strands, which form the glutamate-binding pocket. After glutamate binding, the two lobes undergo large-scale conformational motions around this hinge region. Therefore, the hinge region mediates the interdomain conformational rearrangement of LBD and controls the open and closure states of the ligand-binding cleft. Considering that the conformational motion of LBD is necessary for glutamate binding and dissociation, the hinge region plays a significant role in the activation and deactivation of the AMPA receptor [60]. For example, aniracetam, CX614, CX516, and Me-CX516 are small molecules that act as positive allosteric modulators of the AMPA receptor. Jin et al. [62] and Krintel et al. [63] found that these molecules bind to the AMPA receptor at the twofold axis of LBD dimer interface near the

FIG. 6. The locations of the allosteric sites identified by our method. The region of the structure marked by the rectangle is enlarged on the right. The predicted allosteric sites are grouped into five regions, which are marked by black ellipses and denoted by the numbers 1–5 in the figure.

hinge region of the glutamate-binding cleft, which can slow the deactivation of the receptor by stabilizing the ligand-binding cleft in its closed conformation. Our predicted results are consistent with the experimental observations. (iii) Group 3, as well as group 2, is located at the dimer interface of LBD. Structural and functional studies of the AMPA receptor have indicated that glutamate binding triggers the closure of the ligand-binding cleft in LBD, which then exerts tensions on the LDB dimer interface. This tensions can be relieved either by the opening of the ion channel or by the rearrangement of the dimer interface that induces the desensitization of the AMPA receptor [38,40]. Experimental studies have shown that the activation, deactivation, and desensitization of the AMPA receptor are largely determined by the stability of the dimer interface, and it is found that strengthening the dimer interface can largely reduce desensitization and deactivation of the receptor [38,40,41,64–67]. Crystallographic studies have shown that helix D of LBD in one subunit and helix J of LBD in the adjacent subunit form significant intermolecular contacts to stabilize the dimer interface, which play significant roles in the activation, deactivation, and desensitization of the receptor [65]. This is consistent with our predicted results. Most of our predicted allosteric sites in groups 3 and 2 are located on helix D and J of LBD. Many experiments have proved that the residue substitutions, which disrupt the intermolecular contacts between helices D and J in adjacent subunits, can significantly accelerate desensitization of AMPA receptor [64,65,68,69]. Many known AMPA receptor modulator compounds, such as benzothiadiazide derivatives (e.g., CTZ, IDRA-21, and S18986), pyrrolidinone and related piperidine compounds (e.g., aniracetam, CX614, CX516), and biarylpropylsulfonamide compounds (e.g., PEPA, LY404187), bind to the LBD dimer interface, which increases activation and reduces desensitization of the receptor [70,71]. (iv) Group 4 is located at the bottom of the D2 lobe of LBD near the linkers between LBD and TMD. TMD is the ion channel-forming domain composed of three transmembrane helices (M1, M3, and M4) and a connected loop M2. LBD connects with TMD through three interdomain linkers, i.e., LBD-M1, LBD-M3, and LBD-M4, in which the LBD-M3 linker mainly controls the opening and closure of the ion channel pore. Experimental evidences have indicated that the closure motion of the ligand-binding cleft in LBD triggered by glutamate binding puts strains on the interdomain linkers between LBD and TMD, which then induces the opening of the ion channel [72]. Our predicted allosteric sites in group 4 are located around the LBD-M3 and LBD-M1 linkers. This is consistent with experimental results. Experimental studies have implied that the LBD-M3 linker mainly contributes to the transition of conformational motions from LBD to TMD, and the residue mutations in this linker have great impact on the channel gating and desensitization of the GluA2 AMPA receptor [72,73]. (v) Group 5 is located at the interface between NTD and LBD. For the N-methyl-D-aspartic acid (NMDA)-type receptor, experimental studies have indicated that there is an allosteric communication between the NTD and LBD, which is mediated by the interdomain interface [74]. The analysis of structure-encoded dynamics supports an allosteric role of AMPA NTD on the activity of the receptor. It was suggested

022719-7

SU, QI, LI, ZHU, DU, HOU, HAO, AND WANG

PHYSICAL REVIEW E 90, 022719 (2014)

that the domain motion in NTD transmits to LBD through the NTD-LBD interface and then down to the ion channel pore, thereby modulating the activation of the receptor [75,76]. This is consistent with our predicted results. The allosteric sites in group 5 revealed by our method are located at the interface between NTD and LBD, which thermodynamically couple with the cleft motion of LBD upon glutamate binding and the activation of the receptor. C. Discussions

From the point of view of physics, the value of G reflects the dynamical coupling between the active site and the allosteric sites. As we know, the low-frequency normal modes correspond to the large-scale collective motions [43] and thus they are responsible for the long-range allosteric coupling between distant sites, whereas the high-frequency modes are strongly localized [43] and therefore they have little effect on the long-range dynamical coupling between the active sites and the distant allosteric sites. So, the value of G mainly depends on the slow normal modes. It should be noted that there are two adjustable parameters in the Gaussian network model, i.e., the spring constant and the cutoff value. Our calculation results show that the fitting of these two parameters may result in more realistic G values, however, this has little effect on the relative values of G for the perturbed surface points (data not shown). In our method, we only focus on the relative G values in searching for potential allosteric sites. Therefore, the adjustment of these two parameters is not required for our method to identify the potential allosteric sites on the protein surface. In our method, the allosterically important sites are mainly determined by their locations in the protein structure, which tend to be located at the hinge region, the interdomain interface, etc. Besides that, their distance to the ligand-binding site also influences the G value. The sites near to the ligand-binding site tend to have large G values. Therefore, in our method, the allosteric sites were identified as those with relatively larger G value compared with their surrounding sites, instead of their exact values. Our method was developed based on the coarse-grained elastic network model, which possesses the advantage of high calculation speed. The computational times needed for Hsp70 and GluA2 AMPA receptor are, respectively, about 5 and 38 min running on a personal computer with 3.2 GHz dual core processor.

potential targets for structure-based novel drug design. In the present work, an ENM-based thermodynamic method was proposed to identify the allosteric sites that regulate the activity of proteins. In our method, different regions of protein surface were perturbed through the binding of an effector molecule, and then the allosteric sites were determined as those where the binding of an effector induces a large change in the protein-ligand free energy. The calculation results show that our method is a simple yet effective approach to predict allosteric sites on protein surface. In this work, our method was applied to Hsp70 and GluA2 AMPA receptor as case studies. The predicted allosteric sites coupled with ATP binding in Hsp70 are mainly situated on the following regions: (1) The interface between IB and IIB subdomains in NBD, which controls the opening and closing of the ATP-binding pocket. (2) The hinge region between IA and IB subdomains, as well as the hinge region between IIA and IIB subdomains in NBD. These hinges mediate the interdomain conformational motions. (3) The global hinges between lobes I and II in NBD, which are crucial for the open-closed functional motions of NBD, and thus important for ATP binding. (4) The NBD-SBD interface. This region is responsible for the allosteric communications between NBD and SBD. For GluA2 AMPA receptor, it is found that the allosteric sites are mainly located at the following regions: (1) The mouth region of glutamate-binding cleft: These sites stabilize the closure conformation of the ligand-binding cleft and thus greatly affect the binding affinity of ligand glutamate to the AMPA receptor. (2) The twofold axis of LBD dimer adjacent to the hinge region of ligand-binding clamshell: These sites mediate the interdomain motion of the ligand-binding cleft, thereby playing crucial roles on the binding of ligand glutamate. (3) The LBD dimer interface: These sites are coupled with the opening-close motion of glutamate-binding pocket and subjected to extensions induced by the motion of ligand-binding pocket. (4) The regions around the linkers between the LBD and TMD: These sites contribute to the transition of conformational motions from LBD to TMD. (5) The NTD-LBD interface: These sites mediate the allosteric communications between the NTD and LBD. The allosteric sites predicted by our method agree well with the available experimental results. ACKNOWLEDGMENTS

Allosteric regulation is an efficient mechanism to control protein function in many biological processes, such as cell signaling, metabolism, gene regulation, and so on. One of the challenges in the study of protein allostery is how to effectively predict the allosteric sites on the protein surface. Identification of allosteric sites is not only helpful for our understanding of the mechanism of many biological processes, but also provides

This work was supported in part by grants from the National Natural Science Foundation of China (Grants No. 11204267, No. 61271378, and No. 31171267), the Natural Science Foundation of Hebei Province (Grant No. A2014203126), the Natural Science Foundation of Shandong Province (Grant No. ZR2010CQ033), the Beijing Municipal Education Commission (Grant No. KM201310005030), the Program for the Outstanding Young Talents of Hebei Province, and the funding from Shandong Provincial Key Laboratory of Functional Macromolecular Biophysics (Grant No. sdbp002).

[1] A. C. Anderson, Chem. Biol. 10, 787 (2003). [2] R. Nussinov and C.-J. Tsai, Curr. Pharm. Des. 18, 1311 (2012).

[3] R. Nussinov, C.-J. Tsai, and P. Csermely, Trends Pharmacol. Sci. 32, 686 (2011).

IV. CONCLUSION

022719-8

PREDICTION OF ALLOSTERIC SITES ON PROTEIN . . .

PHYSICAL REVIEW E 90, 022719 (2014)

[4] P. J. Conn, A. Christopoulos, and C. W. Lindsley, Nat. Rev. Drug Discovery 8, 41 (2009). [5] T. Kenakin and L. J. Miller, Pharmacol. Rev. 62, 265 (2010). [6] S.-R. Tzeng and C. G. Kalodimos, Nature 462, 368 (2009). [7] E. J. Fuentes, C. J. Der, and A. L. Lee, J. Mol. Biol. 335, 1105 (2004). [8] I. R. Kleckner, P. Gollnick, and M. P. Foster, J. Mol. Biol. 415, 372 (2012). [9] B. F. Volkman, D. Lipson, D. E. Wemmer, and D. Kern, Science 291, 2429 (2001). [10] A. del Sol, H. Fujihashi, D. Amoros, and R. Nussinov, Mol. Syst. Biol. 2, 2006.0019 (2006). [11] C. Chennubhotla and I. Bahar, Mol. Syst. Biol. 2, 36 (2006). [12] A. Ivetac and J. A. McCammon, Methods Mol. Biol. 819, 3 (2012). [13] B. J. Grant, S. Lukman, H. J. Hocker, J. Sayyah, J. H. Brown, J. A. McCammon, and A. A. Gorfe, PLoS One 6, e25711 (2011). [14] C. L. McClendon, G. Friedland, D. L. Mobley, H. Amirkhani, and M. P. Jacobson, J. Chem. Theory Comput. 5, 2486 (2009). [15] K. Sharp and J. J. Skinner, Proteins 65, 347 (2006). [16] T. Haliloglu, I. Bahar, and B. Erman, Phys. Rev. Lett. 79, 3090 (1997). [17] A. R. Atilgan, S. R. Durell, R. L. Jernigan, M. C. Demirel, O. Keskin, and I. Bahar, Biophys. J. 80, 505 (2001). [18] B. Isin, K. C. Tirupula, Z. N. Oltvai, J. Klein-Seetharaman, and I. Bahar, Methods Mol. Biol. 914, 285 (2012). [19] L.-W. Yang and I. Bahar, Structure 13, 893 (2005). [20] A. Dutta and I. Bahar, Structure 18, 1140 (2010). [21] Y. Liu and I. Bahar, Mol. Biol. Evol. 29, 2253 (2012). [22] T. Haliloglu and B. Erman, Phys. Rev. Lett. 102, 088103 (2009). [23] T. Haliloglu, A. Gul, and B. Erman, PLoS Comput. Biol. 6, e1000845 (2010). [24] B. Erman, Proteins 81, 1097 (2013). [25] C. Atilgan and A. R. Atilgan, PLoS Comput. Biol. 5, e1000544 (2009). [26] C. Atilgan, Z. N. Gerek, S. B. Ozkan, and A. R. Atilgan, Biophys. J. 99, 933 (2010). [27] W. Zheng, J.-C. Liao, B. R. Brooks, and S. Doniach, Proteins 67, 886 (2007). [28] W. Zheng and M. Tekpinar, BMC Struct. Biol. 9, 45 (2009). [29] J. G. Su, X. J. Xu, C. H. Li, W. Z. Chen, and C. X. Wang, J. Chem. Phys. 135, 174101 (2011). [30] J. G. Su, H. J. Du, R. Hao, X. J. Xu, C. H. Li, W. Z. Chen, and C. X. Wang, J. Phys. Chem. B 117, 8689 (2013). [31] D. Ming and M. E. Wall, Phys. Rev. Lett. 95, 198103 (2005). [32] D. Ming and M. E. Wall, Proteins 59, 697 (2005). [33] M. P. Mayer, Trends Biochem. Sci. 38, 507 (2013). [34] E. R. P. Zuiderweg, E. B. Bertelsen, A. Rousaki, M. P. Mayer, J. E. Gestwicki, and A. Ahmad, Top. Curr. Chem. 328, 99 (2013). [35] R. Qi, E. B. Sarbeng, Q. Liu, K. Q. Le, X. Xu, H. Xu, J. Yang, J. L. Wong, C. Vorvis, W. A. Hendrickson, L. Zhou, and Q. Liu, Nat. Struct. Mol. Biol. 20, 900 (2013). [36] A. I. Sobolevsky, M. P. Rosconi, E. Gouaux, Nature 462, 745 (2009). [37] S. Nakanishi, Science 258, 597 (1992). [38] S. F. Traynelis, L. P. Wollmuth, C. J. McBain, F. S. Menniti, K. M. Vance, K. K. Ogden, K. B. Hansen, H. Yuan, S. J. Myers, and R. Dingledine. Pharmacol. Rev. 62, 405 (2010). [39] D. Bowie, CNS Neurol. Disord.: Drug Targets 7, 129 (2008). [40] J. Kumar and M. L. Mayer, Annu. Rev. Physiol. 75, 313 (2013).

[41] K. B. Hansen, H. Yuan, and S. F. Traynelis, Curr. Opin. Neurobiol. 17, 281 (2007). [42] L. W. Yang, A. J. Rader, X. Liu, C. J. Jursa, S. C. Chen, H. A. Karimi, and I. Bahar, Nucleic Acids Res. 34, W24 (2006). [43] I. Bahar, A. R. Atilgan, M. C. Demirel, and B. Erman, Phys. Rev. Lett. 80, 2733 (1998). [44] W. Zheng and B. R. Brooks, Biophys. J. 89, 167 (2005). [45] M. F. Sanner, A. J. Olson, and J. C. Spehner, Biopolymers 38, 305 (1996). [46] H. J. Woo, J. Jiang, E. M. Lafer, and R. Sousa, Biochemistry 48, 11470 (2009). [47] D. Brehmer, S. R¨udiger, C. S. G¨assler, D. Klostermeier, L. Packschies, J. Reinstein, M. P. Mayer, and B. Bukau, Nat. Struct. Biol. 8, 427 (2001). [48] L. Chang, Y. Miyata, P. M. U. Ung, E. B. Bertelsen, T. J. McQuade, H. A. Carlson, E. R. P. Zuiderweg, and J. E. Gestwicki, Chem. Biol. 18, 210 (2011). [49] P. M. Ung, A. D. Thompson, L. Chang, J. E. Gestwicki, and H. A. Carlson, PLoS Comput. Biol. 9, e1003279 (2013). [50] M. Vogel, B. Bukau, and M. P. Mayer, Mol. Cell. 21, 359 (2006). [51] E. R. Johnson and D. B. McKay, Biochemistry 38, 10823 (1999). [52] S. Wisen, E. B. Bertelsen, A. D. Thompson, S. Patury, P. Ung, L. Chang, C. G. Evans, G. M. Walter, P. Wipf, H. A. Carlson, J. L. Brodsky, E. R. P. Zuiderweg, and J. E. Gestwicki, ACS Chem. Biol. 5, 611 (2010). [53] Q. Liu and W. A. Hendrickson, Cell 131, 106 (2007). [54] M. Vogel, M. P. Mayer, and B. Bukau, J. Biol. Chem. 281, 38705 (2006). [55] A. Zhuravleva and L. M. Gierasch, Proc. Natl. Acad. Sci. USA 108, 6987 (2011). [56] N. Armstrong and E. Gouaux, Neuron 28, 165 (2000). [57] A. Hogner, J. S. Kastrup, R. Jin, T. Liljefors, M. L. Mayer, J. Egebjerg, I. K. Larsen, and E. Gouaux, J. Mol. Biol. 322, 93 (2002). [58] M. L. Mayer and N. Armstrong, Annu. Rev. Physiol. 66, 161 (2004). [59] R. Jin, T. G. Banke, M. L. Mayer, S. F. Traynelis, and E. Gouaux, Nat. Neurosci. 6, 803 (2003). [60] W. Zhang, Y. Cho, E. Lolis, and J. R. Howe, J. Neurosci. 28, 932 (2008). [61] A. Robert, N. Armstrong, J. E. Gouaux, and J. R. Howe, J. Neurosci. 25, 3752 (2005). [62] R. Jin, S. Clark, A. M. Weeks, J. T. Dudman, E. Gouaux, and K. M. Partin, J. Neurosci. 25, 9027 (2005). [63] C. Krintel, K. Harpsøe, L. G. Zachariassen, D. Peters, K. Frydenvang, D. S. Pickering, M. Gajhede, and J. S. Kastrup, Acta Crystallogr. D 69, 1645 (2013). [64] Y. Sun, R. Olson, M. Horning, N. Armstrong, M. Mayer, and E. Gouaux, Nature 417, 245 (2002). [65] M. S. Horning and M. L. Mayer, Neuron 41, 379 (2004). [66] D. E. Timm, M. Benveniste, A. M. Weeks, E. S. Nisenbaum, and K. M. Partin, Mol. Pharmacol. 80, 267 (2011). [67] S. E. Ward, B. D. Bax, and M. Harries, British J. Pharmacol. 160, 181 (2010). [68] Y. Stern-Bach, B. Bettler, M. Hartley, P. O. Sheppard, P. J. O’Hara, and S. F. Heinemann, Neuron 13, 1345 (1994). [69] K. M. Partin, M. W. Fleck, and M. L. Mayer, J. Neurosci. 16, 6634 (1996). [70] C. P. Ptak, A. H. Ahmed, and R. E. Oswald, Biochemistry 48, 8594 (2009).

022719-9

SU, QI, LI, ZHU, DU, HOU, HAO, AND WANG

PHYSICAL REVIEW E 90, 022719 (2014)

[71] B. Padmanabhan, Theor. Biol. Med. Modell. 10, 46 (2013). [72] S. M. Schmid, C. K¨orber, S. Herrmann, M. Werner, and M. Hollmann, J. Neurosci. 27, 12230 (2007). [73] M. V. Yelshansky, A. I. Sobolevsky, C. Jatzke, and L. P. Wollmuth, J. Neurosci. 24, 4728 (2004).

[74] F. Zheng, K. Erreger, C. M. Low, T. Banke, C. J. Lee, P. J. Conn, and S. F. Traynelis, Nat. Neurosci. 4, 894 (2001). [75] M. Sukumaran, M. Rossmann, I. Shrivastava, A. Dutta, I. Bahar, and I. H. Greger, EMBO J. 30, 972 (2011). [76] A. Dutta, I. H. Shrivastava, M. Sukumaran, I. H. Greger, and I. Bahar, Structure 20, 1838 (2012).

022719-10

Prediction of allosteric sites on protein surfaces with an elastic-network-model-based thermodynamic method.

Allostery is a rapid and efficient way in many biological processes to regulate protein functions, where binding of an effector at the allosteric site...
1006KB Sizes 0 Downloads 7 Views