DOI: 10.1002/cbic.201500285

Full Papers

A de Novo-Designed Monomeric, Compact Three-HelixBundle Protein on a Carbohydrate Template Leila Malik,[a] Jesper Nygaard,[b] Niels J. Cristensen,[a] Charlotte S. Madsen,[a] Heike I. Rçsner,[c] Birthe B. Kragelund,[c] Rasmus Hoiberg-Nielsen,[b] Werner W. Streicher,[d] Lise Arleth,[b] Peter W. Thulstrup,[a] and Knud J. Jensen*[a] De novo design and chemical synthesis of proteins and of other artificial structures that mimic them is a central strategy for understanding protein folding and for accessing proteins with new functions. We have previously described carbohydrates that act as templates for the assembly of artificial proteins, so-called carboproteins. The hypothesis is that the template preorganizes the secondary structure elements and directs the formation of a tertiary structure, thus achieving structural economy in the combination of peptide, linker, and template. We speculate that the structural information from the

template could facilitate protein folding. Here we report the design and synthesis of three-helix-bundle carboproteins on deoxyhexopyranosides. The carboproteins were analyzed by CD, analytical ultracentrifugation (AUC), small-angle X-ray scattering (SAXS), and NMR spectroscopy, and this revealed the formation of the first compact and folded monomeric carboprotein, distinctly different from a molten globule. En route to this carboprotein we observed a clear effect originating from the template on protein folding.

Introduction De novo design and total chemical synthesis of artificial protein mimics is a powerful approach for testing and advancing our understanding of protein folding at the atomic level. By designing protein mimics with reduced complexity, it is possible to obtain specific sequence-to-folding information. This is difficult to address in more complex systems. The combination of de novo design and total chemical synthesis of proteins allows the preparation of new complex systems. Radical de novo design of proteins would start from a nonnatural peptide sequence and possibly auxiliaries to provide a desired three-dimensional structure.[1] One way to overcome the complexity of protein folding, as initially suggested by Mutter et al., is the concept of template-assembled synthetic proteins (TASPs).[2] This entropy-saving approach uses secondary peptide elements that are covalently attached to a tem[a] Dr. L. Malik, Dr. N. J. Cristensen, Dr. C. S. Madsen, Prof. P. W. Thulstrup, Prof. K. J. Jensen Department of Chemistry, Faculty of Science, University of Copenhagen Thorvaldsensvej 40, 1871 Frederiksberg (Denmark) E-mail: [email protected] [b] J. Nygaard, Dr. R. Hoiberg-Nielsen, Prof. L. Arleth Niels Bohr Institute, Faculty of Science, University of Copenhagen Blegdamsvej 17, 2100 Copenhagen (Denmark) [c] Dr. H. I. Rçsner, Prof. B. B. Kragelund Department of Biology, SBin Laboratory, Faculty of Science University of Copenhagen Ole Maaløes Vej 5, 2200 Copenhagen (Denmark) [d] Dr. W. W. Streicher Novozymes A/S Krogshoejvej 36, 2880 Bagsvaerd (Denmark) Supporting information for this article is available on the WWW under http://dx.doi.org/10.1002/cbic.201500285.

ChemBioChem 2015, 16, 1905 – 1918

plate that cooperatively allows the peptide elements to assemble into a well-defined three-dimensional structure. This approach has been explored by several groups, with investigation of a diverse set of templates.[3] One of the difficulties with TASPs relates to their structural characterization. In previous studies by our group we used carbohydrates as templates for constructing proteins with specific tertiary structures.[4] This first generation of artificial proteins—carboproteins—proved highly thermostable. Furthermore, in a later study our group reported the first low-resolution structure of a TASP, in this case a carboprotein, determined by small-angle X-ray scattering (SAXS) combined with ab initio data analysis; it revealed the structure of a three-a-helix bundle with a separate fourth helix.[5] The Sherman group reported the first high-resolution (1.4 æ) crystal structure of a TASP, the cavitein Q4. This had been designed to form a monomer but crystallized to form an asymmetric eight-helix dimer.[6] Recently, we reported a SAXS and analytical ultracentrifugation (AUC) study of the self-assembly of coiled-coil peptides derived from CoilSer[7a] and CoilVaLd.[7b, 8] The original CoilSer peptide forms an antiparallel triple-stranded coiled-coil assembly, whereas CoilVaLd forms a parallel triple-stranded coiled coil. In our previous study, Lys residues had been preferred over His for synthetic reasons, and the study revealed that a His-to-Lys mutation in CoilVaLd maintained a three-helix formation. Truncation of an N-terminal heptad led to disruption of the threestranded coiled coil. Instead, multiple aggregation states were observed, because there was an equilibrium between twostranded and three-stranded coiled coils. Inspired by these previous results, our goal was to create compact three-helix carboproteins and to study the structural

1905

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Full Papers elements that drive carboprotein folding. The design included comparison of three- versus four-heptad repeat sequences, examination of the role of a spacer between the peptides and the template (C1 vs. C6 spacer), and the role of the distance and geometry in the template. As the result of the design and synthesis of nine carboproteins, as well as of their biophysical characterization by SAXS, AUC, NMR and CD spectroscopy, here we present a comprehensive comparative structural study of an ensemble of three-helix-bundle carboproteins. This led to the design of a monomeric, folded three-helix carboprotein.

Results Design and synthesis Our starting point was the low-resolution structure of a putative four-helix carboprotein elucidated by SAXS and ab initio data analysis, which had revealed a three-helix folding topology with a separate fourth helix in solution.[5] However, these data could not indicate whether one specific helix was not aligned with the remaining three helices or whether the apparent “3+ +1” helix structure represented an average over time. On the basis of these findings, we decided to design a compact three-helix carboprotein. We aimed to explore two different placements of the three helices and designed three deoxyhexopyranoside templates for the formation of carboproteins. The O-3,-4,-6 or O-2,-3,-4 linkage sites were intended to serve as anchor points for attachment of the helices. In addition, we evaluated the role of the O-4 epimers: that is, GlcNAc versus GalNAc.

template 7 in 28 % yield. After removal of all six Boc groups with TFA/CH2Cl2 (1:1) over 30 min, the corresponding trifunctionalized carbohydrate template 8 was obtained (Scheme 2).

Design and synthesis of templates The 2-deoxyhexopyranosides 2-acetamido-2-deoxy-d-glucosamine (GlcNAc) and 2-acetamido-2-deoxy-d-galactosamine (GalNAc) were selected as carbohydrate scaffolds. GlcNAc and GalNAc were converted into the corresponding methyl a-glycosides by Fischer glycosylation[9] in good yields of 75 % (1) and 65 % (2), respectively. Per-O-acylation with N,N’-bis-Bocaminooxyacetic acid (Aoa) gave the protected templates 3 and 4 in decent yields of 70 and 53 %, respectively.[10] After purification by preparative HPLC, the two templates were characterized by MS and NMR. The removal of all six Boc groups with TFA/CH2Cl2 (1:1) afforded two new trifunctionalized carbohydrate templates: methyl 3,4,6-tri-O-Aoa-a-d-GlcNAc (5, Scheme 1) and methyl 3,4,6-tri-O-Aoa-a-d-GalNAc (6). These deprotected templates were somewhat labile in solution and were in general lyophilized immediately to provide fine, hygroscopic powders. After the assembly and biophysical studies of eight 2-deoxyGlcNAc- and -GalNAc-based carboproteins (vide infra), a 6-deoxycarbohydrate with only secondary hydroxy groups (i.e., no primary hydroxy groups on flexible methylene units) was used as an additional three-a-helical bundle template. It was synthesized from methyl 6-deoxy-a-d-glucopyranoside as starting material. Upon per-O-acylation with N,N’-bis-Boc-aminooxyacetic acid (Boc2-Aoa-OH) this gave the protected carbohydrate ChemBioChem 2015, 16, 1905 – 1918

www.chembiochem.org

Scheme 1. Synthesis of methyl 3,4,6-tri-O-Aoa-a-d-GlcNAc (5) and methyl 3,4,6-tri-O-Aoa-a-d-GalNAc (6). a) Amberlite IR 120(H), MeOH; b) DIPCDI, DMAP, Pyr/CH2Cl2 (1:1); c) CH2Cl2/TFA (1:1), 30 min.

Design and synthesis of C-terminal peptide aldehydes Coupling of the peptides to the carbohydrate templates was achieved by oxime formation. This necessitated that the peptides contained C-terminal aldehyde (i.e., glycinal, Gly-H) units. Initially, the short and structurally minimized peptide sequence 9 (Table 1) from previous work[5] was evaluated as a reference. The sequence was derived from two-heptad repeats from a four-a-helix bundle prepared by the Sherman group.[3a, 5, 6, 11] Next, one aim was to explore whether short, three-heptad sequences derived from CoilVaLd, which in the form of the peptide amides had formed multiple aggregational coiled coils,

1906

Table 1. C-terminal peptide aldehydes. ID

C-terminal peptide aldehyde sequences

9 10 11 12 13 14

Ac-YEELLKK LEELLKK AG-H Ac-EWALEKK LAALESK LQALEKK LEALEKG G-H Ac-YE VEALEKK VAALESK VQALEKK VEALEKG G-H Ac-Y VAALESK VQALEKK VEALEKG G-H Ac-Y VAALESK VQALEKK VEALEKG GG-H Ac-Y VAALESK VQALEKK VEALEK-Ahx G-H

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Full Papers

Scheme 2. Synthesis of the methyl 6-deoxy-2,3,4-tri-O-Aoa-a-d-glucopyranoside (8) template. a) DIPCDI, DMAP, Pyr/CH2Cl2 (1:1); b) CH2Cl2/TFA (1:1), 30 min.

could form stable three-helix bundles when covalently linked TFA solutions were concentrated by use of nitrogen flow, and to an optimized template. the six peptide aldehydes 9–14 were precipitated with diethyl In this study we designed six C-terminal peptide aldehydes ether and purified by preparative RP-HPLC. In general, the (9–14, Table 1, Table S3 in the Supporting Information). Peptide yields, calculated from Fmoc-quantified loading, were in the 9 10 was a four-heptad peptide derived from CoilSer, whereas to 16 % range, except for the short 16-mer peptide aldehyde, the subsequent peptides were based on CoilVaLd sequences. which was obtained in 30 % yield (Table 1). The C-terminal glyThe C-terminal aldehyde components enabled us to link the cinal unit enabled oxime ligation with no risk of racemization peptides to the carbohydrate template through oxime formaat the aldehyde stage. The sequences also each contained an tion. Peptide 11 was a four-heptad peptide, whereas peptides additional Tyr residue (except for 10, which had a Trp) at the 12–14 were three-heptad structures. We also explored the N terminus, to allow concentration to be determined from UV effect of linkers between the peptide and the carbohydrate absorption.[14] template. Peptide 13 contained an additional Gly residue next to the C-terminal glycinal unit, whereas in peptide 14 this Gly Synthesis of three-helix-bundle carboproteins was substituted with an aminohexanoyl (Ahx) moiety (Table 1). The C-terminal peptide aldehydes 9–14 were synthesized by Firstly, carboproteins 15–22 (Scheme 3), based on GlcNAc and the BAL linker strategy.[12] It commenced with anchoring of oGalNAc, were synthesized and studied biophysically. SubsePALdehyde [5-(2-formyl-3,5-dimethoxyphenoxy)pentanoic acid] quent to SAXS studies (vide infra) of carboproteins 15–22, carto an amino-terminated resin (Tentagel S-NH2, loading boprotein 23 was also assembled on the 6-deoxy template 8. 0.26 mmol). Standard coupling conditions were used with The chemoselective oxime coupling of C-terminal aldehydes to HBTU N-[(1H-benzotriazole-1-yl)(dimethylamino)methylene]-Nthe aminooxy-functionalized templates was performed in methylmethanaminium hexafluorophosphate N-oxide, HOBt (1100 mm acetate buffer pH 4.76 as previously described.[10, 15] To hydroxy-1H-benzotriazole ), and DIPEA (N,N-diisopropylethylaprevent aggregation of some of the peptide aldehydes, organmine) under microwave irradiation conditions at 60 8C for 10 min. The C-terminal glycinal unit was introduced by reductive amination with aminoacetaldehyde dimethyl acetal and NaBH3CN under weakly acidic conditions (1 % AcOH in methanol) and use of microwave irradiation at 60 8C for 10 min.[13] Acylation with the symmetrical anhydrides of Fmoc-Gly-OH or Fmoc-Ahx-OH afforded the protected, resin-bound dipeptides. Further elongation was performed with a Biotage Syro II peptide synthesizer and use of standard solid-phase peptide synthesis (SPPS) with the 9-fluorenylmethyloxycarbonyl (Fmoc) system for protection of Naamino groups. After N-acetylation with Ac2O/CH2Cl2 (1:3), peptide release was performed with TFA/H2O (95:5) for 2 h at RT. The Scheme 3. Synthesis of three-helix-bundle carboproteins. a) H2O/CH3CN, pH 4.76. ChemBioChem 2015, 16, 1905 – 1918

www.chembiochem.org

1907

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Full Papers ic cosolvents (CH3CN or DMF) were added when required. The template d-GlcNAc 5 was coupled with all six peptides 9–14 (Scheme 3) whereas d-GalNAc 6 was coupled with peptide aldehydes 9 and 13 and 6-deoxy template 8 only with peptide aldehyde 12. This afforded a total of nine carboproteins (Table 2). All carboproteins were obtained in Š 95 % purity after preparative RP-HPLC and each gave an LC-MS spectrum consistent with the expected molecular mass in 6–48 % yield (Table S4).

Table 2. Synthesized carboproteins and their degrees of a-helicity determined from their CD spectra at 222 nm. All carbohydrates were methyl pyranosides. ID

Carboproteins (C-terminal oxime to carbohydrate)

15 16 17 18 19 20 21 22 23

(Ac-YEELLKKLEELLKKAG)3-GlcNAc (Ac-YEELLKKLEELLKKAG)3-GalNAc (Ac-YVAALESKVQALEKKVEALEKGG)3-GlcNAc (Ac-YVAALESKVQALEKKVEALEKGG)3-GalNAc (Ac-YVAALESKVQALEKKVEALEKGGG)3-GlcNAc (Ac-YVAALESKVQALEKKVEALEK-Ahx-G)3-GlcNAc (Ac-YEVEALEKKVAALESKVQALEKKVEALEKGG)3-GlcNAc (Ac-EWALEKKLAALESKLQALEKKLEALEKGG)3-GlcNAc (Ac-YVAALESKVQALEKKVEALEKGG)3-6-deoxy-Glc

a-Helicity [%]

62 69 81 51 55 42 74 43 84

Biophysical studies CD spectroscopy was used to study the secondary structure content in the carboproteins. All CD spectra were recorded at 25 8C in 50 mm acetate buffer at pH 5.5 with a carboprotein concentration of 50 mm. The pH of 5.5 was chosen to allow comparison with previous studies on carboproteins. All the carboproteins showed some degree of a-helicity, with minima at 222 and 208 nm and a maximum near 195 nm (Figure 1). The helical content was in each case calculated on the basis of mean residue ellipticity according to Chen et al. (Table 1).[16] The a-helical contents of carboproteins 15 and 16, which had short two-heptad helices, were 62 and 69 %, respectively. The a-helicities of the three-helix structures are comparable to those observed in four-helix bundle carboproteins based on Glcp (66 %) and Galp (64 %) templates with use of the same peptide sequences.[17] Previous results had shown that, for some carboproteins, the degree of a-helicity depended on the template, and thus there was a controlling effect of the template for secondary structure formation.[17, 18] Comparison of the a-helicities of carboproteins 17, 18, and 23, which have the same three-heptad peptide helices but three different templates, showed values of 81, 51, and 84 %, respectively. Thus, the GalNAc template induced significantly less a-helicity than either of the two other templates. This again emphasizes a clear template effect. The sequence also has an influence on the a-helicity, as becomes clear on comparison of carboproteins with the same template (GlcNAc) but different peptide sequences, in terms of, for example, length and linker. This is clear when the a-helicities of carboproteins 15, 17, 19, 21, and 22 are compared, ChemBioChem 2015, 16, 1905 – 1918

www.chembiochem.org

Figure 1. Top: Far-UV CD spectra of all nine carboproteins. Bottom: GdnCl denaturation of carboprotein 17. Solid black line represents the fit to a twostate reaction.

with the a-helical content varying from 42 to 81 %. Comparison of carboprotein 17 with carboprotein 19, which has the same sequence but with an added Gly residue, showed a significant lowering of the helicity from 81 to 55 %, and a further lowering was observed when one Gly residue was substituted with a longer Ahx linker (42 %). This is remarkable, because it clearly shows an effect of the linker between the peptide and the template. In general, the a-helicities were higher for carboproteins than for the corresponding peptide amides.[8] The exception was carboprotein 22, the peptide sequence of which was based on the CoilSer sequence. In its original structure this prefers antiparallel coiled coils[7] (not possible in this threehelix design). Thus, the 50 % reduction in a-helicity relative to the peptide amide likely reflects the inherent tendency to form antiparallel rather than parallel structures. Comparison of 19 and 21 also indicated that four-heptad sequences, at least in some cases, gave higher degrees of a-helicity than the corresponding carboproteins with three-heptad sequences. A denaturation experiment with GdnCl was performed on carboprotein 17, demonstrating high stability and cooperative unfolding. The analysis was performed by use of the linear extrapolation method (LEM),[19] which gave the denaturation fit.

1908

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Full Papers The folding free energy was extrapolated to DGH2O = ¢4.8 kcal mol¢1 and the slope m = 0.551 kcal mol¢1 m¢1 (Figure 1, bottom). This was lower than reported for four-a-helical bundles with peptide 9 on a carbohydrate template (DGH2O = ¢8.8 for Galp, ¢9.1 for Glcp, and ¢8.0 for Altp).[17] A CD spectroscopy temperature scan at 222 nm from 0– 110 8C for carboproteins 17–20 and 23 revealed that no transition to a fully unfolded state was observed up to 110 8C (data not shown). This indicated that these carboproteins were very resistant towards thermal denaturation and that complete unfolding required > 110 8C. The reversibility of the thermal denaturation was tested for carboproteins 17, 18, and 23 (from 110 back to 20 8C); this showed that 75, 68, and 78 %, respectively, of the original a-helicity remained. AUC velocity experiments were performed on selected carboproteins (17, 18, 22, and 23) to determine the distributions of the different oligomers present in solution. Carboproteins 17, 18, and 23 each showed a single peak (Figure 2), thus suggesting that there was only one predominant species present in the investigated concentration range (50–400 mm). Carboprotein 22 displayed two peaks, thus indicating the presence of two species (Figure 2). Sedimentation equilibrium AUC was used to determine the molecular weights of the species present (Table 3, in parentheses). The ratio of the molecular weight obtained by AUC to the calculated weight corresponds to monomeric species for all four carboproteins. Further investigations of the oligomeric states and other properties were performed for all nine carboproteins by SAXS, as described in the following section. SAXS analysis was first performed on carboproteins 15–22 at two different concentrations (5 and 10 mg mL¢1) in 50 mm NaOAc, pH 5.5. The maximum diameters (Dmax) of the molecular assemblies, the radii of gyration (Rg), and the molecular weights were determined by indirect Fourier transformation for all carboproteins (Table 3). Carboprotein 23 was designed after the SAXS studies of 15–22.

Figure 2. Distributions of sedimentation coefficients of carboproteins 17, 18, 22, and 23.

ChemBioChem 2015, 16, 1905 – 1918

www.chembiochem.org

Table 3. Results from indirect Fourier transformation of SAXS data. Ratio indicates oligomeric state. ID

Dmax [nm]

Rg [nm]

Mw,calcd [kD]

Mw,SAXS/(Mw,AUC) [kD]

Ratio

15 16 17 18 19 20 21 22 23

8.07 7.67 5.42 n.d. 12 54 8.95 11.71 5.5

2.33 2.19 1.52 n.d. 4.58 19.05 2.08 2.93 1.41

6.2 6.2 7.9 7.9 8.0 8.0 10.6 10.1 7.8

11.7 11.5 8.1 (8.9) n.d. (8.3) 7.8 n.d. 11.5 9.9 (9.6) 8.2 (8.4)

1.89 1.86 1.03 n.d. 0.92 n.d. 1.08 0.97 1.05

Concentration … 5 mg mL¢1, in 50 mm acetate buffer pH 5.5.

The scattering curves and the indirect Fourier-transformed data looked very similar for all carboproteins, so only the recorded scattering profiles and the indirect Fourier-transformed data for carboproteins 15, 16, 17, and 23 are shown (Figure 3). The small differences between the carboproteins can easily be seen in the Dmax values (Table 3). The scattering data for two-heptad carboproteins 15 and 16 differed slightly from each other, and carboprotein 15 formed larger assemblies than 16 (Figure 3). The differences could also be seen in the indirect Fourier-transformed data (Figure 3), in which the obtained p(r)-functions of 15 and 16 differed slightly from each other, with Dmax values of 8.1 and 7.7 nm, respectively. In both 15 and 16, a molecular weight close to that required for a dimer (Table 3) was observed. The directing effects of the three templates were investigated by SAXS by a comparison of carboproteins 17, 18, and 23 (Table 3). Carboprotein 18, with the GalNAc template, unfortunately gave a scattering curve suggestive of very high degree of aggregation; further dilution to 1.5 mg mL¢1 did not prevent this. The scattering data for carboproteins 17 and 23 were obtained and gave almost identical curves (Figure 3). They both had roughly bell-shaped p(r) functions, with Dmax values of 5.4 and 5.5 nm, respectively, whereas the corresponding non-templated peptide amide had a Dmax of 4.9 nm.[8] Furthermore, they both had apparent molecular weights expected for the monomers (Table 3). The Rg and Dmax values for carboprotein analogues 19, with one additional Gly residue, and 20, with the Ahx linker, indicated that they both had considerable larger dimensions than carboprotein 17. The Dmax values for the two non-templated coiled-coil peptide amides corresponding to 19 and 20 were both … 4.6 nm;[8] this is significantly lower than the Dmax values of 12 and 54 nm for 19 and 20, respectively. The molecular weight of carboprotein 19 was close to that required for a monomer, whereas no molecular weight for carboprotein 20 could be determined, because no Guinier range was identified (see Table 3). Finally, four-heptad carboproteins 21 (a CoilVaLd analogue) and 22 (a CoilSer analogue) both had molecular weights corresponding to the monomers. Carboprotein 22 was more elongated, with higher Dmax (11.7 nm) and Rg (2.9 nm) values than carboprotein 21 (Dmax = 8.9 nm and Rg = 2.0 nm). The coiled-coil peptide amides corresponding to car-

1909

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Full Papers and chain segments with use of the program Bunch.[22] This program combines rigid body modeling with a Monte-Carlo-based bead modeling approach and, therefore, allows for models consisting of joint chains and rigid body parts. The template portion of the carboprotein was accounted for by including an extra Asp residue at the C terminus in each of the three template-assembled peptides. Thus, by joining three extra Asp residues through a 6 æ contact constraint, a virtual bead model template was formed. For each model, multiple identical computations were performed, and the best-fitting models were selected as the final model result (Figure 5). On going from the top model to the bottom one (Figure 5), more degrees of freedom have been introduced; that is, rigid chain segments have been replaced by flexible parts, and this significantly improves the fit Figure 3. Top right: SAXS data for carboproteins 15 and 16. Top left: The corresponding p(r) functions. Bottom right: SAXS data for carboproteins 17 and 23. Bottom left: The corresponding p(r) functions. with the experimental data. Moreover, as can be seen from the c2 values on the right-hand boprotein 22 (CoilSer-G-desE) and carboprotein 21 (CoilVaLdside of the picture, the data could not be satisfactorily deYG) had Dmax values of 8.8 and 6.5 nm, respectively. The larger scribed by a model with three rigid helices. An improved fit size of 22 was expected, because the CoilSer sequence has could be achieved if one of the peptides was represented as a flexible chain, but by far the best fit was obtained if all three a preference for an antiparallel arrangement and not the parallel one that it would have in a bundle, through the C-terminal helices were allowed to arrange themselves freely, as was the anchoring on the carbohydrate template. In contrast, 21 was case with the model fit 3. This model is noticeably more extended and less dense than any of the others, thus suggesting a compact three-helix bundle. Ab initio and rigid-body modeling of the SAXS data was performed. Ab initio shape models were obtained with the computer program DAMMIF.[20] Each model represents an average of 15 consecutive runs, and averaged models were computed by use of the programs Damaver and Supcomb.[21] For the three carboproteins 17, 21, and 22, the fifteen models included in the average were very similar, with low NSD values (< 1) as computed by Supcomb. The SAXS-reconstructed molecular envelopes of carboproteins 21 and 22 each appeared to be constituted of two parts. The molecular envelope of carboprotein 17 appeared to be more compact (and with no elongated part) than that of 22, which contained a spherical part and a thinner elongated part, whereas carboprotein 21 consisted of two elongated parts (Figure 4). To analyze the data set for the relatively compact carboprotein 17 methodically, further analysis was conducted by modelFigure 4. Reconstructed ab initio shape models of carboproteins 17, 21, and ing the chosen data set as various combinations of rigid parts 22 viewed from different angles. ChemBioChem 2015, 16, 1905 – 1918

www.chembiochem.org

1910

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Full Papers

Figure 6. Structures of carboproteins 17 and 23 (side views: A and B, top views: C and D) after molecular dynamics simulations. Corresponding initial structures are shown in Figure S1. Val and Leu are color coded red and green, respectively. a-Helices and random coils are shown in cylinder and tube representation, respectively, whereas the carbohydrate template and linkers are shown in licorice representation.

Figure 5. A) Rigid-body model fits. B) Combined chain and rigid-body models of carboprotein 17 computed with the program Bunch. Color codes. Blue: rigid helix. Magenta: flexible chain. Yellow: interconnected template beads. 1) Three rigid helices template-linked through flexible chains. 2) Two rigid helices template-linked through flexible chains plus one whole flexible peptide. 3) Three fully flexible peptides interconnected through template beads and constrained into threefold rotational symmetry. For this particular protein construct, all models are compatible with a three-helix bundle.

that carboprotein 17 was less compact than what would be expected for a native three-helix bundle. Molecular dynamics (MD) simulations MD simulations were carried out on models of 17 and 23 to support the interpretation of experimental results and the understanding of template effects on the dynamics of peptide chains. The MD simulations were based on carboprotein structures with all oxime moieties in the E configuration. The chain names O4, O6, O3, and O2 denote the three identical peptide chains linked to the carbohydrate template at oxygen atoms O-4, O-6, and O-3 in 17 and O-2, O-3, and O-4 in 23. Structures of 17 and 23 after the simulations are shown on the left and right of Figure 6, respectively. The MD structure of the non-templated peptide trimer from a previous study[8] is included in Figure S2 A for reference. The simulations indicate that both the non-templated peptide and 23 conserve the C-terminal structure to a greater ChemBioChem 2015, 16, 1905 – 1918

www.chembiochem.org

extent than 17. In the last case, displacement of the C-terminal part of the B-chain (attached to the C6 position of the carbohydrate template) involves Leu19, which is exposed to solvent at the end of the simulation (Figure 6). In contrast, Leu19 in the non-templated peptide and 23 remain hydrophobically associated. The looser association of the three helices at the C termini of 17 is reflected in the simulation-averaged interchain distance between a-carbons of Leu19 residues: for 17 this distance is 11.0 æ, in comparison with 8.6 æ for the nontemplated peptide and 7.9 æ for 23. These in silico indications of stronger helical association in 23 motivated us to commence the synthesis of this carboprotein. From the MD simulations, the mean radii of gyration and maximum dimensions were measured for carboprotein 17 [Rg = (12.7 œ 0.1) æ, Dmax = (44.9 œ 1.4) æ] and carboprotein 23 [Rg = (12.9 œ 0.1), Dmax = (46.2 œ 1.6) æ]. Carboprotein 17 thus exhibited a slightly shorter average maximum dimensions (Dmax) than 23, consistently with the larger degree of C-terminal deformation in 17. The SAXS analysis gave significantly higher Dmax and Rg values than the MD simulations, one possible explanation being the addition of the solvation shell around the protein in the SAXS calculation.[23] NMR spectroscopy SAXS had revealed a compact, monomeric structure for carboprotein 23, so it was further studied by natural-abundance 1H and 13C NMR. Rewardingly, the good peak dispersion and relatively sharp peaks in the 1H spectrum indicated that carbopro-

1911

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Full Papers tein 23 was not a classical molten globule. Because of the degenerate structure of the carboprotein, with three identical peptide chains, the full 3D structure was not likely to be elucidated. However, the first 16 residues of carboprotein 23 could be unambiguously assigned (Figure 7 A). Only one set of sig-

B).[24] Interestingly, the Ha chemical shifts showed a regular pattern of larger and smaller secondary chemical shifts (Figure 7 C). In general, all Val and Glu residues had comparatively large secondary shifts, within a range of d = ¢0.35 to ¢0.5 ppm [average secondary chemical shift was d = (¢0.26 œ 0.17) ppm], whereas Ala3, Ala4, Ser7, Gln10, Ala11, and Leu12 had secondary shifts in the d = 0.0 to ¢0.2 ppm range. Finally, 1H and 13C NMR spectra were recorded at pH 7.0 (data not shown). The chemical shifts did not change significantly; this implied that the degree of burial did not change for Val, Leu, and Glu. However, additional NOEs were observed, thus indicating a slight stabilization of the structure with increased pH.

Discussion The three-a-helix-bundle carboproteins were designed on the basis of our previous studies on carboproteins and coiled-coil peptides. The minimal three-ahelix-bundle carboproteins 15 and 16, which each contained three peptides based on a twoheptad sequence, revealed relatively high helical content by CD Figure 7. NMR analyses of carboprotein 23. A) Assignment panel. The first 16 residues of the peptides were assigned with the aid of a combination of H,H-NOESY and H,H-TOCSY spectra. A total of 25 short-range (i,i+ +1) and spectroscopy (62 and 69 %, refive medium-range (i,i+ +3 and i,i+ +4) NOEs could be unambiguously assigned and are summarized in the strip spectively). This was later conplots below the sequence. The bottom bar plot indicates the type of secondary structure according to the Ca secfirmed by SAXS, in which the 13 ondary chemical shifts assigned from a natural-abundance C HSQC; a value of +1 corresponds to a-helix and molecular mass of each carboa value of ¢1 to b-strand. B) Ha secondary chemical shifts. Ha secondary chemical shifts were in a range from d = ¢0.2 ppm to ¢0.5 ppm. A negative value in general indicates a-helix structure, whereas a value closer to protein corresponded to a dimer ¢0.2 ppm indicates a large degree of solvent exposure, and a value closer to d = ¢0.5 ppm indicates burial of and the Dmax was higher than 1 the residue. C) 1D H NMR spectrum in 100 % D2O, 50 mm NaOAc, pH 5.5 showing a zoom on the oxime protons. would be expected for a monoD) Buried residues according to the secondary chemical shifts. The observed degree of burial versus the expected mer. Higher oligomeric states degree of burial is indicated, following the nomenclature of the coiled-coil connectivity. were also observed by Sherman and coworkers in their threenals was observed for all three peptides in the carboprotein, helix-bundle TASP with a cyclotribenzylene (CTB) template, thus indicating a high degree of symmetry within the system. which by AUC indicated a monomer–dimer equilibrium.[25] We were unable to assign the C termini of the peptide compoTheir results clearly demonstrate that, depending on the temnents, which were attached to the carbohydrate template. For plate and peptide sequences, TASP structures might have a tenresidues Glu17–Gly23 we observed weak and relatively broad dency to form oligomers.[26] peaks in the TOCSY and NOESY spectra, indicating dynamics in Next, on the basis of the well-known de novo-designed sethe intermediate exchange regime. In addition, the weak peaks quence CoilVaLd, carboproteins with more complex three- and of the E and Z oxime protons at d = 7.5–7.6 ppm and 6.9– four-heptad sequences were studied, with use of the CoilSer 7.0 ppm in the 1D 1H NMR spectrum, recorded with a fully sequence as a control. The original CoilSer peptide assembled deuterated sample, were also highly broadened. The signals into an antiparallel three-stranded coiled coil,[8] which would for the oxime protons showed fine-splitting, indicating that not be possible when anchored covalently at the C terminus. several species were in chemical exchange (Figure 7 C), with an We thus expected it to form an unstable carboprotein. The E/Z ratio of approximately 75:25. nine carboproteins allowed us to compare the following: Secondary chemical Ca and Ha shifts of the assigned residues 1) effect of peptide length [three- vs. four-heptad (17, 21)], consistently confirmed a high degree of helicity (Figure 7 A, 2) effect of sequence propensity to form parallel versus antiChemBioChem 2015, 16, 1905 – 1918

www.chembiochem.org

1912

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Full Papers parallel structures (21, 22), 3) effect of linker [Gly vs. Ahx (17, 19, 20)], and 4) effect of the template [GlcNAc vs. GalNAc vs. 6-deoxy-Glc (17, 18, 23)]. Rewardingly, SAXS analysis revealed the first carboprotein to be associated with a bell-shaped p(r)-function, thus indicating that carboprotein 17 was spherically shaped and without any elongated parts. This suggested a packed three-stranded bundle carboprotein (Figure 5). This initial SAXS data analysis led to further analysis of carboprotein 17 by CD spectroscopy. GdnCl denaturation of carboprotein 17 showed stability and cooperative unfolding (Figure 1, bottom). Effect of peptide length: three- versus four-heptad (17, 21) In the study of the effect of the peptide segment length, AUC indicated the presence of two species in the case of carboprotein 22, whereas 17, 18, and 23 were monomeric (Figure 2) and the SAXS molecular envelopes of carboproteins 17 and 23 were compact, whereas carboprotein 21 consisted of two elongated parts. Thus, in this case a three-heptad peptide segment in a carboprotein gave a more compact structure than the corresponding four-heptad carboprotein. Effect of sequence propensity The effect of the propensity of helices to form either parallel or antiparallel structures was elucidated by comparison of carboproteins 17, 21, and 22. Carboprotein 17 was highly a-helical (81 %) in relation to the corresponding non-templated peptide amide (41 %), and the helicity of carboprotein 21 (74 %) was almost the same as for the non-templated peptide amide (67 %). In sharp contrast, the a-helicity of carboprotein 22 was approximately half (43 %) that of the corresponding non-templated peptide amide (84 %),[8] likely because the CoilSer-derived sequences in 22 cannot adopt an antiparallel arrangement. SAXS data for carboproteins 17, 21, and 22 suggested that they were monomers by molecular mass determination, and this was confirmed by AUC for carboproteins 17 and 22. Effect of linker The effect of attaching a C-terminal spacer was studied. Carboprotein 19 incorporated an additional Gly residue in the sequence, in relation to 17, whereas 20 contained a longer sixcarbon Ahx linker. Whereas carboprotein 19 had Rg (4.58 nm) and Dmax (12 nm) values higher than those of carboprotein 17 (Rg = 1.5 nm and Dmax = 4.58 nm), the longer linker on carboprotein 20 afforded higher-order self-assembled structures with very large Rg (19 nm) and Dmax (54 nm) values (Table 3). These data indicated that carboprotein 17, with no added spacer, was more compact. The compact structure was also evident from MD simulations, which suggested that additional structural flexibility afforded by a longer linker tended to destabilize the folding of the carboprotein. This clear demonstration of an effect from the spacer was remarkable and provided further validation of the carboprotein concept. ChemBioChem 2015, 16, 1905 – 1918

www.chembiochem.org

Effect of the template The effect of the distance and geometry (here equatorial vs. axial) of hydroxy groups in the carbohydrate was studied by comparison of the epimers at the C-4 position (GlcNAc vs. GalNAc, 17, 18). Peptide aldehyde 12 was coupled to GalNAc template 5 to provide carboprotein 18. Carboprotein 18 had a rather low helical content (51 %). Unfortunately, it was not possible to obtain good quality scattering data for carboprotein 18, due to aggregation of the sample. However, AUC of carboprotein 18 afforded a molecular mass corresponding to a monomer, and no larger aggregation was seen in this sample (Figure 2). A more comprehensive SAXS data analysis of carboprotein 17 by ab initio and rigid-body modeling suggested that carboprotein 17 formed a three-helix bundle with a somewhat floppy C terminus. This was also confirmed by a MD simulation showing that the C-6 peptide chain was very flexible at the Cterminal end and that the Leu19 position seemed to fluctuate in and out of the hydrophobic core. Encouraged by these observations we designed the new 6-deoxycarbohydrate template 7, allowing peptide conjugation at positions 2, 3, and 4 instead of positions 3, 4, and 6. Peptide aldehyde 12 was attached to this 6-deoxy Glcp template 7 through oxime coupling, affording carboprotein 23. CD spectroscopy indicated that carboprotein 23 had a high a-helical content (84 %). Furthermore, carboprotein 23 was studied by SAXS and AUC, and molecular weight calculations suggested a monomeric structure (Table 3). Evaluation of all the biophysical data on carboproteins 17 and 23 showed that each had a high a-helical content, with 23 (84 %) slightly higher than 17 (81 %), and that they were both monomeric by SAXS and AUC calculations of molecular masses. The indirect Fourier-transformed SAXS data gave relatively small Rg values and Dmax values suggestive of more compact structures than any of the other carboproteins. Furthermore, MD simulation of both systems supported three-helixbundle structures, in which carboprotein 23 was more tightly packed than 17, especially at the C-terminal end of chain O6. Carboprotein 17 was less compact than expected for a native three-helix bundle. This illustrates that by combining design and synthesis with several biophysical techniques and molecular modeling it is possible to optimize a de novo-designed carboprotein. Again, in the comparison of 17 and 23 we observed a clear directing effect from the carbohydrate template. Carboprotein 23 was finally studied by natural-abundance 1 H and 13C NMR spectroscopy. Rewardingly, the first conclusion was that it was not a classical molten globule. The three peptide segments had identical chemical shifts, and the peptide part of the structure was thus symmetric. The value of the secondary chemical shift relates to the degree of solvent exposure of the residues investigated.[27] The three Val residues were clearly buried, as expected. In contrast, the two Leu residues in the assigned Tyr1–Val16 segment exhibited 1Ha secondary chemical shift values of only d … ¢0.2 ppm, indicating solvent exposure. The two Glu residues had 1Ha secondary chemical shift values of d … ¢0.4–0.5 ppm. However, the analyses of sec-

1913

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Full Papers ondary chemical shifts reported by Avbelj et al. indicated that for Glu residues there was apparently no difference between buried and solvent-exposed residues, with an average value of d = ¢0.3 ppm for the 1Ha secondary chemical shift.[27] Still, the large negative values of both Glu residues, and in particular for Glu5, suggest that instead of the expected burial of Val and Leu in the triple coiled-coil structure, the Val residues in the a positions were well buried, whereas the two Leu residues in the d positions were solvent-exposed (Figure 7 D). We speculate that there is a dynamic equilibrium between burial of the Leu side chains and the hydrophobic ethylene moieties of the neighboring Glu, which might lead to instability in the ordering of the three helices. One might suspect that at pH 5.5 the Glu side chain carboxylate group would be at least partially protonated, making it more hydrophobic. However, data acquired at pH 7 pointed only to small changes in the structure. Several heptad sequences have hydrophobic residues not only the in a and d positions but also in the e positions: for example, GCN4(-p1), and some de novo sequences.[3a] This would be expected to contribute to some instability in the structure. Carboprotein 23 has Glu in the e positions; however, the ethylene moiety in Glu could allow for hydrophobic contacts. When carboprotein 23 was dissolved in 100 % D2O, its amide groups all exchanged within one hour (Figure 7 B), which pointed to some instability of the structure. Comparison with MD simulations for 17 and 23 is relevant; these indicated that 23 was more tightly associated, as seen by the distances between the Leu19 a-carbons. Thus, the use of templates with axial/equatorial orientations (Table S2) together with comparatively short linkers had a directing effect. The template in 17, while stabilizing the three-helix, prevented optimal folding, whereas the template in 23 instead allowed it, especially for the N-terminal part of the structure. The cavitein Q4, which featured adjoining Leu in the heptad, crystallized as an asymmetric eight-helix dimer, but was monomeric in solution at concentrations of up to at least 100 mm, and existed in a monomer–dimer equilibrium at … 2.7 mm.[6] In comparison, carboproteins 17, 18, and 23 reported here were monomeric at concentrations of 50–400 mm.

Conclusion The three-helix carboproteins designed and reported here led to the identification of the first, relatively compact, monomeric carboprotein according to SAXS. The first 16 residues of the peptides were assigned in the NMR spectra, which revealed a symmetric structure with a partially buried core. Rewardingly, the good chemical shift dispersity showed that it was not a molten globule. We demonstrated: 1) an effect of peptide length, because the three-heptad peptide segment in a carboprotein gave a more compact structure than the corresponding four-heptad carboprotein, 2) an effect of sequence propensity, because a peptide segment with an inherent propensity to form antiparallel bundles led to a destabilized carboprotein, 3) an effect of the linker, because a flexible linker led to a less stable and more flexible carboprotein, 4) an effect of the carbohydrate template, in the formation of a three-helix structure ChemBioChem 2015, 16, 1905 – 1918

www.chembiochem.org

with short three-heptad sequences, and 5) a subtle effect of the carbohydrate template, evident through the distance and geometry of the 6-deoxy- versus 2-deoxycarbohydrate template. The C termini of the peptide segments and the linker remained somewhat flexible, thus indicating that the carboprotein structure can be further optimized.

Experimental Section General procedures: Organic solvents and Na-9-fluorenylmethyloxycarbonyl (Fmoc) amino acids were obtained from Iris Biotech (Marktredwitz, Germany). Preparative and analytical HPLC was performed with a Dionex Ultimate 3000 instrument and use of Chromeleon 6.80SP3 software. Peptide aldehydes were purified by preparative RP-HPLC on a FEF 300 æ C4 column (5 mm, 20 Õ 250 mm), with application of a flow of 10.0 mL min¢1 in a linear gradient with increasing amount of buffer (10 % B to 100 % B) over 37 min [buffer A: TFA (0.1 %) in H2O; buffer B: TFA (0.1 %) in CH3CN]. Carboproteins were purified by preparative RP-HPLC on a FEF 300 æ C4 column (5 mm, 20 Õ 250 mm), with application of a flow of 10.0 mL min¢1 in a linear gradient with increasing amount of buffer (10 % B to 100 % B; buffers as for peptide aldehydes) over 48 min. Mass analysis was performed with an ESI-MS Mass Spectrometer (MSQ Plus, Thermo). Peptide aldehydes and carboproteins were analyzed with a Phenomenex Gemini 110 æ C18 column (3 mm, 4.6 Õ 50 mm) or a Phenomenex Gemini 110 æ C4 column (3 mm, 4.6 Õ 50 mm), with application of a flow of 1.0 mL min¢1 in a linear gradient with increasing amount of buffer B over 10– 20 min [buffer A: formic acid (0.1 %) in H2O; buffer B: formic acid (0.1 %) in CH3CN]. 1H (300 MHz),13C (75 MHz), and COSY NMR spectra were recorded with a Bruker 300 NMR spectrometer and a BBO probe. The chemical shifts are referenced to the residual solvent signals. Assignments were aided by H,H COSY experiments. Synthesis of templates Methyl 2-acetamido-2-deoxy-d-glycopyranoside (1): Amberlite (14 g, prewashed with methanol) was added to a solution of 2-acetamido-2-deoxy-d-glycopyranose (4.5 g, 20.34 mmol) in MeOH (135 mL), and the mixture was heated at reflux for 16 h. After having cooled, the suspension was filtered, and the Amberlite was washed with MeOH (100 mL). Concentration to dryness and drying in vacuo gave a yield of 3.6 g (75 %). 1H NMR (300 MHz, CD3OD, a/ b ratio 9:1): d = 4.65 (d, J = 3.5 Hz, 1 H; H-1a), 4.31 (d, J = 8.4 Hz, 1 H; H-1b) (the rest of the spectrum only assigned for a), 3.90 (dd, J = 3.5, 10.6 Hz, 1 H; H-2), 3.83 (dd, J = 2.3, 11.9 Hz, 1 H; H-6), 3.72, 3.69 (dd, J = 5.6, 11.8 Hz, 1 H; H-6’),3.63 (dd, J = 8.7, 10.6 Hz, 1 H; H3), 3.56–3.51 (m, 1 H; H-5), 3.37 (s, 3 H; OCH3), 3.31 (m, 1 H; H-4), 1.9 ppm (s, 3 H; NHCOCH3); 13C NMR (75 MHz, CD3OD): d = 173.7, 99.8, 73.6, 72.9, 72.3, 62.7, 55.5, 55.3, 22.6 ppm. Methyl 2-acetamido-2-deoxy-d-galactopyranoside (2): Amberlite (1.4 g, prewashed with methanol) was added to a solution of 2acetamido-2-deoxy-d-galactopyranoside (450 mg, 2.03 mmol) in MeOH (15 mL), and the mixture was heated at reflux for 16 h. After having cooled, the suspension was filtered and the Amberlite was washed with MeOH (10 mL). Concentration to dryness and drying in vacuo gave a yield of 307 mg (65 %). 1H NMR (300 MHz, CD3OD): d = 4.69 (d, J = 3.7 Hz, 1 H; H-1), 4.27 (dd, J = 3.7, 10.9 Hz, 1 H; H-2), 3.88 (d, J = 3.2 Hz, 1 H; H-5), 3.77–3.70 (m, 4 H; H-3, -4, -6, -6’), 3.37 (s, 3 H; OCH3), 1.98 ppm (s, 3 H; NHCOCH3). Methyl 2-acetamido-2-deoxy-3,4,6-tri-O-(Boc2-Aoa)-a-d-glucopyranoside (3): Methyl 2-acetamido-2-deoxy-d-glycopyranoside (1,

1914

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Full Papers 68.2 mg, 0.29 mmol) and N,N-(Boc)2-Aoa-OH (507 mg, 1.74 mmol) were dissolved in pyridine/CH2Cl2 (1:1, 5 mL), and the mixture was stirred for 1 h with molecular sieves (3 æ). Then 4-dimethylaminopyridine (DMAP, 21 mg, 0.17 mmol) and N,N’-diisopropylcarbodiimide (DIC, 266 mL, 1.72 mmol) were added. After 30 min, LC-MS showed several products, and additional DIC (266 mL) was added. After 1 h, LC-MS showed formation of only one product. The suspension was concentrated to dryness and taken up in CH3CN (10 mL), filtered, and purified by prep. HPLC. Yield 213 mg (70 %). 1 H NMR (300 MHz, CDCl3), d = 5.96 (d, J = 9.4 Hz; 1 H; NH), 5.31 (dd, J = 9.4, 10.6 Hz, 1 H; H-5), 5.15 (t, J = 9.7 Hz, 1 H; H-4), 4.75 (d, J = 3.5 Hz, 1 H; H-1), 4.57–4.56 (m, 4 H; CH2), 4.42–4.28 (m, 5 H; CH2, H2, H-6, -6’), 4.03–3.97 (m, 1 H; H-3), 3.34 (s, 3 H; OCH3), 2.02 (s, 3 H; NHCOCH3), 1.51 ppm (s, 54 H; Boc); 13C NMR (75 MHz, CDCl3): d = 171.3, 167.7, 166.8, 166.1, 150.1, 150.0, 97.9, 84.7, 84.6, 84.5, 77.4, 77.2, 77.0, 76.6, 72,4, 72,2, 72.0, 71.8, 68.7, 67.2, 62.5, 55.5, 52.0, 28.2, 28.0, 22.9 ppm. ESI-MS: m/z calcd for C45H74N4O24 : 1054.47; found: 1055.2 [M+ +H] + , 855.2 [M¢2 Boc+ +3 H] + , 755.2 + + [M¢3 Boc+ +4 H] , 655.2 [M¢4 Boc+ +5 H] , 455.2 [M¢6 Boc+ +7 H] + . Methyl 2-acetamido-2-deoxy-3,4,6-tri-O-(Boc2-Aoa)-a-d-galactopyranoside (4): The procedure used was the same as for methyl 2-acetamido-2-deoxy-3,4,6-tri-O-(Boc2-Aoa)-a-d-glucopyranoside; yield 163 mg (53 %). 1H NMR (300 MHz, CDCl3), d = 6.04 (d, J = 8.0 Hz; 1 H; NH), 5.45 (d, J = 3.2 Hz, 1 H), 5.26 (dd, J = 3.1, 11.1 Hz, 1 H), 4.84 (d, J = 3.5 Hz, 1 H; H-1), 4.62–4.42 (m, 6 H), 4.27–4.14 (m, 4 H), 3.39 (s, 3 H; OCH3), 2.02 (s, 3 H; NHCOCH3), 1.52 ppm (s, 54 H; Boc); MS: m/z calcd for C45H74N4O24 : 1054.47; found: 1072.3 [M+ +NH4] + , 855.2 + + [M¢2 Boc+ +3 H] , 755.1 [M¢3Boc+ +4 H] , 655.0 [M¢4 Boc+ +5 H] + . Methyl 2-acetamido-2-deoxy-3,4,6-tri-O-(Aoa)-a-d-glucopyranoside (5): Methyl 2-acetamido-2-deoxy-3,4,6-tri-O-(Boc2-Aoa)-a-d-glucopyranoside (43 mg, 41 mmol) was dissolved in TFA/CH2Cl2 (1:1, 3 mL), and the mixture was stirred at RT for 30 min. The solution was then concentrated to dryness, redissolved in H2O (2 mL), and lyophilized. The template was obtained as a fine, highly hygroscopic powder. Yield 30 mg ( … 100 %, incl. 3 Õ TFA). Used immediately. Methyl 2-acetamido-2-deoxy-3,4,6-tri-O-(Aoa)-a-d-galactopyranoside (6): The procedure used was the same as for methyl 2-acetamido2-deoxy-3,4,6-tri-O-(Aoa)-a-d-glucopyranoside (5). Used immediately. Methyl 6-deoxy-2,3,4-tri-O-(Boc2-Aoa)-a-d-glucopyranoside (7): Methyl 6-deoxy-a-d-glucopyranoside (25 mg, 0.140 mmol) and N,N-(Boc)2-Aoa-OH (163 mg, 0.561 mmol) were dissolved in pyridine/CH2Cl2 (1:1, 3 mL), and the mixture was stirred for 1 h with molecular sieves (3 æ). Then DMAP (7 mg, 0.056 mmol) and DIC (87 mL, 0.561 mmol) were added. After 30 min, LC-MS showed several products, so additional DIC (97 mL) was added. After 1 h, LCMS showed formation of only one product. The suspension was concentrated to dryness and taken up in CH3CN (10 mL), filtered, and purified twice by prep. HPLC. Yield 39.7 mg (28 %). 1H NMR (300 MHz, CDCl3), d = 5.44 (dd, J = 9.6, J = 8.9, 1 H; H-3), 4.90 (d, J = 3.6 Hz, 1 H; H-1), 4.84–4.78 (m, 1 H; H-4), 4.46 (s, 2 H; CH2), 4.42 (s, 2 H; CH2), 4.38 (s, 2 H; CH2), 3.87–3.82 (m, 1 H; H-5), 3.31 (s, 3 H; OCH3), 1.5 (s, 54 H; Boc), 1.15 ppm (d, J = 6.2 Hz, 3 H; CH3);13C NMR (75 MHz, CDCl3), d = 166.7, 166.4, 166.4, 150.2, 150.1, 150.0, 96.1, 84.6, 84.6, 84.5, 74.3, 72.5, 72.2, 72.0, 71.7, 70.5, 64.7, 55.3, 28.1, 27.9, 17.2 ppm; ESI-MS: m/z calcd for C43H71N3O23 : 997.45; found: 1015.3 [M+ +NH4] + , 698.3 [M¢3 Boc+ +4 H] + , 598. [M¢4 Boc+ +5 H] + . Methyl 6-deoxy-2,3,4-tri-O-(Aoa)-a-d-glucopyranoside (8): The procedure used was the same as for methyl 2-acetamido-2-deoxy-3,4,6tri-O-(Aoa)-a-d-glucopyranoside. Used immediately. ChemBioChem 2015, 16, 1905 – 1918

www.chembiochem.org

General procedure for the synthesis of C-terminal peptide aldehydes: TentaGel S-NH2 resin (0.26 mmol g¢1, 385 mg, 0.1 mmol) was washed with N-methylpyrrolidone (NMP, 3 Õ), CH2Cl2 (3 Õ), and NMP (3 Õ). o-PALdehyde (107 mg, 4 equiv), was dissolved in NMP (4 mL) and activated with HBTU (148 mg, 3.9 equiv), HOBt (54 mg, 4 equiv), and DIPEA (124 mL, 7.2 equiv). After 5 min the red solution was transferred to the resin and irradiated (microwave) at 60 8C (10 min). Then the resin was washed with NMP (3 Õ), CH2Cl2 (3 Õ), and NMP (3 Õ), and the above step was repeated once more. Unreacted amino groups were capped by acylation with Ac2O/CH2Cl2 (1:3, 4 mL, 30 min) and then the resin was washed successively with NMP (3 Õ) and CH2Cl2 (3 Õ). The resin was washed with MeOH (3 Õ) and with AcOH (1 %) in MeOH (Õ 3). Aminoacetaldehyde dimethyl acetal (105 mg, 10 equiv) and NaBH3CN (63 mg, 10 equiv) were dissolved in AcOH (1 %) in MeOH (4 mL) and added to the PALdehyde-functionalized resin. The solution was irradiated (microwave) at 60 8C (10 min) and then washed with MeOH (3 Õ), NMP (Õ 3), and CH2Cl2 (3 Õ). Fmoc-Gly-OH (297 mg, 10 equiv) was dissolved in CH2Cl2/DMF (9:1, 4 mL), and DIC (63 mg, 5 equiv) was added to form the symmetrical anhydride. After 10 min the solution was transferred to the resin and irradiated (microwave) at 60 8C (10 min), followed by washing with NMP (3 Õ 10 mL) and CH2Cl2 (3 Õ 10 mL). Unreacted amines were capped with Ac2O/CH2Cl2 (1:3, 4 mL, 30 min), and the resin was washed with CH2Cl2 (5 Õ 10 mL) and dried in vacuo. By Fmoc quantification of a resin sample, the loading was calculated to 0.15 mmol g¢1. Further chain elongation proceeded as described above for peptides. Peptide release was performed with TFA/H2O (95:5) for 2 h at RT. The TFA solutions were concentrated by use of nitrogen flow, and the peptides were precipitated with diethyl ether to yield the crude compounds as white powders. Purities are based on HPLC traces at 215 nm. Ac-YEELLKKLEELLKKAG-H (9): Analytical RP-HPLC: purity 98 %. ESIMS: m/z calcd: 1929.3; found: 957.0 [M¢H2O+ +2 H]2 + , 644.5 [M+ +3 H]3 + , 483.6 [M+ +4 H]4 + Ac-EWALEKKLAALESKLQALEKKLEALEKGG-H (10): Analytical RP-HPLC: purity 95 %. ESI-MS: m/z calcd: 3250.7; found: 1617.0 [M¢H2O+ +2 H]2 + , 1078.5 [M+ +3 H]3 + , 809.2 [M¢H2O+ +4 H]4 + . Ac-YEVEALEKKVAALESKVQALEKKVEALEKGG-H (11): Analytical RPHPLC: purity 98 %. ESI-MS: m/z calcd: 3411.9; found: 1133.0 [M¢H2O+ +3 H]3 + , 850.0 [M¢H2O+ +4 H]4 + , 683.3.6 [M+ +5 H]5 + . Ac-YVAALESKVQALEKKVEALEKGG-H (12): Analytical RP-HPLC: purity 97 %. ESI-MS: m/z calcd: 2486.8; found: 1234.9 [M¢H2O+ +2 H]2 + , 3+ 4+ 823.9 [M¢H2O+ +3 H] , 622.3 [M+ +4 H] . Ac-YVAALESKVQALEKKVEALEKGGG-H (13): Analytical RP-HPLC: purity 98 %. ESI-MS: m/z calcd: 2544.1; found: 1273.1 [M+ +2 H]2 + , 849.0 [M+ +3 H]3 + , 636.9 [M+ +4 H]4 + . Ac-YVAALESKVQALEKKVEALEKGG-Ahx-G-H (14): Analytical RP-HPLC: purity 98 %. ESI-MS: m/z calcd: 2542.7; found: 1272.2 [M+ +2 H]2 + , 3+ 4+ 848.8 [M+ +3 H] , 636.7 [M+ +4 H] . Synthesis of carboproteins General synthesis of three-stranded carboprotein Methyl [Ac-YEELLKKLEELLKKAG-(Aoa)]3GlcNAc (15): The lyophilized template methyl 2-acetamido-2-deoxy-3,4,6-tri-O-(Aoa)-a-d-glucopyranoside (2.2 mg, 3.0 mmol) was dissolved in NaOAc buffer (pH 4.76, 0.1 m, 1 mL), and the hexadecapeptide aldehyde 9 (23 mg, 12 mmol) dissolved in acetonitrile (1 mL) was added. The

1915

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Full Papers reaction mixture was stirred at RT for 2 h. LC-MS showed the major observed peak as the desired product. The reaction was worked up by prep. HPLC, which afforded carboprotein 15 as a white powder. Yield 6.0 mg ( … 32 %, incl. 12 TFA). Analytical RP-HPLC: purity 100 %. ESI-MS calcd for C285H476N64O87: 6187.1; found: 1239.0 [M+ +5 H]5 + , 1032.8 [M+ +6 H]6 + , 885.3 [M+ +7 H]7 + , 774.7 [M+ +8 H]8 + , 9+ 688.8 [M+ +9 H] . Methyl [Ac-YEELLKKLEELLKKAG-(Aoa)]3GalNac (16): Yield 4.0 mg (17 %). Analytical RP-HPLC: purity 100 %. ESI-MS: m/z calcd for C285H476N64O87: 6187.1; found: 1548.7 [M+ +4 H]4 + , 1239.1 [M+ +5 H]5 + , 6+ 1032.8 [M+ +6 H] . Methyl [Ac-YVAALESKVQALEKKVEALEKGG-(Aoa)]3GlcNac (17): Yield 7.6 mg (48 %). Analytical RP-HPLC: purity 100 %. ESI-MS: m/z calcd: 7855.4; found: 1310.6 [M+ +6 H]6 + , 1123.9 [M+ +7 H]7 + , 983.5 [M+ +8 H]8 + . Methyl [Ac-YVAALESKVQALEKKVEALEKGG-(Aoa)]3GalNac (18): Yield 2.2 mg ( … 10 %). Analytical RP-HPLC: purity 95 %. ESI-MS: m/z calcd: 7855.4; found: 1310.5 [M+ +6 H]6 + , 1123.6 [M+ +7 H]7 + , 983.5 [M+ +8 H]8 + . Methyl [Ac-YVAALESKVQALEKKVEALEKGGG-(Aoa)]3GlcNac (19): Yield 5.2 mg (30 %). Analytical RP-HPLC: purity 96 %. ESI-MS: m/z calcd: 8026.1; found: 1004.9 [M+ +8 H]8 + , 893.6 [M+ +9 H]9 + , 804.3 10 + [M+ +10 H] . Methyl [Ac-YVAALESKVQALEKKVEALEK-Ahx-G-(Aoa)]3GlcNac (20): Yield 3.1 mg (18 %). Analytical RP-HPLC: purity 95 %. ESI-MS: m/z calcd: 8023.3; found: 1148.8 [M+ +7 H]7 + , 1104.8 [M+ +8 H]8 + , 893. [M+ +9 H]9 + . Methyl [Ac-YEVEALEKKVAALESKVQALEKKVEALEKGG-(Aoa)]3GlcNac (21): Yield 1.3 mg (6 %). Analytical RP-HPLC: purity 100 %. ESI-MS: m/z calcd: 10 642.1; found: 1774.7 [M+ +6 H]6 + , 1521.7 [M+ +7 H]7 + , 1331.4 [M+ +8 H]8 + . Methyl [Ac-EWALEKKLAALESKLQALEKKLEALEKGG-(Aoa)]3GlcNAc (22): Yield 5.7 mg (42 %). Analytical RP-HPLC: purity 97 %. ESI-MS: m/z calcd: 10 150.3; found: 1129.0 [M+ +8 H]8 + , 1016.2 [M+ +9 H]9 + , 924.0 [M+ +10 H]10 + . Methyl 6-deoxy [Ac-YVAALESKVQALEKKVEALEKGG(Aoa)]3glucopyranoside (23): Yield 1.5 mg (9.6 %). Analytical RPHPLC: purity 97 %. ESI-MS calcd: 7798.3; found: 1116.1 [M+ +7 H]7 + , 8+ 9+ 976.6 [M+ +8 H] , 867.6 [M+ +9 H] . Circular dichroism spectroscopy (Figure 1): All carboproteins were dissolved in acetate buffer (pH 5.5, 50 mm). Carboprotein concentrations were determined from UV absorbance with use of a molecular extinction coefficient (e280) of 1490 m¢1 cm¢1 for tyrosine. Far-UV CD data were recorded with a JASCO J815 instrument calibrated with ammonium d-camphor-10-sulfonate (ACS). All spectra were recorded at room temperature with use of a 0.01 cm cell path length and a peptide concentration of about 1 mg mL¢1. Each resulting De value is based on the molar concentration of peptide bond. Analytical ultracentrifugation (Figure 2): Sedimentation velocity and equilibrium experiments were performed with a Beckman XLl analytical centrifuge at 25 8C, in which the sedimentation was measured at an absorbance wavelength of 280 nm. The sedimentation velocity experiments were performed at 50 000 rpm on an AN50 Ti rotor, and the data were analyzed by use of a c(S) model implemented by SEDFIT.[28] The sedimentation equilibrium experiments were performed at 25 000, 30 000, and 35 000 rpm and the data were fitted to Equation (1): ChemBioChem 2015, 16, 1905 – 1918

www.chembiochem.org

‰ Š ð1¢~n   1Þ   w2 þE C r ¼ C r0   exp Mw   ðr 2 ¢r 20Þ 2RT

ð1Þ

where Cr is the peptide concentration at radius r, Cr0 is the concentration of monomeric peptide at r0, w is the angular velocity, R is the gas constant equal to 8.134 Õ 107 mol¢1 K¢1, T is the temperature in Kelvin, Mw is the molecular mass, n˜ is the partial specific volume of the peptide calculated by use of the additivity scheme described in Makhatadze et al.,[29] 1 is the density of the solvent, and E is the baseline offset. All fits were done with use of nonlinear regression SAXS measurements and data processing (Figures 3–5) Sample preparation and initial characterization: Samples (5 and 10 mg mL¢1) were prepared by dissolving lyophilized sample (1 mg) in buffer [pH 5.5, 100 mL, NaOAc (50 mm)]. The concentrations were controlled by measuring absorption at 280 nm with a Nanodrop spectrophotometer (NanoDrop Technologies). SAXS measurements were performed with the X33 small-angle Xray scattering beamline of the European Molecular Biology Laboratory (EMBL) at the storage ring DORIS III of the Deutsches Elektronen Synchrotron (DESY) with use of standard procedures. Data were collected with a MAR345 image plate detector covering a range of 0.01 < q < 0.496 æ¢1 (q = 4 p sinq/l), where 2q is the scattering angle and l is the X-ray wavelength (l = 1.1 æ in the present experiment). All SAXS measurements were performed at 24 œ 1 8C. The scattering intensities of buffer backgrounds were measured both before and after the sample, and the averaged background scattering was subtracted from the scattering of the sample according to standard procedures. Reference solutions of bovine serum albumin (BSA) of known concentration ( … 5 mg mL¢1) were used for absolute calibration (cm2 cm¢3). As an independent check, absolute calibration was performed with water; the deviation of the two methods was found to be within 2 %. With the aid of the BioXTAS RAW[30] software the 2D CCD images were transformed into the 1D I(q) representation.[31] In the process of preliminary data analysis, the obtained scattering intensities I(q) were converted into a direct space representation in terms of the pair distance distribution function p(r) by means of indirect Fourier transformation by use of the IFT method based on Bayesian statistics built into BioXTAS RAW. For proteins, which have almost uniform electron density, the p(r) function can be considered a histogram of interparticle distances. Thus, both the I(q) and p(r) plots contain information about the particle shape and size. Monte Carlo shape determination: The 3D low-resolution structures of the carboproteins 15, 16, and 17 were recovered by use of Monte-Carlo-based bead modeling as originally developed by Chacon et al.[32] In this work, however, the program Dammif was employed instead. This program is basically a parallelized fast version of the popular Monte Carlo modeling program Dammin.[33] In both programs, the dummy atom models are allowed to evolve according to the SAXS data by choosing a dummy atom at random and changing its phase assignment from protein to solvent or vice versa. Further, for the purpose of escaping local minima, a simulated annealing protocol in utilized in both programs. The ab initio models were obtained by use of a maximum of 400 annealing steps with a maximum of 105 successful (phase shifts improving the fit) iterations and 106 iterations in total in each step. Lower limits of 50 successful iterations in each annealing step were also imposed. Fifteen computations were carried out for each data set, and averaged models were computed by use of the programs Damaver and Supcomb.[22] For the three carboproteins 15, 16, and

1916

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Full Papers 17, the fifteen models included in the average were very similar, and this was also reflected in the NSD values as computed by Supcomb, which in all cases were below one. NMR spectroscopy (Figure 7): 1D 1H NMR spectra were recorded at 25 8C in NaOAc buffer (pH 5.5, 50 mm) in H2O/D2O 90:10 or in 100 % D2O, each containing DSS (2,2-dimethyl-2-silanepentane-5sulfonic acid, 12.5 mm) for referencing. For the assignment, a protein solution ( … 500 mm) was prepared in a NaOAc buffer (H 5.5, 50 mm) containing D2O (10 %, v/v) and DSS (12.5 mm). The HN and Ha protons were assigned by use of the inter- and intraresidue peaks in the TOCSY and NOESY spectra recorded at 25 8C with a Varian 800 MHz spectrometer and use of standard pulse programs from the Varian BioPack. For the assignment of the Ca chemical shifts, a natural-abundance 13C HSQC was recorded in a fully deuterated NaOAc buffer (pD 5.1, 50 mm) containing DSS (12.5 mm) at 25 8C. The spectra were acquired by use of the following parameters: 1 D 1H NMR spectra: spectral width = 13 020.8 Hz (1 H); number of transients was 2048. TOCSY: 256 increments in the t1 dimension. Spectral width = 13 020.8 Hz (1 H); number of transients was 92. NOESY: 256 increments in the t1 dimension. Spectral width = 13 020.8 Hz (1 H); number of transients was 72. 13C HSQC: 188 increments in the t1 dimension. Spectral widths = 12 001.2 Hz (1H) and 15 088.8 Hz (13C); number of transients was 188. NMR data processing: The X-carrier frequency was determined by referencing to internal DSS. The DSS frequency was obtained from a 1D 1H experiment recorded immediately before the remaining experiments. Indirect referencing was used in the 13C dimension by use of conversion factors from Wishart et al.[34] The spectra were processed with nmrDraw/nmrPipe.[35] Spectrometer frequencies and carrier frequencies in ppm were inserted with four decimals. Zero-filling to the nearest power of 2 was used. The processed spectra were analyzed by CCPN analysis. The CCPN data model for NMR spectroscopy: development of a software pipeline.[36] Ca chemical shifts (dobs) were obtained from the natural-abundance 13 C HSQC spectrum. H chemical shifts (dobs) were obtained from the TOCSY spectrum. To derive the secondary chemical shifts all chemical shifts were subtracted from the predicted random coil chemical shifts (dref) taken from[37] by using Equation (2): Dd ¼ dobs ¢dref

ð2Þ

Molecular dynamics simulations (Figure 6) Initial 3D structures of carboproteins 17 and 23 were generated by merging separately generated structures for the peptide part and the carbohydrate templates. The 3D structure of the common triple-helical peptide part of carboproteins 17 and 23 was built with MODELLER9v7[38] and use of the crystal structure of CoilVaLd (PDB ID: 1COI) as template.[7a] Because the C-terminal part of CoilVaLd deviates from a-helicity, the sequence alignment was performed by starting from the N terminus. Symmetry restraints were used in the modeling procedure to ensure C3 symmetry of the triple-helical peptides. The carbohydrate templates including linkers of carboproteins 17 and 23 were constructed in Maestro.[39] The templates were subjected to conformational search by the Monte Carlo multiple minimum (MCMM)[40] method with the carbohydrate ring fixed and variations allowed in all dihedrals in the exocyclic groups. A conservative choice of 10 000 steps was made for the MCMM search to produce … 3000 conformers within an energy window of 25 kJ mol¢1.

substructures. To facilitate this overlap, firstly, conformers of the template geometrically compatible with attachment to the triplehelical part of the carboprotein were selected. Next, the three exposed Ca atoms of the template linker were brought into register with the appropriate three Ca atoms in the peptide part by use of restricted geometry optimization with overlapping target positions of the complementary Ca atoms. The template and protein structures were then merged within Maestro.[40] Because strained geometry occurred at the sites of merging, a geometry optimization scheme was employed to relax the structure locally: firstly, the carbohydrate part was optimized while the atoms in the peptide were restrained with a 500 kJ mol¢1 restraining potential, then the carbohydrate was restrained and the protein was optimized, and finally 200 steps of unrestrained geometry optimization were carried out on the entire carboprotein. The resulting model was used for molecular dynamics simulations. Molecular dynamics simulations were performed with Desmond 3.0.[41] With the aid of the Desmond System Builder, carboproteins 17 and 23 were each immersed in a cubic box of TIP3P water providing a minimum layer of 15 æ of water on each side of the macromolecule. Prior to molecular dynamics simulations, a steepest descent minimization to a gradient of 1 kcal mol¢1 æ¢1 was carried out. This was followed by the default pre-simulation protocol of Desmond, consisting of 1) minimization with restraints on solute, 2) unrestrained minimization, 3) Berendsen[42] NVT simulation, T = 10 K, small time steps, restraints on heavy solute atoms, 4) Berendsen NPT simulation, T = 10 K, restraints on solute heavy atoms, 5) Berendsen NPT simulation with restraints on heavy solute atoms, and 6) unrestrained Berendsen NPT simulation. By the relaxation protocol, 20 ns NPT simulation was carried out for carboproteins 17 and 23. The temperature was regulated with the Nose– Hoover chain thermostat[43] with a relaxation time of 1.0 ps. The pressure was regulated with the Martyna–Tobias–Klein barostat[44] with isotropic coupling and a relaxation time of 2.0 ps. The RESPA integrator[45] was employed with bonded, near, and far time steps of 2.0, 2.0, and 6.0 fs, respectively. MD trajectories were saved at 20 ps intervals. The OPLS_2005 force field[46] was used for minimizations and simulations. A 9 æ cutoff was used for nonbonded interactions, and the smooth particle mesh Ewald[47] method with a tolerance of 10¢9 for long-range Coulomb interactions. MD trajectories were analyzed with VMD.[48] Mean radii of gyration and intraprotein maximum atom–atom distances were calculated by use of 100 equally spaced snapshots from the MD trajectories. Molecular images were generated with VMD. For additional computational details, see Section S1 in the Supporting Information.

Acknowledgements A NABIIT grant from DSF (K.J.J.) is gratefully acknowledged. This research was supported by the BioNEC center funded by Villum Fonden (K.J.J.). Beam time at the X33 beamline at the European Molecular Biology Laboratory (EMBL) at the storage ring DORIS III is gratefully acknowledged. In this context we thank Alexey Kikhney for the support received at the beamline. This work was supported in parts by grants from the Danish Research Councils (BBK 12–128803).

Merging protein and template models to yield the carboprotein models requires near perfect overlap of the terminal atoms in both ChemBioChem 2015, 16, 1905 – 1918

www.chembiochem.org

1917

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Full Papers Keywords: bioorganic chemistry · carbohydrates · oximes · peptides · proteins [1] a) D. Grell, J. S. Richardson, M. Mutter, J. Pept. Sci. 2001, 7, 146 – 151; b) W. F. DeGrado, Z. R. Wasserman, J. D. Lear, Science 1989, 243, 622 – 628; c) W. F. DeGrado, Science 1997, 278, 80 – 81; d) B. I. Dahiyat, S. L. Mayo, Science 1997, 278, 82 – 87; e) G. V. Nikiforovich, G. R. Marshall in Peptide and Protein Design For Biopharmaceutical Applications, Vol. 1 (Ed.: K. J. Jensen), Wiley, Chichester, UK, 2009, pp. 5 – 48. [2] a) M. Mutter, S. Vuilleumier, Angew. Chem. Int. Ed. Engl. 1989, 28, 535 – 554; Angew. Chem. 1989, 101, 551 – 571; b) M. Mutter, E. Altmann, K.-H. Altmann, R. Hersperger, P. Koziej, K. Nebel, G. Tuchscherer, S. Vuilleumier, H.-U. Gremlich, Helv. Chim. Acta 1988, 71, 835 – 847. [3] a) A. R. Mezo, J. C. Sherman, J. Am. Chem. Soc. 1999, 121, 8983 – 8994; b) T. Sasaki, E. T. Kaiser, Biopolymers 1990, 29, 79 – 88; c) T. Sasaki, E. T. Kaiser, J. Am. Chem. Soc. 1989, 111, 380 – 381; d) P. E. Dawson, S. B. H. Kent, J. Am. Chem. Soc. 1993, 115, 7263 – 7266; e) A. K. Wong, M.-P. Jacobsen, D. J. Winzor, D. P. Fairlie, J. Am. Chem. Soc. 1998, 120, 3836 – 3841; f) J. Kwak, A. D. Capua, E. Locardi, M. Goodman, J. Am. Chem. Soc. 2002, 124, 14085 – 14091. [4] K. J. Jensen, J. Brask, Cell. Mol. Life Sci. 2002, 59, 859 – 869. [5] R. Høiberg-Nielsen, A. P. Tofteng, K. K. Sørensen, M. Roessle, D. I. Svergun, P. W. Thulstrup, K. J. Jensen, L. Arleth, ChemBioChem 2008, 9, 9 – 11. [6] J. O. Freeman, W. C. Lee, M. E. P. Murphy, J. C. Sherman, J. Am. Chem. Soc. 2009, 131, 7421 – 7429. [7] a) N. L. Ogihara, M. S. Weiss, W. F. DeGrado, D. Eisenberg, Protein Sci. 1997, 6, 80 – 88; b) B. Lovejoy, S. Choe, D. Cascio, D. K. McRorie, W. F. DeGrado, D. Eisenberg, Science 1993, 259, 1288 – 1293. [8] L. Malik, J. Nygaard, N. J. Christensen, W. Streicher, P. W. Thulstrup, L. Arleth, K. J. Jensen, J. Pept. Sci. 2013, 19, 283 – 292. [9] a) E. Fischer, Ber. Dtsch. Chem. Ges. 1893, 26, 2400 – 2412; b) U. Zehavi, N. Sharon, J. Org. Chem. 1972, 37, 2141 – 2145. [10] J. Brask, K. J. Jensen, J. Pept. Sci. 2000, 6, 290 – 299. [11] B. C. Gibb, A. R. Mezo, J. C. Sherman, Tetrahedron Lett. 1995, 36, 7587 – 7590. [12] a) K. J. Jensen, J. Alsina, M. F. Songster, J. Vagner, F. Albericio, G. Barany, J. Am. Chem. Soc. 1998, 120, 5441 – 5452; b) U. Boas, J. Brask, J. B. Christensen, K. J. Jensen, J. Comb. Chem. 2002, 4, 223 – 228; c) F. Guillaumie, J. C. Kappel, K. M. Kelly, G. Barany, K. J. Jensen, Tetrahedron Lett. 2000, 41, 6131 – 6135. [13] a) S. L. Pedersen, K. K. Sorensen, K. J. Jensen, Pept. Sci. 2010, 94, 206 – 212; b) M. Brandt, S. Gammeltoft, K. J. Jensen, Int. J. Pept. Res. Ther. 2006, 12, 349 – 357. [14] J. F. Brandts, L. J. Kaplan, Biochemistry 1973, 12, 2011 – 2024. [15] J. Brask, K. J. Jensen, Bioorg. Med. Chem. Lett. 2001, 11, 697 – 700. [16] Y.-H. Chen, J. T. Yang, K. H. Chaus, Biochemistry 1974, 13. [17] J. Brask, J. M. Dideriksen, J. Nielsen, K. J. Jensen, Org. Biomol. Chem. 2003, 1, 2247 – 2252. [18] A. O. Tofteng, T. H. Hansen, J. Brask, J. Nielsen, P. W. Thulstrup, K. J. Jensen, Org. Biomol. Chem. 2007, 5, 2225 – 2233. [19] a) M. M. Santoro, D. W. Bolen, Biochemistry 1992, 31, 4901; b) M. M. Santoro, D. W. Bolen, Biochemistry 1988, 27, 8063. [20] D. Franke, D. I. Svergun, J. Appl. Crystallogr. 2009, 42, 342 – 346. [21] V. V. Volkov, D. I. Svergun, J. Appl. Crystallogr. 2003, 36, 860 – 864.

ChemBioChem 2015, 16, 1905 – 1918

www.chembiochem.org

[22] M. V. Petoukhov, D. I. Svergun, Biophys. J. 2005, 89, 1237 – 1250. [23] S. Ebbinghaus, S. J. Kim, M. Heyden, X. Yu, U. Heugen, M. Gruebele, D. M. Leitner, M. Havenith, Proc. Natl. Acad. Sci. USA 2007, 104, 20749 – 20752. [24] D. S. Wishart, B. D. Sykes, F. M. Richards, Biochemistry 1992, 31, 1647 – 1651. [25] A. S. Causton, J. C. Sherman, J. Pept. Sci. 2002, 8, 275 – 282. [26] J. O. Freeman, J. C. Sherman, Chem. Eur. J. 2011, 17, 14120 – 14128. [27] F. Avbelj, D. Kocjan, R. L. Baldwin, Proc. Natl. Acad. Sci. USA 2004, 101, 17394 – 17397. [28] P. Schuck, Biophys. J. 2000, 78, 1606 – 1619. [29] G. I. Makhatadze, V. N. Medvedkin, P. L. Privalov, Biopolymers 1990, 30, 1001 – 1010. [30] S. S. Nielsen, K. N. Toft, D. Snakenborg, M. G. Jeppesen, J. K. Jakobsen, B. Vestergaard, J. P. Kutter, L. Arleth, J. Appl. Crystallogr. 2009, 42, 959 – 964. [31] O. Glatter, O. Kratky, Small Angle X-ray Scattering, Academic Press, London, 1982. [32] P. Chacûn, F. Moran, J. F. Diaz, E. Pantos, J. M. Andreu, Biophys. J. 1998, 74, 2760 – 2775. [33] D. I. Svergun, Biophys. J. 1999, 76, 2879 – 2886. [34] D. S. Wishart, C. G. Bigam, J. Yao, F. Abildgaard, H. J. Dyson, E. Oldfield, J. L. Markley, B. D. Sykes, J. Biomol. NMR 1995, 6, 135 – 140. [35] F. Delaglio, S. Grzesiek, G. W. Vuister, G. Zhu, J. Pfeifer, A. Bax, J. Biomol. NMR 1995, 6, 277 – 293. [36] W. F. Vranken, W. Boucher, T. J. Stevens, R. H. Fogh, A. Pajon, M. Llinas, E. L. Ulrich, J. L. Markley, J. Ionides, E. D. Laue, Proteins 2005, 59, 687 – 696. [37] M. Kjaergaard, S. Brander, F. Poulsen, J. Biomol. NMR 2011, 49, 139 – 149. [38] A. Sˇali, L. Potterton, F. Yuan, H. van Vlijmen, M. Karplus, Proteins Struct. Funct. Bioinform. 1995, 23, 318 – 326. [39] L. Schrçdinger, Maestro, 9.2, New York, 2011. [40] L. Schrçdinger, MacroModel, 9.9, New York, 2011. [41] Desmond Molecular Dynamics System, D. E. Shaw Research New York, NY, 2011. [42] H. J. C. Berendsen, J. P. M. Postma, W. F. van Gunsteren, A. DiNola, J. R. A. Haak, J. Chem. Phys. 1984, 81, 3684 – 3690. [43] S. Nose, J. Chem. Phys. 1984, 81, 511 – 519. [44] G. J. Martyna, D. J. Tobias, M. L. Klein, J. Chem. Phys. 1994, 101, 4177 – 4189. [45] M. E. Tuckerman, B. J. Berne, G. J. Martyna, J. Chem. Phys. 1992, 97, 1990 – 2001. [46] J. L. Banks, H. S. Beard, Y. Cao, A. E. Cho, W. Damm, R. Farid, A. K. Felts, T. A. Halgren, D. T. Mainz, J. R. Maple, R. Murphy, D. M. Philipp, M. P. Repasky, L. Y. Zhang, B. J. Berne, R. A. Friesner, E. Gallicchio, R. M. Levy, J. Comput. Chem. 2005, 26, 1752 – 1780. [47] U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee, L. G. Pedersen, J. Chem. Phys. 1995, 103, 8577 – 8593. [48] W. Humphrey, A. Dalke, K. Schulten, J. Mol. Graphics 1996, 14, 33 – 38.

Manuscript received: June 5, 2015 Accepted article published: July 6, 2015 Final article published: August 14, 2015

1918

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

A de Novo-Designed Monomeric, Compact Three-Helix-Bundle Protein on a Carbohydrate Template.

De novo design and chemical synthesis of proteins and of other artificial structures that mimic them is a central strategy for understanding protein f...
3MB Sizes 0 Downloads 12 Views