RESOLUTION OF PROTEIN STRUCTURE BY MASS SPECTROMETRY Elien Vandermarliere,1,2 Elisabeth Stes,1,2 Kris Gevaert,1,2 and Lennart Martens1,2* 1 2

Department of Medical Protein Research, VIB, B-9000, Ghent, Belgium Department of Biochemistry, Ghent University, B- 9000, Ghent, Belgium

Received 18 June 2014; accepted 14 October 2014 Published online in Wiley Online Library (wileyonlinelibrary.com). DOI 10.1002/mas.21450

Typically, mass spectrometry is used to identify the peptides present in a complex peptide mixture and subsequently the precursor proteins. As such, mass spectrometry focuses mainly on the primary structure, the (modified) amino acid sequence of peptides and proteins. In contrast, the three-dimensional structure of a protein is typically determined with protein X-ray crystallography or NMR. Despite the close relationship between these two aspects of protein studies (sequence and structure), mass spectrometry and structure determination are not frequently combined. Nevertheless, this combination of approaches, dubbed conformational proteomics, can offer insight into the function, working mechanism, and conformational status of a protein. In this review, we will discuss the developments at the intersection of mass spectrometry-based proteomics and protein structure determination and start from a brief overview of the classic approaches to identify protein structure along with their advantages and disadvantages. We will subsequently discuss the ability of mass spectrometry to overcome some of the hurdles of these classic methods. Finally, we will provide an outlook on the interplay of mass spectrometry and protein structure determination, and highlight several recent experiments in which mass spectrometry was successfully used to either aid or complement structure elucidation. # 2014 Wiley Periodicals, Inc. Mass Spec Rev 9999:1–13, 2014 Keywords: conformational proteomics; mass spcetrometry; protein dynamics; protein structures

I. INTRODUCTION Ever since the first protein structure was determined in the late 1950s (Kendrew et al., 1958), these ‘static’ images of a protein were assumed to carry all the information on the function of that protein. However, it quickly became clear that proteins are dynamic molecules that can undergo—sometimes dramatic— conformational changes that are essential for their biological function (Tompa, 2005). Hence, knowledge of the dynamic properties of a protein structure is of utmost importance to fully understand the function of a protein.

Contract grant sponsor: Belgian government agency for Innovation by Science and Technolog; Contract grant number: IWT 110431; Contract grant sponsor: Scientific Research (FWO)-Flanders. � Correspondence to: Lennart Martens, Department of Medical Protein Research, Universiteit Gent-VIB A, Baertsoenkaai 3, B-9000, Gent, Belgium. E-mail: [email protected]

Mass Spectrometry Reviews, 2014, 9999, 1–13 # 2014 by Wiley Periodicals, Inc.

The classic approach to determine protein structure is X-ray crystallography; however this method provides essentially static images with little information on protein flexibility as molecules are frozen in fixed positions within the crystal lattice (McPherson, 2003). This latter information can, however, be retrieved with nuclear magnetic resonance (NMR). Here, the structure is determined in solution; flexibility is retained to allow it to be measured (Ishima & Torchia, 2000). NMR can, however, only be commonly applied to proteins up to 30 kDa [with notable exceptions (Rudiger et al., 2002; Horst et al., 2005; Bertini et al., 2008)], thus this excludes many multi-domain proteins from analysis. Yet it is especially in these multi-domain proteins that conformational plasticity is very important: signal transfer within multi-domain proteins frequently occurs through alterations in the structure. Therefore, conformational dynamics is an important part to fully understand the function and mode of action of a protein. Mass spectrometry (MS), on the other hand, focuses on the primary structure of proteins (amino acid sequence) as it is typically used to identify peptides present in complex peptide mixtures and subsequently precursor proteins, in so-called bottom-up approaches. Despite the general knowledge that sequence guides structure, it remains less frequent to link mass spectrometry analysis with structure determination. However, the combination of protein structure with mass spectrometry in the field of conformational proteomics can provide additional insight in the function, mechanism, and conformational status of a protein. Conformational proteomics might, therefore, allow us to unravel the structural plasticity of proteins, and in particular of multi-domain proteins and protein complexes for which this information is currently very difficult to obtain via classic structure determination methods. In this review, we will start with a short overview of the classic approaches to identify protein structure, and will discuss advantages and disadvantages associated with these techniques. Subsequently, we will discuss the power of mass spectrometry to overcome some of the specific drawbacks and limitations of these classic methods. Table 1 provides a summary of the advantages and limitations of each of the methods described. Finally, we will provide an outlook based on several experiments in which mass spectrometry was successfully used to either aid or complement structure determination.

A. Classic Approaches to Protein Structure Determination In this part of the review, we provide a description of the standard techniques to elucidate the structure of a protein: protein X-ray crystallography, NMR, and electron microscopy.

&

VANDERMARLIERE ET AL.

TABLE 1. Advantages and limitations of the different methods

The classic approaches Protein X-ray crystallography

Nuclear magne! c resonance

Advantages

Limita ons

High resolu! on can be obtained with informa! on on each atom within the protein

Large amount s of very pure protein sample are needed A protein crystal is needed Not appropriate for large complexes, membrane proteins and highly flexible proteins Protein size is limited to 30 kDa High concentra! on of pure protein is needed Only visualiza! on of the shape of proteins, no individual atoms are resolved Only informa! on on the overall conforma! on of the protein

Electron microscopy

Flexibility within a prot ein can be visualized Informa!on about each atom within the structure Allows analysis of large complexes

Spectroscopic techniques

The prot eins remain in solu! on

Mass spectrometry-based methods Limited proteolysis

Hydrogen-deuterium exchange Mass spectrometry-based cross-linking

Moderat e amount s of sample are needed

Na! ve mass spectrometry

Solvent accessible regions can be delineated Mul!ple cross-linkers can be used within one experiment Very important to unrave l the architecture of large mul! -prot ein complexes Allows a broad range of solu! on condi! ons and prot eases Allows analysis of endogenously expressed complexes

Ion mobility mass spectrometry

Analysisof disorde red prot eins

Mass spectrometry-based footprin!ng

For each technique, a brief overview of its advantages and disadvantages is given.

1. X-ray Crystallography The first protein structure, that of sperm whale myoglobin (Kendrew et al., 1958), was determined in the late 1950s with protein X-ray crystallography. Ever since, protein crystallography has been the archetypal approach to determine the structure of a protein at the atomic level. To produce the image of a protein, which is� composed of atoms separated by an average bond length of 1.5 A, electromagnetic waves in the X-ray range of the spectrum are needed. When a protein is irradiated with such X-rays, these rays are scattered by the electrons in the protein molecule. However, unlike electromagnetic waves in the visible light spectrum, which can be focused with a lens to produce an image, no lens exists to gather scattered X-rays and focus them into an image. Instead, the scattered waves are recorded as a diffraction pattern. Fourier synthesis is applied to this diffraction pattern to produce a meaningful image of the protein in the form of an electron density map. The structure of the protein is subsequently built into this electron density map. It is, however, important to point out that the diffraction pattern of a single protein molecule is too weak to be detected; thus, signal amplification is required. This amplification is achieved through the use of a protein crystal, which is a three-dimensionally ordered array of proteins (Drenth, 1999; Blow, 2002; McPherson, 2003). When protein X-ray crystallography was still in its infancy, only one or two structures were published per year, and this only after several years of intensive work for each structure. But, in the last decade, many structures are solved every year. In 2013,

2

Proteolysis condi!o ns might disturb the na! ve conforma! on High-resolu! on informa! on can only be obtained if the data are combined with NMR or protein crystallographic data Back-exchange of deuterium might occur Weak or transient interac! ons can get lost during copurifica! on Unnatural interac! ons might be iden! fied High doses of radicals might cause protein unfolding Weak or transient interac! ons can get lost during copurifica! on Analysis of membrane proteins is not possible Large protein-ligand complexes cannot be analyzed

for instance, 8,872 structures were deposited into the Protein Data Bank (PDB) (Berman et al., 2000), whereas in 1993 and in 1983, only 622 and 36 structures were deposited, respectively (Fig. 1). This recent large increase in deposited structures is dueto several important evolutions that made protein X-ray crystallography much easier to use. The processing of diffraction patterns has greatly benefitted from the availability of highspeed computers, whereas experimental acquisition of patterns has become much faster thanks to the use of higher energy imaging beams. The introduction of synchrotron radiation as X-ray source allowed experiments to be shortened from a few days to a few hours, and third-generation synchrotrons can now reduce this time to only a few seconds. Moreover, the use of synchrotron radiation made it possible to obtain atomic resolu� tion (defined as 1.2 A) and ultrahigh- resolution diffraction data from protein crystals despite smaller crystal sizes (Dauter, Jaskolski, & Wlodawer, 2010). This introduction of synchrotron radiation as X-ray source is coupled to the introduction of cryocrystallography. During data collection, crystals are cooled to 100 K to reduce the heat load generated by the intense X-ray beam and to decrease radiation-induced decay of the crystal (Garman, 2003). Another evolution is the implementation of Se-Met multiple- and single-wavelength anomalous dispersion (MAD and SAD) to solve the phase problem inherent to protein crystallography. Moreover, the use of MAD and SAD is enhanced by the use of recombinant methods for protein production (Dauter, 2002). The very latest advances in protein crystallography allow one to map macromolecular dynamics, which is still in its infancy. ‘Structure photographs’ are taken by recording a complete diffraction pattern in about 1 ns (Jaskolski, 2010). Eventually, X-ray beams many billions of times brighter will allow the diffraction pattern of a single molecule to be Mass Spectrometry Reviews DOI 10.1002/mas

STRUCTURAL MASS SPECTROMETRY

&

FIGURE 1. Annual number of structures released by the PDB per experimental method. The graph gives the number of structures per structure determination method released by the PDB (Berman et al., 2000) per year. Most structures released by the PDB have been determined by X-ray crystallography, with NMR and EM lagging far behind. Furthermore, the last decade shows an explosive growth in the number of structures determined by X-ray crystallography, due to advances made in the technology over this time period.

collected, which will bypass the need for a crystal altogether (Hajdu, 2000). Despite this bright future for protein X-ray crystallography, the necessity to possess a protein crystal currently remains an important limitation to the high-throughput application of this approach. Indeed, certain proteins, including integral membrane proteins, large macromolecular complexes, disordered proteins, and multi-domain proteins, are nearly impossible to crystallize. In addition, the crystallization step is often a time-consuming effort: many hundreds of crystallization conditions need to be tested and it might take up to one year for a crystal to grow in a specific crystallization condition. Moreover, the current approach allows visualization of only one conformation of the protein, and this conformation might not even be the natural conformation of the protein in solution. Contacts within the crystal often force loop regions into unnatural conformations, which results in a distorted picture of the protein structure.

2. Nuclear Magnetic Resonance In 1984, NMR was applied for the first time to determine a protein structure, that of the proteinase inhibitor IIA from bull seminal plasma (Williamson, Havel, & Wu¨thrich, 1985). After a short period of skepticism—it was thought that this structure had instead been modeled after the crystal structure of a homologous protein (Wu¨thrich, 1995)—NMR quickly became one of the classic methods to determine protein structure. However, NMR still lags behind X-ray crystallography in popularity; among others due to its very high running, labeling, and installation costs, and the lack of automated assignment software. This low popularity is reflected in the much lower

Mass Spectrometry Reviews DOI 10.1002/mas

number of submissions to the PDB of protein structures determined with NMR compared to those determined with X-ray crystallography (Fig. 1). In a typical NMR experiment, a protein in solution is placed into a strong magnetic field. Radio frequency signals are sent through the protein sample, where the nuclei of the individual atoms absorb different frequencies. The absorption of these signals is subsequently measured for the resonance of 1H (or 13 C) as the chemical shift. The difference in absorbance depends on the local environment of the atoms within the protein—each atom within a protein is subjected to a unique environment— and is used to reconstruct the chemical links between the atoms (Wilson & Walker, 2000). Even though NMR had already been frequently used to solve the structure of small molecules, several advances had to be combined before the technique became accessible for protein structure determination. Advances in computational tools and increased processing power enabled interpretation of the large amounts of data generated from a protein NMR experiment, and sequence-specific assignment of the thousands of acquired peaks. Furthermore, the nuclear Overhauser effect (NOE) was implemented as an experimentally accessible parameter to yield information instrumental for fold determination, and multidimensional NMR techniques allowed efficient data collection (Wu¨thrich, 2001). Moreover, high polarizing magnetic fields became available for high-resolution NMR, which allow improved intrinsic sensitivity and peak separation (Wider & Wu¨thrich, 1999). The main advantage of a protein structure determined with NMR over X-ray crystallography is that the protein is analyzed in solution, and thus retains its flexibility, which ultimately can 3

&

VANDERMARLIERE ET AL.

be measured. NMR can, therefore, be used to obtain dynamic information and, hence, is a powerful technique to unravel the dynamics intimately related to the function of a protein (Ishima & Torchia, 2000). Moreover, NMR can be applied to partially unfolded proteins, which are unsuited for X-ray crystallography. Unfortunately, NMR is limited to a protein size of 15 kDa. Larger structures (up to 30 kDa) can be characterized with the aid of labels, but there are currently only few structures above 25 kDa solved with NMR. Nevertheless, techniques such as TROSY and CRINEPT that allow structure solution up to 100 kDa are also available (Wider & Wu¨thrich, 1999), but were implemented in only a limited number of studies (Rudiger et al., 2002; Horst et al., 2005; Bertini et al., 2008). Another disadvantage of NMR spectroscopy is the high concentration and purity of protein required: ideally, 2–5 mM solution of an as pure as possible protein. Unfortunately, many biological macromolecules aggregate at these concentrations (Wu¨thrich, 1995).

3. Electron Microscopy X-ray crystallography and NMR are the methods of choice to obtain information at high resolution on single proteins or small complexes. However knowledge of the three- dimensional spatial arrangement of proteins within larger complexes, organelles, and even complete cells is equally important. Visualization of these superstructures is achieved with the aid of electron microscopy (EM). EM allows analysis of purified, isolated complexes as well as individual� objects such as organelles, cells, and tissue sections in the 7–30 A resolution range. In analogy with a conventional light microscope, EM uses a radiation source—an electron source—to irradiate the sample. The scattered electron beams subsequently pass through a series of lenses and the resulting image is recorded (Orlova & Saibil, 2011). This image is, however, a two-dimensional projection of the sample. To obtain a three-dimensional image, the sample is viewed from different angles, and the final image is synthesized from these multiple views (Radon, 1917; Baumeister, 2005). The introduction of cryogenic methods in the EM field allows vitrification of the sample with rapid freezing. Vitrification ensures a close to native, hydrated preservation of the molecular and cellular structures, and as such, allows snapshots to be taken of dynamic events within the cell (Baumeister, 2005). � Typically, an EM map has a resolution that ranges 7–30 A. Hence, individual amino acids cannot be differentiated from each other with EM and consequently, it is not possible to obtain information� about the structural properties of a single residue. In the 20–30 A (low) resolution range, large individual domains might by recognizable by their shapes. In the high resolution � range (6–9 A), alpha-helices can be resolved, whereas beta� strands become visible at a resolution beyond 4.5 A (Orlova & Saibil, 2011). Over the full resolution range of EM, map interpretation can be facilitated by the availability of the atomic structure of the components. These atomic structures can be used for fitting based on density correlation. In a first step, a targetdensity map at the same resolution as the EM map is calculated from the atomic structure. Next, a cross- correlation search is used to align the two densities (Rossmann, 2000; Chaco�n & Wriggers, 2002). Hence, upon availability of the individual atomic structures (either determined experimentally or obtained by modeling) and with the aid of fitting, a large polypeptide complex can be reconstructed at the near-atomic level. 4

4. Other Methods Besides protein crystallography, NMR, and EM, several other methods are available. Similar to NMR, information on the flexibility of a protein can be retrieved with small-angle X-ray scattering (SAXS). This approach relies on the interaction of radiation with particles in solution, which cause small, measurable deviations to the direction of incident radiation (Koch et al., 2003). Nowadays, SAXS is frequently used to study unfolded or partially unfolded proteins either as a function of time or as a function of different solution conditions (Lipfert & Doniach, 2007). Several spectroscopic techniques are also frequently used to obtain overall structural information, although all of these techniques suffer from very low resolving power and are mainly useful to assess broad structural characteristics of a protein or a complex. The most commonly used method is circular dichroism (CD). CD is based on the principle that left-handed and right-handed circularly polarized light is not equally absorbed upon passage through a solution of chiral molecules such as proteins. The resulting spectrum is a plot of ellipticity versus wavelength (Wilson & Walker, 2000). CD is ideally suited for secondary structure determination and is, therefore, frequently used to verify the conformation of recombinant proteins or to study the effect of a mutation on protein structure (Martin & Schilstra, 2008; Greenfield, 2009). Other fields in which CD can play an important role are ligand- and drug-binding studies (Siligardi et al., 2014), protein folding studies, and analysis of protein stability (Kelly & Price, 2000). CD measurements in the far-UV spectrum (far-UV CD) allow evaluation of overall features of the secondary structure and quantification of relative proportions of alpha-helices, beta-sheets, and random coils, whereas near-UV CD allows one to monitor tertiary structure and the micro-environment of aromatic chromophores in a protein. Tryptophan fluorescence spectroscopy is a specific application of fluorescence spectroscopy that makes use of the intrinsic fluorescence of aromatic residues, with tryptophan as the main contributor. Tryptophan fluorescence depends on the polarity of the local environment and is, therefore, ideal to analyze conformational differences like the folding and unfolding dynamics of a protein or the effect of ligand binding (Vivian & Callis, 2001). Additionally, infrared (Demirdo¨ven et al., 2004) and Raman spectroscopy (Balakrishnan et al., 2008) can be used to estimate conformational features and transitions of a protein.

B. Mass Spectrometric Approaches to Determine Protein Structure At the time of writing, the number of known protein structures [91,796 available in the PDB as of April 2014, www.pdb.org (Berman et al., 2000)] remains small compared to the number of annotated protein sequences (542,782 sequences in UniprotKB (release 2014_3) (The Uniprot Consortium, 2014)). In order to bridge this gap between sequence and structure, alternative, high-throughput methods to explore protein structure are needed. Although these higher-throughput approaches typically preclude comprehensive, high- detail structural information, their value lies in providing a limited yet detailed view on a part of the structure, or rather a sensitive yet coarse overall picture of structural dynamics. Several mass spectrometry-based methods Mass Spectrometry Reviews DOI 10.1002/mas

STRUCTURAL MASS SPECTROMETRY

serve as alternative tools to unravel protein structure and hence deviate from their typical focus on identification and quantification of these macromolecules. In this respect, native mass spectrometry, which is often referred to as structural mass spectrometry, is the most-obvious approach. Here, an intact protein or protein complex is analyzed to obtain information on its stoichiometry (Heck, 2008; Lanucara et al., 2014). However, the applications of choice to obtain more-detailed structural information are limited proteolysis, hydrogen-deuterium exchange, MS footprinting, and cross-linking; all of which exploit the surface accessibility of amino acids within a protein structure. The properties of these approaches are discussed in detail below.

1. Limited Proteolysis Proteolysis is a key mechanism in much of biology that has important roles in biochemical processes such as digestion, catabolism, protein maturation or activation, and signal propagation. The more-directed forms of proteolysis rely on cleavage of a particular peptide bond in a target protein with a very specific protease. This cleavage subsequently modifies the biological activity of the protein (Kazanov et al., 2012). Proteases are also a frequently used tool in biochemistry, although more promiscuous proteases are typically preferred for these applications. Typically, such a protease is added to a mixture of denatured proteins to obtain a near- complete digest of these sample proteins; ideally cleaving every peptide bond that satisfies the protease specificity. For example, in a classic mass spectrometry-based proteomics experiment, the protease

&

trypsin, which cleaves C-terminally from Arg and Lys residues, is added to a mixture of denatured proteins and overnight digestion results in full digestion (Vandermarliere, Mueller, & Martens, 2013). In contrast to full digestion where the main application is the identification of proteins and conformational insight is not provided (the digested proteins are denaturated), limited proteolysis can provide structural information on the analyzed proteins. Limitation of proteolysis can be achieved in several ways like removing the protease from the protein sample after only a short incubation period or by the addition of a covalent protease inhibitor (Liu, Kihara, & Park, 2011; Lomenick et al., 2011). Other strategies to confine proteolysis are related to suboptimal reaction conditions: low temperature, nonoptimal pH or ionic strength, or an enzyme:substrate ratio between 1:50 and 1:1,000 (Hubbard, 1998). The main difference between the classical, total proteolysis approach and the limited proteolysis method is that, in the latter experiment, proteins in the sample remain in their native conformation (Fig. 2A). Preservation of the native protein structure also implies that special care must be taken when pH or ionic strength manipulations are used for partial proteolysis. Because of the tightly folded domains within native proteins subjected to limited proteolysis, cleavage sites are usually located in disordered regions of these domains or in regions with enhanced backbone flexibility, for example, loops that connect secondary structure elements or domains and termini. These restraints follow from the conformational plasticity required for proper binding of the substrate in the protease’s active site (Fontana et al., 2004). Limited proteolysis is,

FIGURE 2. Schematic representation of the main difference in peptide generation between complete and limited proteolysis. (A) During a complete proteolysis experiment, the proteins in the sample are denaturated prior to digestion. Therefore, each cleavage site is accessible to the protease. In contrast, in a limited proteolysis experiment proteins remain in their native conformation. Hence only solvent accessible cleavage sites can be processed, which results in fewer and longer peptide fragments. (B) Representation of the usage of limited proteolysis to isolate protein domains.

Mass Spectrometry Reviews DOI 10.1002/mas

5

&

VANDERMARLIERE ET AL.

therefore, frequently used to isolate protein domains like protein fragments that fold autonomously and are linked to each other by flexible regions (Fig. 2B). This information hence defines structural domains and commonly guides subsequent genetic engineering of these domains for expression as independent polypeptides (Coburger et al., 2013; Sato et al., 2014). Because sites that are cleaved during a limited proteolysis experiment are usually characterized by enhanced backbone flexibility, this approach is often used to analyze the folded, unfolded, and partly folded (molten globule) state of proteins (Fontana et al., 2004). For example, apomyoglobin (Musi et al., 2004) and alpha-lactalbumin (Polverino de Laureto et al., 2002) are two proteins for which the molten globule state was studied via limited proteolysis to gain insight in their folding mechanisms. In contrast to the classic methods to obtain structural information of a protein, the limited proteolysis approach is moderate in the amount of sample needed and the experimental effort. However, the outcome does not provide high-resolution information (Fontana et al., 1997). This high-resolution information can be achieved by a combination of the outcome of a limited proteolysis experiment with, for example, the static picture of the protein obtained with protein X-ray crystallography. The combination of partial digestion and other structuredetermination methods allowed one to monitor conformational transitions upon substrate, cofactor, or inhibitor binding (McCulloch & Fitzpatrick, 1999; Kang, Wilson, & Kermode, 2008; Crain & Broderick, 2013; Diestel et al., 2013;), and during protein-protein interactions (Dhungana et al., 2009; Grote et al., 2010; Hennig et al., 2012; O’Neill et al., 2012; Feng et al., 2014).

2. Hydrogen-Deuterium Exchange Mass Spectrometry A hydrogen-deuterium exchange (HDX) experiment relies on the continuous exchange of hydrogen atoms from proteins involved in O–H, N–H and S–H groups with the surrounding solvent. Consequently, the exposure of proteins to D2O results in the exchange of these hydrogens by deuterium. Because deuterium contains a neutron as well as a proton in its nucleus, deuterium is twice as heavy as hydrogen, which causes an increase of protein or peptide mass upon exchange (Konermann, Pan, & Liu, 2011). The HDX method was already introduced in 1955 by Linderstrom-Lang and coworkers (Hvidt & Linderstrom-Lang, 1955).

Within a protein, the hydrogens of the amide groups (mainly the hydrogens from the protein backbone) are most frequently involved in HDX; the rate of exchange depends predominantly on solvent accessibility and hydrogen-bond status of the amide group (Yan et al., 2004). An amide group which is involved in an intramolecular hydrogen bond— frequently due to secondary structure—will exchange slowly. Amides buried inside a protein but not involved in hydrogen bonds also have very slow exchange rates. On the other hand, an amide located at the surface of a protein and devoid of hydrogen bonds (except for those with the surrounding solvent) will exchange rapidly. Unfortunately, it is not possible to differentiate between the contribution of solvent accessibility and hydrogen-bond status to the exchange rate (Wales & Engen, 2006). Exchange between hydrogen and deuterium is thus dependent on the structure of the protein, and the level of exchange reflects the relative openness, solvent accessibility, and hydrogen bond status within a protein (Konermann, Pan, & Liu, 2011). However, back-exchange of deuterium might limit precise measurements (Kaltashov, Bobst, & Abzalimov, 2009). Introduction of mass spectrometry into the HDX field allowed measurement of the additional mass introduced by the deuterium atoms, and as such provided information on the number of hydrogens that have been exchanged from the protein (Fig. 3). Katta and Chait (1991) introduced this approach to probe conformational changes in bovine ubiquitin induced by addition of methanol to aqueous acidic solutions of the protein. Ever since, HDX coupled to mass spectrometry is frequently used to monitor conformational changes of proteins (Mehmood et al., 2012; Vadas et al., 2013; West et al., 2013; Zhang et al., 2013), protein folding and unfolding dynamics (Khanal et al., 2012; Engen et al., 2013), intrinsically disordered proteins (Goswami et al., 2013), and protein–protein interactions (Bereszczak et al., 2013). The experimental setup for HDX depends on the structural question to resolve. When information on the different conformations within a protein population is required, continuous labeling is the method of choice (Miranker et al., 1993; Wales & Engen, 2006). The protein population is exposed to D2O while the different conformations are in equilibrium. Upon transition among conformations, the protein becomes labeled with deuterium and hence its mass increases. Next, the family of conformations is digested with a protease. Subsequent analysis of the ratio of deuterated versus non-deuterated forms of the resulting peptides allows one to monitor the ratio of the different

FIGURE 3. Cartoon representation of the workflow of mass spectrometry-based HDX. The first step of a HDX experiment is the exchange between hydrogens from the protein structure and deuterium atoms from within the solvent. Next the labelled protein is proteolytically digested and the resultant peptides are subsequently analysed; the difference between labeled and non-labeled masses is monitored. An MS footprinting experiment follows an analogous workflow but D2O is replaced by for example hydroxyl radical.

6

Mass Spectrometry Reviews DOI 10.1002/mas

STRUCTURAL MASS SPECTROMETRY

conformational states of the protein. Pulse labeling which implies that very short pulses, ‘bursts,’ of deuterium are added to the sample, on the other hand, is used to determine the influence of an agent, such as an inhibitor, a substrate, or an agonist on the conformation of a protein. In this approach, the protein is exposed to D2O for only a brief moment and subsequently digested by a protease, usually pepsin. The deuterium level of the resulting peptides indicates changes in conformation that result from the addition of the agent (Wales & Engen, 2006). Additionally, incorporation of deuterium can be monitored as a function of exposure time to map conformational dynamics in response to external stimuli such as ligand binding (Choi et al., 2010). This approach was successfully used to screen candidate vitamin D receptor ligands for osteoporosis treatment. Vitamin D3 is regularly used clinically for the treatment of osteoporosis, yet when complexed with the vitamin D receptor, it triggers a wide range of physiological cellular responses. One such response is the undesired side-effect of hypermineralization of pathological tissue. Hence, an alternative is needed. Carson and coworkers (Carson et al., 2014) evaluated candidate vitamin D receptor ligands with HDX to analyze their influence on the structure of the vitamin D receptor.

3. Mass Spectrometric Analysis of Cross-Linked Peptides The classic proteomics approach to obtain insight in proteinprotein interactions is co-precipitation or affinity purification MS (AP-MS). Typically, a whole-cell extract is prepared under non-denaturating conditions. Upon immuno-precipitation or affinity purification of the (tagged) bait protein, the interaction partners can be identified from the protein extracts, and the stoichiometry of the complex can be unraveled (Elion, 2006). Nevertheless, when the interaction among proteins is transient or of very low affinity, it can be very difficult to co-purify interaction partners (Tabb, 2012). Moreover, because co-precipitation is performed in non-physiological conditions, unnatural interactions might be identified (Mellacheruvu et al., 2013). However, some of these limitations can be overcome by the integration of cross-linking techniques. Within a chemical cross-linking experiment, neighboring residues within a protein or between polypeptides are covalently

&

linked with the aid of cross-linking agents. Homobifunctional cross-linkers have identical functional groups at their reactive sites that link identical functional groups in the protein, or protein complex. In contrast, heterobifunctional cross-linkers are composed of two different reactive groups that subsequently target different functional groups within the protein or protein complex. An even more versatile group of cross-linkers, the trifunctional cross-linkers, are composed of three different reactive groups. The additional group is either able to link a third specific protein or can be used for affinity purification of the cross-linked products (Sinz, 2003). Reactive groups are separated from each other by a linker: the spacer chain. The length of this spacer chain is an important feature of the cross-linker, because it constrains the maximum distance between residues, it will provide information on the interaction distance between the linked residues (Kruppa, Schoeniger, & Young, 2003; Walzthoeni et al., 2013). Moreover, the spacer gives the cross-linker significant rotational flexibility and as such allows cross-linking within a cone defined by the spacer. The usage of a combination of cross-linkers provides an additional level of information. This approach is nicely demonstrated by the elucidation of the alpha-crystalline structure. Peterson and coworkers (Peterson, Young, & Takemoto, 2004) used two cross-linkers: DTSSP, which is hydrophilic and can be used both to decorate residues exposed to a hydrophilic environment and to cross-link residues involved in nearest neighbor interactions, and diol formaldehyde, which is a very short cross-linker. Undecorated peptides are completely buried or inaccessible, while peptides modified exclusively by diol formaldehyde have intermediate solvent accessibility. Decoration with both cross-linkers but without cross-linking indicates that the region is highly solvent exposed and solvent accessible, while decoration and crosslinking indicates the presence of specific intermolecular subunit interactions. The first step of the procedure is the cross-linking reaction itself. In the next step, which is optional, the cross-linked products are isolated. The mixture is subsequently digested with a protease (Fig. 4). The resulting cross-linked peptides can, again optionally, be separated (Petrotchenko & Borchers, 2010). This enrichment of cross-linked peptides can be independent of

FIGURE 4. Cartoon representation of the workflow of a cross-linking experiment. We here represent a cartoon representation of the workflow of the elucidation of the alpha-crystalline structure with the aid of cross-linkers as described by Peterson and coworkers (Peterson et al., 2004). In the first step, the polypeptides are cross-linked (cross-linker 1 represents DTSSP, cross-linker 2 represents diol formaldehyde). Next, the protein complex is proteolysed and the resultant peptides are subsequently analysed with the mass spectrometer. Optionally, crosslinked peptides can be enriched.

Mass Spectrometry Reviews DOI 10.1002/mas

7

&

VANDERMARLIERE ET AL.

the nature of the cross-linker: because cross-linked peptides have an increased charge, strong-cation exchange resins can be used, moreover, because of the increased size of cross-linked peptides, size exclusion chromatography can also be applied as enrichment step (Leitner et al., 2012). Enrichment can also exploit the properties of the cross-linker itself. Examples are covalent capture to a thiol reactive resin and affinity chromatography, which can be combined with the use of a cleavable crosslinker (Buncherd et al., 2012). The cross-linker can react with residues within a single protein or with residues within two different proteins. To identify these possibilities, all peptides and peptide–peptide combinations in the search database have to be considered, but this results in a very large search space of possible peptides and peptide–peptide combinations, which can be overcome with cleavable cross-linkers (for example disulfide-containing linkers) that reduce the search space as it can again be restricted to single peptides) (Back et al., 2003). Moreover, several dedicated algorithms such as pLink (Yang et al., 2012) and xQuest (Rinner et al., 2008), and its extension xProphet (Walzthoeni et al., 2012), have been developed to tackle the large search range of possible peptides. One application of cross-linking is the reconstruction of large multiprotein complexes of which the structure of the individual subunits is available (Back et al., 2003). This application was, among others, successfully applied on RNA polymerase domain I (Jennebach et al., 2012) and II (Chen et al., 2010), on the prokaryotic ribosome (Lauber & Reilly, 2011), and the yeast 19S proteasome (Kao et al., 2012). For large protein complexes which are extremely difficult to crystallize, cross-linking experiments are frequently able to bridge the gap between an EM experiment and structure determination of the single subunits. This approach was already successfully used to solve the structure of the Nup84 complex of Chaetomium thermophilum (Thierbach et al., 2013), the translation initiation complex of yeast (Erzberger et al., 2014), and the 26S proteasome (Lasker et al., 2012). As can be deduced from the examples above, cross-linking can contribute considerably to the elucidation of large multiprotein complexes such as the proteasome, ribosome, polymerase, and several viral capsids. Indeed, it is often a key approach to unravel the complex organization of these complexes; the cross-linking allowed for targeted molecular docking of the single subunits. But cross-linking experiments can also be applied for the opposite: they can be used to validate docking experiments (Van Dijk, Boelens, & Bonvin, 2005). Overall however, cross- linking remains limited in its applications because of the data analysis challenges offered by the complex mass spectra of cross-linked peptides.

4. Mass Spectrometric Footprinting In a mass spectrometric footprinting experiment, proteins, which are in their native conformation, are covalently and irreversibly labeled. This labeling allows one to probe the solvent accessible surface of the protein. As such, MS footprinting can be used to map conformational changes due to for example ligand binding or complex formation. It is also frequently used to map the interfaces between different domains of a macromolecule (Wang & Chance, 2011). An MS footprinting experiment usually consists of three steps. The first step brings about the chemical modification of 8

the solvent-exposed protein regions, most frequently via oxidation with hydroxyl radicals. This is followed by proteolysis of the modified proteins and subsequently, a tandem mass spectrometric analysis is performed to annotate the modifications (Guan & Chance, 2005). The hydroxyl radical is the most popular reagent because it has a van der Waals radius that is similar to water and one can therefore obtain very highresolution information. Moreover, hydroxyl radicals react with most protein functional groups. This reactivity is not only dependent on the solvent accessibility of the group, but also of the specific chemistry of the functional group with Cys being the most reactive residue (Wang & Chance, 2011). There are several ways to generate hydroxyl radicals. Fenton’s reagent is the most popular chemical method (Tullius & Dombroski, 1986), but radiolysis of water (Hayes, Kam, & Tullius, 1990) and photolysis of hydrogen peroxide (Sharp, Becker, & Hettich, 2004) are also frequently applied. MS footprinting is similar to HDX in that they both label solvent accessible regions. But in HDX the accessible backbone is labeled with deuterium atoms, while in a footprinting experiment hydroxyl radicals label the accessible side chains. Moreover, in contrast to HDX, a broader range of solution conditions and proteases can be examined because of the lack of back exchange. However, high doses of radicals might cause protein unfolding (Guan & Chance, 2005), and modification of amino acid side chains might interfere with protease recognition and cleavage. One of the main applications of MS footprinting is to unravel ligand- induced conformational changes. For this, MS footprinting and existing structural data are coupled. Another application is the verification of homology models of a structure: a molecular docking experiment can be followed by MS footprinting to verify the interaction interface (Guan & Chance, 2005). The reverse can also be done: MS footprinting can be performed to identify the interaction interface which can then guide the docking experiment (Sharp et al., 2005). MS footprinting has, for example, been successfully used to decipher the differences in conformation of apolipoprotein E isoforms (Gau et al., 2011). Apolipoprotein E has an important role in the regulation of the lipid metabolism as it controls lipid distribution in tissues and cells. MS footprinting revealed that the two different positions of the three apolipoprotein E isoforms influence their structure and oligomerization status.

5. Native Mass Spectrometry and Ion Mobility Mass Spectrometry Native mass spectrometry involves the analysis of native proteins or protein complexes, and allows information about the stoichiometry and topology of complexes to be obtained. Hence, native MS allows the study of subunit exchange within complexes and the assembly of complexes from individual subunits (Heck, 2008). In contrast to the analysis of peptides which is carried out with mass spectrometers with a low m/z range capability, native MS requires instruments with an extended m/z range. Hence, time-of-flight (TOF) mass spectrometers are the instruments of choice for this application. These TOF mass spectrometers are usually coupled to a quadrupole, and this tandem setup allows for the identification of the components within the complex as the protein complexes are dissociated via collisional activation (van den Heuvel & Mass Spectrometry Reviews DOI 10.1002/mas

STRUCTURAL MASS SPECTROMETRY

Heck, 2004). Moreover, because protein complexes follow a similar decomposition pathway (small subunits located at the periphery of the assembly dissociate preferentially) additional information about the location of the subunit within the complex can be gained (van Duijn, 2010). Hence, the first step in a native MS experiment is the measurement of the mass of the complex. Subsequently, the masses of the individual components are determined. The combination of these measurements allows for the determination of the stoichiometry of the components within the complex. Due to the small amount of sample needed, native MS has the sensitivity to analyze endogenously expressed protein complexes, which is actually preferred because it reflects the naturally occurring quaternary structures and post-translational modifications. Purification of such native cellular complexes is highly challenging, however. Moreover, transient and weak interactions can get lost during this purification. Another limitation is that native MS works in aqueous solution, which hampers the study of membrane proteins (Heck, 2008). A variation on native MS is ion mobility MS in which ions of proteins or protein complexes are separated based on their ability to traverse a chamber filled with a buffer gas under the influence of an electric field. Prior to the MS analysis, the complexes undergo electrospray ionization in which ions of the complex are formed. The result is a droplet which contains one molecule of the protein or protein complex (Konijnenberg, Butterer, & Sobott, 2013). During this ionization step, larger protein or protein complex ions undergo a greater number of collisions with the buffer gas that fills the chamber; this results in a larger collisional cross-section than more compact protein complex ions of similar mass (Zhong, Hyung, & Ruotolo, 2012). Because the ion charge state of a protein or a protein complex is strongly dependent on its structure, ion mobility MS allows the separation of conformational forms, which can have only subtle differences (Niu, Rabuck, & Ruotolo, 2013). Therefore, ion mobility is a valuable tool in the study of disordered structures; it is able to help to define the degree of unstructuredness (Konijnenberg, Butterer, & Sobott, 2013).

C. Interplay Between Mass Spectrometry and Protein Structure Determination

1. Protein Structure as a Guide for Mass Spectrometry As described above, mass spectrometry can guide protein structure determination; however the opposite is also applicable: mass spectrometry can benefit from protein structure. The latter point of view is, however, less well-established. Post-translational modifications are often important actors in protein regulation and recognition. Both processes require visibility and accessibility to other proteins. For example, protein phosphorylation plays a key role in many signal transduction cascades. An important requirement of the phosphorylated position is that it has to be accessible for the kinase or phosphatase, and that the phosphate group is visible to other proteins downstream in the cascade (Blom et al., 2004). The phosphorylated position, therefore, ideally has to be located on the surface of the protein. Nowadays, phosphorylation sites are most commonly determined with MS-based high-throughput methods (Trost & Kusalik, 2011); however identification and

Mass Spectrometry Reviews DOI 10.1002/mas

&

exact localization of the measured peptides can be doubtful. Older approaches such as MALDI-TOF MS followed by peptide mass fingerprinting do not yield direct sequence information, and, therefore are highly unlikely to identify the exact phosphorylated residue (Mann et al., 2002). However, fragmentationbased methods such as CID also often make it impossible to unambiguously localize the exact phosphorylated residue when the peptide contains several possible phosphorylation sites; not only because the required signals in the acquired mass spectrum might be missing, but also because the phosphate group can actually transfer to unmodified hydroxyl-containing residues inside the instrument (Palumbo & Reid, 2008; Frese et al., 2013). To gain an overview of the degree of wrongly annotated phosphorylation events, Vandermarliere and Martens (2013) analyzed phosphorylated peptides of human proteins stored in PRIDE (Martens et al., 2005) in light of their structural properties. As expected, most phosphorylation sites were found in loop regions and are solvent-accessible. However phosphorylation sites were also found in solvent-inaccessible regions that were most likely unable to undergo conformational changes without destruction of the whole protein structure. In-depth structural analysis revealed that several phosphorylation sites are most likely wrongly assigned. Indeed, an identified phosphorylation site was reported in a buried part of the peptide, whereas several other possible phosphorylation sites for that same peptide were located on the surface of the protein. It is thus likely that the phospho-site was simply wrongly assigned to the buried residue, and instead occurred on one of the exposed residues; which nicely illustrates how phosphosite localization with mass spectrometry can benefit from structural information on the modified protein in question.

D. Examples of the Use of Mass Spectrometry in Protein Structure Determination

1. Limited Proteolysis and the Collagen-Binding Domain Where in the past Clostridium was only known as the pathogen that causes gas gangrene (Matsushita et al., 1999), it nowadays is equally important as a source of therapeutics, especially its collagenase. This collagenase is, among others, used as an alternative to surgical fasciectomy in Dupuytren’s disease (Badalamente & Hurst, 2000). Collagenase is composed of several domains with—at its carboxy-terminus—one or more collagen-binding domains (CBDs). The amino-terminus of the CBDs functions as a Ca2þ sensor: it is a flexible arm when no Ca2þ is bound, whereas Ca2þ binding induces rigidity (Philominathan et al., 2009). The crystal structure of the apo- and Ca2þbound form of the CBD are available (Wilson et al., 2003). Sides and coworkers (Sides et al., 2012) applied limited proteolysis to analyze the dynamics and stability of the CBD of Clostridium collagenase upon Ca2þ binding to nicely illustrate the occurrence of differences between the structure in solution and the structure determined with X-ray crystallography. Their data from limited proteolysis experiments furthermore confirm the dynamic nature of the flexible arm in the apo-structure, whereas this dynamics could not be deduced from the corresponding crystal structure, where the flexible arm is involved in crystal contacts and, therefore, loses its flexibility.

9

&

VANDERMARLIERE ET AL.

2. HDX and Mammalian Glutathione Transferase Glutathione transferases are involved in the detoxification of endogenous and xenobiotic electrophiles, occur as dimers, and are structurally well-characterized (Armstrong, 1997). Mutations of Phe-56, which is important in the hydrophobic dimer interface of class mu glutathione transferases, impair the catalytic activity and have an influence on the dimer interface (Hornby et al., 2002). Despite the well-characterized wild-type structure of glutathione transferases, little structural information on the mutants is available because these structures are difficult to crystallize. Codreanu and coworkers (Codreanu et al., 2005) used HDX to explore the influence of these mutations on the structure and function of glutathione transferase. They incubated the protein with D2O for various lengths of time. The samples were subsequently digested with pepsin, and the resulting peptides were analyzed with ESI-MS. They found that Phe-56-Ser and Phe-56-Arg mutants display enhanced H/D exchange at the dimer interface, to indicate lower dimer stability. In the Phe56-Glu mutant, on the other hand, the H/D exchange rate is decreased; there is less solvent accessible at the dimer interface. These data indicate that the interface adopts an alternative conformation. This alternative conformation is supported by the finding that the GSH binding site is disrupted in the Phe-56-Glu mutant [it has been shown that there is a link between the GSH binding site and the dimer interface (Hornby et al., 2002)].

II. CONCLUSIONS AND OUTLOOK Due to the large gap between the number of structures available in the PDB, on the one hand, and the much larger number of available protein sequences on the other hand, there is a need for proteome-scale approaches to gain further insight in protein structures and their associated plasticity. Moreover, it is generally understood that a static picture of a protein is not always suited to explore its function and mechanism of action. Usually, such analyses require deeper knowledge on the conformational flexibility of the protein. Classic approaches for protein structure determination such as protein crystallography and NMR however, each suffer from distinct limitations to deliver such information. As a result, mass spectrometry-based methods are becoming important tools to study protein structure and dynamics because they provide information complementary to the classic structural approaches. Moreover, mass spectrometry-based methods consume only a limited amount of sample, are fast, offer high sensitivity, and provide a wide range of conditions under which proteins can be examined. Improvements can, however, still be made to these massspectrometry based approaches. Cross-linking experiments are currently primarily limited by the difficulty of a reliable interpretation of the obtained data, which indicates that improved algorithms are a key element needed to allow such approaches to become much more widely used. The two core challenges for such algorithms are combinatorial complexity that results from intra-protein, and especially inter-protein cross-linking, and the much more complex fragmentation spectra obtained from cross-linked peptides. Although current algorithms primarily attempt to extend or adapt existing search

10

algorithms, it will be worthwhile to attempt the creation of novel identification algorithms that are especially aimed at crosslinked peptides. In general, further refinement of existing mass spectrometrybased methods, able to map dynamic features back onto the protein structures, in native or otherwise desired environments, are of utmost importance, because they can be used to connect structure and function, and provide us with deeper mechanistic understanding of the underlying principles involved. Recent advances already allow whole yeast cell analysis of conformational changes upon for example administration of different nutrients (Feng et al., 2014). In conclusion, structural analyses via mass spectrometry complement the conventional methods, thereby serving as a bridge between structural and functional biology. Great depth of analysis can be obtained when the data generated by mass spectrometry-based methods are directly linked to structural data obtained from X-ray, NMR or EM studies, to solve the puzzle of a protein’s structure and dynamics

ABBREVIATIONS AP-MS CBD CD DTSSP EM ESI-MS GSH HDX MAD MALDI MS NMR NOE PDB SAD SAXS TOF

affinity purification mass spectrometry collagen-binding domain circular dichroism 3,30 -dithiobis [sulfosuccinimidyl propionate] electron microscopy electrospray ionization mass spectrometry glutathione hydrogen deuterium exchange multiple-wavelength anomalous dispersion matrix-assisted laser desorption/ionization mass spectrometry nuclear magnetic resonance nuclear Overhauser effect protein data bank single-wavelength anomalous dispersion small-angle X-ray scattering time-of-flight

ACKNOWLEDGMENTS E.V. is supported by a research project grant (IWT 110431) by the Belgian government agency for Innovation by Science and Technology (IWT). E.S. is a Postdoctoral Research Fellow of the Fund for Scientific Research (FWO)-Flanders (Belgium).

REFERENCES Armstrong RN. 1997. Structure, catalytic mechanism, and evolution of the glutathione transferases. Chem Res Toxicol 10:2–18. Back JW, de Jong L, Muijsers AO, de Koster CG. 2003. Chemical crosslinking and mass spectrometry for protein structural modeling. Mol Biol 331:303–313. Badalamente M, Hurst L. 2000. Enzyme injection as nonsurgical treatment of Dupuytren’s disease. J Hand Surg Am 25:629–636. Balakrishnan G, Weeks C, Ibrahim M, Soldatova A, Spiro T. 2008. Protein dynamics from time resolved UV Raman spectroscopy. Curr Opin Struct Biol 18:623–629. Baumeister W. 2005. From proteomic inventory to architecture. FEBS letters 579:933–937.

Mass Spectrometry Reviews DOI 10.1002/mas

STRUCTURAL MASS SPECTROMETRY

Bereszczak J, Rose R, van Duijn E, Watts N, Wingfield P, Steven A, Heck AJR. 2013. Epitope-distal effects accompany the binding of two distinct antibodies to hepatitis B virus capsids. J Am Chem Soc 135:6504–6512. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. 2000. The protein data bank. Nucleic Acids Res 28:235–242. Bertini I, Jime´nez B, Pierattelli R, Wedd A, Xiao Z. 2008. Protonless 13C direct detection NMR: Characterization of the 37kDa trimeric protein CutA1. Proteins 70:1196–1205. Blom N, Sicheritz-Ponte´n T, Gupta R, Gammeltoft S, Brunak S. 2004. Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics 4:1633– 1649. Blow D. 2002. Outline of crystallography for biologists. Oxford: University Press. Buncherd H, Nessen M, Nouse N, Stelder S, Roseboom W, Dekker H, Arents J, Smeenk L, Wanner M, van Maarseveen J, Yang X, Lewis P, de Koning L, de Koster C, de Jong L. 2012. Selective enrichment and identification of cross-linked peptides to study 3D structures of protein complexes by mass spectrometry. J Proteomics 75:2205– 2215. Carson M, Zhang J, Chalmers M, Bocchinfuso W, Holifield K, Masquelin T, Stites R, Stayrook K, Griffin P, Dodge J. 2014. HDX reveals unique fragment ligands for the vitamin D receptor. Bioorg Med Chem Lett 24:3459–3463. Chaco�n P, Wriggers W. 2002. Multi-resolution contour-based fitting of macromolecular structures. J Mol Biol 317:375–384. Chen Z, Jawhari A, Fischer L, Buchen C, Tahir S, Kamenski T, Rasmussen M, Lariviere L, Bukowski-Wills J, Nilges M. 2010. Architecture of the RNA polymerase II-TFIIF complex revealed by cross-linking and mass spectrometry. EMBO J 29:717–726. Choi JH, Banks AS, Estall JL, Kajimura S, Bostro¨m P, Laznik D, Ruas JL, Chalmers MJ, Kamenecka TM, Blu¨her M, Griffin PR, Spiegelman BM. 2010. Anti-diabetic drugs inhibit obesity-linked phosphorylation of PPARgamma by Cdk5. Nature 466:451–456. Coburger I, Dahms S, Roeser D, Gu¨hrs K, Hortschansky P, Than M. 2013. Analysis of the overall structure of the multi-domain amyloid precursor protein (APP). PloS ONE 8:e81926. Codreanu SG, Thompson LC, Hachey DL, Dirr HW, Armstrong RN. 2005. Influence of the dimer interface on glutathione transferase structure and dynamics revealed by amide H/D exchange mass spectrometry. Biochemistry 44:10605–10612. Crain A, Broderick J. 2013. Flavodoxin cofactor binding induces structural changes that are required for protein-protein interactions with NADP(þ) oxidoreductase and pyruvate formate-lyase activating enzyme. Biochim Biophys Acta 1834:2512–2519. Dauter Z. 2002. New approaches to high-throughput phasing. Curr Opin Immunol 12:674–678. Dauter Z, Jaskolski M, Wlodawer A. 2010. Impact of synchrotron radiation on macromolecular crystallography: A personal view. J Synchrotron Radiat 17:433–444. Demirdo¨ven N, Cheatum C, Chung H, Khalil M, Knoester J, Tokmakoff A. 2004. Two-dimensional infrared spectroscopy of antiparallel betasheet secondary structure. J Am Chem Soc 126:7981–7990. Dhungana S, Williams J, Fessler M, Tomer K. 2009. Epitope mapping by proteolysis of antigen-antibody complexes. Methods Mol Biol 524:87– 101. Diestel U, Resch M, Meinhardt K, Weiler S, Hellmann T, Mueller T, Nickel I, Eichler J, Muller Y. 2013. Identification of a novel TGF-beta-binding site in the zona pellucida C-terminal (ZP-C) domain of TGF-betareceptor-3 (TGFR-3). PloS ONE 8:e67214. Drenth J. 1999. Principles of protein X-ray crystallography. New York: Springer-Verlag. Elion E. 2006. Detection of protein–protein interactions by coprecipitation. Curr Protoc Mol Biol 20:1–10.

Mass Spectrometry Reviews DOI 10.1002/mas

&

Engen JR, Wales TE, Chen S, Marzluff E, Hassell K, Weis D, Smithqall T. 2013. Partial cooperative unfolding in proteins as observed by hydrogen exchange mass spectrometry. Int Rev Phys Chem 32:96– 127. Erzberger J, Stengel F, Pellarin R, Zhang S, Schaefer T, Aylett C, Cimermancic P, Boehringer D, Sali A, Aebersold R, Ban N. 2014. Molecular architecture of the 40S-eIF1-eIF3 translation initiation complex. Cell 158:1123–1135. Feng Y, De Franceschi G, Kahraman A, Soste M, Melnik A, Boersema PJ, de Laureto PP, Nikolaev Y, Oliveira AP, Picotti P. 2014. Global analysis of protein structural changes in complex proteomes. Nat Biotechnol 32:1036–1044. Fontana A, de Laureto PP, Spolaore B, Frare E, Picotti P, Zambonin M. 2004. Probing protein structure by limited proteolysis. Acta Biochim Pol 51:299–321. Fontana A, Polverino de Laureto P, De Filippis V, Scaramella E, Zambonin M. 1997. Probing the partly folded states of proteins by limited proteolysis. Fold Des 2:R17–R26. Frese C, Zhou H, Taus T, Altelaar A, Mechtler K, Heck AJR, Mohammed S. 2013. Unambiguous phosphosite localization using electron-transfer/ higher-energy collision dissociation (EThcD). J Proteome Res 12:1520–1525. Garman E. 2003. “Cool” crystals: Macromolecular cryocrystallography and radiation damage. Curr Opin Struct Biol 13:545–551. Gau B, Garai K, Frieden C, Gross M. 2011. Mass spectrometry-based protein footprinting characterizes the structures of oligomeric apolipoprotein E2, E3, and E4. Biochemistry 50:8117–8126. Goswami D, Devarakonda S, Chalmers M, Pascal B, Spiegelman BM, Griffin PR. 2013. Time window expansion for HDX analysis of an intrinsically disordered protein. J Am Soc Mass Spectrom 24:1584– 1592. Grote M, Wolf E, Will C, Lemm I, Agafonov D, Schomburg A, Fischle W, Urlaub H, Lu¨hrmann R. 2010. Molecular architecture of the humen Prp19/CDC5L complex. Mol Cell Biol 30:2105–2119. Guan J, Chance M. 2005. Structural proteomics of macromolecular assemblies using oxidative footprinting and mass spectrometry. Trends Biochem Sci 30:583–592. Hajdu J. 2000. Single-molecule X-ray diffraction. Curr Opin Struct Biol 10:569–573. Hayes J, Kam L, Tullius T. 1990. Footprinting protein-DNA complexes with gamma-rays. Methods Enzymol 186:545–549. Heck AJR. 2008. Native mass spectrometry: A bridge between interactomics and structural biology. Nat Methods 5:927–933. Hennig J, de Vries S, Hennig K, Randles L, Walters K, Sunnerhagen M, Bonvin A. 2012. MTMDAT-HADDOCK: High-throughput, protein complex structure modeling based on limited proteolysis and mass spectrometry. BMC Struct Biol 12:29. Hornby J, Codreanu SG, Armstrong RN, Dirr HW. 2002. Molecular recognition at the dimer interface of a class mu glutathione transferase: Role of a hydrophobic interaction motif in dimer stability and protein function. Biochemistry 41:14238–14247. Horst R, Bertelsen E, Fiaux J, Wider G, Horwich A, Wu¨thrich K. 2005. Direct NMR observation of a substrate protein bound to the chaperoning GroEL. Proc Natl Acad Sci USA 102:12748–12753. Hubbard SJ. 1998. The structural aspects of limited proteolysis of native proteins. Biochim Biophys Acta 1382:191–206. Hvidt A, Linderstrom-Lang K. 1955. The kinetics of the deuterium exchange of insulin with D2O: An amendment. Biochim Biophys Acta 16:168– 169. Ishima R, Torchia D. 2000. Protein dynamics from NMR. Nat Struct Biol 7:740–743. Jaskolski M. 2010. Personal remarks on the future of protein crystallography and structural biology. Acta Biochim Pol 57:261–264. Jennebach S, Herzog F, Aebersold R, Cramer P. 2012. Crosslinking-MS analysis reveals RNA polymerase I domain architecture and basis of rRNA cleavage. Nucleic Acids Res 40:5591–5601.

11

&

VANDERMARLIERE ET AL.

Kaltashov I, Bobst C, Abzalimov R. 2009. H/D exchange and mass spectrometry in the studies of protein conformation and dynamics: Is there a need for a top-down approach. Anal Chem 81:7892–7899. Kang M, Wilson L, Kermode J. 2008. Evidence from limited proteolysis of a ristocetin-induced conformational change in human von Willebrand factor that promotes its binding to platelet glycoprotein Ib-IX-V. Blood Cells Mol Dis 40:433–443. Kao A, Randall A, Yang Y, Patel V, Kandur W, Guan S, Rychnovsky S, Baldi P. 2012. Mapping the structural topology of the yeast 19S proteasomal regulatory particle using chemical cross-linking and probabilistic modeling. Mol Cell Proteomics 11:1566–1577. Katta V, Chait B. 1991. Conformational changes in proteins probed by hydrogen-exchange electrospray-ionization mass spectrometry. Rapid Commun Mass Spectrom 5:214–217. Kazanov MD, Igarashi Y, Eroshkin AM, Cieplak P, Zhang Y, Li Z, Godzik A, Osterman AL, Jeffrey W. 2012. Structural determinants of limited proteolysis. J Proteome Res 10:3642–3651. Kelly S, Price N. 2000. The use of circular dichroism in the investigation of protein structure and function. Curr Protein Pept Sci 1:349–384. Kendrew JC, Bodo G, Dintzis HM, Parrish RG, Wyckoff H, Phillips DC. 1958. A three-dimensional model of the myoglobin molecule obtained by X-ray analysis. Nature 181:662–666. Khanal A, Pan Y, Brown L, Konermann L. 2012. Pulsed hydrogen/deuterium exchange mass spectrometry for time-resolved membrane protein folding studies. J Mass Spectrom 47:1620–1626. Koch M, Vachette P, Svergun D. 2003. Small-anlge scattering: A view on the properties, structures and structural changes of biological macromolecules in solution. Q Rev Biophys 36:147–227. Konermann L, Pan J, Liu Y-H. 2011. Hydrogen exchange mass spectrometry for studying protein structure and dynamics. Chem Soc Rev 40:1224– 1234. Konijnenberg A, Butterer A, Sobott F. 2013. Native ion mobility-mass spectrometry and related methods in structural biology. Biochim Biophys Acta 1834:1239–1256. Kruppa GH, Schoeniger J, Young MM. 2003. A top down approach to protein structural studies using chemical cross-linking and Fourier transform mass spectrometry. Rap commun Mass Spectrom 17:155– 162. Lanucara F, Holman S, Gray C, Eyers C. 2014. The power of ion mobilitymass spectrometry for structural characterization and the study of conformational dynamics. Nat Chem 6:281–294. Lasker K, Forster F, Bohn S, Walzthoeni T, Villa E, Unverdorben P, Beck F, Aebersold R, Sali A, Baumeister W. 2012. Molecular architecture of the 26S proteasome holocomplex determined by an integrative approach. Proc Natl Acad Sci USA 109:1380–1387. Lauber M, Reilly J. 2011. Structural analysis of a prokaryotic ribosome using a novel amidinating cross-linker and mass spectrometry. J Proteome Res 10:3604–3616. Leitner A, Reischl R, Walzthoeni T, Herzog F, Bohn S, Forster F, Aebersold R. 2012. Expanding the chemical cross-linking toolbox by the use of multiple proteases and enrichment by size exclusion chromatography. Mol Cell Proteomics 11:M111.014126. Lipfert J, Doniach S. 2007. Small-angle X-ray scattering from RNA, proteins and protein complexes. Annu Rev Biophys Biomol Struct 36:307–327. Liu P, Kihara D, Park C. 2011. Energetics-based discovery of protein-ligand interactions on a proteomics scale. J Mol Biol 408:147–162. Lomenick B, Jung G, Wohlschlegel J, Huang J. 2011. Target identification using drug affinity responsive target stability (DARTS). Curr Protoc Chem Biol 3:163–180. Mann M, Ong S-E, Gronborg M, Steen H, Jensen O, Pandey A. 2002. Analysis of protein phosphorylation using mass spectrometry: Deciphering the phosphoproteome. Trends Biotechnol 20:261–268. Martens L, Hermjakob H, Jones P, Adamski M, Taylor C, States D, Gevaert K, Vandekerckhove J, Apweiler R. 2005. PRIDE: The proteomics identifications database. Proteomics 5:3537–3545.

12

Martin S, Schilstra M. 2008. Circular dichroism and its application to the study of biomolecules. Methods Cell Biol 84:263–293. Matsushita O, Jung C, Katayama S, Minami J, Takahashi Y, Okabe A. 1999. Gene duplication and multiplicity of collagenases in Clostridium histolyticum. J Bacteriol 181:923–933. McCulloch R, Fitzpatrick PD. 1999. Limited proteolysis of tyrosine hydroxylase identifies residues 33–50 as conformationally sensitive to phosphorylation state and dopamine binding. Arch Biochem Biophys 367:143–145. McPherson A. 2003. Introduction to macromolecular crystallography. Hoboken, New Jersey, USA: John Wiley & Sons. Mehmood S, Domene C, Forest E, Jault J. 2012. Dynamics of a bacterial multidrug ABC transporter in the inward- and outward-facing conformations. Proc Natl Acad Sci USA 109:10832–10836. Mellacheruvu D, Wright Z, Couzens AL, Lambert J, St-denis N, Li T, Miteva YV, Hauri S, Sardiu ME, Yew T, Halim VA, Bagshaw RD, Hubner NC, Bouchard A, Faubert D, Fermin D, Dunham WH, Heck AJR, Choi H, Gstaiger M. 2013. The CRAPome: A contaminant repository for affinity purification mass spectrometry data. Nat Methods 10:730–736. Milanovic M, Kracht M, Schmitz M. 2014. The cytokine-induced conformational switch of nuclear factor kB p65 is mediated by p65 phosphorylation. Biochem J 457:401–413. Miranker A, Robinson CV, Radford S, Aplin R, Dobson C. 1993. Detection of transient protein folding populations by mass spectrometry. Science 262:896–900. Musi V, Spolaore B, Picotti P, Zambonin M, De Filippis V, Fontana A. 2004. Nicked apomyoglobin: A noncovalent complex of two polypeptide fragments comprising the entire protein chain. Biochemistry 43:6230– 6240. Niu S, Rabuck J, Ruotolo B. 2013. Ion mobility-mass spectrometry of intact protein–ligand complexes for pharmaceutical drug discovery and development. Curr Opin Chem Biol 17:809–817. O’Neill M, Bhakta M, Fleming K, Wilks A. 2012. Induced fit on heme binding to the Pseudomonas aeruginosa cytoplasmic protein (PhuS) drives interaction with heme oxygenase (HemO). Proc Natl Acad Sci USA 109:5639–5644. Orlova E, Saibil V. 2011. Structural analysis of macromolecular assemblies by electron microscopy. Chem Rev 111:7710–7748. Palumbo A, Reid G. 2008. Evaluation of gas-phase rearrangement and competing fragmentation reactions on protein phosphorylation site assignment using collision induced dissociation-MS/MS and MS3. Anal Chem 80:9735–9747. Peterson J, Young MM, Takemoto L. 2004. Probing alpha-crystallin structure using chemical cross-linkers and mass spectrometry. Mol Vis 10:857–866. Petrotchenko E, Borchers V. 2010. Crosslinking combined with mass spectrometry for structural proteomics. Mass Spectrom Rev 29:862– 876. Philominathan S, Matsushita O, Gensure R, Sakon J. 2009. Ca2þ -induced linker transformation leads to a compact and rigid collagen- binding domain of Clostridium histolyticum collagenase. FEBS J 276:3589– 3601. Polverino de Laureto P, Frare E, Gottardo R, Fontana A. 2002. Molten globule of bovine alpha-lactalbumin at neutral pH induced by heat, trifluoroethanol, and oleid acid: A comparative analysis by circular dichroism spectroscopy and limited proteolysis. Proteins 49:385–397. ¨ ber die bestimmung von funktionen durch ihre integraRadon J. 1917. U lwerte la¨ngs gewisser manningfaltigkeiten. Math Phys Klasse 69:262– 277. Rinner O, Seebacher J, Walsthoeni T, Mueller L, Beck M, Schmidt A, Mueller M, Aebersold R. 2008. Identification of cross-linked peptides from large sequence databases. Nat Methods 5:315–318. Rossmann MG. 2000. Fitting atomic models into electron-microscopy maps. Acta Crystallogr D Biol Crystallogr 56:1341–1349.

Mass Spectrometry Reviews DOI 10.1002/mas

STRUCTURAL MASS SPECTROMETRY

Rudiger S, Freund S, Veprintsev D, Fersht A. 2002. CRINEPT-TROSY NMR reveals p53 core domain bound in an unfolded form to the chaperone Hsp90. Proc Natl Acad Sci USA 99:11085–11090. Sato T, Miyanoiri Y, Takeda M, Naoe Y, Mitani R, Hirano K, Takehara S, Kainosho M, Matsuoka M, Uequchi-Tanaka M, Kato H. 2014. Expression and purification of a GRAS domain of SLR1, the rice DELLA protein. Protein Expr Purif 95:248–258. Sharp J, Becker J, Hettich R. 2004. Analysis of protein solvent accessible surfaces by photochemical oxidation and mass spectrometry. Anal Chem 76:672–683. Sharp J, Guo J, Uchiki T, Xu Y, Dealwis C, Hettich R. 2005. Photochemical surface mapping of C14S-Sml1p for constrained computational modeling of protein structure. Anal Biochem 340:201–212. Sides CR, Liyanaga R, Lay JO, Philominathan STL, Matsushita O, Sakon J. 2012. Probing the 3-D structure, dynamics, and stability of bacterial collagenase collagen binding domain (apo- versus holo-) by limited proteolysis MALDI-TOF MS. J Am Soc Mass Spectrom 23:505–519. Siligardi G, Hussain R, Patching SG, Phillips-Jones MK. 2014. Ligand- and drug-binding studies of membrane proteins revealed through circular dichroism spectroscopy. Biochimi Biophys Acta 1838:34–42. Sinz A. 2003. Chemical cross-linking and mass spectrometry for mapping three-dimensional structures of proteins and protein complexes. J Mass Spectrom 38:1225–1237. Tabb DL. 2012. Evaluating protein interactions through cross-linking mass spectrometry. Nat Methods 9:879–881. The Uniprot Consortium. 2014. Activities at the Universal Protein Resource (UniProt). Nucleic Acids Res 42:D191–D198. Thierbach K, van Appen A, Thomas M, Beck M, Flemming D, Hurt E. 2013. Protein interfaces of the conserved Nup84 complex from Chaetomium thermophilum shown by crosslinking mass spectrometry and electron microscopy. Structure 21:1672–1682. Tompa P. 2005. The interplay between structure and function in intrinsically unstructured proteins. FEBS Lett 579:3346–3354. Trost B, Kusalik A. 2011. Computational prediction of eukaryotic phosphorylation sites. Bioinformatics 27:2927–2935. Tullius T, Dombroski B. 1986. Hydroxyl radical “footprinting”: Highresolution information about DNA-protein contacts and application to lambda repressor and Cro protein. Proc Natl Acad Sci USA 83:5469– 5473. Vadas O, Dbouk H, Shymanets A, Perisic O, Burke J, Abi Saab, Khalil W, Harteneck B, Bresnick C, Nu¨mberg A, Backer B, Williams J. 2013. Molecular determinants of PI3Kgamma-mediated activation downstream of G-protein- coupled receptors (GPCRs). Proc Natl Acad Sci USA 110:18862–18867. Van Dijk A, Boelens R, Bonvin A. 2005. Data-driven docking for the study of biomolecular complexes. FEBS J 272:293–312. Van den Heuvel R, Heck AJR. 2004. Native protein mass spectrometry: From intact oligomers to functional machineries. Curr Protoc Chem Biol 8:519–526. Vandermarliere E, Martens L. 2013. Protein structure as a means to triage proposed PTM sites. Proteomics 13:1028–1035.

Mass Spectrometry Reviews DOI 10.1002/mas

&

Vandermarliere E, Mueller M, Martens L. 2013. Getting intimate with trypsin, the leading protease in proteomics. Mass Spectrom Rev 32:453–465. Vivian J, Callis P. 2001. Mechanisms of tryptophan fluorescence shifts in proteins. Biophys J 80:2093–2109. Wales TE, Engen JR. 2006. Hydrogen exchange mass spectrometry for the analysis of protein dynamics. Mass Spectrom Rev 25:158–170. Walzthoeni T, Claassen M, Leitner A, Herzog F, Bohn S, Fo¨rster F, Beck M, Aebersold R. 2012. False discovery rate estimation for crosslinked peptides identifies by mass spectrometry. Nat Methods 9:901– 903. Walzthoeni T, Leitner A, Stengel F, Aebersold R. 2013. Mass spectrometry supported determination of protein complex structure. Current Opin Struct Biol 23:252–260. Wang L, Chance M. 2011. Structural mass spectrometry of proteins using hydroxyl radical based protein footprinting. Anal Chem 83:7234– 7241. West G, Pascal B, Ng L, Soon F, Melcher K, Xu H, Chalmers M, Griffin P. 2013. Protein conformation ensembles monitored by HDX reveal a structural rationale for abscisic acid signaling protein affinities and activities. Structure 21:229–235. Wider G, Wu¨thrich K. 1999. NMR spectroscopy of large molecules and multimolecular assemblies in solution. Curr Opin Struct Biol 9:594– 601. Williamson M, Havel T, Wu¨thrich K. 1985. Solution conformation of proteinase inhibitor IIA from bull seminal plasma by 1H nuclear magnetic resonance and distance geometry. J Mol Biol 182:295–315. Wilson J, Matsushita O, Okabe A, Sakon J. 2003. A bacterial collagenbinding domain with novel calcium-binding motif controls domain orientation. EMBO J 22:1743–1752. Wilson K, Walker J. 2000. Principles and techniques of practical biochemistry. Cambridge: Cambridge University Press. Wu¨thrich K. 1995. NMR–This other method for protein and nucleic acid structure determination. Acta Crystallogr D Biol Crystallogr 51:249– 270. Wu¨thrich K. 2001. The way to NMR structures of proteins. Nat Struct Biol 8:923–925. Yan X, Watson J, Ho PS, Deinzer ML. 2004. Mass spectrometric approaches using electrospray ionization charge states and hydrogen-deuterium exchange for determining protein structures and their conformational changes. Mol Cell Proteomics 3:10–23. Yang B, Wu Y, Zhu M, Fan S, Lin J, Zhang K, Li S, Chi H, Li Y, Chen H, Luo S, Ding Y, Wang L, Hao Z, Xiu L, Chen S, Ye K, He S, Dong M. 2012. Identification of cross-linked peptides from complex samples. Nat methods 9:904–906. Zhang Q, Chen J, Kuwajima K, Zhang H, Xian F, Young N, Marshall A. 2013. Nucleotide-induced conformational changes of tetradecameric GroEL mapped by H/D exchange monitored by FT-ICR mass spectrometry. Sci Rep 3:1247. Zhong Y, Hyung S, Ruotolo B. 2012. Ion mobility-mass spectrometry for structural proteomics. Expert Rev Proteomics 9:47–58.

13

Resolution of protein structure by mass spectrometry.

Typically, mass spectrometry is used to identify the peptides present in a complex peptide mixture and subsequently the precursor proteins. As such, m...
2MB Sizes 3 Downloads 15 Views