Analytical methods in hiotechnology

I27

Mass Spectrometry of Peptides, Proteins and Glycoproteins Fiona M. Greer M-Scan Ltd, Silwood Park, Sunninghill, Ascot, Berkshire SL5 7PZ, U K

Introduction The determination of the primary structure of a protein molecule has always been an important, yet formidable task for protein chemists. Two general methods are employed but both suffer from several limitations. The first, stepwise sequencing using the Edman degradation technique, relies on a free primary or secondary amino-group at the amino terminus and, thus, cannot deal with blocked or certain modified amino acids, or indeed any non-protein modification. The second method, deducing the sequence indirectly through the DNA base sequence of the cloned gene cannot predict any post-transitional process in the final recombinant protein product. In recent years, mass spectrometry has become a powerful alternative tool for solving biopolymer structural problems, which complements these traditional methods. Extensive research has been carried out on the utility of various desorption or ‘soft’ ionisation mass spectrometry techniques for biological research. These include fast atom bombardment, field desorption, plasma desorption, laser desorption, liquid secondary ion, thermospray and electrospray ionisation. At the moment, fast atom bombardment mass spectrometry (FAB-MS) is the most widely used of these techniques.2 FAB-MS has now become the method of choice for confirming the cDNA derived amino acid sequences of large proteins. It is able to detect any changes in a protein caused by, e.g. insertion, deletion or modification of single amino acids. Indeed, any structural alteration which changes the mass of the molecule can not only be detected, but also identified, such as blocking groups at the amino-terminus. Also, in contrast to Edman sequencing, there is an equal probability of observing both the C-terminal as well as the N-terminal regions of the protein. When dealing with recombinant proteins, FAB-MS can identify post-translational modifications such as phosphorylation, glycosylation, sulphation, etc., which cannot be predicted from the gene sequence. Most importantly, FAB-MS techniques are amenable to mixtures of peptides-a difficult proposition for classical methods. This eliminates the need for costly time and material consuming purification steps. This article will illustrate some aspects of recombinant protein structure problem solving and quality control by FAB-MS. Principles of FAB-MS In the early 1980s, two major advances in mass spectrometry combined to allow its more widespread application to protein analysis. First, the introduction of the FAB technique in a liquid matrix by Barber er in 1981 enabled previously intractable thermally labile and highly polar compounds to be analysed with minimal sample preparation and no chemical derivatisation. This in turn prompted

128

S C l Biotechnoloqy Group Meeting

the development of high field magnet instrumentation to analyse the large molecules which were ionised by FAB. Modern two-sector double focusing mass spectrometers have high field magnets enabling the measurement of masses of samples in the 10 pmol-1 nmol range up to 10-15 kDa a prerequisite for the study of large biomolecules, with the ability to determine the molecular weight of peptides with an accuracy of better than one mass unit. The principle of FAB-MS is that the sample, in solution in a liquid matrix (e.g. glycerol), is bombarded by a stream of high energy atoms or ions (Xe or in the case of liquid SIMS, Cs). The resulting gas phase sample ions are then separated according to their mass-to-charge ratio using a high field magnet to produce a mass spectrum. The signals observed in the spectrum correspond to the mass of the protonated molecule, the quasi-molecular ion, [M H I + , together with any sequence ions of lower intensity resulting from fragmentation of the parent ion. Peptides fragment across the peptide bond between amino acid residues to give a series of 'sequence ions' both from the N - and C-terminus, from which the sequence can easily be deduced by examining the molecular-weight differences between the signals in the spectrum. Any N - or C-terminal blocking groups (e.g. formyl, acetyl or amide) can be detected by the mass difference. FAB-MS is thus a simple and effective means of checking the integrity of peptide sequences. Figure 1 shows the complex positive ion spectrum obtained from the peptide mellittin which fragments at every peptide bond leading to a series of both N - and C-terminal sequence ions. A quasi-molecular ion, [M H] is observed at m/z 2845. A second molecular ion signal is also present 28 mass units higher than the anticipated [M + H I + . This represents an additional component in the sample which contains a formyl group at the N-terminus, and demonstrates the effectiveness of FAB-MS to analyse and identify blocked peptides.

+

+

+

FAB Mapping The ability to study mixtures of peptides by FAB-MS led to the development of an extremely rapid and effective method of fingerprinting protein structures--FAB m a ~ p i n g . The ~ . ~ concept is to cleave the protein using a specific enzymic or chemical method then analyse the mixture of peptides by direct FAB-MS to produce a map. In contrast to the spectrum of a single peptide, sequence ions are not observed in the spectrum of a mixture. Given that the nucleotide or amino acid sequence is known, the peaks (quasi-molecular ions) in the spectrum can be assigned to portions of the molecule and offer a clear diagnostic for the presence (or otherwise) of the corresponding structure. An example of an early biotechnology application of FAB mapping is shown in Fig. 2. The signals observed following tryptic digestion of the carboxymethylated urogastrone fusion product (epidermal cell growth factor) provide an elegant confirmation of the anticipated sequence. Simple biochemical, enzymic or chemical procedures (e.g. steps of manual Edman degradation) can be used to verify the assignments in the spectrum. When additional or unexpected signals appear in the F A B map these can also be fully investigated. These may arise from errors of translation, insertion, deletion, point mutation or post-translational modification. For larger proteins, by combining data from two or more simple FAB maps, over

Analytical methods in biotechnoloyy

129

200

300

400

500

500

600

700

800 1055 1070

BOO

900

1100

1200

1395 1400

1300

c-15 1510

-,2092

2100

2600

2700

,

C-15

,2191

2200

C-15

2800

2300

2900 mlz

Fig. la. FAB-MS spectrum of mellitin showing sequence ions.

90% of the sequence can be rapidly confirmed.2 Thus this simple procedure can be used in the process or quality control analysis of recombinant proteins. However, as well as being a rapid screening method, FAB-MS is also an important research tool. FAB mapping techniques have been extended to address problems which previously have been difficult if not impossible to solve using conventional techniques. These range from identification of glycosylation and determination of the carbohydrate structure to definitive assignment of disulphide bridges. The following examples illustrate some of the strategies involved.

130

SC1 Biotechnology Group M r e r i n g

GI+++++ly+++++++ppl 171

58

228

299

398

511

639

738

851

952 1053 1110 1223 1320 1391 1sM

2707 2674 2617 2546 2447 2334 2206 2107 1994 1893 1792 1735 1622 1525 1454 1617 1704 l8sO 2003 2131 2287 2415 2571 2699

~eJ+~m+*~y++y+r++n-NH2 341

1226 1141 955 842

714

558

430

274

146

Fig. Ib. The amino acid sequence of the peptide mellitin.

N-terminal

C-lerminal

MQTQKPTSSSK-LKK-NSDSECPLSHDGYCLHDGVCMYIEALDK-YACNCVVGYIGER-CQYR-DLK-~EL 1-11 2 -15-42 A L56-59J L63-66J

Fig. 2. Tryptic map of a recombinant fusion product (epidermal cell growth factor)

C-Terminal ragged-ends Occasionally, impurities in a recombinant protein may be caused by 'processing' which leads to 'ragged-ends' or micro-heterogeneity, i.e. mixtures of whole and truncated molecules at either the C- or N-terminus. If this occurs at the C-terminus, it is extremely difficult to detect by Edman sequencing. In a novel method developed to address this problem, FAB-MS is used to screen semi-purified mixtures of digest peptides to detect the intact C-terminal peptide and/or truncated versions of it. For example, with recombinant gamma-interferon (Fig. 3), CNBr peptides were separated by gel chromatography, and screened by FAB-MS. The signal at m/z 1090 in the spectrum represents the intact C-terminus (Leu'38 Phe Arg Gly Arg Arg Ala S e r - G l r ~ ' ~and ~ ) m/z 648 the truncated version, Le~'~~-Arg'~'. Glycosylation Powerful strategies have also been developed using high mass FAB-MS for the analysis of both 0- and N-linked glycosylated structures.6 With large structures, it is often useful to remove the carbohydrate moiety from the glycopeptide by,

Analytical methods

G l y c e r o l 61,s

It

131

in hiotechnology tn/r 668

_,_.

.-I--

700

600

- -1

~

900

-~

1000

-1-

.

1100

m/z

Fig. 3a. FAB mass spectrum of fraction 20 from gel chromatography. C-terminus of sequence CNBr

1 Lys-Arg-

Ser-Gln-

Met-Leu-

Phe-Arg-

Gly-Arg-

Arg -Ala- Ala- Ser -Glni4'

Intact C-terminus

mlz Leu -Phe-Arg-Gly-Arg-Arg

-Ah

Ser GlnI4"

1090

'Ragged ends' Leu-Phe-Arg--

Gly -Arg14*

648

Fig. 3b. C-Terminus of sequence of gamma-interferon.

e.g. enzymic or chemical means. From the molecular mass, the number of sugar constituents in terms of deoxy-hexose, hexose, hexosamine, etc. can be calculated. With polysaccharide structures, preparation of permethyl or peracetyl derivatives yields extensive fragmentation information (mainly across glycosidic bonds) in the resulting spectrum. FAB mapping procedures can identify the location of the glycosylation site, determine the identity of the terminal non-reducing ends and the type, branching and identity of oligosaccharides.

S-S Bridge assignment The final task of primary protein structure determination, assignment of disulphide bridges, is generally the most difficult, particularly when several disulphide links

SCI Biotechnology Group Meeting

132 Mixture of peptides

Disulphide bridged protein

%

Enzymiclchemikel digestion

+ n peptides SH

SH

ldentificefion by FAB-MS foNowed b reduction and further FAL-MS etc.

A

Fig. 4. Disulphide bridge identification strategy.

occur in a single polypeptide chain. Traditional lengthy procedures raise the possibility of disulphide reshuffling due to reduction and re-oxidation of the S-S bridges. A new FAB mapping strategy minimises this risk.7 The protein is cleaved at points between the potentially bridged cysteine residues (Fig. 4) and a FAB map spectrum produced under non-reducing conditions. Disulphide bridged peptides are then characterised by their unique masses. Interpretation is confirmed by reduction in e.g. thioglycerol followed by re-running the spectrum. Inter S-S bridged peptides collapse to give two single peaks at lower masses corresponding to the individual peptide components and an intra S-S bridge can be detected by a shift of two mass units as the disulphide is converted to a dithiol.

New developments The sensitivity and mass range of modern mass spectrometers is continuously being improved. Recent advances in magnetic sector instrumentation such as post-acceleration array detection of ions will allow the measurement of biopolymers at the fmol level. For large intact protein molecules, time of flight analysers can detect molecules up to approximately 40 kDa using plasma desorption or approximately 200 kDa using laser desorption. However, the molecular weight accuracy (0.1-0.2%), sensitivity and resolution of these techniques is generally poor. An exciting new development for the analysis of high molecular weight proteins is electrospray mass spectrometry using an atmospheric pressure ionisation source.' Molecular weights of up to at least 60 000 Da can be determined using low pmol to fmol quantities. The technique can also be coupled to capillary zone electrophoresis (CZE)or high pressure liquid chromatography (HPLC) for on-line analysis of biological mixtures. Accuracy of mass measurement is in the region of 04-0.02% for proteins of 25-40 kDa.9

Analytical methods in biotechnoloyy

133

References I . Biemann, K. & Martin, S. A,, Mass spectrometric determination of the amino acid sequence of peptides and proteins. Muss Spectrom. Rev., 6 (1987) 1-76. 2. Morris, H . R . & Greer, F. M., Mass spectrometry of natural and recombinant proteins and glycoproteins. Trends Biotechnol., 6 (1988) 140-7. 3. Barber, M., Bordoli, R . S., Sedgewick, R . D. & Tyler, A. N., Fast atom bombardment of solids (FAB): a new ion source for mass spectrometry. J . Chem. Soc. Chem. Commun.. (198 I ) 325-7. 4. Morris, H. R., Panico, M., Barber, M. et ul., Fast atom bombardment: a new mass spectrometric method for peptide sequence analysis. Biochem. Biophys. Res., Commun., 101 (1981) 623-31. 5 . Morris, H. R., Panico, M. &Taylor, G . W., FAB-mapping of recombinant DNA derived protein products. Biochem. Biophys. Res. Commun., I17 (1983) 299-309. 6. Dell, A., FAB mass spectrometry of carbohydrates. Ad. Adv. Carb. Chem. Biochem., 45 (1987) 19-72. 7. Morris, H . R. & Pucci, P., A new method for rapid assignment of S-S bridges in proteins. Biochem. Biophys. Res. Commun., 126 (1985) 1122-7. 8. Meng, C. K., Mann, M. & Fenn, J. B., Electrospray ionization of some polypeptides and small proteins. Proceedings of’ the 36th A S M S A n n u l Conference on Mass Spectrometry and Allied Topics, 1988, p. 771. 9. Covey, T. R.. Bonner, R. F., Shushan, B. I . & Henion. J., The determination of protein, oligonucleotide and peptide molecular weights by ion-spray mass spectrometry. Rupid Commun. Muss Spectrom., 2 (1989) 249.

Sequence Analysis of N-Linked Oligosaccharides Derived from Glycoproteins and GIycopeptides

S. W. Homans Department of Biochemistry, University of Dundee. Dundee DDI 4HN, UK

Introduction The analysis of the primary sequences of glycoprotein- and glycopeptide-derived N-linked oligosaccharides is presently an important problem. There are many instances where it would be desirable to determine with high precision the primary structures of the manifold of oligosaccharides generally found at a given glycosylation site. Notable examples include: ( I ) those instances where the oligosaccharide is required for biological activity, as in the glycohormones;’ (2) situations where a recombinant product is of potential pharmaceutical value, but has undesirable properties such as increased serum clearance rate, by virtue of the presence of ‘incorrect’ glycosylation by the expression vector; (3) possible requirement for characterisation of the oligosaccharide moieties of recombinant products intended for drug use in humans which is of relevance to the pharmaceutical industry. Unfortunately, oligosaccharide sequence analysis is presently both costly and time consuming. Approaches which might be useful in overcoming both of these difficulties are described here.

Mass spectrometry of peptides, proteins and glycoproteins.

Analytical methods in hiotechnology I27 Mass Spectrometry of Peptides, Proteins and Glycoproteins Fiona M. Greer M-Scan Ltd, Silwood Park, Sunninghi...
373KB Sizes 0 Downloads 0 Views