Immunology Today, vol. 7, Nos. 7&8, 1986

-reviews

Complementsystemproteinswhich interactwith C3b or C4b A superfamilyof structurallyrelated proteins Recent cDNA sequencing data has allowed the prediction of the entire amino acid sequences of complement components factor B and C2, the complement control proteins factor H and C4b-binding protein and a partial sequence for the C3b/C4b receptor CR1. Theseproteins all contain internal repeating units of approximately 60 amino acids, each repeating unit having a characteristic framework of highly conserved residues. The N-terminal Ba and C2b portions of factor B and C2 both contain 3 repeating units and the chains of C4b-binding protein and factor H contain 8 and 20 repeating units, respectively, while the precise number of units in CR1 is not known yet. These structurally homologous complement proteins are also functionally related as they all interact with C3b and C4b during activation of the cascade. The repeating units also occur in the functionally unrelated proteins subcomponent Cl r, ~2-glycoprotein I, blood clotting factor XIII and interleukin2 receptor. In this review Ken Reid and his colleagues propose that this could be a general feature of a superfamily of structurally related proteins. The complement system is composed of at least 20 plasma glycoproteins. This number includes seven control proteins in addition to the 13 components of the classical and alternative pathways which are the well characterized routes by which activation of the system takes place1-3 (Fig. 1). Many of the biological effects of the complement system (which can involve the induction of inflammatory responses, engulfment, killing and lysis of bacteria and viruses) are mediated by a variety of cellular receptors4 such as the C3blC4b receptor (CRI) which can bind to the activated components (principally C3blC4b) or the fragments generated on limited proteolysis of these activated components by the action of control proteins such as the enzyme factor I and its cofactors factor H (H), CRI and C4b-binding protein (C4BP). A feature of these three groups of proteins associated with the complement system, i.e. compone_nts, control proteins and receptors, is the presence in each group of two, or more, proteins which interact with C3b, or C4b (Fig. I). A summary of the biochemistry and genetics of the control proteins C4BP and H, the membrane bound proteins CRI, decay accelerating factor (DAF) and glycoprotein 45-70 (gp 45-70) which pointed out the functional similarity between those proteins, was recently given in Immunology Today by Holers et al. 5. However, an interesting feature which has emerged from detailed structural studies is that, in addition to showing functional homology, the control proteins C4BP and H and the receptor CRI show an unusual type of structural homology that is shared with the components C2, factor B and subcomponent Clr, and at least three

230

7MRCImmunochemistryUnit, Departmentof Biochefnistry,University of Oxford, Oxford OX1 3QU, UK: and 2ScrippsClinic and Research Foundation, Departmentof Immunology, La Jolla, California 92037, USA

K.B.M. ReidI , D.R. BentleyI , R.D. Campbell1, L.P. Chung1, R.B. Sim 1, T. Kr,stensen" and B.F. Tackz non-complement proteins, 132-glycoprotein I, interleukin2 receptor and factor XlII. Function The manner in which the proenzyme forms of the components C2 and factor B interact with C4b and C3b to yield, after activation, the C3 convertases of the classical and alternative pathways is well documented. In the classical pathway the proenzyme C2 (Mr 102 000) associates, probably via its N-terminal C2b domain, with C4b in a Mg2+-dependent interaction 6,7. The C2 is split by C~ to yield the non-catalytic chain C2b (Mr 30 000) and the C-terminal catalytic chain C2a (Mr 70 000), the C2b not being required for the C3 convertase activity of the C4b2a complex. In the alternative pathway, the proenzyme factor B (Mr 90 000) associates, probably via its N-terminal Ba domain 8-1°, with C3b (or a 'C3b-like' form of C3 in the initial events of alternative pathway activation) in a Mg2÷-dependent interaction. The factor B is split by factor D to yield the non-catalytic chain Ba (M~ 30 000) and the C-terminal catalytic chain Bb (Mr 60 000). Thus in the formation of the C3 convertases of both the classical or alternative pathways, it is clear that the binding of C3b, or C4b, is dependent upon the N-terminal portion of the proenzyme which contains three repeating homology units (as is o~utlined below). The C3 convertases can then act as C5 convertases by the association of C5 with surface bound C3b thus allowing the splitting of C5 and the assembly of the C5b-9 lytic complex 1~ (Fig. 1). The C3 convertases 'decay' by release of C2a, or Bb, from the C4b, or C3b, and this 'decay' is accelerated by interaction with control proteins C4BP and factor H which then act as cofactors in the proteolytic splitting of C4b, or C3b, by factor I (reviewed by Holers et a/.S). Various membrane associated receptor molecules such as CR1 and DAF can serve as accelerators of the decay of C4b2a and C3bBb and in the case of CR1, (and also gp 45-70) as cofactors in the proteolytic splitting of C4b, or C3b, by factor I (reviewed by Holers etal. 5) (see Fig. 1). Three non-complement proteins, interleukin-2 (IL-2) receptor, 132-glycoprotein I (1321)and factor XlII display a structural similarity, at the primary amino acid sequence level, to regions of C2, factor B, C4BP, factor H and CR1. The IL-2 receptor is a 55 000 M r glycoprotein found on the membranes of antigen or mitogen-stimulated T cells and thus the presence of IL-2 is involved in the expansion of T-cell populations. The function of ~21 ( M r 50 0 0 0 ) is not known although it has been completely sequenced at the amino acid level12 and crystals of the protein have been obtained. It has been suggested that it is an activator of lipoprotein lipase, but in addition to showing © 1986,Elsevier5ciencePublishersB.V.,Amsterdam 0167 4919/86/$0200

Immunology Today, vol. 7, Nos. 7 & 8, 1986

Q

.r

Accelerated decayofcamplex, re~a~of C2a anddegradation

Fig. 1. Main steps in the activation of the classicaland alternativepathways of complemenL 'C3' denotes C3b-like form of C3. Proteinswhich interact with C3b, or C4b, are enclosed in boxes. Enzymicallyactive components are denoted by an asterisk.

ylsw$-

Cl

Classical ~N pathway activation El CZ,

P CCb

~.-C/,b2a

C~,bC2

!lO"

C 3 ~

C3

. P-E3b 'C3'Bb Atfernative l~. pathway nrtivotion

C3bB

~C5b9

~-3D~b

'E3:B,D

Accelerateddecayofcomplex releaseof Bb and degradation of C3b

an association with lipoproteins, it has been found to bind to platelets and to interact with heparin. Factor XlII is involved in cross-linking of fibrin.

Structuralstudiesat proteinandDNAlevels Extensive protein sequence analysis of human C213, factor B 13-15, and C4BP 16 has been performed while limited protein sequence data is available for human factor H 17-19 and human CR12°'21. The cDNA coding for all, or part, of each of these proteins has been cloned and thus complete amino acid sequences can be pre-

dicted for human C2 22, C4BP lb, Factor B23'24, CIr and mouse H 25. Partial sequences have also been predicted from cDNA cloning for human H 17 19 and CR120,21. With respect to the non-complement proteins the complete amino acid sequence of ~321has been determined 12 and predicted amino acid sequences are available for both human 26'27 and mouse 28'29 IL-2 receptor and factor XIII. On examination of all these sequences (see Table 1)it is apparent that each protein contains repeating homology units of approximately 60 amino acids which conform to a consensus sequence having a framework of highly

Table1. Proteinsfor which there is structuralevidencefor repeating homology units Protein

Interactswith:

Complement Proteins

Factor B C2 C4b-binding protein (composed of 7 identical chains) Factor H (mouse) C3b/C4b receptor(CR1) Subcomponent Clr

Three contiguous units starting at N-terminusof 92 000 M, chain Threecontiguous units starting at N-terminusof 108 000 Mr chain Eight contiguous units starting at N-terminusof 70 000 Mr chain

C3b C4b C4b

Twenty contiguous units throughout entire 160,000 Mr chain Number not yet establishedbut, at leasteight units or perhaps up to thirty-six throughout 160-250 000 M, chain Two units located nearC-terminusof non-catalytic56 000 Mr chain

C3b C3b/C4b Not known

Non-Complement Proteins

Not known

Fivecontiguous units starting at N-terminusof 50 000 Mr chain Two units starting at N-terminusof 55 000 Mr chain (the two repeating units are separatedby domain of 37 amino acids) Ten contiguous units startingat N-terminusof 80 000 Mr chain

132glycoprotein I IL-2 receptor 13subunit of clotting factor Xlll

Not known Not known

General consensus sequence of approximately 60 amino acids* seen in most repeating homology units:

4 --CYS

7 --PRO

30

32

TYR PHE - - . C Y S

35

46

50

52

GLY

CYS

GLY - -

TRP - - P R O

*Certaingapsand deletionshaveto be madein orderto aligneveryrepeatin C222,1323,C4BP16 H25,Clr 43,132112,IL-2receptor2627 ' and factorXIII42,insuch a way that they conformto the generalconsensussequenceand numbering

57

ALA

.59 --CYS

- -

shown above.Althoughonly limitedsequencedata is availablefor CR12~,it is clearthat this proteinalsocontainsrepeatinghomologyunitsnearlyidenticalto the consensussequencefound in H2°.

231

Immunology Today, vol. 7, Nos. 7 & 8, 1986

-reviews \

170~/~'"- ,

I

0 = one repeafing unif faken to be 42.5/~ Long Fi9. 2. Schematic modelof C4BP complexed to C4b (adapted from figure by Dahlback et al.33)showing a possible arrangementof the eight internal homology regions, found in the N-terminal 491 residues of eachof the seven chains of ¢4BP, to give a 'tentacle' of C4BP as outlined in the text. The dimensions indicated arefrom the electron microscopy studies of Dahlback et alY. conserved residues consisting of one tryptophan, two praline and four cysteine residues. Two other positions show conserved glycine residues, while at other positions a bulky hydrophobic amino acid such as tyrosine or phenylalanine is often found (Table I). The repeating units are contiguous and start at the N-terminal end of the processed form of each protein except in the case of Clr and also the IL-2 receptor where the two repeats in the molecule are separated by a region of 37 amino acids which does not show any homology with the repeating unit. On alignment and analysis of all the available sequences containing the internal homology units it is apparent that certain of the repeats e.g. the second and third repeats of Ba show extensive homology (almost 50% identity) outside the general framework of highly conserved residues23 and that the C2b units are more similar to those in Ba than to those in C4BP or H. The Ba and C2b show greatest homology over repeats two, three and four of C4BP (approximately 25% identity over 180 residues). The mouse H sequence yields the longest

sequence of repeating units (twenty) reported at present therefore it is of interest to look at its optimal alignment with the other proteins. In general, the N-terminal portions of C2, factor B and C4BP show a stronger homology with the N-terminal repeatsone to four of the mouse H than with any other area of the H sequence. On the other hand 1321shows greatest homology over repeats five to eight of H (24% identity over 305 residues). Although there is a strong body of evidence supporting the view that the N-terminal, C2b and Ba, portions of C26'7 and factor B8-1° and the N-terminal regions of C4BP3°'31 and H32 all contain C4b or C3b binding domains, the correlation of the presence of these internal repeating homology units with functional binding activity requires further evaluation especially in view of the presence of similar repeating units in the IL-2 receptor 1321, and factor XIII. It could be implied that the presence of repeating units in these proteins is an indication that they might display some ability to interact with C3b or C4b, but this remains to be tested. In the case of IL-2 receptor a role for the second repeat in binding interleukin 2 has been proposed.

Structure generated by repeating units Electron microscopic studies of factor B, C2, H and C4BP have provided data concerning the overall dimensions of these molecules. However, no clear structural models have been proposed despite the availability of the primary sequences. The C4BP molecule (which contains eight contiguous repeating units running from the N-terminal end of each of its seven identical disulphidebonded chains) gives the clearest picture of the type of structure the repeating units impart to proteins which contain them. The C4BP appears in the electronmicroscope as a spider-like structure with seven flexible 'tentacles' (each 30~ by 330/~) joined at one end to a small central 'core '33 (Fig. 2). Each 'tentacle' and one seventh of the 'core' probably represents one of the seven disulphide linked chains of human C4BP. The C-terminal 58 amino acids of C4BP do not fall into tlie same pattern of internal homology (as seen for the N-terminal 491 amino acids) and in structure prediction studies the C-terminal region of 58 amino acids in each

Ancestral 60 amino acid structure

[]

Serine profease domain

I

il

I Z lnlml I

EL, bindingprofein

i'---" i sP

C2 and Factor B

tispl Classicalserineprofeases

IL-2receptor

lllllllllllllllll]lll Facfor H

C:r Fig. 3. Homology of C2, factor B and Clr to two different protein families, the

~2 glycoprotein I

Ill 232

lllilllm Factor

classical serine proteinase family and the 60 amino acid repeat family. The individual repeats are boxed in C4BP, H, ~2I, 11.-2receptor, ~ subunit of factor Xlll and also the presumed ancestral structure. Repeats in C2 and factor B are labelled I, II and III and the two in the IL-2 receptor are labelled A and B. SP = serine proteinase domain; C = unrelated C-terminal domain in C4BPand ~21; the solid block in the IL-2 receptor = the transmembrane region.

Immunology Today, vol. 7, Nos. 7 & & 1986

r Yl 14/5[nferleukin-2

receptor gen.

-1 -21

6/+ 1

101 65

173 102

197 17/*

221 198

2~ 222

251 2/+5

H

100 bp -5 Ba portion of Factor B gene ~ ~ ' - ~ -25 -/*

136

7/* ~ 75

19/+

228

27/+

I 137

195

Fig. 4. Comparison of the intron/exon structures of the IL-2 receptor gene26"2z and factor B gene24. Exons are shown boxed, the numbers refer to the amino acids encoded by each exon. The exons encoding the three homologous regions in Ba are labelled I, II and III; the exons encoding the two homologous regions in the IL-2 receptor are labelled A and B.

229

Ba~ I ~.Bb of the C4BP chains appears likely to form a stable s-helical structure 16. Thus it is probable that the Cterminal helical regions from all seven chains will form the central 'core'. Structure prediction studies indicate that the N-terminal 491 residues of each of the C4BP chains will form predominatety 13-sheets and random coils, and this fact along with the possible intrachain disulphide bond pattern has to be taken into account in the formation of any models. If C4BP has a similar intrachain disulphide bond pattern to that found in 1321 (1321 is the only protein containing these repeats for which the disulphide bonds have been assigned 12) then the eight homology units may form eight similar structura) domains (each of 42.5/~) which are arranged in a linear and tandem fashion (8 x 42.5/~ = 340~) along the 'tentacle' structure (estimated to be 330~ long by electron microscopic studies) with the N-terminal homology unit at the extremity (Fig. 2). Factor H and ~2 I, containing twenty and five repeating units respectively also display elongated structure indicating that this may be a feature of all proteins containing large numbers of these repeats. In the electron microscope factor B and C2 show no elongated structure, both being 80--85/~ in diameter and appearing to be composed of three domains of approximately equal size34. In both cases, one domain corresponds to the N-terminal Ba or C2b regions. Gene structure and chromosome location The C2 and factor B genes, along with the C4 and 21-hydroxylase genes are a tight cluster of class III genes located in the major histocompatibility complex in man on the short arm of chromosome 6 3 5 .'3 6 . In the mouse the same genes are also found to be closely clustered in the H2 system on chromosome 1737. Linkage analysis of allotypes of human C4BP, H and CR1 indicates that they are also closely clustered 38 and by in situ hybridization studies the CR1 gene has been mapped to the long arm of chromosome 1 in man 2°. Consistent with the previous observation are studies using cDNA probes for C4BP and H in the analysis of human-mouse somatic cell hybrids which indicate that human C4BP and H are also located on chromosome 1 (E. Solomon, unpublished). Mouse H has been mapped to chromosome 139 but there is some uncertainty over the chromosome assignment for mouse C4BP4° and the location of mouse CR1 has not been investigated yet. The human IL-2 receptor has been mapped to chromosome 10. The largest polypeptide chains described so far which contain these repeating homology units are those of the CR1 molecule. The extent of the repeating units in this molecule is not known yet but up to approximately 36 could be present if, as in mouse factor H, they are found

throughout the entire length of the chain. It has been found that there are four allotypic variants of CR1 which vary in molecular size by a factor of 30 000 Mr i.e. giving allotypes of approximately 160 000, 190 000, 220 000 and 250 000 Mr. It has been suggested 2° that duplication or deletion events involving these repeat units may explain the structural polymorphism seen in CR1. This is an attractive suggestion since analysis of the genes coding for factor B24 and the IL-2 receptor 26 '2 7 has shown that the repeating homology units in each protein are each completely encoded by a discrete exon (Fig. 3 and 4). Thus it may be that all the other proteins displaying these repeating units at the protein level will have them encoded in separate exons at the gene level, and initial studies in the human C4BP and C2 genes indicate that this is the case. In the C2 and factor B genes the exons encoding the repeating units are spliced to exons encoding the catalytic residues which form typical serine esterase active sites (Fig. 3) and this is also probably the case for Cl r. It is of interest that human haptoglobin Hp2 precursor, which shows considerable homology to the serine proteases, also shows homology with H (24% identity over 100 residues); however its homology with H is different from that of C2, B etc. in that it lacks one of the highly conserved cysteine residues. Further, haptoglobin differs from other members of the sixty amino acid repeat family at the genetic level as the sixty residue stretch of homology in haptoglobin is enclosed by two exons 41 Conclusion C4BP, factor H and ~21 are composed almost entirely of repeats (depicted in Fig. 3). In C2 and factor B the exons encoding the repeats are combined with coding regions which include exons that are related to the classical serine protease genes. In the case of IL-2 receptor, exons encoding the two repeats are separated by an unrelated exon and are also combined with other coding regions including an exon encoding a transmembrane domain (Fig. 3 and 4). Recently, workers in E.W. Davie's laboratory42 '4 3 have described the presence of ten of these sixty amino acid repeats, extending from the N-terminal end of the 13 subunit of factor Xlll (of the human blood clotting system). They have also described two of the repeats within the C-terminal region of the non-catalytic A chain of the subcomponent Clr of the human complement system. These observations illustrate the probable widespread nature of this structural feature. All the above findings indicate that tandem duplication of individual exons and of complete genes, and also exon shuffling 44, have all been important features in the divergence of members of this 60 amino acid repeat family of proteins.

233

-reviews References

1 Lachmann, P.J. and Hughes-Jones, N.C. (1984)Springer Semin. Irnmunopathol. 7, 143-162 2 Pangburn, M.K. and MOIler-Eberhard, H.J. (1984) Springer Semin. Immunopathol. 7, 185-214. 3 Reid, K.B.M. and Porter, R.R.(1981)Annu. Rev. Biochem. 50, 433-464 4 Fearon, D.T. and Wong, W.W. (1983)Annu. Rev. Immunol. 1,243-271 5 Holers, V.M., Cole, J.L., Lublin, DM. etal. (1985)Immunol. Today 6, 188-192 6 Nagasawa, S. and Stroud, R.M. (1977) Proc. NatlAcad. Sci. USA 74, 2998-3001 7 Kerr, M.A. (1980) Biochem, J. 189, 173-181 8 GOtze, O. and MOIler-Eberhard, H.J. (1971)J. Exp. Med. 134, 90s-108s 9 Hunsicker, L.G., Ruddy, S. and Austen, K.F. (1973)J. Immunol. 110, 128-138 10 Ueda, A., Kearney, J.F., Roux, K.H. etal(1985) Complement 2, 80 11 MOIler-Eberhard, H.J. (1984) SpringerSemin. Immunopathol. 7, 93-141 12 Lozier, J., Takahashi, N. and Putnam, F.W. (1984) Proc. Natl Acad. Sci. USA 81, 3640-3644 13 Gagnon, J. (1984) Philos. Trans. R. Soc. London Ser. B Biol. Sci. 306, 301-309 14 Christie, D.L. and Gagnon, J. (1983) Biochem. J. 209, 61-70 15 Mole, J.E., Anderson, J.K., Davison, E.A. etaL (1984)./. Biol. Chem. 259, 3407-3412 16 Chung, L.P., Bentley, D.R. and Reid, K.B.M (1985) Biochem. J. 230, 133-141 17 Sire, R.B., Malhotra, V., Ripoche, J. etaL (1985) Biochem. Soc. Symp. 51,83-96 18 Ripoche, J., Day, A.J., Willis, A.C. et aL (1986) Biosci. Rep. 6, 65-72 19 Kristensen, T., Wetsel, R.A. and Tack, B.F. (1986)J. Immunol. 136, 3407-3411 20 Klickstein, L.B., Wong, W.W., Smith, J.A. etal. (1985) Complement 2, 44 21 Wong, W.E., Klickstein, L.B., Smith, J.A. etaL (1985) Proc. Natl Acad. Sci. USA 82, 7711-7715 22 Bentley, D.R. Biochem. J. (in press)

234

Immunology Today, vol. 7, Nos. 7 & 8, 1986

23 Morley, B.J.and Campbell, R.D. (1984) EMBO. J. 3, 153 157 24 Campbell, R.D., Bentley, D.R. and Morley, B.J. (1984) Philos. Trans. R. Soc. London Ser. B. Biol. Sci. 306, 367-378 25 Kristensen, T. and Tack, B.F. Proc NatlAcad. Sci. USA (in press) 26 Leonard, W.J., Depper, J.M., Kanehisa, M. etal. (1985) Science 230, 633-639 27 Ishida, N., Kanamori, H., Noma, T. etal. (1985) Nucleic Acids Res. 13, 7579-7989 28 Miller, J., Mal.ek,T.R., Leonard, W.J. etal. (1985) J. Immunol. 134, 4212-4217 29 Shimizu, A., Kondo, S., Takeda, S. et aL (1985)NucleicAcids Res. 13, 1505-1516 30 Fujita, T., Kamato, T. and Tamura, N. (1985)J. Immunol. 134, 3320-3324 31 Chung, L.P. and Reid, K.B.M. (1985) Biosci. Rep. 5, 855--865 32 Alsenz, J., Schulz, T.F., Lambris, J.D. etal. (1985) Biochem. J. 323,841-850 33 Dahlback, B., Smith, C.A. and MOller-Eberhard, H.J. (1983) Proc. Natl Acad. Sci. USA 80, 3461-3465 34 Smith, C.A., Vogel, C.-W. and MOIler-Eberhard, H.J. (1984) J. Exp. Med. 159, 324-329 35 Carroll, M.C., Campbell, R.D., Bentley, D.R. etaL (1984) Nature (London) 307,237-241 36 Carroll, M.C., Campbell, R.D. and Porter, R.R.(1985) Proc. NatlAcad. Sci. USA 82, 521-525 37 Chaplin, D.D., Woods, D.E., Whitehead, A.S. etaL (1983) Proc. Natl Acad. Sci. USA 80, 6947-6951 38 Rodriguez de Cordoba, S., Lublin, D., Rubinstein, P. etaL (1985) J. Exp. Med. 161, 1189-1195 39 Kristensen, T., D'Eustachio, P. and Tack, B.F. (1985) Complement 2, 46 40 Rodriguez de Cordoba, S., Ferreira, A. and Rubinstein, P. (1985) Immunogenetics 21, 257-265 41 Maeda, N., Yang, F., Barnett, D.R. etal. (1984) Nature (London) 309, 131-135 42 Ichinose, A., McMullen, B.A., Fujikawa, K. and Davie E.W. Biochemistry (in press) 43 Leytus, S.P., Kurachi, K., Sakariassen, K.S. and Davie, E.W. Biochemistry (in press) 44 Gilbert, W. (1985) Science 228, 823-824

Complement system proteins which interact with C3b or C4b A superfamily of structurally related proteins.

Recent cDNA sequencing data has allowed the prediction of the entire amino acid sequences of complement components factor B and C2, the complement con...
435KB Sizes 0 Downloads 7 Views