Volume 300, number 3.237440 FEBS 10889 53 1992 Federation or European Biochemical socictics 00145793/92155.00

April

1992

A large domain common to sperm receptors (Zp2 and Zp3) and TGF-p type III receptor Peer Bork and Chris Sander EMBL.

MeyerltoJ3r.

I. 6900 Heidelberg. Gemtany

Received 3 February

1992

A new family of mosaicproteinsis &finedby scqucnce analysis. The ramily is characterized by d 260 rcsiduc domain common to proteins of apparently diverse function and tissue specificity: spcm~ rcccptors Zp2 and rp3, betaglycan (a!so called TGF-p lypc 111 rcccp~or). uromodulin, as well as the major zymogcn granule membrane protein(GP-2). The location of the common domain is similar with respect to puta& transmembrane regions. The results lead 10 the hypothesis that this type of domain has a common tertiary structure and that there is a functional similarity in the recognition mechanism of the sperm txecptor system and the TGF-B reecptor complex. Homology: Mosaic protein; Sperm reecptor: TGF-receptor;

1. INTRODUCTION Three different glycoprotcins (ZPI , ZP2 and ZP3 j of the zona pcllucida, an extrzelLsa~ matrix surrL:inding oocytes. are responsible for inducing the acrosomc reaction (sperm exocytosis, for review see [l]). The detailed understanding of these molecular proccsscs has a direct practical aspect in the context of development of safe contraceptive agents. Indeed, vaccination of a synthetic ZP3 peptide resulted in long term contraception in female mice [2]. Zp3 first binds to specific sperm proteins [3], thus mediating sperm contacts with the oocyte. Zp2 then acts as a second sperm receptor reinforcing the tight interactions [4]. Zp I crosslinks Zp2 and Zp3, which both appear to form dimers. or perhaps higher oligomers [for review see [S]).Zp3 from different mammalia [6-S] as well as Zp2 from mouse [9] has been compktcly sequenced, but no sequcncc similarity bctween the two was observed. We have z~pwidcntificd. by pattern-based scqucnce analysis, G k;: domain common to b*th sperm rcceptors. In ac; ,. this domain c’anbc found in scvcral other recepto .i.;e proteins such as TGF-/I rcccptor III [IO,1I]. uromodulin [12,13] and the major zymogcn granule mcmbranc glycoprotcin GP-2 114,151defining a new family of mosaic proteins. Mosnic proteins have a modular architccturc. as reviewed in [ 16.17].The homologics prcscnted hcrc can bc the basis fur functional

Uromodulin

tests of the single modules in all proteins of the family. Conserved residues or segments provide useful hints for site-directed mutagenesis and DNA screening experiments. 2. MATERIALS AND METHODS For homology scarchcs SWISS-PROT 1181and PIR [19] sequence databases were used. Having dctectcd a similarity bc~wecn ZpI and Zp3 using the program FASTA 1201.the most conserved regions were described by properly consensus pattcms [21]. which arc xnsidvc and sck!!tivc enough to dcteet remolc relationship among exlraccllular mosaic prolcins (c.g. [22]). As an additional test of the results cvcry domain de~c~lcd in this way was compared with u/l protcinf of the scqucnec database using FASTA (k-tuplc-1). Because of the apparent s~yucncc llcxibility and the high contcm of hydrophobic amino acids I .OIHIhits were rlrordcd and sortcxi in Vxsnr of the ‘optimized [20) scores. The multiple alignment was carri~z out using the program PILEUP or the GCG package 1231.A liiw a&lidonal gaps were introduccvl to align conservcxlcystcircs. When attcn ~!ing to establish rcmotc rclatio,,.:rips. .I s;&i&;l:b: estimate should bc providcul. cqxtially if the prolcins predicted 10 bc homologous are not clearly r&cd in terms of biological function. Evaluating slalis~~eallyonly pain+ alignments in a set 0tsequcnNs. ralhcr than the multiple align as a whole, cx~n letid fo crroncous undcrcslimates o(*si~niBoanec. Low-level rescmblanecs become more significant as more mcmbcrs arc added 10 the alignment 1241.Although there has beqn some progrcx in the mathcmotical Ircatmcnl of signirio:tntx in multiple seicyucncv Jiynlucnts, c.g. IX]. insertions and dclclions have yet to be included in the statisliL=l cstimat~% WC thcrcrorc assessour (indings in thrw dill’crcnt cmpirisal ways. in the content of I hr~ di~l’~wztrih~t~l~gy WtWt methods:

FEW (1) Rat&x than p~in\uc cvmprkons of wqucnccs. rhc program PiZORL~4RCH worms a muhiple aleian#nt dmzn by a pr+ tile. ie raidlie frosrrc;rics at each position B]. Ihe slgllikasl~ (standard deviations estimatein~snahodi5provifMbyZz-nwa abo\e kkgrormd). with talus hi&r than 6XXIconsidcml sigvilicant m A damsearch wilh the dcriwd proiik yichkd Z-scorts higher than 1LOClfor all of the predicted membms of rhc domain family (dara not show). No olher protein with a score higher than 5.3 ma!5found. f31 Whezxas PIZOFILESWRCH takes into account the multiple sc-

quebfp alignment owx k entire lcr@h of the SC~UCIKIX.. the program PAT 1211conccn~ratts on the most cowxwd regions (motifs) only. reprexnting them in terrnsofaminoacid pro-i& and allowing large. gaps betwwn the motifs ~consensas propcny partcms’l. In this method. each of the 3 consensus regions indcpcndentls appcamd to be highly discriminating At kast 4 (6.7 respccli~~ly ) mismatches wcrc recorded for otter proteins not belonging to the family. A mismatch is a deviating amino acid profly in any of the positions in rhc motif (for details see e.g. 1271).

3. RESULTS

AND DISCUSSION

Sequence analysis by FASTA revealed a homology between large segments of Zp2 and Zp3. Describing the most consened regions of the common domain (marked in Fig. 1) by a consensus property pattern (211. further relationships to (i) transforming growth factor receptor III (TGFR3!. (ii) uromodulin/Tamm-Horsfall protein. and (iii) the major granule membrane protein GP-2. became evident. (i) TGFR3. also called betaglycan. is a component of

LEITERS

April 1992 Table I

Painkc

worn GPD GP2R 2P2 ZP3M ZP3H TGFR3

similarities bctwcn the domains in tcr’ms of optimized FA!XA scores [Xl] worn

GP2D

GP2R

ZP2

1.317

908 I.288

902 I.011 I .25-I

144. 175 161 1.413

ZF3M 54’ 132’ t26’ 158 1,321

ZP3H TGFR3 127 133’ 149 171 1,058 I.322

135’ 151‘ 142’ 103’ 112 121’ 1,620

The maximum possible score for a particular sequcnc~ is dcfincd by the score for self-comparison along the diagonal. * FASTA dots not dctcct the full length of the domain due 10 inscrtions!ddctions.

the complex receptor system which mediates the numerous functions of transforming growth factor /3 (TGF/Ii+ TGFR3 is t h ought to control the access of TGF-/I to the signalling receptors [lO,l l]. (ii) Uromodulin was first discovered in the urine of pregnant women [28], but was later found to be a normal constituent of human urine. at lower concentrations. The protein Fraction of this glycoprotein is identical to Tamm-Horsfall protein [ 12,13.29] which in turn is the major glycoprotein secreted by the mammalian kidney [30]. The functions of uromoduliniTamm-Horsfall protein remain uncer-

Volume 300. number 3

FEBS LEI-I-EM

results support a role in preventing urinary tract infections by binding mannose-sensitive fimbriated microorganisms [31]. (iii) Glycoprotein GP- 2, the major component of pancreatic secretory granule membranes, is similar to uromodulin in sequence and domain structure [14], but lacks the three N-terminal EGF-like domains (Fig. 2). The homologous region of about 260 residues common to these proteins (Figs. 1 and 2) contains eight strictly conserved cysteines, which may form disulfide bridges. The conservation of hydrophobicity. polarity or turn-forming tendency at numerous positions is consistent with a conserved three-dimensional structure (see consensus line in Fig. 1). In addition to the conserved cysteines, only a few aromatic or hydrophobic amino acids are absolutely invariant (Fig. 11, probably as a result of structural rather than functional constraints. Such a conservation pattern is typical of distantly related domains of mosaic proteins involved in binding functions [16,17]. The common domain occurs at a similar location relative to the putative membrane-spanning regions in each of these proteins (Fig. 2). This situation is suggestive of a possible common biological role of the domain. In this context, we note the follo\ving common biological properties of these pi&L. (i) They all li;*ie been detected in soluble form, but (ii) also have features of integral membrane proteins in that they contain a long hydrophobic sequence segment at or near the C- terminus (Fig. 2). For three of them, rnernbranc-bound forms, which are then further processed. have been chatain, but rcccnt

April 1992

ractertzcd: uromodulin and GP-2 are known to contain glycosylphosphatidylinositol membrane anchor 132,331, whereas the membrane-bound TGFR3 has a short cytoplasmic part homologous to that of endoglin IlO.1 1.341. In contrast, Zp2 and Zp3 have so far only been described as secreted proteins, but the short stretch of positively charged amino acids C-terminal to a hydrophobic region is typical of a small cytoplasmic extension of a membrane-bound form. Furthermore, all 5 proteins (iii) are heavily glycosylated, and (iv) appear in substantial amounts in the respective tissues (e.g. [10,14,32,35]). The identification of a sizable common domain in these proteins is strongly suggestive of functional analogies. For the sperm receptors Zp2 and Zp3, previously not known to be structurally related. a common binding function to 95 kDa sperm proteins [3] is likely, with Zp3 binding first and Zp2 reinforcing the interaction. Our results also suggest a possible functional similarity in the recognition mechanism of the sperm receptor system and the TGF-/I receptor complex. a

REFERENCES Yanagimachi. R. (19X8! in: Th< physiology ol’ rcproduaion (Knobil. E. and Ncitl. J. eds.) Vol. 1. 135.-18s. Ravcn Prc~s. NW York. [2] bjittcr. S.E.. Charnow. SM.. Baur. A.W.. Oliver. C.. Roky. F. and Dean. J. (1989) Science 242. 935 ~938. [3J Lcyus, 1 and Sating. P.M. (lYX9) Cdl 57. 1123~1t30. [d] Btclt. J.D.C.. Grcvc. J.M. and Waswman. P.M. (11181) Dcv. Bid. 86. IX9 197.

[I]

t

cs] Lisa. L-F,

Chamov. SM. and Dean, J. (19w) Mol. Cell BioL

lo. 150-1515. [to] v F_ Chcifii S Doody. L Lane. WS and Massqw. J. (19’91) Celi 67. ?8!X95. [ill Wasp X-F_ Lin, H.Y., Ng-&ton. E_ DonnwttxL J.. Lodish. ttF_ and Wdnbcrp R& (1991) CeU 67.797-805. [12] bnisia. D, Kok WJ, Kwng. WU, Cimcr. D.. Aggarwi. ELB, Cixn, EY. and C&d&i. D-V. (1987) Science 236.83-88. A-P, Kumar. S Yuc. 1131 Hcfsios c, rkckcr. J.M.. SLlbn. CC, Mattalii RJ, l-tt R, Kaxwsbla. E !sclmlcisncr,

U, lWetky, S, Chow. EP.. Burac AS. and Mu&more. A.V. (19s7) scicwr 237.1479-1484. ii41 Hoes T-C- ati Rind& MJ. (1991) J. 3iol. Chem. X6. CS7[iq “FE& S-1, F:e&nzn. S. and Scltc&. G.A. (1991) Proc. NatL Acad Sci US+ 88 X98--~?, fiq Patthy. L (1991) Cbrr- Opit~ Sttuct. Biol. I. 351-361. [lq Bark. P. (19911 FEBS L.ctt_286.47-55. 118) Bairoch. A. and beckmann B. (1991) Nuchzic Acids Rcs. 19. X47-2249. [I91 Barker. WC. George. D.G.. Hunt. L.T. and Gatii. J.S. (1991) Nuclek Acids Rcs. 19. “31-2236.

120) Pr;rrsot~ W-R. and Lipman. DJ. (1988) Pmt. Natl. Acad. Sci. USA 85. xx+&-334L pl] Botk. P.andGnmtrakl.C(l990) Eur. J. Biicm. 191.347-358. E] Bark* P. w91) FEES l&t_ 2Q 9-12 p] Dnm 3.. Hactnxli. P. and Smithii 0. (lh14) Nucleic Acids Res. 12.387-395. p4] Doolittk RF. (1969) ix Ptediiion of Protein Structutc and the Rinciples of Protein Conformation (Fastnan. G.D. ed.) Plenum

Publishing Corporalion. p2 Schulcr, G-D, AItschuL SF. and Lipman. DJ. (1991) Proteins 9.1stst90. [26j Griikov. M, McLachlan. kD., Eiibq, Natl. Acad sci USA 84.43554358.

D. (1987) Pruc.

[27j Bark. P. and Rohdc K, (1991) Biochcm. J. 279~908-910. p-sl Muchmom. A-V. and Dcckcr. J-M. (1985) Scicncc 229.479~1. [29] Tamm 1. and Hors&ii. F. (1952) J. Exp. Mai. 95.71-97. [30] Fabric&. f.. Scan D.M. and Kinne. R.K. (1989) Biol. Chcm. HoppeSeilcr 370. 151-158. 1311 Rcinhan. H.H., Obcdcanu. N.. Robinson. R., Kormiowski. 0.. Kayz. D. and Sobci, J.D. (1991) J. Ural. i46.80&808. 1321Rind& MJ.. Naik. S.S.. Lin. N., Hoops, T.C. and F’craldi, M.-N. (1990) J. Bioi. Cbcm. 265,2078+20789. [33] Rindlcr. MJ. and Hoops. T.C. (1990) Eur. J. Cell. BioI. 53, W-163. [34] Gouges. A. and Lclarte, M. (1989) J. Eliot. Chum. 265, 836i8364. [351 Wing. P.M. (1991) Biol. Reprod. 44, 246-251.

A large domain common to sperm receptors (Zp2 and Zp3) and TGF-beta type III receptor.

A new family of mosaic proteins is defined by sequence analysis. The family is characterized by a 260 residue domain common to proteins of apparently ...
442KB Sizes 0 Downloads 0 Views