Eur. J. Biochem. 210, 839 - 847 (1992) 0FEBS 1992

Amino-acid sequence and cell-adhesion activity of a fibril-forming collagen from the tube worm Riftia pachyptila living at deep sea hydrothermal vents Karlheinz MANN I , Francoise GAILL' and Rupert TIMPL' Max-Planck-Institut fur Biochemie, Martinsried, Federal Republic of Germany Centre de Biologie Cellulaire, CNRS, Ivry-sur-Seine, France (Received August 7/September 23, 1992) - EJB 92 1145

We have determined the amino acid sequence of the c( chain of a fibril-forming collagen from the body wall of the marine invertebrate Rijtiu puchyptilu (vestimentifera) by Edman degradation. The pepsin-solubilized collagen chain consists of a 1011-residue triple-helical domain and short remnants of N- and C-telopeptides. The triple-helical sequence showed one imperfection of the collagen GlyXaa-Yaa triplet repeat structure due to a Gly+Ala substitution. This imperfection is correlated to a prominent kink in the molecule observed by electron microscopy. No strong sequence similarity was found with the fibril-forming vertebrate collagen types I - 111, V and XI except for the invariant Gly residues. However, one of the two consensus cross-linking sequences was well conserved. The Rijtiu collagen shared with the vertebrate collagens many post-translational modifications. About 50% of the Pro and Lys residues are found in the Yaa position and were extensively hydroxylated to 4hydroxyproline (4Hyp) and hydroxylysine (Hyl). A few proline residues in Xaa position were partially hydroxylated to either 4Hyp or 3Hyp. Despite the low sequence similarity, Rijtiu collagen was a potent adhesion substrate for two human cell lines. Cell adhesion could be inhibited by antibodies against the integrin p1 subunit but not by RGD peptides. This biological activity is apparently conserved in fibril-forming collagens of distantly related species but does not require the two RGD sequences present in Rijtiu collagen.

Five of the fourteen collagen types described so far in vertebrates belong to the distinct group of collagens forming large fibrils that, after staining, show a characteristic and periodic (67-nm) cross-striation pattern [l - 31. These fibrilforming or interstitial collagens include the types I - 111,V and XI. They self-assemble laterally with a characteristic quarterstaggered arrangement. This basic structure of the fibrils becomes stabilized by covalent cross-links derived from a few oxidized lysine or hydroxylysine residues. The ability to form such fibrils is apparently dependent on some molecular characteristics shared by all members of this group of collagens. These include a uniform length of about 300 nm, a triple-helical domain of three chains made up from 10141023 residues showing a non-interrupted Gly-Xaa-Yaa seCorrespondence to R. Timpl, Max-Planck-Institut fur Biochemie, W-8033 Martinsried, Federal Republic of Germany Fax: (089) 85782422. Abbreviations. CB and EL peptides, peptides produced by digestion with CNBr and lysyl endopeptidase, respectively; Pth, phenylthiohydantoin; SLS, segment-long-spacing. Enzymes. Chymotrypsin from porcine pancreas (EC 3.4.21.I); collagenase from Clostridium histolyticum (EC 3.4.24.3); endoproteinase Asp-N from Pseudomonas fragi (EC 3.4.-.-); endoproteinase Glu-C from Staphylococcus aureus (EC 3.4.21.19); lysyl endopeptidasc from Achromobacter lyticus (EC 3.4.21.50); pepsin from porcine stomach mucosa (EC 3.4.23.1); thermolysin from Bacillus thermoproteolyticus (EC 3.4.24.4); trypsin from bovine pancreas (EC 3.4.21.4). Note. The amino acid sequence has been deposited in the Martinsried Institute for Protein Sequences (MIPS) data bank.

quence repeat and two conserved triple-helical cross-linking sites. Major functions of these fibrils are to provide mechanical strength to extracellular tissues but also to act as substrates for anchoring cells to these matrices via specific cellular receptors [4, 51. Cross-striated collagen fibrils and the corresponding proteins are also known to exist in many invertebrates but are insufficiently characterized [6].Some partial cDNAderived sequences have been reported for a sea urchin [7] and a sponge species [8], demonstrating a reasonable homology to vertebrate fibril-forming collagens in the procollagen Cpeptide region but not in the triple-helical domain. Sequence information at the protein level, which would allow the detection of important post-translational modifications, is even more scarce and restricted to a few Gly-Xaa-Yaa repeats [9, 101. We have recently described the isolation and the properties of a fibril-forming collagen from the body wall of the invertebrate Riftiu puchyptilu [ll], a vestimentiferan living in deep sea hydrothermal vent communities [12, 131. This giant tube worm lives some 2600 m deep in areas where hot, acidic, anoxic and hydrogen-sulfide-rich vent water mixes with the cold, oxygenated surrounding water. It is thus exposed to constantly fluctuating oxygen concentrations and variations in ambient temperature between 2 - 15"C. One adaptation to such apparently hostile conditions is a high concentration of extracellular hemoglobin in both the vascular blood and the coelomic fluid, which may act as an oxygen reserve during periods of low oxygen [14]. This mechanism of constant oxygen supply apparently also allows for a sufficient hydroxyl-

840 ation of proline residues in the collagen. This is presumably required to raise the melting point of the triple helix to 29 "C [Ill. The body wall collagen of this unusual organism forms 20-nm-thick, typical cross-striated fibrils in the interstitial connective tissue underneath the epidermis. Pepsin-solubilized molecules were about 280 nm long and had a molecular mass of about 340 kDa. Like fibril-forming vertebrate collagens, the invertebrate collagen formed segment-long-spacing (SLS) segments that differed, however, considerably in cross-striation patterns, indicating substantial sequence differences [ 1 11. Electron microscope predictions of a rather unique amino acid sequence for Riftiu interstitial collagen and its origin from an unusual habitat made us decide to determine the complete amino acid sequence by Edman degradation. Previous limited stoichiometric and sequence data [ l l ] have indicated that the collagen consists of three identical M chains. We here report the whole sequence of the triple-helical domain of this a chain, which is the first determined for an invertebrate fibril-forming collagen. In addition, we identified most of the post-translational modifications. Special features of the sequence also led us to show that the Rijtiu collagen is an active substrate for the integrin-mediated adhesion of vertebrate cells.

MATERIALS AND METHODS Proteins, antibodies and synthetic peptides

Pepsin-solubilized interstitial collagen from Riftiu puchyp-

tih and its a chain were purified as described previously [I 11. Neutral-salt-extracted collagen I from fetal bovine skin was purified by NaCl precipitation [15]. Pepsin-solubilized collagen VI was obtained from human placenta [16]. Hybridoma culture medium containing a monoclonal antibody (mAb) AIIB2 against human integrin pl chain [17] was a kind gift of Dr C. H. Damsky. Purified mAb GoH3 against integrin a6 chain [18] and mAb C17 against p3 chain [I91were generously supplied by Dr A. Sonnenberg. mAb Gi9 against a2 chain was purchased from Dianova, Hamburg. Synthetic GRGDS and RGES peptides were commercial products (Bachem, Heidelberg and Peninsula Lab., Belmont, CA). Cyclic RGDDFV (DF = D-Phe) residue [20] was provided by Dr H. Kessler. Chemical and proteolytic cleavages

The Rijtia collagen or its purified M chains were dissolved in 70% formic acid (0.5 - 1 mg/ml) and a 100 - 200-fold excess (by mass) of CNBr was added. The mixture was incubated for 4 h at 30°C. Reagents were then removed by lyophilization. The a chains were cleaved with lysyl endopeptidase (Wako Chemicals GmbH, Neuss) in 0.2 M ammonium hydrogen carbonate for 24 h at 37 "C at an enzyme/substrate ratio of 1 : 100. Purified large fragments were further cleaved either with trypsin (treated with tosylphenylalanylchloromethane ; Worthington Corp., Freehold, NJ) for 16 h at 23"C, with thermolysin (Merck, Darmstadt) for 2 h at 30°C, with a-chymotrypsin (Boehringer Mannheim GmbH) for 6 h at 2 5 T , with endoproteinase Asp-N (sequencing grade, Boehringer Mannheim GmbH) for 16 h at 25"C, with endoproteinase Glu-C (Boehringer Mannheim GmbH) for 16 h at 23°C or with Clostridium collagenase (type VII, Sigma Chemie GmbH, Deisenhofen) for 2 h at 25'C in 0.2 M ammonium hydrogen carbonate at an enzyme/substrate ratio of about 1 : 100. With substrate amounts less than 10 pg cleavage was done regularly with 0.5 pg protease.

Chromatographic procedures

CNBr fragments were separated on a Superose 6 column (Pharmacia, Uppsala) equilibrated in 0.2 M ammonium acetate pH 6.9. Pools containing fragments of similar size were then chromatographed on a Mono S column at pH 2.5 and finally desalted and further separated by reverse-phase HPLC on a CI8 column as described [21]. Fragments derived from lysyl endopeptidase cleavage were separated by size-exclusion chromatography on a tandem consisting of a TSK G-3000 and a TSK G-2000 column followed by reverse-phase HPLC [22].Peptides derived from further cleavage of large fragments were separated by reverse-phase HPLC [21, 221. Analytical methods

Peptide concentrations and compositions were determined after hydrolysis in 6 M HC1 at 110°C for 16 h on a Biotronic LC 5001 analyzer. The amount of glycosylated hydroxylysine was determined indirectly after hydrolysis with 1 M NaOH [23]. Amino acid sequences were determined with gaslliquidphase sequencers (Applied Biosystems, models 470A and 473A) according to the manufacturer's instructions. Sequence comparisons were done with the FASTP program [24] or the PIRALIGN program (PIR, National Biomedical Research Foundation, Washington, DC). Electrophoresis of small peptides was done according to [25]. Cell binding assays

Cell attachment was determined by a colorimetric assay in microtiter wells coated with various substrates [26]. The substrates were dissolved in 0.1 M acetic acid and then diluted into neutral coating buffer at concentrations of 0.6 - 40 pg/ ml. Cell suspensions (1 - 5 x lo5 cells/ml) were mixed in the inhibition assays with mAbs or R G D peptides and then seeded onto the coated wells. Results were expressed as the percentage residual attachment compared to a non-inhibited control. The human cell lines used were the fibrosarcoma HT1080 and the mammary epithelia HBL-100 [26, 271.

RESULTS Cleavage of the E chain and isolation of peptides

Two sets of peptides were prepared by cleavage with CNBr (CB peptides) and lysyl endopeptidase (EL peptides). Electrophoresis of both digests showed a band pattern of limited complexity and most of the peptides were smaller than the major CB peptides of calf collagen ctl(1) chain (Fig. 1). With a single exception, major EL peptides were also smaller than the Riftiu CB peptides. A clear size separation of EL peptides was achieved on two TSK molecular sieve columns which were operated in tandem (Fig. 2). Further purification of all indicated pools by reverse-phase HPLC allowed recovery of a total of 25 EL peptides in reasonably pure form. The CB peptides were first separated according to size on a Superose 6 column followed by chromatography on a cation exchanger and by reverse-phase HPLC (not shown). All the large (Fig. I ) and several smaller CB peptides could be purified by these procedures. Several tripeptides were not obtained and their existence became obvious from the sequence analysis of overlapping EL peptides (Fig. 3). For three of the CB peptides (CB6/13, CB4/6 and CB3/1, see Fig. 3) it has previously been shown that they are present in stoichiometric amounts in the Rftia collagen digest [ll].

841

Fig.1. Electrophoresis of CB and EL fragments of Riftiu collagen. Lane CB, CNBr-released peptides of Rifia collagen; lane EL: lysyl endopeptidase digest of Riftiu CI chain. The run was calibrated with globular proteins and peptides and their molecular masses arc indicated in kDa (left side). The positions of some major CNBr pcptides of calf collagen al(1) chain are shown on the right side. The 12%T, 3 % c gel system of [25] was used for this analysis.

ficient overlaps to align a continuous sequence of 1027 amino acid residues. The sequence of the a chain started with a short non-triple-helical domain (positions N1- N12) containing only a single glycine. Shorter variants of this region were found in some fragments indicating that pepsin may cleave this N-terminal telopeptide at different positions. The telopeptide sequence was followed by a major triple-helical domain of 1011 residues that showed, with one exception, a regular GlyXaa-Yaa triplet repeat. The sequence is terminated by DYGA which is very likely a residual piece of the C-telopeptide left over after pepsin solubilization. There are several pecularities within the sequence. The triple-helical domain contains a single triplet imperfection (position 598 -600) caused by a Gly+Ala substitution, a single canonical cross-linking sequence (position 82 - 87) and two potential cell-adhesive RGD sequences (positions 564 566 and 603-605). The C-terminal of the domain contains four consecutive Gly-Pro-Pro triplets in which all of the prolines are hydroxylated either in the 3 or 4 position. Similar sequences are common in other collagens and are thought to serve as nucleation centers for helix formation and to stabilize the end region of the triple helix [28]. A single microheterogeneity was found at position 881 that was occupied by alanine and 4-hydroxyproline in an approximate ratio of 1 : 1. Post-translational modifications of the a chain

0.:

0.07E

W

2 0.0:

Q

0.02E

C I

I

I

16

22

28

Elution volume [ m l ]

Fig. 2. Separation of lysyl-endopeptidase-generatedfragments of Riftiu a chain on a tandem of TSK G-3000 and TSK G-200 columns. The columns were equilibrated in 0.2 M ammonium acetate containing 0.1'YOtrifluoroacetic acid and operated as described previously [22]. Individual peaks as indicated by numbers were further separated by reverse-phase HPLC.

Several of the large CB and EL peptides could not be completely sequenced in a single run and were cleaved with various neutral proteases (see Materials and Methods) to allow the isolation of more fragments by reverse-phase HPLC. Thus more than 200 peptides were isolated from about 10 mg Rijtia collagen and used for sequence analysis. Sequence of the a chain

The amino acid sequence of the Riftia CI chain (Fig. 3) could be determined from its CB and EL peptides and from internal peptides derived from them. This also yielded suf-

Amino acid analysis of Riftia collagen has shown that about half of its proline and lysine residues are converted into 4-hydroxyproline and hydroxylysine [I 11. Alkaline hydrolysis released only about half of the hydroxylysine which shows that the remaining residues are glycosylated [23]. In addition, we detected in acid hydrolysates an imino acid with maximal adsorption at 440 nm preceding 4-hydroxyproline on the amino acid analyzer. This component usually corresponds to 3-hydroxyproline [29, 301 indicating that Riftia collagen contains 8 - 10 3Hyp/1000 residues. The Riftia CI chain contained 162 prolines in its triplehelical domain, 87 of them in the Yaa position. Hydroxylation of proline in the Yaa position was in most cases complete (> 90%) except for a few residues which are mainly (z500/,) located in the N-terminal half of the CI chain (Fig. 3). Six of the 4-hydroxyproline residues were also found in the Xaa position but were not fully hydroxylated (5 - 50'/0). We also noticed that several of the small fragments possessing one or a few 4-hydroxyproline residues could be separated by reversephase HPLC into variants that were either efficiently or not hydroxylated at a single proline position. This was particularly obvious for 4-hydroxyproline in the Xaa positions. The reasons for this microheterogeneity are unclear but could be related to the fluctuating oxygen concentrations and/or temperature in the environment of the animals [14]. This has allowed only a rough estimate of the overall hydroxylation in the Yaa position ( z90?0'), which is, however, in the range observed for vertebrate collagens. While it is not difficult to identify 4-hydroxyproline due to its unequivocal peak pattern after Edman degradation, the identification of the phenylthiohydantoin (Pth) derivatives of 3-hydroxyproline is more difficult since they are scattered among several peaks and the yields are usually low [31]. We could, however, detect a similar but characteristic Pth-Xaa pattern (Fig. 4A) in 12 positions of the sequence (Fig. 3). This number is in good agreement with values for an imino acid preceding 4-hydroxyproline on the analyzer (see above). A correlation was shown directly for a small peptide (EL8/3)

842 N1

Y R A G P R V I Q A Q V

TH 1

61

121

. . . .

G P R G Q T G E R G R D G K S G L ~ G L R G V D G L A G ~ ~ G ~ ~ G P I G S T G S P G F P G T P G S K G O R G Q S G I X

181

241

301

361

0

421

481

541

601

.

.

G A H G E Q G D A G K D G E T G A A G P P G A A G P T G A R G P P G P R G Q Q G F Q G L A G A Q G T P G E A G K T G E R

843 661

721

781

841

P G L ; G L A G R ; G E R G E ; G V A G R A G S Q G L A G L H G Q R G L ; G A A G P ; G D R G E R G E A G G Q G V Q G P V

901

961

.

c1 v * I G E A G G R G S Q G P P G K D G Q P G P S G R V G P R G P S G D D G R S G P P G P P G P P G P P G N S D Y G A __________________________I

V.

b

I

EL714

V.

v.

I _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ --------------------------> ___

I

[ _ _ _ _ _ _ _ _ _ _ _ CB4/9a,T2a

_ _ _ _ _ _I _ _ I_CB4/9a,T2b _____I

I _ _ _ _ _ _ _ _EL7/4,T5 __ -----------------I I _ _ _ _ _ _ _ _CB4/9a _ _ T6 ----------------I

Fig.3. Complete amino acid sequence of the Riftiu collagen a chain. Positions within the triple-helical domain are numbered 1 - 1011, those in the N-terminal telopeptide N1 - N12, and those in the C-terminal telopeptide C1 -C4. Symbols on top of P and K denotc completely hydroxylated 4Hyp in Yaa position ( O ) , partially hydroxylated 4Hyp in Yaa position (0),4Hyp in Xaa position (A), 3Hyp in Xaa position (V), completely hydroxylated Hyl (m) and partially hydroxylated Hyl (0).X denotes non-identified residues, which, as discussed, are very likely glycosylated Hyl residues. A canonical cross-linking site is underlined by a thick line and a triplet imperfection by two thick lines. Dashed lines identify positions sequenced by Edman degradation and are terminated by a vertical line in the case of a complete determination or by an arrow in the case of a partial sequence determination. Individual peptides used were generated by cleavage with CNBr (CB), lysyl endopeptidase (EL) collagenase (C), chymotrypsin (Ch), endoproteinase Asp-N (EA), endoproteinase Glu-N (EG), trypsin (T) and thermolysin (Th) and are distinguished by numbering according to different chromatographic pools. Peptides CB 6/13, CB 3/1 and CB 4/6 in this scheme have been referred to as CB 6d, CB 6b and CB 6c, respectively, as in a previous study [Ill.

possessing this imino acid and a single residue with the characteristic Pth-3Hyp derivatives. The CI chain sequence showed all of the 19 lysine residues in Xaa position and 7 hydroxylysine residues in Yaa position, two of them partially hydroxylated (Fig. 3). This low number of hydroxylysine was the only large discrepancy between the amino acid composition determined on the analyzer (21 Hyl/ 1000 residues, see [I 11) and that predicted from Edman degradation (Fig. 3). Yet the sequence contains 13 unidentified residues all in the Yaa position, indicating that glycosylated hydroxylysine (see above) occupies most if not all of these positions. We have also obtained some circumstantial evidence for degraded Pth-Hyl derivatives by analyzing the partially hydroxylated lysine at position 921. This showed the distinct peaks of Pth-Lys and Pth-Hyl and a new doublet peak with the approximate retention time of Pth-Tyr (Fig. 4B). This novel doublet peak was only associated with some but not all positions denoted X in the sequence and was also observed in vertebrate collagens possessing glycosylated hydroxylysine (unpublished). We suspect that some unknown degradation product of glycosylated hydroxylysine gives rise

to these unidentified derivatives. Interestingly, it was only observed in peptides stored for some time in HPLC effluents containing 0.1% trifluoroacetic acid but never in freshly prepared peptides. We also found that lysyl endopeptidase of Achromobucter lyticus cleaved peptide bonds C-terminal to lysine and to hydroxylysine. Yet no cleavage was observed at positions obviously occupied by glycosylated hydroxylysine indicating that the oligosaccharide side chains prevent access of the protease.

Cell adhesion activity of Riftia interstitial collagen Because the collagen possesses two RGD sites (Fig. 3) and is located in the vicinity of cells in the body wall of R . puchyptilu [I I], we examined its cell binding properties with two human cell lines (HBL-100, HT 1080). Comparable dose/ response profiles were observed with Riftiu collagen and the known cell-adhesive vertebrate collagens I and VI in an attachment assay illustrated for HBL-100 cells in Fig. 5. A considerable decrease in activity was observed for both cells after heat

844

A

3-HYP

B

HYL

4

LYS

I 1

Fig. 4. Typical reverse-phase HPLC profiles of phenylthiohydantoin derivatives obtained for 3-hydroxyproline, 4-hydroxyproline and proline (A) and a mixture of lysine and free and glycosylated hydroxylysine (B). The proline derivatives (A) shown represent positions 539, 540 and 542 as sequenced in peptide EL8/3. Note the poor yield of Pth3Hyp as compared to the yields of Pth-4Hyp and Pth-Pro. The lysine residue shown in (B) is position 921. Arrows denote positions of characteristic derivatives. X denotes a possible derivative of glycosylated hydroxylysine. (Numbers on the curves indicate retention times which are not pertinent to this paper.)

denaturation of Riftiu collagen. Cell adhesion to native Riftia collagen could not be inhibited by synthetic GRGDS and RGES peptides (up to 100 pM) and cyclic RGDDFV (up to 10 pM). Concentrations tenfold lower than the highest used here were previously shown to block adhesion of the same two cell lines to vitronectin and laminin PI substrates to more than 50% [20]. Monoclonal antibodies specific for certain

integrin rx/P subunits were used in further studies t o block cell adhesion to Rftia collagen (Fig. 6). Complete inhibition was observed with antibodies against the p1 chain but not with those against the p 3 chain. A partial inhibition ( z25%) was observed with anti-rx2 antibodies. Since anti-a6 antibodies were non-inhibitory, the weak effect of anti-rx2 seems to be significant. Together the data show that cell adhesion to Riftitin

845

E

c

P0

ul

Substrate [pg/weLL]

Fig. 5. Dose/response profile of HB-100 cell adhesion to Riftia collagen in comparison to other cell-adhesive substrates. Wells were coated with native Kiftiu collagen ( O ) ,heat-denatured (20 min, 50°C) Riftia collagen (0),native calf collagen I (A)and human collagen VI (M). Attached cells were analyzed by a colorimetric assay.

,

Antibody dilution

lol”

:: 101-3

14-1

” / W

25

01

i I

(0-1

l\A-A

100

I

I

10’

Antibody [ p g l r n l l

Fig. 6. Inhibition of HBL-100 cell adhesion to native Riftia collagen by mAb against integrin subunits. The inhibitors used were mAb AIIB2 against fl1 (A),mAb C17 against p3 (O), mAb Gi9 against a2 ( A ) and mAb GoH3 against ct6 (0).The scale on top refers to mAb AIIB2 available only as hybridoma medium, the scale on the bottom to the other mAbs.

collagen is a specific process that is mediated by distinct cellular receptors. DISCUSSION A large number of peptides have been sequenced for establishing the covalent structure of Riftiu interstitial collagen. Since they all fitted a single z 1000-residue sequence, there is no doubt that the collagen consists of three identical a chains. The triple-helical domain contains 1011 residues and is thus slightly shorter than those of vertebrate fibril-forming collagens (1014- 1023 residues). In the latter sequences, all amino acid residues except glycine show preferences for either Xaa or Yaa positions in the Gly-Xaa-Yaa triplets or a similar distribution between both [32]. The same preferences were observed for Riftiu collagen, i.e. Leu for Xaa, Arg for Yaa

and Pro about the same in both positions. A rather unique localization along the Riftia a chain was found for methionine (Fig. 7). This yields a characteristic CNBr peptide pattern not comparable to those of vertebrate collagens I, I1 and I11 (see Fig. 1). The methionine distributions are very similar for a given collagen type in mammals and birds and CNBr cleavage is therefore a good analytical tool for identification and quantitation [33]. Whether the Rftia collagen CNBr peptide pattern is also shared by related invertebrate species remains to be studied. A unique cross-striation pattern was previously demonstrated for SLS crystallites from Riftiu interstitial collagen [ll]. It has been shown for vertebrate collagens that the banding patterns of SLS crystallites correlates closely to the distribution of polar amino acids along the triple helix [l, 341. A similar correlation could now be shown to exist for the Riftiu collagen (Fig. 7). An amino acid sequence quite different to those of vertebrate collagens was predicted for Riftia interstitial collagen based on SLS banding patterns and some preliminary sequence data [l ll. This was now convincingly demonstrated by aligning the triple-helical sequence of Riftia a chain with that of mammalian or chick a(lI), a2(1), al(II), ctl(III), al(V), a2(V), al(X1) and a2(XI) chains. By analysis with the FASTP program, the levels of identity were in the range 45 - 53% with the highest scores observed for al(1) and al(1I) chains. These values are close to the basic level of similarity in the family of fibril-forming collagens caused by a 33% glycine content and the frequent occurrence of proline. Partial cDNAderived sequences from the C-terminal of the triple helix have been reported for fibril-forming collagens of a sponge (307 residues) and sea urchin (786 residues) species [8, 71. A comparison with the Riftia CI chain sequence showed no better identity (45 - 50%, depending on the gaps introduced) between these invertebrate collagens. Higher similarities (60 90% identity) are observed among the collagens of different vertebrate species comparing either the same or different types of fibril-forming collagens. This indicates that fibril-forming collagens of different animal phyla have diverged considerably during evolution except for maintaining glycine in almost every third position of the sequence. There are several more interesting features of the Riftia a chain sequence. Most remarkable is a Gly-Ala substitution at position 598 resulting in an imperfection in the regular GlyXaa-Yaa repeats. This should cause a disturbance in triplehelix rigidity about 170 nm away from its N-terminal and correlates well with a sharp kink observed at the same distance in the thread-like Rijtiu collagen molecules when visualized by rotary shadowing [ll]. The same single substitution in a synthetic peptide (Gly-Pro-Hyp)lo does not prevent triplehelical folding but decreases thermal stability [35]. Similar substitutions in mutants of human collagens I and I11 may have a large impact on fibril formation and stability [36, 371. Triplet imperfections have also been described in the sequence of putative fibril-forming collagens of other invertebrates [7, 81 but are located closer to the C-terminal than in the Riftiu a chain. It indicates a frequent occurrence of such triple-helixdestabilizing segments in fibril-forming collagens of invertebrates which, however, do not impair their normal function. A few crucial hydroxylysine residues located close to both ends of the triple helix in the canonical sequence Gly-XaaHyl-Gly-His-Arg are essential for intermolecular cross-linking within collagen fibrils [l]. This typical sequence is found in Rijtia a chain at position 82 - 87 (Fig. 3) and is shifted one triplet closer to the N-terminus when compared to vertebrate collagens. The second potential site with the sequence Gly-

846

Fig.7. Schematic correlation of the Riftia a chain sequence with the distribution of methionines and some post-translational modifications (top) and of basic amino acid residues with the banding pattern of collagen SLS crystallites (bottom). Top : positions of methionine are indicated by vertical lines outlining the predicted CNBr (CB) peptide sizes. Positions of 3Hyp (V),4Hyp in Xaa position ( A ) and the single triplet imperfection (m) are denoted above. Bottom: upward and downward vertical lines show the distribution of Arg and Lys/Hyl residues, respectively. This is correlated to the SLS banding pattern observed by electron microscopy of Rifzia collagen [l 11. Note the good correspondence of clusters of basic residues with the dark bands of the SLS crystallite obtained by staining with phosphotungstic acid and uranyl acetate. The asterisk marks an exceptional lack of correlation in a sequence region (position 205 - 21 5) containing, however, two acidic residues. The top scale shows the amino acid numbers of the triple-helical domain (101 1 residues).

Asp-Hyl-Gly-Trp-Thr (position 925 - 930) is less similar but is shifted by the same distance and may have the same function. That both sites are involved in the cross-linking of Rijtia collagen is indicated by the need to solubilize the collagen with pepsin [l 13 which removes the cross-linking counter-parts (lysine or hydroxylysine) present in the N- and C-telopeptides. This agrees with the observation that no lysine or hydroxylysine is found in the remnants of the telopeptides in pepsin-solubilized Rijtia collagen (Fig. 3, positions N1- N12, C1 -C4). A second functional sequence of such collagens involves a Gly-Leu or Gly-Ile bond (position 775-776) which is cleaved by collagenase (matrix metalloproteinase I) during catabolism [l, 381. In the same region of Rijtia M chain this sequence could be either Gly-Glu (775-776), Gly-Gln (772773) or Gly-Leu (784-785); it remains to be shown whether these sequences are actually cleaved by the protease. Their surrounding sequences in Rijtia M chain have a low imino acid content which is considered to be essential for efficient cleavage [11. Enzymatic oxygen-dependent post-translational modifications of collagens are the hydroxylation of about half of their proline and lysine residues and occur apparently in the Riftia collagen to a sufficient extent as shown by amino acid analysis [l 11. The sequence data now demonstrate, as expected, that essentially all Pro and Lys in the Yaa position are converted into 4Hyp and Hyl. Interestingly, six of the Pro in the Xaa position are also hydroxylated to yield 4Hyp but with a lower grade of modification. Five of these residues are clustered not far away from the N-terminal site of the triplet imperfection (Fig. 7). Since triple-helical folding of fibrilforming collagens is thought to start from the C-terminal[28], it could indicate that a delay caused by the imperfection may keep the M chains for a sufficiently long time in an unfolded state to allow this unusual modification. More abundant in the Xaa position were, however, 12 3Hyp residues, which require another prolyl hydroxylase than that used for 4-hydroxylation [39, 401. They are clustered to some extent at the C-terminus but are otherwise evenly distributed over the entire sequence (Fig. 7). There is, however, no obvious consensus

sequence around this modification. The structural requirements that determine the specificity of 3-prolyl hydroxylase and the possible functions of its product therefore still remain unclear. Vertebrate fibril-forming collagens, and in particular collagen I, have also been identified as very active cell adhesion substrates in vitro [41, 421 and may have a similar function in situ. In adhesion assays with two human cell lines, a comparable activity was found for Riftia collagen and the vertebrate collagens I and VI. Cell adhesion was dependent on an intact triple-helical conformation but apparently not on RGD sequences, as indicated from inhibition assays with synthetic peptides. Major collagen receptors on vertebrate cells are the integrins alB1 and a2B1 [43, 441. Apparently, they are both involved in cell binding to Rftia collagen as shown by complete inhibitions with anti-B1 antibodies and partial inhibition with anti-ct2 antibodies. Since blocking antibodies to the a1 subunit are not available, it will require affinity chromatography to demonstrate the participation of a101 integrin. Platelet integrin a2B1 was reported to bind a synthetic peptide corresponding to a DGEA sequence in vertebrate collagen I [41]. This sequence is, however, not found in the Rijtia ct chain. The invertebrate collagen may therefore become an important substrate to analyze human cell adhesion specificity for fibrillar collagens. We thank Vera van Delden, Mischa Reiter, Hanna Wiedemann and Wolfgang Straljhofer for excellent technical help. The study was supported by the Deutsche Forschungsgemeinschaft and by a grant designed for cooperation between the Centre National de la Recherche Scientifique and the Max-Planck-Gesellschaft. We are also grateful to Drs A. M. Alayse-Danet and H. Felbeck, chief scientists of the Hydronaute cruise, for the collected material.

REFERENCES 1. Kiihn, K. (1987) in Structure and function of collagen types (Mayne, R. & Burgeson, R. E., eds) pp. 1-42, Academic Press, New York. 2. Burgeson, R. E. (1988) Annu. Rev. Cell Biol. 4, 551 - 577. 3. Van der Rest, M. & Garrone, R. (1991) FASEBJ. 5,2814-2823. 4. Nimni, M. E. & Harkness, R. D. (1988) in Collagen, vol. 1 (Nimni, M. E., ed.) pp. 1-78, CRC Press, Boca Raton FL. 5. Grinnell, F. (1982) Methods Enzymol. 82, 499-503. 6. Bairati, A. & Garrone, R. (eds) (1985) Biology ofinvertebrate and lower vertebrate collagens, Plenum Press, New York. 7. D’Alessio, M., Ramirez, F., Suzuki, H. R., Solursh, M. & Gambino, R. (1990) J . Biol. Chem. 265, 7050 - 7054. 8. Exposito, J.-Y. & Garrone, R. (1990) Proc. Natl Acad. Sci. USA 87,6669 - 66’73. 9. Goldstein, A. & Adams, E. (1970) J . Biol. Chem. 245, 54785483. 10. Sharma, Y . D. & Tanzer, M. L. (1 984) Anal. Biochem. 141,205 212. 11. Gaill, F., Wiedemann, H., Mann, K., Kiihn, K., Timpl, R. & Engel, J. (1991) J . Mol. Biol. 221, 209-223. 12. Jones, M. L. (1981) Science 213,333-336. 13. Grassle, J. F. (1985) Science 229, 713-717. 14. Arp, A. J. & Childress, J. J. (1981) Science 213, 342-344. 15. Miller, E. J. & Rhodes, R. K. (1982) Methods Enzymol. 82, 3364. 16. Odermatt, E., Risteli, J., van Delden, V. & Timpl, R. (1983) Biochem. J . 21 1,295 - 302. 17. Hall, D. E., Reichardt, L. F., Crowley, E., Holley, B., Moezzi, H., Sonnenberg, A. & Damsky, C. H. (1990) J . Cell Bid. 110, 21 75 - 21 84. 18. Sonnenberg, A., Janssen, H., Hogervorst, F., Calafat, J. & Hilgers, J. (1987) J . Biol. Chem. 262, 10376- 10383.

847 19. Tetteroo, P. A. T., Landsdorp, P. M., Leeksma, 0.C. & von dem Borne, A. E. G. K. (1983) Br. J. Haematol. 55, 509-522. 20. Aumailley, M., Gurrath, M., Muller, G., Calvete, J., Timpl, R. & Kessler, H. (1991) FEBS Lett. 291, 50-54. 21. Mann, K., Jander, R., Korsching, E., Kuhn, K. & Rauterberg, J. (1990) FEBS Lett. 273, 168-172. 22. Mann, K. (1992) Biol. Chem. Hoppe-Seyler 373, 69-75. 23. Butler, W. T. (1982) Methods Enzymol. 82, 339-346. 24. Lipman, D. J. & Pearson, W. R. (1985) Science 227,1435- 1441. 25. Schagger, H. & von Jagow, G. (1987) Anal. Biochem. 166, 368379. 26. Aumailley, M., Mann, K., von der Mark, H. & Timpl, R. (1989) Exp. Cell Res. 181, 463 -414. 27. Sonnenberg, A,, Linders, C. J. T., Modderman, P. W., Damsky, C. H., Aumailley, M. & Timpl, R. (1990) J. Cell Biol. 110, 2145-2155. 28. Gormann, H.-P. & Heidemann, E. (1988) Biopolymers 27, 157163. 29. Irreverre, F., Morita, K., Robertson, A. V. & Witkop, B. (1962) Biochem. Biophys. Res. Commun. 8,453 - 455. 30. Piez, K. A., Eigner, E. A. & Lewis, M. S. (1963) Biochemistry 2, 58 - 66. 31. Schuppan, D., Glanville, R. W. & Timpl, R. (1982) Eur. J . Biochem. 123, 505-512. 32. Hofmann, H., Fietzek, P. P. & Kiihn, K. (1978) J . Mol. Biol. 125, 137- 165.

33. Miller, E. J. (1984) in Extracellular matrix biochemistry (Piez, K. A. & Reddi, A. H., eds) pp. 41 -82, Elsevier, New York. 34. Kiihn, K. (1982) Collagen Rel. Res. 2, 61 -80. 35. Lang, C. G., Li, M. H., Baum, J. & Brodsky, B. (1992) J . Mol. Biol. 225, 1 -4. 36. Kuivaniemi, H., Tromp, G. & Prockop, D. J. (1991) FASEB J . 5,2052 - 2060. 37. Engel, J. & Prockop, D. J. (1991) Annu. Rev. Biophys. Chem. 20, 137- 152. 38. Woessner, J. F. (1991) FASEBJ. 5, 2145-2154. 39. Nordwig, A. & Pfab, F. K. (1969) Biochim. Biophys. Acta 181, 52-58. 40. Risteli, J., Tryggvason, K. & Kivirikko, K. (1977) Eur. J . Biochem. 73,485 -492. 41. Staatz, W. D., Fok, K. F., Zutter, M. M., Adams, S. P., Rodriguez, B. A. & Santoro, S. A. (1991) J . Biol. Chem. 266, 7363 - 7367. 42. Dedhar, S., Ruoslahti, E. & Pierschbacher, M. D. (1987) J . Cell Biol. 104, 585 - 593. 43. Hemler, M. E. (1991) in Receptors for extracellular matrix (McDonald, J. A. & Mecham, R. P., eds) pp. 256 - 300, Academic Press, New York. 44. Vandenberg, P., Kern, A,, Ries, A,, Luckenbill-Edds, L., Mann, K. & Kiihn, K. (1991) J . Cell Biol. 113, 1475-1483.

Amino-acid sequence and cell-adhesion activity of a fibril-forming collagen from the tube worm Riftia pachyptila living at deep sea hydrothermal vents.

We have determined the amino acid sequence of the alpha chain of a fibril-forming collagen from the body wall of the marine invertebrate Riftia pachyp...
857KB Sizes 0 Downloads 0 Views