Molecular and Biochemical

Parasitology, 41 (1990) 25-34

25

Elsevier MOLBIO 01338

Molecular cloning and primary sequence of a cysteine protease expressed by Haemonchus contortus adult worms George N. Cox l, Dickson Prattl, Robert Hagernan’ and Rudolph J. Boisvenue2 ‘Synergen,

Inc., Boulder,

CO, U.S.A.,

and =Animal Health Research Department, Company, Greenfield, IN, U.S.A.

Lilly Research Laboratories,

Eli Lilly and

(Received 20 November 1989; accepted 22 January 1990)

We have cloned cDNAs encoding a 35-kilodalton cysteine protease that is a major component of protective extracts isolated from blood-feeding Haemonchus contortus adult worms. Near full-length cDNAs for the protease were isolated by immunoscreening an adult worm cDNA expression library with a rabbit antiserum prepared against the protein eluted from preparative SDS gels and by rescreening the library with oligonucleotide probes. The protein predicted from the nucleotide sequence of the cDNAs and of a genomic DNA clone comprises 342 amino acids and contains an N-terminal signal sequence, 16 cysteine residues and four potential N-linked glycosylation sites. The enzyme appears to be glycosylated in vivo. The H. contortus protease, called AC1, displays an overall 42% sequence identity with the human lysosomal thiol protease cathepsin B. The similarities between cathepsin B and AC-1 are localized primarily to regions of cathepsin B that comprise the mature, active form of the enzyme. A stretch of six amino acids that includes the active site cysteine of cathepsin B is conserved, and is present in the same relative location in AC-l, suggesting that this region comprises the active site of the H. contortus enzyme. Key words: Haemonchus

contortus; Cysteine protease; cDNA cloning; Nucleotide sequence; Cathepsin B

Introduction Huemonchus contortus is a member of the nematode order Strongylata, which includes many endoparasites of human and veterinary medical importance [ 11. The most pathogenic members of this nematode order reside in the digestive tracts of their hosts and feed on host blood components. Because of their blood-feeding habit the worms cause severe anemia, intestinal disturbances and weight loss in infected individuals. H. contor~us is a pathogen primarily of sheep, Correspondence

address: George N. Cox, Synergen, Inc., 1885 33rd Street, Boulder, CO 80301 U.S.A.

Note: Nucleotide sequence data reported in this paper have been submitted to the GenBankm data base with the accession number M31112. Abbreviations:

BSA, bovine serum albumin; MBS, Mopsbuffered saline; Mops, 3-[N-motpholino]propanesulfonic acid; Rb, rabbit; SDS, sodium dodecyl sulfate; SSC, standard saline citrate; TBS, T&-buffered saline.

although it also infects cattle, goats and other ruminants. We hypothesized that blood-feeding H. contortus adult worms might possess an anticoagulant mechanism to prevent the host’s blood from clotting during feeding. Subsequently, we discovered that adult worms possess a thiol protease activity that preferentially degrades fibrinogen and increases the clotting time of sheep plasma in vitro, presumably because the degraded fibrinogen cannot participate in fibrin clot formation (R. Hageman, manuscript in preparation). Fibrinogen is the blood protein that is cleaved by thrombin to initiate fibrin deposition and clot formation. Thus, this H. confoorfus~enzyme has attributes expected for an anticoagulant protease, although this function has not been demonstrated in vivo. Our interest in this enzymatic activity was stimulated further when we found that extracts purified from adult worms on the basis of enzyme activity were highly protective in sheep against H. contorfus infections (Boisvenue, Stiff, Tonkinson, Cox and Hage-

0166-6851/90/$03.50 0 1990 Elsevier Science Publishers B.V. (Biomedical Division)

26

man, manuscript submitted). In this report we describe the molecular cloning and primary sequence of a 35kDa cysteine protease that is a major component of these extracts.

The gels were stained with Coomassie blue to identify fractions that degraded the fibrinogen.

Materials and Methods

ified through the Sepharose CLdB column step were electrophoresed on 0.75-mm-thick SDS slab gels [3]. Proteins were made visible by staining for 15 min with 0.1% Coomassie Blue in water, destained with water, and the 35 kDa protein excised with a razor blade. The gel band was diced and the protein electroeluted using an elution apparatus purchased from Isco, Inc. Approximately 40 Kg of the eluted 35-kDa protein was emulsified with Freund’s complete adjuvant and injected subcutaneously at several sites along the back of rabbit 10285 (Rb-10285). Booster injections with additional antigen mixed with Freund’s incomplete adjuvant were given at monthly intervals.

H. conPreparation of anticoagulant extracts. form adult worms were recovered from young

lambs infected with a pure drug-susceptible United States Department of Agriculture isolate BPLl and stored in liquid nitrogen or at -70°C until use [2]. Adult worms were homogenized (10 ml buffer per g wet weight of worms) in a buffer of 75% MBS (10 mM Mops/l50 mM NaCl, pH 7.0/25% (w/v) glycerol/l mM EDTA/l mM dithiothreitol and phenylmethanesulfonyl fluoride was then added to a final concentration of 1 mM. After low-speed centrifugation (10000 rev./min for 20 min in a JA-20 rotor), 5-ml aliquots of the supernatant were applied to a 1.5 x 50 cm Sepharose CL4B column and proteins eluted using a buffer of 20 mM bis-Tris-propane pH 7.0, 20% (w/v) glycerol, 1 mM EDTA, 1 mM dithiothreitol. Fractions were assayed for fibrinogen-degrading activity using the assay described below. Most of the enzymatic activity elutes in the void volume, which suggests that the enzyme is part of a high-molecular-weight complex. The nature of this complex is under investigation. Active fractions were pooled and applied to an FPLC Mono Q column. Bound proteins were eluted with a gradient of 0.05-0.4 M NaCl in 20 mM bis-Trispropane pH 7.0, 10% (w/v) glycerol, 1 mM EDTA, 1 mM dithiothreitol, and fractions were tested for fibrinogen-degrading activity. Active fractions typically eluted in the middle of the gradient and were pooled. Protein concentrations were determined using a protein assay kit purchased from BioRad Laboratories (Richmond, CA). The fibrinogen degradation assay consisted of mixing 5-20 ~1 aliquots of the column fractions with an equal volume of a solution of 1% (w/v) bovine fibrinogen (Sigma) suspended in MBS, 1 mM EDTA, 1 mM dithiothreitol and incubating the samples for 1 h at 37°C. Samples were then diluted into SDS sample buffer, boiled for 5 min and analyzed on 9% SDS-polyacrylamide gels [3].

Preparation of rabbit antisera to the 35-kDa protease. Anticoagulant proteins that had been pur-

Construction and screening of the ,+gtll:adult cDNA expression library. Adult worms were

ground to a fine powder with a mortar and pestle immersed in liquid nitrogen and poly(A)+ mRNA isolated from the mixture using procedures described by Shamansky et al. [2]. 2 kg of poly(A)+ mRNA was used to prepare a library of 2 x lo6 individual recombinants in hgtll using a kit purchased from Amersham. The procedures used were those described in the manual that accompanies the kit, with minor modifications [2]. The library was screened with a 1:200 dilution of Rb10285 serum essentially as described by Young and Davis [4]. To isolate longer cDNAs, the library was screened by plaque hybridization [5] using 32P-labeled, nick-translated cDNAs [6] or an oligonucleotide that had been labeled at its 5’ end with [y3’P]ATP [7]. Hybridization conditions for cDNA probes were as described in [2], except that the hybridization and wash temperatures were 32-37°C and 4147°C respectively. The oligonucleotide, which has the sequence 5’CACTICAGGGTCGGGATClTCT-ITGACCATAAGATITAGC-3’ was synthesized on an Applied Biosystems DNA synthesizer. The labeled oligomer was hybridized to nitrocellulose filters at 32-52°C using 2 X SSC/S X Denhardt’s/0.5% SDS. Filters were washed in 2 x

21

SSC/O.S% SDS at 52°C. Phages were grown as plate lysates and purified by banding in cesium chloride gradients [8]. DNAs were released from the phages using formamide [8]. DNA

sequence

and Northern

blot analyses. Nu-

cleotide sequences were determined by the dideoxy chain termination method [9,10] after subcloning restriction fragments into Ml3 phage vectors. Denaturing formaldehyde agarose gel electrophoresis of adult worm poly(A)+ mRNA and Northern blot hybridization procedures were performed as described [2], except that nitrocellulose filters were hybridized at 32°C and washed at 37°C in hybridization buffer. An RNA ladder (0.67-1.77 kb, Bethesda Research Laboratories) was used to estimate the size of AC-l transcripts.

lecular weight protein standards were purchased from Bethesda Research Laboratories and consisted of myosin (200 kDa), phosphorylase B (97.4 kDa), bovine serum albumin (68 kDa), ovalbumin (43 kDa), carbonic anhydrase (29 kDa), plactoglobulin (18.4 kDa) and lysozyme (14.3 kDa). Anticoagulant proteins were digested with Endoglycosidase-F following the protocol supplied by the manufacturer (Genzyme, Boston, MA). Results Cloning of a cDNA encoding the 35kDa cysteine protease. H. contortus adult worms possess a thiol

protease activity that degrades fibrinogen and which can be partially purified by fractionating soluble worm proteins on a column of Sepharose

Antibody elution experimenti.

Phages were plated at a density of 1 x lo4 per 15 cm diameter agar plate using E. coli Y1090 and incubated at 42°C for 4 h. Nitrocellulose filters that had been impregnated with 10 mM isopropyl-B-D-thiogalactopyranoside and air-dried were placed on top of the phage and the plates incubated overnight at 37°C. After cooling, the filters were removed, cut into 77 x 100 mm strips, incubated in TBS + 2-10% BSA for 1 h, and incubated overnight with Rb-10285 serum diluted 1:200 in TBS + 2% BSA. The next day, the filter strips were washed 3 x for 15 min with TBS + 0.1% (v/v) Nonidet-P40. Bound antibodies were eluted by washing the strips 3 x with 300 p,l of 5 mM glycine/500 mM NaCl/0.2% Tween-20/100 pg ml-’ BSA, pH 2.3. Eluted antibodies were neutralized with onetwentieth volume 1 M Tris-HCl pH 7.4, diluted approximately three-fold with TBS + 2% BSA and incubated overnight with nitrocellulose strips that had been cut from Western blots of total adult worm proteins or FPLC Mono Q-column purified anticoagulant proteins (separated on 12% SDS gels). Subsequent washes, secondary antibody incubations and staining with 4-chloro-lnaphthol and hydrogen peroxide were by standard procedures. Miscellaneous

procedures. SDS-polyacrylamide gel electrophoresis and Western blots were by standard procedures [3,11]. Prestained high mo-

97 68

43 37 35 +

35 29

Fig. 1. Gel purification and characterization of rabbit antisera prepared against the 35-kDa protease. Proteins present in the active fractions eluting in the void volume of a Sepharose CL4B column are shown in lane 1. The arrow indicates the position of the 35 kDa thiol protease. The 35-kDa protein obtained by electroelution horn preparative SDS gels is shown in lane 2. The eluted 35 kDa band was used to immunize rabbit 10285. The reaction of this immune serum with total adult worm proteins and with anticoagulant extracts obtained by FPLC Mono Q column chromatography are shown in lanes 3 and 4, respectively. The 37-kDa protein that reacts weakly with the antiserum is indicated with an arrow. Molecular weights of protein standards are shown to the right.

28

CL4B followed by FPLC Mono Q column chromatography as described in Materials and Methods. Details of the purification process and characterization of the enzyme will be published elsewhere (R. Hageman, manuscript in preparation). Mono Q column-purified extracts containing enzyme activity are referred to as ‘anticoagulant’ extracts. A rabbit antiserum was prepared against the protease in order to clone a cDNA for the enzyme from an expression library. To do this, the fibrinogen-degrading protease activity of H. contortus adult worms was partially purified by Sepharose CL-4B column chromatography as described in Materials and Methods. The protein profile of the pooled, active fractions is shown in

A kDa

12345

B 12345

97 68 43 35 -B 29 -

c -

37 35

18 -

14 -

Fig. 2. Western blot analyses of antibodies selected by recombinant phage clones. Phage clones isolated by screening the hgtll:H. contortus adult cDNA expression library with Rb10285 serum were used to affinity purify antibodies reactive with their expressed antigens as described in Materials and Methods. The affinity purified antibodies were used to probe Western blots of total adult worm proteins (panel A) or Mono Q purified anticoagulant extracts (panel B). Lane 1 shows the reaction of Rb-10285 serum with these antigens. The reactions of antibodies selected by phages Xgtll, 2A, 2B and 4A are shown in lanes 2, 3, 4 and 5, respectively. Only phage 2B selected antibodies that react with the 35-kDa protein (indicated by an arrow) in both panels A and B. Antibodies selected by this phage clone also react weakly with a 37-kDa protein (arrow). Molecular weights of protein standards are indicated on the left.

Fig. 1. The two predominant polypeptides in the active fractions have sizes of 55 kDa and 35 kDa. Mono Q column chromatography enriches for these two proteins. The 35kDa protein band (marked with an arrow in Fig. 1) is the thiol protease as determined by active site labeling experiments using [14C]iodoacetic acid (R. Hageman, manuscript in preparation). The 35kDa protein band was electrophoretically eluted from preparative SDS gels (Fig. 1) and used to immunize rabbit 10285 (Rb-10285). On Western blots, the immune sera reacted predominantly with a 35kDa protein band in extracts of adult worms and in anticoagulant extracts that had been further purified by Mono Q column chromatography (Fig. 1). Although the predominant reactivity of the antiserum was with the 35-kDa protein, other proteins clearly reacted with the antiserum, but with less intensity. RB-10285 antiserum was used to screen a Xgtll:cDNA expression library prepared from mRNA isolated from adult worms that had been isolated from monospecifically infected sheep. Screening of 80000 phage yielded four positive clones, which were rescreened with Rb-10285 serum until plaque pure. The recombinant phage clones were used to affinity purify antibodies that react with their expressed antigen from the polyclonal rabbit serum (‘antibody elution experiments’; see Materials and Methods). The eluted antibodies were then used to probe Western blots of worm proteins to identify the target antigen corresponding to the cDNA in each phage clone. These antibody elution experiments revealed that only phage 2B selected antibodies that reacted specifically with a 35-kDa protein on Western blots of adult worm extracts and in Mono Q column-purified anticoagulant extracts (Fig. 2). The antibodies selected by phage 2B also consistently reacted weakly with a 37-kDa protein in the Mono Q column-purified anticoagulant preparations (Fig. 2). As is suggested below, this 37-kDa protein may be a more heavily glycosylated form of the 35-kDa protein. From these experiments we conclude that phage 2B encodes a 35-kDa protein present in anticoagulant extracts. Digestion of phage 2B DNA with EcoRI revealed that it contained a cDNA insert of about

180 bp. Nucleotide sequence analysis of the cDNA revealed that it encoded only 12 amino acids fused to p-galactosidase; the remainder of the cDNA consisted of 3’ untranslated sequences and a poly(A) tail. The 3’ untranslated region contained a canonical poly(A) addition sequence AATAAA [12], as is shown in Fig. 4. cDNA 2B was labeled with 32P by nick translation and used to screen the cDNA library in order to identify larger cDNAs. The frrst such screen yielded cDNA 3-1, which was about 870 bp in length. A 40 nucleotide long oligomer that corresponds to the sequence at the 5’ end of cDNA 3-l (see Materials and Methods) was synthesized, end-labeled with 32P and used to screen the cDNA library for even larger cDNAs. Duplicate filters were screened with 32P-labeled cDNA 2B. This screen yielded cDNAs F-l, O-l and T-l, all of which hybridized to the oligomer and to cDNA 2B. cDNA F-l was the largest, about 1100 bp, and was chosen for further characterization. X

E

5’

I

H

I

*

W

X---W

S (E)

I

u

*-

F-l

.

03 X

I

3-l . * * *-

e

03 S

LL -

c200 bp ,

(E)

2B

-

Fig. 3. Relationship of cDNAs 2B, 3-l and F-l. The relative sizes and restriction maps of cDNAs 2B, 3-1 and F-l are shown. The thick horizontal lines denote coding regions; the thin horizontal lines represent 3’ untranslated sequences. Regions of the cDNAs that were sequenced are indicated by arrows. Asterisks indicate sequences that were determined using synthetic oligonucleotide primers. Restriction enzyme sites shown are EcoRI (E); Hind111 (H); SuZI (S); and, XhoI (X). The EcoRI sites present at the 5’ and 3’ ends of the cDNAs, which were added during the cloning procedure, arc indicated in parentheses. cDNA F-l has a defective EcoRI site at its 5’ end. Note that the lengths of the 3’ untranslated regions differ in 2B vs. 3-l and F-l.

Fig. 4. Nucleotide and predicted amino acid sequence of AC1. The sequence shown is a composite of sequences obtained from various regions of cDNAs 2B, 3-1 and F-l. The AT of the initiator ATG shown is not present in the cDNAs and was inferred from the sequence of the gene isolated from an H. contortus: XEMBL-3 phage library (manuscript in preparation). The EcoRI linkers added during the cloning process are not shown. Potential N-linked glycosylation sites are indicated with double underlines. The asterisk denotes the termination codon. The position of a potential poly(A) addition signal, AATAAA, is underlined. The solid triangle at nucleotide 1073 indicates the location of the poly(A) tail in cDNAs F-l and 3-l (see text).

The relationship of cDNAs 2B, 3-l and F-l is shown in Fig. 3, along with a composite restriction map of the cDNAs. Various regions of the cDNAs were sequenced and found to be identical in the regions in which the sequences overlapped. The one difference noted was that the 3’ untranslated region of cDNAs 3-l and F-l were shorter

30

-

+

kDa - 43

kb

-29 0.53 0.40 -

- 18 Fig. 5. Northern blot analysis of AC-l mRNA transcripts. 1.5 ug of adult worm poly(A)+ mRNA was electrophoresed on a 1.5% denaturing formaldehyde agarose gel, blotted onto a nitrocellulose filter and hybridized with a 32P-labeled pBR325 plasmid containing the approx. 1.0 kb EcoRI fragment of cDNA F-l. The size of the hybridizing mRNA is about 1.25 kb. Positions of RNA size markers are indicated on the left.

than that of cDNA 2B (Fig. 4). The composite nucleotide sequence and predicted amino acid sequence of the cDNAs is presented in Fig. 4. This gene has been named AC-l. The largest cDNA, F-l, contained a single long open reading frame but was missing an initiator methionine codon at its 5’ end; therefore, we presumed that the cDNA was not full-length. Northern blot hybridizations indicated that cDNA F-l hybridized to a 1.25-kb transcript in adult worm poly(A)+ mRNA preparations (Fig. 5). experiments using adult Primer-extension poly(A)+ mRNA and the 40-nucleotide-long oligomer from the 5’ end of cDNA 3-l indicted that cDNA F-l was about 10 nucleotides shorter than full-length at its 5’ end (data not shown). Nucleotide sequence analysis of the AC-1 gene isolated from an H. contortus:AEMBL-3 library (manuscript in preparation) confirmed this result and indicated that cDNA F-l was missing the codon for only one amino acid, the initiator methionine. For completeness, the initiator methionine has been included in the sequence presented in Fig. 4. The AC-l protein comprises 342 amino acids and has a predicted molecular weight of 38.4 kDa. At the N-terminus of the protein is a stretch of about 15 hydrophobic amino acids that could

- 14 Fig. 6. Deglycosylation of AC-1 with Endoglycosidase F. Mono Q-purified anticoagulant proteins (about 2 pg) were denatured by boiling in 1% SDSR% f3-mercaptoethanol and incubated overnight with buffer (- lane) or buffer + 1.5 units of Endoglycosidase F (+ lane). The next day the samples were electrophoresed on a 12% SDS gel, blotted to a nitrocellulose filter and reacted with Rb-10285 antiserum. Molecular weights of protein standards are shown to the right.

function as a signal sequence for sequestration of the protein to the rough endoplasmic reticulum, as a prelude to extracellular secretion or localization to cellular organelles. Computer analysis predicts that the signal sequence would be cleaved between amino acids 18 and 19 (Ala-Asp). There are no other significant hydrophobic regions in the protein. The AC-1 protein contains 16 cysteine residues, two of which are present in the presumed signal sequence and would not be present in the mature protein. The protein also contains four potential N-linked glycosylation sequences (AsnX-SeriThr, where X can be any amino acid), which are indicated in Fig. 4. Treatment of purified anticoagulant proteins with Endoglycosidase F reduces the apparent molecular weight of the AC-l protein to 33 kDa (Fig. 6), indicating that the protein is glycosylated in vivo. By Western blot analysis, the deglycosylated protein usually appeared as a dark band above a slightly smaller, lighter band, suggesting minor heterogeneity in the deglycosylated form of the protein. Although not clearly visible in Fig. 6, the 37-kDa protein

31

that reacts weakly with Rb-10285 antiserum and with eluted antibodies selected by phage 2B (see above) disappears and presumably also changes mobility to 33-kDa after Endoglycosidase F treatment, suggesting that it may be a more heavily glycosylated form of AC-l. Antisera prepared against the recombinant AC-l protein synthesized in E. coli also react with this 37-kDa protein in addition to the 35kDa protein (manuscript in preparation). Homology of AC-l

with mammalian cathepsin B.

The primary sequence of AC-1 shows significant homology with mammalian cathepsin B (human, 1 1

50 43

97 91

147 141

197 191

247 240

297 290

HKYLVLALCTYLCSQTGADENAAQGIPLEAQRLTGEPLVAY.LRRSQNLF ** **t l l *** t MWQLWASLCCLLVLAN.Ai..... RSRPSFHPVSDE.I.VNYVNKRNTTWQ signal Pro

EV.NSAPTPNFEQKIHDIKYKHQ.KLNL"VKEDPDPEVDIPPSYDPRDVW l l l l l tt** AGHNFYNVDMSYLKRLCGTFLGGPKPPQRVMFTEDLK..,LPASFDAREQW mature

KNC.TTFYIRDQANCGSCWAVSTAAAISDRICIASKAEKQVNISATDIMT t t l *** l ***** l ***,*** l l PQCPTIKEIRDQGSCGSCWAFGAVEAISDRICIHTNAHVSVEVSAEDLLT .

t*

CCRPQCGDGCEGGWPIEAWKYFIYDGWSGGEYLTKDVCRPYPIHPCGHH t* l **** l * * l *t l l *** l l *** CCGSMCGDGCNGGYPAEAWNFWTRKGLVSGGLYESHVGCRPYSIPPCEHH

GNDTYYGECRGTAPTPPCKRKCRPGVRKHYRIDKRYGKDAYIVKQSVKAI l l * l ** l ** l l *** l VNGSRPP.CTGEGDTPKCSKICEPGYSPTYKQDKHYGYNSYSVSNSEKDI

l

QSEILRNGPWASFAVYEDFRHYKSGIYKHTAGELRGYHAVKHIGWGNEN t* t*** l l* l* l *** * l t* l l * HAEIYKNGPVEGAFSVYSDFLLYKSGVYQHVTGEMMGGHAIRILGWGVEN

NTDFWLIANSWHNDWGEKGYFRIIRGTNDCGIEGTIAAGIVDTESL l *t l *** l ** l l l l* l *** l ** GTPYWLVANSWNTDWGDNGFFKILRGQDHCGIESEWAGIPRTTYWEKI

l

l

*

t

l

* **

t*t

l

**

l

*

*

Fig 7. Comparison of the predicted amino acid sequence of AC-l with human cathepsin B. The upper sequence is AC-l, the lower sequence is human cathepsin B, which is taken from [13]. Amino acid positions are indicated to the left. Dots indicate gaps that were introduced to increase similarities between the proteins. Identical amino acids in the proteins are indicated by an asterisk. The active site cysteine of cathepsin B (amino acid 108) is marked with a diamond. Arrowheads denote positions of cleavages that occur during maturation of cathepsin B. The location of the signal sequence, ‘pro’ sequence and mature enzyme sequence of cathepsin B are shown and blocked by the arrowheads. The final six amino acids of cathepsin B are not present in the mature enzyme (cleavage indicated by an arrowhead).

rat and mouse) and to a lesser extent with cathepsins H and L and with the plant protease papain [13-Z]. The overall amino acid identity between AC-l and human cathepsin B is 42% (Fig. 7). A stretch of six amino acids that includes the active site cysteine of cathepsin B (Cys-Gly-SerCys-Trp-Ala; the bold-face Cys is the active site cysteine) is conserved in AC-l (see Fig. 7). This sequence is also conserved in the plant cysteine protease, papain [14], and is present in the same relative location in all three proteases. Based upon these homologies, we predict that cysteine-114 is the active site cysteine of the AC-l protease. AC1 can be aligned for homology with mature cathepsin B by introducing only two single amino acid gaps in the proteins (Fig. 7). When these minor alignments are introduced, all 14 cysteines in the mature cathepsin B protein align with cysteine residues in AC-l, suggesting that AC-l and cathepsin B have similar tertiary structures. In addition, towards the C-termini of the proteins there is a histidine residue in AC-l (residue 285) that is in the same position as the histidine residue (278) which forms part of the active site of cathepsin B [16]. The amino acids immediately surrounding these histidine residues are not as conserved as those surrounding the active site cysteine residues (Fig. 7). Cathepsin B is synthesized as a pre-proenzyme that contains an N-terminal signal sequence, followed by a stretch of 62 amino acids that must be cleaved from the proenzyme to generate the mature, active protease [13]. The ‘pro’ region also may be involved in localizing cathepsin B to lysosomes. The positions of the above amino acid cleavages in cathepsin B are marked in Fig. 7. Most of the amino acids that are identical between AC-1 and cathepsin B are located in the region of cathepsin B that constitutes the mature, active enzyme (Fig. 7). Little similarity, other than length, exists between the ‘pre’ and ‘pro’ sequences of cathepsin B and AC-l. When just the mature form of cathepsin B is compared to the corresponding region of AC-l, the amino acid similarity between the two proteases increases to 49%.

32

Discussion The studies reported here have elucidated the primary structure of a 35-kDa cysteine protease of the parasitic nematode H. contortus. The primary sequence of the enzyme reveals that it is a member of the cathepsin superfamily of thiol proteinases. This thiol protease family includes, among others, the lysosomal proteases cathepsin B, H and L, the plant enzymes, papain and actinidin, and certain cytosolic calcium-dependent proteinases [U-17]. Members of this protease family have an active site that includes conserved cysteine and histidine residues. The various family members differ in their substrate specificities and perform a variety of cellular functions. The H. contorfz4.r enzyme is more similar in sequence to cathepsin B than to other members of the cathepsin protease family. The sequence of a cathepsin B-like protease of another hehninth, the liver fluke Schistosoma munsoni, has recently been determined [18]. The H contortus enzyme shares approximately the same level of primary sequence similarity with human cathepsin B as with the S. mansoni enzyme. The predicted molecular weight of the H. contortus enzyme is 38.4 kDa, whereas the molecular weight of the mature enzyme isolated from adult worms is 33 kDa after deglycosylation. Thus, like cathepsin B, the H. contortus enzyme appears to be synthesized as a precursor that is proteolytitally processed to yield a mature enzyme. We have not yet determined the N-terminal amino acid of the mature enzyme. The larger size of the deglycosylated H. contortus enzyme, 33 kDa, when compared to cathepsin B, 30 kDa, suggests that the H. contortus enzyme is proteolytically processed at a different site than cathepsin B. Preliminary expression studies in E. coli support this conclusion. When the H. contortus enzyme is expressed in E. coli beginning at isoleucine -87, which corresponds to the N-terminal leucine of mature cathepsin B (Fig. 7), the molecular weight of the bacterially produced H. contortus enzyme is similar to that of cathepsin B, but 3 kDa smaller than the deglycosylated H. contortus enzyme isolated from adult worms (unpublished results). The primary sequence of AC-l contains four potential glycosylation sites and Endoglycosidase

F digestion experiments indicate that the protein is glycosylated in vivo. Two of the four potential glycosylation sites in AC-l are conserved in cathepsin B and one of these conserved sites (asparagine -289) is known to be glycosylated in cathepsin B [13]. We have not determined which of the four potential glycosylation sites of AC-l are used in vivo. There appears to be some heterogenicity in the degree to which the H. contortus enzyme is glycosylated in vivo, resulting in proteins of 35 and 37 kDa. This conclusion is based upon the observations that rabbit antiserum prepared against the 35 kDa H. contortus enzyme (and rabbit antiserum raised against recombinant AC-l protein, unpublished results) react with the 37-kDa protein in anticoagulant extracts and that the 37-kDa protein also appears to be reduced to 33 kDa by Endoglycosidase F digestion. These results also are consistent with the hypothesis that the 35- and 37-kDa proteins are antigenically related, but products of distinct genes. The 35-kDa form of the protease is the predominant species that is isolated using our purification methods. Hotez et al. [19] purified a 37-kDa elastinolytic and fibrinolytic enzyme that is secreted by the dog hookworm, Ancylostoma caninum, which is another blood-feeding member of the nematode order Strongylata. In contrast to AC-l, the A. caninum enzyme is a metalloprotease. The Nterminal sequence determined for the A. caninum enzyme also is not present in AC-l. Thus, these two nematode proteins are clearly distinct enzymes. The elastinolytic properties of the A. caninum enzyme and the fact that the protein is secreted suggests that its primary functions may be to facilitate invasion of the mucosa and to disrupt blood capillaries prior to feeding [19]. At present we can only speculate as to the function of the AC-l protease in H. contortus adult worms. AC-l was identified as an enzyme capable of degrading fibrinogen during a search for potential anticoagulant proteases (R. Hageman, manuscript in preparation). Because of this property, it is possible that AC-1 could function as an anticoagulant to prevent host blood from clotting during feeding; however, we have no evidence that this is the case. It is equally possible that AC-l is a gut-associated digestive enzyme or a lysosomal protease, like cathepsin B. However,

33

the predicted p1 of AC-l (8.7-10.0, depending upon where one postulates the N terminus of the mature protein to be) is considerably more basic than the pIs determined for two lysosomal cathepsins of the free-living nematode Cuenorhabditis elegans (~1s of 4.7 and 6.8; ref. 20), suggesting that AC-l is genetically distinct from these two nematode lysosomal cathepsins. Experiments to localize the AC-l protease in H. contortus adult worms are in progress and should help clarify the enzyme’s function. If AC-1 is a digestive enzyme or an anticoagulant protease, then it would be critical for survival of H. contortus in sheep and would be a promising candidate for immunoprophylactic or chemotherapeutic control of this economically important parasite. In this regard, we have found that vaccination of sheep with Mono Q columnpurified anticoagulant extracts confers significant protection against challenge infections with H. contortus (R.J. Boisvenue, M.I. Stiff, L.V. Tonkinson, G.N. Cox and R. Hageman, manuscript submitted). Whether protection is due to neutralization of the AC-l protease by antibodies or to immunological reactions directed against other proteins in these extracts is currently under in-

vestigation. The availability of recombinant clones for the protease will allow production of the protease through recombinant methods for more defined vaccination studies and for structural analyses of the molecule. Since blood feeding is a characteristic shared by many Strongyliid nematodes, it will be of interest to determine if other blood-feeding members of this order possess a similar protease. Acknowledgements Drs. David Hirsh and Dan Rifkin originally suggested the anticoagulant vaccine strategy to us and we thank them for their continued interest. We also wish to thank Dr. Michael Milhausen for computer analyses, initial Northern blot studies, and discussions, Mr. Ervin Colestock and Mr. Andy Jackson for maintenance and collection of H. contortus worms, Ms. Sheila Baron for help with the rabbit antiserum preparation, Dr. Gene Armes for peptide sequence analyses, and Ms. Dhyan Atkinson and Ms. Carla Worland for preparation of figures and typing the manuscript. Financial support for this work was provided by Eli Lilly and Company.

References 1 Schmidt, G.D. and Rogers, L.S. (1981) Foundations of Parasitology. C.V. Mosby, St. Louis, MO. 2 Shamansky, L.M., Pratt, D., Boisvenue, R.J. and Cox, G.N. (1989) Cuticle collagen genes of Haemonchus contortus and Caenorhabditk elegans are highly conserved. Mol. Biochem. Parasitol. 37, 73-86. 3 Laemmli, U.K. and Favre, M. (1973) Maturation of the head of bacteriophage T4, I. DNA packaging events. J. Mol. Biol. 80, 575-599. 4 Young, R.A. and Davis, R.W. (1983) Yeast RNA polymerase II genes: isolation with antibody probes. Science 80, 1194-1198. 5 Benton, W.D. and Davis, R.W. (1977) Screening Xgt recombinant clones by hybridization to single plaques in situ. Science 196, 180-182. 6 Rigby, P. W.J., Dieckmann, M., Rhodes, C. and Berg, P. (1977) Labeling deoxyribonucleic acid to high specific activity in vitro by nick translation with DNA Polymerase I. J. Mol. Biol. 113, 237-251. 7 Maxam, A.M. and Gilbert, W. (1980) Sequencing end-labeled DNA with base-specific chemical cleavages. Methods Enzymol. 65, 49%560. 8 Davis, R.W., Botstein, D. and Roth, J. (1980) Advanced Bacterial Genetics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY.

9 Sanger, F., Nicklen, S. and Coulson, A.R. (1977) DNA sequencing with chain terminating inhibitors. Proc. Natl. Acad. Sci. USA 74, 5463-5467. 10 Biggen, M.D., Gibson, T.J. and Hong, G.F. (1983) Buffer gradient gels and % label as an aid to rapid DNA sequence determination. Proc. Natl. Acad. Sci. USA 80, 3963-3965. 11 Towbin, H., Staehelin, T. and Gordon, J. (1979) Electrophoretic transfer of proteins from polyacrylamide gels to nitrocellulose sheets: Procedure and some applications. Proc. Natl. Acad. Sci. USA 76, 435w354. 12 Proudfoot, N.J. and Brownlee, G.G. (1976) 3’ non-coding region sequences in eukaryotic messenger RNA. Nature 263, 211-214. 13 Chan, S.J., Segundo, B.S., McCormick, M.B. and Steiner, D.F. (1986) Nucleotide and predicted amino acid sequences of cloned human and mouse preprocathepsin B cDNAs. Proc. Natl. Acad. Sci. USA 83,7721-7725. 14 Cohen, L.W., Coglan, V.M. and Dihel, L.C. (1986) Clon-. ing and sequencing of papain-encoding cDNA. Gene 48, 219-227. 15 Wada, K., Takai, T. and Tanabe, T. (1987) Amino acid sequence of chicken liver cathepsin L. Eur. J. Biochem. 167. 13-18.

34

16 Carno, A. and Moore, C.H. (1978) The amino acid sequence of the tryptic peptides from actinidin, a proteolytic enzyme from the fruit of Actinidiu chinerub. Biochem. J. 173, 7383. 17 Ohno, S., Emori, Y., Imajoh, S., Kawasaki, H., Kisaragi, M. and Susuki, K. (1984) Evolutionary origin of a calcium-dependent protease by fusion of genes for a thiol protease and a calcium-binding protein? Nature 312, 566570.

18 Klinkert, M.-Q., Felleisen, R., Link, G., Ruppel, A. and Beck, E. (1989) Primary structures of Sm 31/32 diagnostic proteins of Schistosoma mansoni and their identification as proteases. Mol. Biochem. Parasitol. 33, 113-122. 19 Hotez, P., Trang, N., McKerrow, J.H. and Cerami, A. (1985) Isolation and characterization of a proteolytic enzyme from the adult hookworm Ancylostoma cuninum. J. Biol. Chem. 260, 7343-7348. 20 Sarkis, G.J., Kurpiewski, M.R., Ashcom, J.D., Jen-Jacobson, L. and Jacobson, L.A. (1988) Proteases of the nematode Caenorhabditis elegans. Arch. Biochem. Biophys. 261, 80-90.

Molecular cloning and primary sequence of a cysteine protease expressed by Haemonchus contortus adult worms.

We have cloned cDNAs encoding a 35-kilodalton cysteine protease that is a major component of protective extracts isolated from blood-feeding Haemonchu...
1MB Sizes 0 Downloads 0 Views