Proc. Nati. Acad. Sci. USA Vol. 87, pp. 4444-4448, June 1990 Biochemistry

Identification of an additional member of the protein-tyrosinephosphatase family: Evidence for alternative splicing in the tyrosine phosphatase domain (transmembrane protein/glycoprotein/protein phosphorylation/protein evolution)

R. JAMES MATTHEWS, ELLEN D. CAHIR, AND MATTHEW L. THOMAS* Department of Pathology, Washington University School of Medicine, Saint Louis, MO 63110

Communicated by Emil R. Unanue, March 28, 1990

ABSTRACT

Protein-tyrosine-phosphatases (protein-

-72% identical. LCA is expressed uniquely by all nucleated cells of hematopoietic origin and consists of a large, heavily glycosylated exterior domain, a single membrane-spanning region and two cytoplasmic tyrosine phosphatase domains (for review, see ref. 7). LAR is more widely expressed than LCA and the exterior domain has a structural motif similar to neural cell adhesion molecule. PTP lb and TCPTP are the two known cytoplasmic PTPases (8, 9). PTP lb was isolated and sequenced from human placenta and TCPTP was isolated from a human T-cell library, although its distribution is not restricted to cells of the hematopoietic lineage. PTP lb and TCPTP are the two most closely related PTPases and have -72% of their PTPase domain sequence in common. PTKases have been implicated in positive regulation of cell growth. Logically, therefore, PTPases may act to inhibit cell growth and conceptually are potential tumor suppressor genes or anti-oncogenes. However, recent demonstrations have also indicated that some tyrosine phosphatases may act in a positive fashion to regulate the cell cycle. Pingel and Thomas (10) have described a T-cell clone that is deficient in the expression of LCA and fails to proliferate in response to antigen, indicating that LCA is required for cell growth. This effect may be mediated by the ability of LCA to regulate tyrosine kinases as it has been demonstrated that the tyrosine kinase p56lck is a substrate for LCA (11). Furthermore, Gould and Nurse (12) have shown that in yeast, a critical tyrosine is phosphorylated in the ATP binding site of the serine/ threonine kinase cdc2, thereby blocking cdc2 activity, and thus implicating a tyrosine phosphatase in the positive regulation of cdc2. An active cdc2 is required for yeast to enter into mitosis. To identify further members of the tyrosine phosphatase family, a mouse pre-B-cell library was screened under low stringency with a human LCA cDNA probe that encodes the two tyrosine phosphatase domains. Clones were isolated and two were further characterized. We have named this molecule LRP (LCA-related phosphatase). The clones encoded an additional member of the tyrosine phosphatase family that is expressed in a wide variety of tissues.t

tyrosine-phosphate phosphohydrolase, EC 3.13.48) have been

implicated in the regulation of cell growth; however, to date few tyrosine phosphatases have been characterized. To identify additional family members, the cDNA for the human tyrosine phosphatase leukocyte common antigen (LCA; CD45) was used to screen, under low stringency, a mouse pre-B-cell cDNA library. Two cDNA clones were isolated and sequence analysis predicts a protein sequence of 793 amino acids. We have named the molecule LRP (LCA-related phosphatase). RNA transfer analysis indicates that the cDNAs were derived from a 3.2kilobase mRNA. The LRP mRNA is transcribed in a wide variety of tissues. The predicted protein structure can be divided into the following structural features: a short 19-amino acid leader sequence, an exterior domain of 123 amino acids that is predicted to be highly glycosylated, a 24-amino acid membrane-spanning region, and a 627-amino acid cytoplasmic region. The cytoplasmic region contains two --260-amino acid domains, each with homology to the tyrosine phosphatase family. One of the cDNA clones differed in that it had a 108-base-pair insertion that, while preserving the reading frame, would disrupt the first protein-tyrosine-phosphatase domain. Analysis of genomic DNA indicates that the insertion is due to an alternatively spliced exon. LRP appears to be evolutionarily conserved as a putative homologue has been identified in the invertebrate Styela plicata.

Protein tyrosine phosphorylation is an important regulatory signal in cell physiology, particularly with regard to cell growth, and is controlled by two sets of enzymes: proteintyrosine kinases (PTKases; ATP:protein-tyrosine O-phosphotransferase, EC 2.7.1.112) and protein-tyrosine-phosphatases (PTPases; protein-tyrosine-phosphate phosphohydrolase, EC 3.13.48). PTKases are notable in that they are the cytoplasmic, catalytic portion for many growth factor receptors, and the genes that encode them are potential oncogenes (1). PIPases, by contrast, have only recently been identified and are divided into two groups: transmembrane PTPases and cytoplasmic PTPases (for review, see ref. 2). PTPase domains are -'260 amino acids long with various degrees of sequence similarity, although there are certain residues that are absolutely conserved. The two subfamilies, transmembrane and cytoplasmic, differ in that all transmembrane PTPases characterized to date, have two PIPase domains, while cytoplasmic PTPases have a single domain. The known transmembrane PTPases are the leukocyte common antigen [LCA; CD45 (3, 4)], LAR (LCA related) (5), and two forms isolated from Drosophila melanogaster, DLAR and DPTP (6). DLAR appears to be the Drosophila homologue to human LAR since the PTPase domains are

MATERIALS AND METHODS Isolation of cDNA Clones. A 2-kilobase (kb) Sac II/Acc I restriction fragment from pHLC-1 (13) was used to screen 1 x 106 plaques from a Agtll library derived from 70Z/3 (14). Filters were hybridized as described except that 35% formAbbreviations: PTKase, protein-tyrosine kinase; PTPase, proteintyrosine-phosphatase; LCA, leukocyte common antigen; PCR, polymerase chain reaction. *To whom reprint requests should be addressed at: Department of Pathology, Box 8118, 660 South Euclid Avenue, Saint Louis, MO

The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.

63110.

tThe sequence reported in this paper has been deposited in the GenBank data base (accession no. M33671).

4444

Proc. Natl. Acad. Sci. USA 87 (1990)

Biochemistry: Matthews et al. amide was used and the hybridization temperature was 30'C (13). Filters were washed in 150 mM NaCl/15 mM trisodium citrate, pH 7.0, at 420C. DNA Sequencing. cDNA fragments were subcloned into Bluescript (Stratagene) and sequenced as described by oligonucleotide-directed priming using plasmid DNA as template (15). To overlap restriction fragments, fragments were obtained by using oligonucleotides to prime Thermus aquaticus DNA polymerase (Taq polymerase) (Promega) in a polymerase chain reaction (PCR). Fragments were isolated by electrophoresis on low-melting point agarose gel. PCR-derived fragments were sequenced directly without subcloning. RNA Transfer Blotting Analysis. RNA was isolated from mouse tissues by the method of Chirgwin et al. (16), transferred to nitrocellulose, and screened as described (13). Primer-Extension Analysis. RNA was hybridized to an oligonucleotide derived to the antisense of nucleotides 62-82 (see Fig. 2) that had been end-labeled by T4 polynucleotide kinase and [y-32P]ATP. Hybridization conditions were 1 M NaCl/100 mM Pipes, pH 6.4/25 mM EDTA at 520C for 14 hr. The reaction was initiated by adding 50 units of avian myeloblastosis virus reverse transcriptase (Life Sciences, Saint Petersburg, FL) to a final vol of 50 ,ul in a buffer containing 80 mM Tris HCl (pH 8.3), 80 mM KCI, 10 mM MgCl2, 0.5 mM dithiothreitol, 1.0 mM dNTPs, and 40 units of RNasin (Promega), and the mixture was incubated at 42°C for 1 hr. Fragments were analyzed on an 8% acrylamide/7 M urea denaturing sequence gel. PCR Analysis of Genomic DNA. Mouse genomic DNA (1 ,ug) was used as template to hybridize 200 ng of oligonucleotide primers in a buffer of 20 mM Tris HCI, pH 8.4/50 mM KCI/2.5mM MgCl2/0.2mM dNTPs/0.1 mg of bovine serum albumin per ml/0.1% Triton X-100. The primers were derived from the cDNA sequence 798-825 and antisense to 828-850 and from the 5' and 3' ends of the 108-bp insert inAmLRP-C26 (see Fig. 2). The reaction was initiated by 2 units of Taq polymerase (Promega) with 30 temperature cycles of 94°C for 30 sec, 55°C for 45 sec, and 72°C for 2 min. Fragments were analyzed on agarose gels or were sequenced directly after isolation on low melting point agarose. Isolation of cDNA Encoding Tyrosine Phosphatase Domains from Styela plcata. Two sets of degenerate oligonucleotides were made to highly conserved sequences found in tyrosine phosphatases. Set A was derived from the compilation of PTPase protein sequences that in LCA is DFWRMIWE (see Fig. 3 Middle) and consisted of 5'-A(T/C)TT(T/C)TGG(C/ A)(T/A/G)(T/G)ATG(A/G)T(T/C/A)TGG(C/G)A-3'. Likewise, set B was derived from the compilation of PTPase protein sequences that in LCA is HCSAGVGR and consisted of 5'-GTGAC(A/G)TC(T/C/A/G)CG(T/C/A/G)CC(T/ C/A/G)(T/C)A(A/G)CCCGC-3'. Set B oligonucleotides (200 ng) were hybridized to 2,ug of total RNA from an adult ascidian, S. plicata, and cDNA was generated by incubating 1 hr at 42°C with 200 units of avian myeloblastosis virus reverse transcriptase in a buffer of 20mM Tris HCI, pH 8.4/50mM KCI/2.5 mM MgCl2/0.1 mg of bovine serum albumin per ml/1mM dNTPs/40 units of RNasin. cDNAs encoding PTPases were amplified by adding , 2 d of the reverse transcriptase reaction mixture to 100 pA of the buffer described above plus 0.1% Triton X-100/0.2mM dNTPs/100 ng of both set A and set B oligonucleotide pools/2 units of Taq I polymerase, and 30 temperature cycles of 94°C for 30 sec, 37°C for 30 sec, and 72°C for 1 min. Fragments were isolated and subcloned by EcoRI linker addition.

RESULTS AND DISCUSSION

Nucleotide and

Predicted Amino Acid Sequence. Twenty-

four non-LCA cDNA clones were isolated from the mouse 70Z/3 pre-B-cell library by low-stringency hybridization using a human LCA probe encoding the tyrosine phosphatase

4445

domains. Two clones were initially characterized, AmLRPB20 and AmLRP-C26, and were found to be related by sequence analysis (Fig. 1). The AmLRP-B20 cDNA was used to rescreen the 24 A clones and 17 were hybridized under high stringency, indicating that they were all derived from the same mRNA. Since the library was amplified, some of these clones may be identical. The other 7 clones isolated from the library have not been analyzed. The complete sequence was derived from analysis of the two clones (Fig. 2A). The clones differed by the insertion of 108 bases in AmLRP-C26 (Fig. 2B). Interestingly, comparison of the analogous position in the LCA sequence coincides exactly at an exon/intron boundary (16). The intron in the LCA gene is 87 base pairs (bp), but unlike the insert in AmLRP-C26, it does not preserve the reading frame. To determine whether the insertion was due to an alternatively spliced exon, genomic DNA was used as a template for PCR, priming with various combinations of oligonucleotides derived from sequences 5' and 3' of the insertion and from the insertion. Fragments were analyzed by agarose gel electrophoresis and indicated that the inserted sequence was an exon with a 5' flanking intron of =300 bp and a 3' flanking intron of -500 bp (data not shown). To confirm that the inserted sequence was an exon, PCR fragments were derived from genomic DNA and sequenced directly. Sequences immediately adjacent to the putative exon contained the consensus 5' and 3' splice site sequences, AG and GT, respectively, indicating that the inserted sequence is an alternatively spliced exon (Fig. 2B). Examination of the cDNA sequence indicates one long open reading frame from nucleotide residues 1-2404. The first methionine is encoded at positions 26-28 and this site is in good agreement with the canonical sequence for translation initiation, ACCATGG (17). Therefore, the methionine at positions 26-28 is presumptively assigned as the translation initiation codon. A hydrophobic-rich sequence follows the methionine at positions 26-28 and is, therefore, likely to be a signal sequence. Using consensus rules for signal sequence cleavage sites (18, 19), amino acid 20 is predicted to be the first amino acid of the mature protein. One potential membrane-spanning region exists in the predicted protein sequence at positions 124-147. Carboxyl terminal to the potential membrane-spanning region is a sequence with high homology to PTPases and, by analogy to LCA, is likely to be on the cytoplasmic side of the membrane. An exterior domain of 123 amino acids is therefore predicted. Using the exterior domain sequence, a screen of the National Biomedical Research Foundation data base with the SEARCH and ALIGN programs (20) found no significant homology with other proteins. LRP is predicted to be highly glycosylated with eight potential sites for asparagine-linked glycosylation and many potential sites for serine/threonine-linked glycosylation. Indeed, this region is extremely rich (36%) in serines and threonines and is high (8%) in proline content. This region is LRP mRNA 1.0

SCALE (Kb)

'

CEB

gBg

RESTRICTION MAP

E

XmLRP-B20 E

X m.LR-C26

I

2.0 3.0 ''i l I'--

~

E

E

E

E

E

I

E

FIG. 1. Schematic diagram of the LRP mRNA, sequencing strategy, and the relationship of the derived cDNA clones. Shaded rectangle indicates the coding region. A partial restriction map of the cDNA is indicated. B, BamHI; Bg, BglII; C, ClaI; E, EcoRI.

4446

Proc. Natl. Acad. Sci. USA 87 (1990)

Biochemistry: Matthews et al.

4

A

M D S W F I L V L F G S G L I H V S A N N A T T V CCGCCCAGCGCCGGGCTCGGTCAGCATGGATTCCTGGTTCATTCTTGTCCTGTTTGGCAGTGGTCTAATACATGTTAGTGCCAACAATGCTACTACAGTT S P S L G T T R L I K T S T T E L A K E E N K T S N S T S S V I S

TCACCTTCTTTAGGAACGACAAGATTAATTAAAACATCAACAACAGAATTGGCTAAGGAAGAGAATAAAACCTCAAATTCMCCTCTTCAGTAATTTCTC L S V A P

T F

S

P N L T L E

P

T

Y V

T

T V N S

S

H

S

D N G

T

R

R A A

6 100 39 200 73

TTTCTGTGGCACCAACATTCAGCCCAAACCTGACTCTGGAGCCCACCTATGTGACTACTGTTAATTCTTCACACTCTGACAATGGGACCAGGAGGGCAGC S T E S G G T T I S P N G S W L I E N Q F T D A I T E P W E G N S CAGCACGGAATCTGGAGGCACTACCATTTCCCCGAACGGAAGCTGGCTTATTGAGAACCAGTTCACGGATGCCATAACAGAACCCTGGGAGGGGAACTCC

300 106

Pi

139

T A A T

S

T P

E

T F P

P A D

E

T

I

I

A V M V A L

S

S

L L V

I

V F

400

AGCACTGCAGCAACCACTCCAGAAACCTTCCCCCCGGCAGATGAGACACCAATTATTGCGGTGATGGTGGCCCTGTCCTCTCTGCTAGTAATCGTGTTTA 500 I

I

I V L Y M

Li R

F K K Y

K Q A G

S

H

S N

S

F

R L

S

N

G

R

T

E D V E 173

TTATCATAGTTCTGTACATGTTAAGGTTTAAGAAATACMAGCAAGCTGGGAGTCATTCCMACTCTTTCCGCCTGTCAAATGGCCGCACGGAGGATGTGGA P Q S V P L L A R S P S T N R K Y P P L P V D K L E E E I N R R M GCCCCAAAGTGTACCACTTCTGGCCAGGTCCCCAAGCACCAACAGGAAGTACCCACCACTGCCTGTGGACMAGCTGGAAGAGGAGATTAACCGGAGAATG A D D N K I FF R E E F N A L P A C P I Q A T C E A A S K E E N K E GCTGATGACAATMAGATCTTCAGAGAAGAATTCAACGCTCTCCCTGCTTGTCCTATCCAGGCCACCTGTGAGGCTGCCTCCMAGGAAGAAAACAAGGAAA

600 206

700 239 800 K N R Y V N I L P tY D H S R V H L T P V E G V P D S D Y I N A S F I 273 AAAACCGCTATGTAAACATCCTGCCCTATGACCACTCTAGAGTGCACCTGACACCTGTTGAAGGGGTCCCAGATTCTGATTACATCAACGCTTCATTCAT 900 N G Y Q E K N K F I A A Q G P K E E T V N D F W R M I W E Q N T A 306 TMATGGCTACCAGGAAAAGAACAAATTCATCGCTGCACAAGGACCAAAAGAAGAAACAGTGAATGACTTCTGGAGAATGATATGGGAACAAAACACAGCT 1000 339 T I V M V T N L K E R K E C K C A Q Y W P D Q G C W T Y G N V R V ACTATTGTCATGGTGACCAACCTGAAGGAGAGAAAGGAGTGTAAATGTGCCCAATACTGGCCAGACCAAGGCTGCTGGACCTATGGGAATGTCCGTGTGT 1100 S V E D V T V L V D Y T V R K F C I Q Q V G D V T N R K P Q R L I T 373 CTGTCGAGGATGTGACTGTTCTGGTGGACTAC1ACAGTACGGAAATTCTGCATCCAGCAGGTGGGCGACGTGACCAACAGGAAACCACAGCGCCTCATCAC1200 Q F H F T S W P D F G V P F T P I G M L K F L K K V K A C N P Q Y 406 TCAGTTCCACTTCACCAGCTGGCCAGACTTTGGGGTGCCTTTCACCCCAATTGGCATGCTCAAGTTCCTCAAGAAGGTGAAGGCCTGTAACCCTCAGTAC 1300 439 A G A I V V H C S A G V G R T G T F V V I D A M L D M M H S E R K GCAG5GGCTATCGTGGTCCACTGCAGTGCAGGTGTAGGGCGCACTGGCACCTTTGTTGTCATCGATGCCATGCTGGACATGATGCATTCGGAACGCAAAG 1400 V D V Y G F V S R I R A Q R C Q M V Q T D M Q Y V F I Y Q A L L ETH 473 TGGATGTATACGGGTTTGTGAGCCGGATCCGGGCCCAGCGCTGCCAGATGGTACAGACAGACATGCAGTACGTCTTCATATACCAGGCCCTTCTGGAGCA 1500 Y L Y G D T E L E V T S L E T H L Q K I Y N K I P G T S N N G L E 506 TTATCTGTATGGGGACACAGAACTGGAAGTGACTTCTCTAGAAACCCACCTACAAAAAATTTATAACAAGATCCCAGGGACGAGCAACAACGGGTTAGAG 1600 539 E E F K K L T S I K I Q N D K M R T G N L P A N M K K N R V L Q I GAGGAGTTTAAGAAATTAACTTCAATCAAAATCCAGAATGACAAGATGCGCACGiGGAAACCTTCCAGCCAACATGAAGAAGAACCGGGTTTTACAGATCA 1700 I P Y E F N R V I I P V K R G E E N T D Y V N A S F I D G Y R Q K D 573 TTCCATATGAATTTAACAGAGTGATCATTCCAGTCAAACGAGGCGAAGAGAACACAGACTATGTGAACGCATCCTTCATTGATGGATACCGGCAGAAAGA 1800 S Y I A S Q G P L L H T I E D F W R M I W E W K S C S I V M L T E 606 CTCCTACATTGCCAGCCAGGGCCCTCTTCTCCACACGATTGAGGACTTCTGGCGAATGATCTGGGAGTGGAAGTCCTGTTCTATCGTAATGCTGACAGAA 1900 639 L E E R G Q E K C A Q Y W P S D G L V S Y G D I T V E L K K E E E CTGGAAGAGAGAGGCCAGGAGAAGTGTGCCCAGTACTGGCCATCTGATGGCCTGGTGTCCTATGGAGACATCACAGTTGAGCTGAAGAAGGAGGAGGAAT 2000 C E S Y T V R D L L V T N T R E N K S R Q I R Q F H F H G W P E V G 673 GTGAAAGCTACACTGTCCGAGACCTCCTGGTCACCMACACCAGGGAGAACAAGAGTCGGCAAATCCGGCAGTTCCACTTCCACGGCTGGCCTGAGGTGGG 2100 I P S D G K G M I N I I A A V Q K Q Q Q Q S G N H P I T V H C S A 706 CATCCCCAGCGACGGCAAGGGCATGATCAACATCATTGCAGCAGTGCAGAAGCAGCAGCAGCAGTCGGGGAACCATCCCATCACTGTGCACTGCAGTGCC 2200 739 G A G R T G T F C A L S T V L E R V K A E G I L D V F Q T V K S L GGGGCAGGACGGACAGGAACCTTCTGTGCCTTGAGCACAGTCCTGGAACGTGTGAAAGCAGAAGGAATTTTAGATGTCTTCCAAACTGTCAAGAGCCTGC 2300 R L Q R P H M V Q T L E Q Y E F C Y K V V Q El Y I D A F S D Y A N F 773 G2CTGCAGAGGCCACACATGGTCCAGACACTGGAACAGTATGAATTCTGCTACAAGGTGGTACAGGAGTACATTGACGCCTTTTCAGATTATGCCAACTT2400 774 K * CAAGTGACAGGTGACAAGGCCCACAGACAGGAGAATTGCCTTTAATATTTTGTAATATTCTGTTTTTGTTAATATACCCAAAATTGTATATATCTTATAA 2500 CTGTTTTAGAAATGGCACATAGGCTTCTATTACCTGTTAGGTGGAGATTTTGTATGTAAATGTGTTAGCACTGATAGTCCTTTTCCAGTGTTTTATTGGG 2600 AAATTAAATAGTGTGATATTTGGGTTGATATAATGAAATCCTCAGCCTGGAAACTGGGCCAGATTGTTCCTTGCTTCAAATATCTTTTCCTAAAGAAGAT 2700 AAACCTAAGACTCATTCCAGGTAGCTCAGTGCCAACTAAAACAAAGCACAAAGTTCTCAGAGCTCTTGAGGAAATGGTTGTCTCCCTGTCCCCAGGCAGG 2800 CCTCTTCCCCTCCCTGTCCTGTAAATATCCCTCCCCTCTCCAGTCCACCCTCATCTCCCACCAAGATCAGCCACCTCAGGCATGGGGAGTAATGAGACCA 2900 __3000 GAGCGCCTCTCTGGCACCACAGCAGGGATCGTCAGGTAATAAACACTCTTGATTCCCTGA 3002

AA

B 4.F L S L A V S K D A V K A L N K T T P L L E R R F gcttactcaaacgatcatacaaag FTTCTCTCTTTAGCTGTGAGCAAGGATGCAGTGAAAGCACTGAACAAAACCACTCCATTGTTAGAAAGAAGGTT 4 I G K S N S R G C L S TATTGGGAAATCAAACTCCAGAGGCTGTCTCTCAG1 gtcagagaaaac

FIG. 2. Nucleotide and predicted protein sequence derived from the LRP cDNA clones. (A) Combined sequence analysis of AmLRP-B20 and AmLRP-C26. Single arrow indicates the predicted amino terminus of the mature protein. Potential asparagine-linked carbohydrate sites are indicated by superscript dots. The potential membrane-spanning region is boxed and the sites of the two tyrosine phosphatase domains are indicated by half brackets. Open arrow indicates the site of insertion of the 108 bp found in AmLRP-C26 but not in AmLRP-B20. The polyadenylylation site is underlined. (B) Genomic sequence encoding the inserted sequence found in AmLRP-C26. The intron sequence is in lowercase letters and the splice sites are indicated by half brackets and arrows. The single-letter amino acid code is used.

predicted to have many areas of random coil structure, typical of 0-linked carbohydrate sites, and is reminiscent of the amino-terminal domains of the LCA glycoprotein, a region that is also rich in 0-linked glycosylation (21). The predicted cytoplasmic domain is 627 amino acids long and contains two tandem domains with similarity to tyrosine phosphatases (Fig. 3). Comparison with other mammalian tyrosine phosphatase domains reveals that sequence patterns conserved in all tyrosine phosphatases are found in both of the LRP cytoplasmic domains. This suggests that LRP also will contain tyrosine phosphatase activity. There are regions

within the PTPase domains that are more highly conserved than others. It is likely that the conserved regions will be important for enzymatic activity while the regions that are more divergent may be involved in determining substrate specificity and regulation. The alternatively spliced exon found in AmLRP-C26 encodes a sequence that disrupts a highly conserved sequence (position 248 in LRP), ProTyr-Asp, and potentially, therefore, the inserted sequence may affect the putative tyrosine phosphatase activity. A comparison matrix of each of the mammalian tyrosine phosphatase domains using the ALIGN program (20) indicates

Biochemistry: Matthews et al. MOUSE LRP Dl MOUSE LCA Dl HUMAN LAR Dl MOUSE LRP D2 MOUSE LCA D2 HUMAN LAR D2 HUMAN PTP lb HUMAN TCPTP CONSENSUS

Proc. Natl. Acad. Sci. USA 87 (1990)

ARSPSTNRKYPPLPVDKLEEEINRRMADDNKIFREEF RDDEKQLMDVEPIHSDILLETYKRKIADEGRLFLAEF MRRLNYQTPGMRDHPPIPITDLADNIERLKANDGLKF HYLYGDTELEVTSLETHLQKIYNKIPGTSNNGLEEEF

NALPACPIQAT CEAASKEENKEK NRYVUtYVNILPYDHSRVHLTPVEGVPD

264

QSIPRVFSKFPIKDARKPH NQNKNRYVDILPYDYNRVELSEINGDAG 668 SQEYESIDPG QQFTWENSNLEV NIPKPRYANVIAYDHSRVILTSIDGVPG 1376 KKLTSIKIQNDKMRTGNLPANNMKKNRVLQIIPYEFNRVIIPVKRGEEN 557 QRLPSYRSWRTQHIGNQEE NKKKNRNSNVVPYDFNRVPLKHELEMSKESEPESDESSDDDSDS 975

YNQFGETEVNLSELHSCLHNMKKRDPPSDPSPLEAEY AATCGHTEVPARNLYAHIQKLGQVPPGESVTAMELEF KLLASSKAHTSRFISANLPC NKFKNRLVNIMPYELTRVCLQPIRGVEG MEMEKEFEQIDKSGSWAAIYQDIRHEASDFPCRVAKLPK NKNRMYRDVSPFDHSRIKLHQED

1665 63 65

MPTTIEREFEELDTQRRWQPLYLEIRNESHDYPHRVAKFPE NRNRERYRDVSPYDHSRVKLQNAE e ef

i

n p

MOUSE LRP Dl MOUSE LCA Dl HUMAN LAR Dl MOUSE LRP D2 MOUSE LCA D2 HUMAN LAR D2 HUMAN PTP lb HUMAN TCPTP CONSENSUS

kTIVMVTNLKERKECKCAQYIUP STYINASYIDGFKEPRKYI.AQGPRDETVDII MIWEQKAT'VIVXITRCEEGNRNKCAEYWPSME In KVWI ,QRTJ tTVVTRLEEKSRVKCD MYLPA TDY1WSFIDGYRQKDSYIRSQGPLLMTIEVI IWEWKSC'SIVWLTELEERGQEKcQYPS EETSKYI lSFVMSYWKPEMMIAAQGPLKETIGD11KIFQRKVK(VIVIYTELVNGDQEVC2QY1G

MOUSE LRP Dl MOUSE LCA Dl HUMAN LAR Dl

RLITQFHFTSWPDFGVPFTPIGMM THIQFTSWPDHGVPEDPHLLL QFQFMAWPDHGVPEYPTPIL QFHFHGWPEVGIPSDGKGMI

MOUSE LRP D2 MOUSE LCA D2 HUMAN LAR D2 HUMAN PTP lb HUMAN TCPTP CONSENSUS

4447

N

kNRy

pyd

Rv 1

g

DGCWTYGNVRVSVEDVTVLVDYTVRKFCIQQVGDVTNRKPQ 371 EGTRAFKDIVVTINDHKRCPDYIIQKLNVAHKKEKATGREV 775 RGTETCGLIQVTLLDTVELATYTVRTFALHKSGSSEKRELR 1481 DGLVSYGDITVELKKEEECESYTVRDLLVTNTRENKSRQIR 662 EGKQTYGDMEVEMKDTNRASAYTLRTFELRHSKRKEPRTVY 1092 SDYIIWSFLDGYRQQKAYIATQGPLAESTED~i ILWIEHNSTrIIVINLTRLREMGREKCHQYiiPA ERSARYQYFVVDPMAEYNMPQYILREFKVTDARDGQSRTIR 1770 .v WTM NDYIlTSLVDIEEAORSYILTOPTLPNTCGHNWFMOKS kGVVILNRVMEKGSLKCQYLIPQKEEKEMIFEDTNLKLTLISEDIKSYYTVRQLELENLTTQETREIL 172 NDYIIUSLVDIEEAQRSYILTQGPLPNTCCHFWI VWQQKTK(AVVMLNRIVEKESVKCAQYWPTKDQ EMLFKETGFSVKLLSEDVKSYYTVHLLQLENINSGETRTIS 173 yla QGP1 et sdYiNAks dgy dcfrY weqk iVult 1 e eg v kCaqYlp Yt r r

LKFLKKVKACNPQYAGAIVVWCSAGVGRTGTVWVIDAMLDMMHsERK KLRRRVNAFSNFFSGPIVVBCSAGVGRTGTYIGIDAMEGLEAEGK

VDVYGFVSRIRAQACQMVQTDMQYVFIYQALLE VDVYGYVVKLRRQRCLMVQVEAQYILIHQALVE

AFLRRVKACNPLDAGPMVVNCSAGVGRTGCFIVIDAILERMKHEKT VDIYGHVTCMRSQRNYMVQTEDQYVFIHEALLE NIIAAVQKQQQQSGNHPITVECSAGAGRTGTFCALSTVI.EAVKAEGI LDVFQTVKSLRLQRPHMVQTLEQYEFCYKVVQE QYQCTTXKGEELAAEPKDLVSMIQDLKQELPKASPEGMIKHASILVIDCGSQQTGFCALFNLLESAETEDV VDVFQVVKSLRKARPGVVCSYEQYQFLYDIIAS QFQFTDWPEQGVPKTGEGFI DFIGQVHKTKEQFGQDGPITVBCSAGVGRTGVFITLSIVLERMRYEGV VDMGQTVKTLRTQRPAMVQTEDQYQLCYRAALE HFHYTTWPDFGVPESPASFL NFLFKVRESGSLSPEHGPVVVECSAGIGRSGTFCLADTCI IMKRKDPSSVDIKKVLLEN}XKRMGLIQTADQLRFSYLAVIE HFHYTTWPDFGVPESPASFL NFLFKVRESGSLNPDHGPAVIECSAGIGRSGTFSLVDTCLVLMEKGDD INIKQVLLNlRKYRMGLIQTPDQLRFSYMAIIE qf ft Wpd gvP gpivvNCsaG grtC f d Le vd V R qr mvqt Qy f a e p

472 875 1580 762 1200 1871 276 274

y

FIG. 3. Comparison of mammalian tyrosine phosphatase domains. The two domains from LRP are compared with the two domains from LCA (22) and LAR (5) and the single domains from PTP lb (23) and TCPTP (9). Conserved residues are indicated by boldface type and the capital letters in the consensus sequence, and those residues found in the majority of domains are indicated by the lowercase letters in the consensus sequence. The single-letter amino acid code is used.

that LRP is more similar to transmembrane PTPases than cytoplasmic PTPases (data not shown). Within the transmembrane tyrosine phosphatase subfamily, LRP is more related to LAR than to LCA, 52% versus 43% identical residues. Also, LRP is more closely related to both LCA and LAR than LCA is related to LAR. LRP mRNA Size and Expression. RNA transfer blot analysis reveals a 3.2-kb mRNA transcribed in a wide variety of tissues (Fig. 4). The size of the mRNA is in good agreement with that predicted from the analysis of the cDNAs. To verify that AmLRP-C26 represented the most 5' clone, all 17 clones were subjected to PCR analysis using an oligonucleotide primer derived from the antisense orientation to bases 286293 and primers from either end of the A arms. Three clones yielded fragments the same size as AmLRP-C26; two were derived from the same A arm, while the third was derived 1

28

2

3

4

5 6 7

8

9

-

23s

18s

-

16s

FIG. 4. RNA transfer blot analysis of LRP mRNA from various tissues. RNA was separated on a 1% agarose formaldehyde gel, blotted onto nitrocellulose, and probed with a 1.7-kb EcoRI fragment from AmLRP-C26. All lanes were loaded with 5 ,g of total RNA with the exception of the 70Z/3 pre-B-cell RNA, which was loaded with 1 ,ug of RNA isolated by oligo(dT) affinity chromatography. RNA was from the following sources: lane 1, 70Z/3 pre-B cell; lane 2, brain; lane 3, kidney; lane 4, liver; lane 5, lung; lane 6, spleen; lane 7, thymus; lane 8, bone marrow; lane 9, lymph node. Bacterial and mouse ribosomal RNAs were used as markers. mouse

from the opposite arm and, therefore, must be a different cDNA clone from AmLRP-C26. However, sequence analysis showed that the 5' end of this new clone, AmLRP-D25, ended exactly in the same position as AmLRP-C26. To identify the site of transcription initiation, primer-extension analysis was performed (data not shown). The derived fragment extended 123 bases beyond the 5' end of AmLRP-C26. This would account for the small size difference between the cDNA sequence and that predicted from RNA transfer analysis. Isolation of a cDNA from S. plicata with High Homology to Mouse LRP. The evolutionary conservation of tyrosine phosphatases was investigated by isolating cDNA encoding tyrosine phosphatases from the protochordate S. plicata (sea squirt). cDNAs were isolated by priming RNA isolated from soft tissue with degenerate oligonucleotides derived from conserved tyrosine phosphatase sequences and synthesis with reverse transcriptase. This material was used to prime a PCR and cDNA fragments were subcloned and sequenced. A number of cDNAs were found to be similar to tyrosine phosphatases; one was very similar to LRP (Fig. 5). Although the analysis only compares 107 amino acids, there were 63 identical residues. This level of identity is significantly higher than between any other tyrosine phosphatase domains from transmembrane proteins and is on the order of similarity between human LAR and Drosophila DLAR (9). It is noteworthy that this region includes an area that is highly divergent between PTPases. It is likely, therefore, that this represents the Styela homologue of LRP and indicates that, similar to LCA (24) and LAR (9), LRP is also highly conserved in evolution. Accordingly, it appears that the function of tyrosine phosphatases is under evolutionary pressures that disallow extensive divergence, most likely reflecting a critical importance in cell physiology. Structure of Transmembrane Tyrosine Phosphatases and Implication for Function. The external domains of the trans-

MOUSE LRP DOMAIN 2 WKSCSIVMLTELEERGQEKCAQYWPSDGLVSYGDITVELKKEEECESYTVRDLL STYELA S A V N ILTF E FLSVN S D SE FK

649

MOUSE LRP DOMAIN 2 VTNTRENKSRQ--IRQFHFHGWPEVGIPSDGKGMINIIAAVQKQQQQSGNHPITV 702 Y STYELA S RSD NGS LIVK A VS YS ELVED N VI

FIG. 5. Comparison of predicted amino acid sequences between a S. plicata cDNA, derived by PCR, and sequence from the second domain of LRP. Identical residues in the S. plicata sequence are not shown.

4448

Biochemistry: Matthews et al.

membrane PTPases appear to be fundamentally different than the growth factor receptor PTKases. LAR, DLAR, and DPTP have structural motifs similar to neural cell adhesion molecule, suggesting that they may have similar adhesion properties. LCA is highly glycosylated and the cell typespecific variation in LCA external domain, which is due to alternative splicing, gives rise to changes in the size of the O-linked carbohydrate region, arguing that the carbohydrate moieties will be important in function. Similar to LCA, the external domain of LRP also is predicted to be heavily glycosylated; however, the LRP external domain does not contain any cysteine residues and, therefore, does not contain any disulfide bridges. LRP is a candidate molecule for presenting carbohydrates for carbohydrate-lectin interactions. It appears that the transmembrane PTPases so far characterized are distinct from the transmembrane PTKases and, rather than binding soluble ligands, may be involved in cell-cell interactions. We thank Casey Weaver for comments on the manuscript, Edwin Flores for RNA preparation, Matt Smith for help with figures and computer analysis, and our colleagues for helpful suggestions and criticism. This work was supported by Grant AI 26363 from the U.S. Public Health Service and by grants from the Council for Tobacco Research. R.J.M. is the recipient of a Cancer Research Institute Fellowship and M.L.T. is the recipient of an Established Investigator Award from the American Heart Association. 1. Hunter, T. & Cooper, J. A. (1985) Annu. Rev. Biochem. 54, 897-930. 2. Hunter, T. (1989) Cell 58, 1013-1016. 3. Thomas, M. L., Barclay, A. N., Gagnon, J. & Williams, A. F. (1985) Cell 41, 83-93. 4. Tonks, N. K., Charbonneau, H., Diltz, C. D., Fischer, E. H. & Walsh, K. A. (1988) Biochemistry 27, 8696-8701.

Proc. Natl. Acad. Sci. USA 87 (1990) 5. Streuli, M., Krueger, N. X., Hall, L. R., Schlossman, S. F. & Saito, H. (1988) J. Exp. Med. 168, 1553-1562. 6. Streuli, M., Krueger, N. X., Tsai, A. Y. & Saito, H. (1989) Proc. Natl. Acad. Sci. USA 86, 8698-8702. 7. Thomas, M. L. (1989) Annu. Rev. Immunol. 7, 339-369. 8. Charbonneau, H., Tonks, N. K., Walsh, K. A. & Fischer, E. H. (1988) Proc. Natl. Acad. Sci. USA 85, 7182-7186. 9. Cool, D. E., Tonks, N. K., Charbonneau, H., Walsh, K. A., Fischer, E. A. & Krebs, E. G. (1989) Proc. Natl. Acad. Sci. USA 86, 5257-5261. 10. Pingel, J. T. & Thomas, M. L. (1989) Cell 58, 1055-1065. 11. Ostergaard, H. L., Shackelford, D. A., Hurley, T., Johnson, P., Hyman, R., Sefton, B. M. & Trowbridge, I. S. (1989) Proc. Natl. Acad. Sci. USA 86, 8959-8963. 12. Gould, K. L. & Nurse, P. (1989) Nature (London) 342, 39-45. 13. Ralph, S. J., Thomas, M. L., Morton, C. C. & Trowbridge, I. S. (1987) EMBO J. 6, 1251-1257. 14. Ben-Neriah, Y., Bernard, S. A., Paskind, M., Daley, G. Q. & Baltimore, D. (1986) Cell 44, 577-586. 15. Johnson, N. A., Meyer, C. M., Pingel, J. T. & Thomas, M. L. (1989) J. Biol. Chem. 264, 6220-6229. 16. Chirgwin, J. M., Przybyla, A. E., MacDonald, R. J. & Rutter, W. J. (1979) Biochemistry 18, 5294-5299. 17. Kozak, M. (1984) Nucleic Acids Res. 12, 857-872. 18. Von Heijne, G. (1983) Eur. J. Biochem. 133, 17-21. 19. Watson, M. E. E. (1984) Nucleic Acids Res. 13, 5145-5164. 20. Dayhoff, M. O., Barker, W. C. & Hunt, L. T. (1983) Methods Enzymol. 91, 524-545. 21. Jackson, D. I. & Barclay, A. N. (1989) Immunogenetics 29, 281-287. 22. Thomas, M. L., Reynolds, P. J., Chain, A., Ben-Neriah, Y. & Trowbridge, I. S. (1987) Proc. Natl. Acad. Sci. USA 84, 5360-5363. 23. Charbonneau, H., Tonks, N. K., Kumar, S., Diltz, C. D., Harrylock, M., Cool, D. E., Krebs, E. G., Fischer, E. H. & Walsh, K. A. (1989) Proc. Natl. Acad. Sci. USA 86,5252-5256. 24. Matthews, R. J., Pingel, J. T., Meyer, C. M. & Thomas, M. L. (1989) Cold Spring Harbor Quant. Biol. 54, 675-682.

Identification of an additional member of the protein-tyrosine-phosphatase family: evidence for alternative splicing in the tyrosine phosphatase domain.

Protein-tyrosine-phosphatases (protein-tyrosine-phosphate phosphohydrolase, EC 3.13.48) have been implicated in the regulation of cell growth; however...
1MB Sizes 0 Downloads 0 Views