Proc. Nail. Acad. Sci. USA Vol. 89, pp. 9804-9808, October 1992 Biochemistry

Cloning and expression of chicken erythrocyte transglutaminase (protein-glutamine -glutamyltransferase/mRNA expresdon/erythroid development)

N. WERAARCHAKUL-BOONMARK, J.-M. JEONG, S. N. P. MURTHY, J. D. ENGEL, AND L. LORAND Department of Biochemistry, Molecular Biology and Cell Biology, Northwestern University, Evanston, IL 60208-3500

Contributed by L. Lorand, July 29, 1992

We report the sequences of cDNAs encoding ABSTRACT chicken erythrocyte transglutaminase (EC 2.3.2.13). The complete mRNA consists of 3345/3349 nucleotides and predicts a single open reading frame. Nine peptide sequences derived from partial digests of the isolated protein agreed with the corresponding translation of the open reading frame. Approximately 60% identities between the avian protein and three related mammalian enzymes were found. Chicken erythrocyte transglutaminase mRNA is most abundant in red blood cells and kidney, and it accumulates during erythroid cell differentiation.

MATERIALS AND METHODS

Cytosolic transglutaminases (protein-glutamine y-glutamyltransferases, EC 2.3.2.13) were first described by Waelsch and collaborators in the 1950s (1) and were assumed to be primarily involved in the metabolism of amines (such as histamine) by conjugation to proteins (2). These Ca2+dependent thiol enzymes are widely distributed in vertebrates and are also present in a number of invertebrates [e.g., sea urchin egg, Homarus hemocyte, Limulus amoebocyte (3), Homarus muscle (4), and marine sponge cells (5)], some of which served as a rich source for their purification. The intracellular function of transglutaminases is now thought to be the posttranslational crosslinking of proteins, triggered perhaps by a substantial increase in free Ca2+ (approaching 0.1 mM), the appearance of an activating metabolite, or the removal of an inhibiting substance (6). Significant amounts of N6-(-y-glutamyl)lysine, the product of the transglutaminasemediated crosslinking of proteins, were identified in a variety of cellular structures [e.g., in membrane skeletal polymers in human red blood cells (7, 8), in the cornified envelope of human keratinocytes (9), in polymers from human lens cataracts (10), and in apoptotic bodies of degenerated liver cells (11)]. Membrane-bound variants of intracellular transglutaminases have been identified in keratinocytes (12). Secreted forms of transglutaminases participate in the clotting of seminal vesicle secretory proteins of the prostatic fluid of rodents (13-16) and in the clotting of Homarus and Limulus blood (3, 17, 18). One of the subunits (designated A) of the factor XIII zymogen (fibrin-stabilizing factor) circulating in human plasma also belongs to this class of gene products (19, 20). Transglutaminase was shown to become activated in sea urchin eggs soon after fertilization (21, 22) and in A431 epidermal carcinoma cells exposed to epidermal growth factor (23), and it is expressed in induced murine erythroleukemia cells well before the appearance of hemoglobin (24). This paper deals with the identification, isolation, and sequence analysis of cDNAs encoding chicken red blood cell transglutaminase* and shows that the mRNA for this protein is present only in very low amounts in embryonic erythroid cells or in retrovirally transformed erythroid progenitor cells but increases significantly during erythroid cell maturation.

Purification and Peptide Sequencing of Chicken Red Blood Cell Transglutaminase. Erythrocyte transglutaminase was purified from chicken blood (collected in heparin; Pel-Freez Biologicals) to apparent SDS/PAGE homogeneity by David Schilling, using slight modifications of the procedure described for human red cells (25). To obtain partial peptide sequences, purified protein was digested with 100:1 weight ratios ofeither Staphylococcus aureus V8 protease, endoproteinase Lys-C (Boehringer Mannheim), L-1-tosylamido-2phenylethyl chloromethyl ketone (TPCK)-treated trypsin (Worthington), or 7-amino-1-chloro-3-tosylamido-2-heptanone ("Na-p-tosyl-L-lysine chloromethyl ketone," TLCK)treated a-chymotrypsin (Sigma). Some of the peptides were purified by reverse-phase (C3 column) HPLC. Other fragments were separated by SDS/PAGE (26) and were electroblotted onto poly(vinylidene difluoride) transfer membranes (27) prior to sequencing (Applied Biosystems model 177A; Northwestern University Biotechnology Facility). Screening of cDNA Libraries. Two types of cDNA libraries were used to isolate the chicken erythrocyte transglutaminase cDNA clones. The BV4 cDNA library in Agtll was derived from poly(A)+ RNA isolated from a pool of 11-dayold chicken embryos (28), whereas the B21 cDNA library in AZAPII vector was derived from poly(A)+ RNA isolated from erythroid cells of 13- to 14-day-old B21 chicken embryos (29). Recombinant cDNAs were identified by immunoscreening (30) of the BV4 library with a rabbit antiserum raised against purified chicken erythrocyte transglutaminase. After treatment with dithiothreitol, the protein was subjected to SDS/PAGE, and rabbits were injected with the gel slice corresponding to a band of Mr 78,000 (using Freund's complete adjuvant for initial injection and Freund's incomplete adjuvant for subsequent injections). The antiserum recognized chicken erythrocyte transglutaminase preferentially in both the native and the denatured form; it crossreacted with human red cell transglutaminase and with guinea pig liver transglutaminase but did not recognize human factor XIII subunit a or human erythrocyte membrane protein 4.2. Positive clones were plaque-purified, and phage DNA was isolated (31). The BV4 library was then screened successively by plaque hybridization using as probes either the EcoRI-Sac I [297 nucleotides (nt)] cDNA fragment of clone 27c (Fig. 1), the 54 nt of DNAsynI (5'-GAATTTGGGGTTGATGTCCAGCATCTCGAGGCAGATGGCCAAGATCTCATCTTC-3') complementary to nt 1034-1087 in the compiled cDNA sequence (Fig. 2), or the EcoRI-Kpn I fragment (374 nt) from clone NW1. The B21 library was screened using the synthetic 21-base oligonucleotide TG1 (5'-GTCTCCAGCACCAGCTCTTCG-3', complementary to nt 484-504 in the cDNA sequence; Fig. 2) and the EcoRI-Kpn I fragment (374 nt) as probes. The positive clones derived from hybridization screening were amplified by PCR using an internal cDNA

The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Abbreviations: aa, amino acid(s); nt, nucleotide(s). *The sequence reported in this paper has been deposited in the GenBank data base (accession no. L02270).

9804

EcoRI

Kpnl 1

5.1

1

Xhol 1

BamHl

I

Ball

Sad

==

3

Ckrbctg

Ball

EcoRI-Sacl (297 nt)

I kb 4-

4-

4-

-

18 TACTGTGAGACGCAGGCTAAGGAGAATGCCTACGATCTGGAGGCCAACCTGGCTGTGCTGAAGCTGTACCAGTTC 93 AACCCCNCCTTCTTCCAGACCACAGTGACGGCGCCAGATCCTGCTGAAGGCCCTCACCAACCTCGCCCCACACTGA 168 CTTCAGCCTTTGCAAGTGCATGATCGACCAGGCCACCAGCAGGACGGCCATCCGCCAGATCCTGTACCTGGGGGA 243 GCTGCTGGAGACGTGCCACTTCCAGTCCTTCTGGCAAGCTCTGGATGAGAACATGGAGCTGTTGGATGGGATTGC 319 TGGTTTTGAGGACTCTGTCCGAAAATTCATCTGCCACGTGGTGGGTATCACATACCAACACATCGACCGATGGCT 393 GCTGGCTGAGATGCTGGGGGACCTCTCAGAGGCACAGCTGAAGGTGTGGATGAGCAAAT ATG GCT GGA CGG M

27 C 464

4-

521

-

DNAsynl

154

nt)

D2040

.

4--..

(21

----b

P

GAG

CGC

NW1 0

ACC T

ATG

G SAT

GGC

CGC

N

G

GAC

GGG

578

CGG

R

R4

GGG

G7G

G

V

692

GGC

ACC T

AGG

G

749

GTG

TTG

GAG

GAG

V

L

0

0

806

GGC

CGC

863

AGC

nt)

1

CCG

G

635

EcoRI-Kpnl (374 nt) TG1

GGA

E 4,-

4-

CGGCAACGGTGGAGCGC

1

EcoRI

I I

A

CGC

R TTG

GGG

GAG

G

92G

NW3

977

GTG

TAG

CCC P

AMA CTC

GAG

KI

L

TCG CAC

R

CTG

CCC

GAG

CCC

V

G

R

GAG

GTC

GGG

L

G

F

.*---- TG6 121 nt)

L

V

L

GAG

GCC GMA GAG A E E CAC CGG ACG

GAG

GAG

A7G

E

R

T

E

E

M

ATC

ACC

M

TTC

ACC T

I

T

L

TTC

GAC

GTG

GAG

A

F

D

V

E

TTC

ACC

F

T

TAC

E 1091

CCC

GAG

1148

R GTG

AIG

N

AAT

GAG

N

H

TAT

SAC

L

TiC

GCC

CTG GTG L

GGA

CGG

GGC

TAG

V GAG

G

R

G

Y4

E

E

TGT

CCC

GTT

GAG

ACG

TCA

C

P

V

E

T

GAG

GAG

GGG

ACC

TCG

CTC

T14C

TCC

L

C

TGG

AGT

GCA

AGC

ATT

G

A

T

L

C

V

CCC

CTG

ACG

CTG

GAG

GCC

TAC

CAG

L

T

L

E

A

S

T

G

Q

G

TTG

GTT

CTG

CTC

TTC

AAT

GCC

TGG

GAG

CCA

GAG

SAT

F

V

L

L

F

N

A

W

H

P

E

CGT

GAG

TAG

GTG

CTG

GAG

GAG

V

L

E AGC

TCC

AGA

TGC

GTG

GAG

ATG

TGT

C

L

E

ATA

I ATG

TGC TCC CGT CGC C S R R AAT

V

N

C

N

CAG

SAT

CGC

ATG

ACC

MAT

GAG

N

ATC

GCA

M

A

GAG

GTC

ATC

CCC

TGG

AAC

W6

N

F

ATC

AAC

CCC

N

P

K

TAT

ATT

CGC

CCT

GTC

TIC

V

TCC 136 GGA A

F CTC

F

L

ATC

CCC

G

L

193

202

ASS

G75

R

V

231

W6 ATE

250

CCC

R

L

AGC GTG S

174

TTT

Q TTC

G

MAA

155

GGG CTC

G

V

W6

GGC

GAG

P

98

117

L

GAG

P

P TGG

TGC

GGG

A

P

C TCGGGTCCTGCT CCC

CCA

TTG GGC

CCC

L

GASGSACGAG

G

TCC ACG

60 79

CCC

7

41

GAG

A CCC

P

CGG

V

F

CCC

22

GTG

GGG

N

GTA

7CC

ASS

GAG

G

3

GCC C

C

AAC

TGG AGG

GAG

P

L

TGC

GTC

E

AAG AGS

1262

SAC

269

V

AAG4A~ ::C4?>9l 288

1319

1376

OCT

ICC

ITS

F

A

A

V

A

C

AAC

TCT

CCC

GAG

A

H

ACG

AAC

1433

ASG

GAC

1490

TGC

E GTC

TAG

N

ACG

CCC

CAG

1604

1661

1718

1775

1832

1889

1946

GCG CTG A

L

CC?

OTT

P

V

C GAG

AGO

P

V

K

GAG

ACC T

SAGCGIG rGC AGG

R

R

GIG

GCT

P R CCCCA GAGCASAM

CCC

GAG

K

A

7CC

CTG

C

L

AACGGSC

AAC

N

7CC

AC

SAG

CTG

GSAC

CGA

L

A

SIC

GTG

ASC

CAA

P

CTG

GTG

ATC

L

V

ATS

ATC

CGT

SCA

P

G

STT V

CGT

CGA

GTG

L

CTC

GAG

L

H

ATG M

GTC

TTC

GCT

7TC

V

F

CAC

SAC

CGC

TCC GCC C

GAG

CTG CGT

2117

CTC

TAT

L

2174

CTC

2231

CGA

F

I

6CC

0CC

CTG

ACT

GAS

TAT

CCA

GAS

ATC

N

P

GTC

GCA

SAC

V

A

GTC

CTC V

V

2402

CTC

SAC

CGTCGAG L

H

L

GAG

V

AAC

SAG

GAO

TAG

CMA GGC

CTC

L

N

SAG

AAC

SOC

GAS

GTC

N

P

H

CTG

6CC

P

N

L

D

V

V

A

ATT

GGG

SAC

CCA

GAG

ACT

GAG

ATG GMA CGA

CCC

AAC

6CC

SAC

Q

D

GTO

L

V

N

P

GCC

GGC

CTC

ACT

GAS

GAS

ASS

GAG

0CC

AAG

ATG

SAC

A

K

TTC CGG F R GAS

GAG

CGT L

N

GCA SAGCGCA A

TTC

GAG

TGC

F

H

C

CTC

MGC

SAC

CTG

F

V

6CC

F

K

CTG

6CC

GTC

F

V

SAG

CTC

TCT

SAG

COT

SAG

CCC

7CC

CGA

CAG

SAG

CCC

440

459

GAA TTC

L

R MAG

C

F

CCC

TTG CG1C ATC L R CTG GTG GCT

P

G67

V

GTC V

SAC

CCC

K

TAG

Y

I

AAC

7CC

TTC

N

C

F

K

SAT

GAG

ICA

R GGA GTG

Q MCG

ICC

V

L

K

592

611

630

CCT P

CCC

P

554

ATC

R

GAS

516

535

AAG CTG

E

SAGCGTT

497

573

V

649

666 TAG

Y

AAC GTC ATC ATC GCA CCC CTG CCC AAG TGA GGCCCCCCCGAGCCCCCACCCTGCTCCAGCC P K N V A P L R 2580 CTGGGCM7OGCTGCGAAACAAAGCCATAAGCCTTAGCCCMACCTGCACCMSCCGCATCCSACCCCGCACCTCCG 2655 ACTAOCCCACTG4CCACCACAGCCCCTCTCCATCACTGCCACTCCCAGCCGGCCGGGGGACAGTGACMACTGTGGTAC 2730 TGCAGAC9TGCCACCGGCTGAGCCATGCCTTC7CCTCCTCTCCCTCCCCATGGACCTCGACGCAIGCTCCAGCTCGO 2805 GGCCCTC CTGCAGCASGCCAGCAGAMCCGTCOGTCATTTCTGCAGCTCTAAACACGCCGOCCCCGACGCAGAGCAC 2680 9TAACGCAMACATAGAMAGACAATCTTCTCCTGCACCCGATGGCGCAGCAGCAGCACTCATGCCTGCCAG 2955 TTAAAATGCTCTGAATGCAATTTCCTAGAGAAMCATCGTATACTGTGAGCACGMAGCTGTTTATATGCTATATA 3030 CACATATAIGATAMATCTATTTATAGCTCTATMAATACATACTGCCGAGAACCCTGCTGCATAGGTMSGCAGGTT 3105 67771GCTTTMTAACTGTTCCTGTCACCGCAGAGATCCTGCCCTGAGGAATGCGACATTGCCTTGATGCGAATTA 3180 ACTATGCAAGCACGT ACCTCTATTTTTCACTTTTTAMAGCAAAAAMAAAAAAAAAAAA~k

2516

421

MCG

TTC

L

L

383

402

CGT

7CC

TCC

K

CCC

364

478

ATC

MAC

345

MSG

F

C

GTC

326

GCC

V

N

0

TTC

ACC

K

R

ATC

TIC

L

ATC

L

A

K

R

SAC

AAC

SAG

A7GCAGA CTC

P

K

T

M

P

L

N

ACC

3147

CTG

Y AAC

V

P

L

V

6CC

MAC ATG

GTG GCC

GTC

L CCA

GT

TAG

V

W

CCGC AGA

0

GAG

CCC

GTC

TACGSAT GGG TIC CAA Y TIC TCC CCG GCCC

P

CAC

SAC

V

TGG

R

V

CTG

ATG

SAGC 14CC

CCC

ATC

A

K

AAC

A

R

CMA GCA

MSG CGT

CCC

G

ACG

R

Y C C TACGSAC ATC CCC Y I P CGA ACT GACGGST

OCT

AGG R AAC

ATC

CCC

CAC

K

MAC

TAG

Y

K

I

7CC

N 6CC

Y

P

V

K

L

Y

L

2345

6CC

L TAG

AAC

L

AGC

A

AAC CTC

L

SAGCAGA

E

L

ATC

A

CTG

ATC

R

CCC

R

A

GIG

CCC... A'GC' "CIT "GT'G"

V

M

K

P

MSG GCC ATC AAG CAG

2060

2459

CCC

SAC

ATG

ACT

GCG GAC CTC MAT CCT GACGCTG GTG TAG TOO ATC GTT V W V F A N A V V D AAGAC ATC AGC MCG AAG ASC ACC GAG 7CC TEA GTG GTG SGCG H V V K N K K AGG GAG AGC CGC GAG GAG ATC ACC GAG ACC TAG AAG TAT CCC P H R GAO CGA SAG GTG TTC AGC MSG GCG GAG GACGSAC AAG AGC 7CC V F K A H K1

L

2298

ATC

TIC

V

1.547

GCG TGC

TTC

Y

2003

sequence and a primer within the A vector. Clones that contained the longest 5' inserts were then plaque-purified and further characterized. DNA Sequencing. All independent cDNA clones were subcloned into pGEM-3Z(+) or pGEM-7Zf(+) (Promega) or pBluescript SK(+) (Stratagene). Plasmid DNA was sequenced (32) using 2'-deoxyadenosine 5'-[a[35S]thio]triphosphate and Sequenase (United States Biochemical). Occasionally the Taq polymerase sequencing system (Promega) was employed. For three clones (D2040, NW1, and NW3), nucleotide sequence determinations were derived from both DNA strands by using a combination of random deletion subcloning and synthetic oligonucleotide priming. Conveniently located restriction sites (Fig. 1) were used to prepare most subclones. Additional subclones were prepared using exonuclease III (33). Three other clones (27c, 30, and 31a) were sequenced for their full length, but only on one DNA strand. The nucleotide sequences were assembled and analyzed with the IBI/Pustell sequence-analysis software. Northern Blot Analysis. Samples of poly(A)+ RNA (2 ,ug) from a variety of chicken tissues and cell lines were denatured, electrophoresed in a 1.3% agarose gel containing 0.2 M formaldehyde (34), and transferred to a nitrocellulose membrane (GeneScreenPlus, DuPont). The blot was hybridized with a 32P random-labeled EcoRI-Sac I cDNA fragment (297 nt) from clone 27c and washed as described (35). The blot was stripped and rehybridized to a mixture of cDNA probes encoding the chicken ,B-actin and erythrocyte band 3 proteins in order to determine the integrity, size, and relative amounts of the mRNA samples. The band 3 hybridization signal was used to check the degree of erythroid cell contamination of the various nonerythroid chicken tissues. Primer Extension Analysis. The 21-mer TG6 (5'-GAACTGGTACAGCTTCAGCAC-3'), complementary to nt 72-92 of

SAG

AAC CTG

GCC ATG A

V

1205

CGA

D

AGC

G

CTG CAG

TCT

R SAG

GIG

GAG

GAG

A

ATG

GAT

C

CTC

L

TAG

TGC

T

T

14CC

ATC

ACG

E

ACG

TTG

K

ACC

AAC

ACG

GAG

L

Y GAG

T75

GTG C7G

GCC

SAT

GGC

L

CTG

F

ATG

AAG

Y

GTG

GCC

GMAGA GAG GAS A E TCC ASS SAG TAG

CTG

V

:1034 GMA

FIG. 1. Restriction map and sequencing strategy for the chicken red blood cell transglutaminase (Ckrbctg) cDNA clones. The shaded box represents the coding region, and the unshaded boxes show the 5' and 3' untranslated regions. Originally, clone 27c was isolated by immunological screening (30) of a Agtll cDNA library prepared from 11-day-old chicken embryos (BV4; ref. 29) using an antiserum raised in rabbits against purified chicken red blood cell transglutaminase. D2040 was isolated by using an a-32P random-labeled EcoRI-Sac I (297 bases) cDNA fragment of 27c as probe; NW1 was isolated using a y-32P end-labeled oligonucleotide, DNAsynI (54 nt), as probe. NW3 was isolated from a AZAPII cDNA library (B21; ref. 29) prepared from erythroid cells of 13- to 14-day-old chicken embryos by using an a-32P random-labeled EcoRI-Kpn I (374 nt) cDNA fragment of NW1 and a y-32P end-labeled oligonucleotide, TG1 (21 nt), as probes. Solid arrows indicate the extent and direction of DNA sequencing for each strand of the individual cDNA clones; dotted arrows indicate the relative position and orientation of synthetic oligonucleotides. Restriction sites employed for subcloning and DNA sequencing are indicated. kb, Kilobase.

9805

Proc. Natl. Acad. Sci. USA 89 (1992)

Biochemistry: Weraarchakul-Boonmark et al.

687

COG

FIG. 2.

697

Nucleotide and deduced amino acid sequence of chicken

red blood cell

transglutaminase

cDNA clones

(Fig. 1)

were

cDNA.

Sequences of the overlapping by the dideoxynucleotide

determined

technique (32). Nucleotide residues are shown in the 5' to 3' beginning at the 5' end of recombinant NW3 (Fig. 1). The sequence reveals a single open reading frame of 2094 nt [698 amino acids(a) flanked by 451 nt at the 5' end and by 700 nt at the 3' end. The stop codon is indicated by an open circle. The consensus polyadenylation signal AATAAA is located 26 nt 5' to the poly(A) tail. The pentapeptide sequence containing the active-site Cys is shaded. Matching amino acid sequences of peptides isolated from partial proteolytic digests of purified chicken red blood cell transglutaminase (Materials and Methods) are underlined. A peptide isolated from the endoproteinase Lys-C digest of the protein (PNLHGPEILDVP), and three other peptides obtained in very low yields, did not match the cDNA-derived sequence. orientation

the chicken transglutaminase cDNA, was 32P-end-labeled by polynucleotide kinase. RNA from MSB-1 lymphoid cells (36) or adult chicken (definitive) reticulocytes was hybridized to

9806

Proc. Natl. Acad. Sci. USA 89

Biochemistry: Weraarchakul-Boonmark et al.

this oligonucleotide, and cDNA was synthesized by avian myeloblastosis virus reverse transcriptase (37). To accurately assess the position of the mRNA cap site (the 5' end of the primary transcript), the primer extension products were coelectrophoresed in a DNA sequencing gel directly adjacent to a "ladder" produced by dideoxy sequencing of clone NW3 with the same primer.

RESULTS AND DISCUSSION Isolation of Chicken Transglutaminase cDNA Clones. Immunoscreening with a rabbit antiserum raised against chicken erythrocyte transglutaminase allowed the initial identification of 20 clones in the BV4 cDNA library, and these were plaque-purified. Three clones (27c, 30, and 31a) harboring the largest fusion proteins (on Western blots) were further analyzed and found to have overlapping nucleotide sequences (Fig. 1). Clone 27c contained an insert of 1704 nt, whereas clones 30 and 31a contained overlapping inserts of 270 and 250 nt (data not shown). The 1704 nt of 27c represented a single open reading frame followed by a stop codon and an untranslated sequence of 678 nt at the 3' end. This clone was confirmed to be an authentic cDNA segment encoding chicken erythroid transglutaminase by matching to peptide sequences derived from a partial digest of the purified protein. A DNA fragment (297 nt, EcoRP-Sac I; Fig. 1) corresponding to the 5' end of clone 27c was then used to rescreen the same library, resulting in the identification of four more positive clones. One of these (D2040) contained 3446 nt with an open reading frame from nt 1219 to nt 2746 (Fig. 1). This clone encoded 509 aa, including the conserved pentapeptide around the active-site Cys (38) for transglutaminase (aa 283-287 in Fig. 2) as well as all the 3' sequence encoded within 27c. Eight peptide fragments from the purified enzyme [aa 189-199 (chymotrypsin digest), 250-270 (trypsin digest), 391-409 (S. aureus V8 protease digest), 414-424 (chymotrypsin digest), 438-456 (endoproteinase Lys-C digest), 465480 (chymotrypsin digest), and 472-491 and 658-669 (endoproteinase Lys-C digest) as indicated in Fig. 2] matched with segments of the conceptually translated nucleotide sequence. A sequence of 1218 nt at the 5' end of this clone (Fig. 1; dashed line on the map) was subsequently found to represent an unspliced intron of 647 nt, as well as additional 5' cDNA coding sequence of 571 nt. After screening of the same library with a 54-base oligonucleotide (DNAsynl, Fig. 1), two additional clones were obtained of which the longer (NW1) was partially sequenced (a total of 1250 nt; of these, 1244 nt matched the 5' end of D2040, corresponding to aa 13-426 of Fig. 2). Unlike D2040, the recombinant NW1 contained no intron, but in conceptual translation NW1 was found to be only 6 nt longer than D2040. An EcoRI-Kpn I fragment (374 nt) of clone NW1 and oligonucleotide TG1 (see Materials and Methods) were next used as probes to screen a different (B21) cDNA library (29). Sixty independent recombinants were isolated, and 10 of these were further characterized by PCR. Three clones were found to contain longer inserts than NW1; the longest one, NW3 (484 nt longer than NW1) was subcloned, and the 5' end was sequenced (a total of 856 nt; of these, 372 nt matched the 5' end of NW1, corresponding to aa 11-134 of Fig. 2). NW3 contained the Met initiation codon preceded by an untranslated region of 451 nt. Nucleotide Sequence of cDNA and the Deduced Amino Acid Sequence. The nucleotide sequence of the cDNAs coding for chicken erythrocyte transglutaminase was constructed from overlapping sequences of cDNA clones D2040, NW1, and NW3 (Fig. 2). The compiled sequence contained 3245 nt with a single open reading frame beginning with an ATG initiation codon (nt 452-454) and ending with a TGA stop codon (nt

(1992)

2546-2548), predicting a coding sequence of 698 aa. Analysis of NW1 and NW3 provided further results matching one peptide sequence (aa 17-30) isolated from the V8 protease digests of purified chicken erythrocyte transglutaminase. Two ATG triplets were found, at nt 452-454 and 479-481. According to Kozak's rule (39), either of these could serve as an initiation codon; the latter exactly fits the consensus (ACCATGG) as the translation initiating site, while the former is divergent (AATATGG). However, the location of a purine residue at position -2 has been shown to have a

Cloning and expression of chicken erythrocyte transglutaminase.

We report the sequences of cDNAs encoding chicken erythrocyte transglutaminase (EC 2.3.2.13). The complete mRNA consists of 3345/3349 nucleotides and ...
2MB Sizes 0 Downloads 0 Views