GENOMICS

6,X9-167

(1990)

Chromosomal Organization and Localization of the Human Urokinase Inhibitor Gene: Perfect Structural Conservation with Ovalbumin JULIE A. SAMIA,* SUSAN J. ALEXANDER,*,’ KRISTIN W. HORTON,* PHILIP E. AuRoN,t MARY G. BYERS,$ THOMAS B. SHOWS,+ AND ANDREW C. WEBB*** *Department of Biological Sciences, Wellesley College, Wellesley, Massachusetts 02 78 1; t Department of Medicine, Lovett Group, Massachusetts General Hospital, Charlestown, Massachusetts 02129; and *Department of Human Genetics, Roswell Park Memorial Institute, Buffalo, New York 14263 Received

June 19. 1989;

revised

September

8, 1989

INTRODUCTION Plasminogen activator inhibitor 2 (PAI-2) plays an essential role in the regulation of localized extracellular proteolysis by its inactivation of urokinase. Using probes derived from a cDNA we isolated from lipopolysaccharide (LPS)-stimulated human peripheral blood monocytes, we have mapped, isolated, and determined the molecular organization of the gene for PAI(PLANHZ).’ In situ hybridization of the cDNA to normal metaphase chromosomes has confirmed our prior assignment of the gene for PAI-P to chromosome 18 and further localized it to the long arm at 18q2 1.218922. We have isolated nine independent genomic clones, two of which were found to contain the entire PAItranscriptional unit of approximately 16.4 kilobase pairs (kbp). Analysis of the gene organization by restriction enzyme mapping, Southern blotting, and DNA sequencing revealed that the cDNA sequence is divided among eight exons interrupted by seven introns, the junctions of which all conform to the “GTAG” consensus rule. In common with the arrangement found throughout the serpin superfamily, of which PAIis a member, the first intron is located just 5’ to the initiator methionine residue, and the 3’ untranslated region (UTR) is not interrupted by a splice junction. Determination of the transcription initiation site by primer extension analysis of monocytic mRNA indicated that our PAIcDNA was, at most, only three nucleotides short of full length, yielding a primary PAItranscript with a 66-bp first exon. A promoter “TATAAA box” is located 30 bp upstream of the “cap” site. The intron-exon arrangement of the gene for PAIwas found to be identical to that of the chicken ovalbumin and Y genes and distinct from that of other members of the serpin superfamily. This remarkable similarity suggests a common ancestral gene and conservation of structural organization for over 260 million years of evolution. 0 is90 Academic Press, he.

Two types of plasminogen activators (PA), namely, tissue-type &PA) and urokinase (UPA), encoded by separate genes have been identified in mammals. The PAS are potent arginine-specific serine proteases capable of generating large amounts of proteolysis at localized sites. Their only known natural substrate is the plasma zymogen plasminogen, itself a precursor of the powerful endopeptidase plasmin. tPA has a high affinity for fibrin and is intimately involved with the fibrinolytic cascade.Urokinase, on the other hand, has been primarily associated with extracellular proteolytic events that mediate tissue reorganization and morphogenesis, inflammation, cellular migration, and infiltration (both metastatic and nonneoplastic). The absence of congenital defects in this enzyme system argues strongly for its essential roles in tissue and vascular biology (reviewed in Dane et al., 1985; Travis and Salvesen, 1983). The regulation of PA activity is a multilevel process from biosynthesis and secretion to binding with plasma membrane receptors and modulation by specific, fastacting inhibitors (PAI). These serine protease inhibitors are part of a large gene superfamily called the serpins (Carrel1 and Travis, 1985; Hunt and Dayhoff, 1980) which includes such diverse members as chicken ovalbumin and the barley proZ gene with unknown functions, as well as the collection of proteolytic antagonists integral to the complex regulation of mamSequence data from this article have been deposited with the EMBL/GenBank Data Libraries under Accession No. 504752. 1 Present address: Boston University School of Medicine, 80 East Concord Street, Boston, MA 02118. ’ To whom correspondence should be addressed. 3 The gene symbols PLANHl and PLANH2 have been assigned for PAIand PAI-2, respectively, at the Human Gene Mapping Workshop 10, New Haven, CT USA June ll-17,1989. 159

OSSS-7543/90 $3.00 Copyright 0 1990 by Academic Press, Inc. All rights of reproduction in any form reserved.

160

SAMIA

malian homeostasis (e.g., angiotensinogen, antithrombin III, Cl inhibitor). Three forms of PA1 exist in mammals. Type-l (PAI-l), derived mainly from endothelial cells, inhibits tPA more strongly than uPA (Erickson et al., 1985; Kruithof et al., 1986), whereas type-2 (PAI-2), expressed predominantly by mononuclear cells, is more relevant to uPA than to tPA control (Chapman et al., 1982; Vassalli et al., 1984; Wohlwend et al., 1987). The third, protease nexin (PN), binds preferentially to thrombin and has only a secondary action on PA (Scott et al., 1985). PAIis a major product of monocytes and macrophages following stimulation with phorbol esters, endotoxin, steroids, and an array of growth factors (e.g., IL-l, TNF, TGF). Over 95% of the -5O-kDa polypeptide remains nonglycosylated and intracellular with no known function. The remainder of the PAIis the extracellular, glycosylated form that rapidly inactivates non-receptorbound uPA (reviewed in Blasi et al., 1987). Several groups, including ours, have reported the molecular cloning of a PAI- cDNA (Webb et al., 1987; Ye et al., 1987; Schleuning et al., 1987; Antalis et al., 1988). The gene for PAI(PLANHl) has been found to be encoded within nine exons on a 12.2-kbp segment of the long arm of human chromosome 7 (Klinger et al., 1987; Loskutoff et al., 1987; Bosma et al., 1988). Here we report the localization of the human gene for PAI- to the long arm of chromosome 18, together with its complete molecular structure. The PAI- transcriptional unit is approximately 16.4 kbp in length and is split by seven introns into eight exons, all but one of which lie within the first two-thirds of the coding region. However, the precise location of introns in the gene for PAI- bears little resemblance to that of any other serpins for which a gene arrangement has been determined, with the notable exception of the chicken ovalbumin and Y genes. The placement of intron-exon boundaries in the human PAIand these bird genes was found to be highly conserved. With the exception of minor differences in the position of the transcriptional initiation site and the approximate sizes of some of the introns, the data presented here for the genomic localization and arrangement of human PAIare essentially identical to those reported recently by Ye et al. (1989). MATERIALS

AND

METHODS

Chromosomal Localization The 1256-bp P&I-DraI fragment from the plasmid pcD-1214 harboring a full-length PAI- cDNA (Webb et al., 1987) was labeled by nick-translation in the presence of 3H-labeled nucleotide triphosphates and hybridized to cytological preparations of human metaphase chromosomes, which were subsequently prepared for autoradiography as previously described (Webb et al., 1986; Zabel et al., 1983). Silver grains were scored

ET

AL.

over 100 independent cells and a total of 151 were assigned to chromosomal bands. Screening Genomic Libraries Two separate human genomic bacteriophage libraries (one a leukocyte library cloned into EMBL3 from Clontech Labs. Inc., Palo Alto, CA, Cat. No. HL1006d, Lot No. 1087; the other of lung fibroblast origin (W138) in the X-Fix vector obtained from Stratagene Cloning Systems, La Jolla, CA) were screened for the PAIgene using established techniques (Davis et al., 1986; Maniatis et al., 1982). In excess of 1 X lo6 recombinant phage from each library were probed with either pcD1214 restriction fragments or oligonucleotides (22 to 35-mers) synthesized to cDNA sequenceson a MilliGen 7500 DNA synthesizer (MilliGen/Biosearch, Burlington, MA) and labeled with 32P (Amersham Corp., Arlington Heights, IL) using random-priming or 5’-endlabeling kits (Boehringer Mannheim Biochemicals, Indianapolis, IN), respectively. Phage DNA Mapping and Sequencing Conventional cesium chloride banding of phage particles was used to prepare DNA from the EMBL/PAI21 clone (Davis et al., 1986), but for purified X-Fix clones, the more rapid and efficient ion-exchange method described by Manfioletti and Schneider (1988) was adopted, except that we utilized DEAE-Sephacel (Sigma Chemical Co., St. Louis, MO) in preference to DEAE- or TEAE-cellulose in the original protocol. Recombinant phage DNA was mapped using the LambdaMap method (Promega, Madison, WI) on partial restriction enzyme digests following pulse-field electrophoresis in 1% vertical agarose gels, as suggested by Hoefer Scientific Instruments (San Francisco, CA). The approximate lengths of introns were determined by hybridization of oligonucleotide probes to Southern blots of restriction fragments generated by cutting at sites in flanking exons. Restriction fragments were subcloned into either Ml3 or pBluescript (Stratagene) vectors for sequencing by the dideoxynucleotide chain termination method (Sanger et al., 1977) using t3%]dATP (Amersham) as tracer in either the Sequenase (US Biochemical Corp., Cleveland, OH) or T7 DNA polymerase (Pharmacia-LKB Biotechnology Inc., Piscataway, NJ) kits, as directed by the manufacturers. Plasmid DNA from pBluescript subclones to be sequenced directly, with either commercial primers flanking the inserts or internal synthetic oligonucleotides (both senseand antisense), was prepared according to Kraft et al. (1988). Terminal deletions of the pBluescript inserts were obtained by the ExoIII/mung bean nuclease method using the reagents and protocol supplied by Stratagene. DNA sequence was assembled and analyzed with the PC/Gene software package from Intelligentics, Inc. (Mountain View, CA).

HUMAN

Primer

UROKINASE

Extension

A synthetic 26-mer (5’-GTGTGTTTGCCACACAAAGATCCTCC) was 5’-end-labeled with [r3’P]ATP to a specific activity of approximately 4 X lo6 dpm/ pmol. The procedures for annealing this primer to the RNA and reverse transcription were performed according to the method of Calzone et al. (1987). Poly(A)+RNA (2 pg) from either unstimulated or LPSstimulated peripheral blood monocytes or toxic-shock toxin-l-stimulated U-937 cells was hybridized to 0.5 pmol of labeled primer and extended with either 20 U AMV (Pharmacia-LKB) or 200 U MMLTV (Bethesda Research Labs, Life Technologies, Inc., Gaithersburg, MD) reverse transcriptase at 37°C. Ml3 subclones of PAIcDNA were primed with the same end-labeled 26-mer for conventional dideoxy sequencing, except that unlabeled dATP was substituted for radiolabeled tracer in the reactions (Pharmacia-LKB). These Ml3 sequencing reactions were run on conventional polyacrylamide-urea sequencing gels alongside the primer extension products as length reference markers. RESULTS Chromosomal (PLANH2)

AND

Localization

DISCUSSION of the Gene for PAI-

We have already assigned this gene to human chromosome 18 by hybridization of our monocyte-derived PAI- cDNA to Southern blots of genomic DNA isolated from a panel of human-mouse somatic cell hybrids (Webb et al., 1987). As a prelude to isolation and molecular characterization, the regional location of PLANHZ on chromosome 18 was determined by in situ hybridization of the same PAIcDNA probe to metaphase chromosomes. A total of 100 cells were counted and the chromosomal distribution of 151 silver grains (an average of 1.5 grains/metaphase) was scored (Fig. la). These data support our prior assignment to human chromosome 18 and locate the PAIgene to the distal portion of the long arm. Forty-three grains (28.5% of total) were found over chromosome 18, and 32 of these (74.4%) formed a peak over bands lBq21.2q22. Of the metaphases examined, 31% showed grains over this locus. These data agree very well with a recent, independent assignment of the gene for PAIto essentially the same locus (18q21-23) on flow-sorted human chromosomes (Ye et al., 1989). Isolation

of PAI-

Genomic Clones

Initial screening of the EMBL3 library with PAIcDNA fragment probes yielded three truncated clones, which were missing the first two exons of the gene. The longest of these clones (EMBL/PAIB-1) contained an insert of -16 kbp in length, but terminated just upstream of the intron II/exon 3 boundary. The bulk of the pcD-1214 cDNA sequence (nucleotides 490-1900)

INHIBITOR

GENE

161

(PAI-2)

was located within a 7.3-kbp BamHI fragment (KS + 100) as exons 5-8, and another subclone containing the terminal SalI-EcoRI 2.3-kbp fragment (ES-l) carried only the 120-bp exon 3 (pcD-1214 nucleotides 241360) flanked by extensive intronic regions (see Fig. 2). In order to obtain the remaining 240 nucleotides of 5’ PAImRNA sequence and additional upstream DNA, a synthetic oligonucleotide probe (ES45) was synthesized using the intron II sequence beginning 45 bp back from the terminal Mb01 site of the EMBL3/ PAI2-1 clone. Rescreening of the EMBLB library with the ES45 probe revealed no further PAIclones, but a primary screening of the X-Fix library yielded 11 positives. Subsequent plaque purification produced nine independent X-Fix clones that were ES45 positive and contained inserts from 10 to 18 kbp in length. Hybridization with a series of oligonucleotide probes made to sequences along the length of the PAIcDNA (A-G; see Fig. 2) indicated that seven of the nine ES45-positive clones carried sequence complementary to the 5’ UTR of the PAIcDNA, since they hybridized to oligonucleotide probe A (nucleotides l-30 of pcD-1214). Furthermore, two of the X-Fix phage (17-2 and 18) appeared to contain “full-length” PAIgenomic clones by virtue of their hybridization to our most 3’ oligonucleotide probe G (pcD-1214 residues 1739-1776). Clone 18 was selected for detailed characterization of the complete PAIgene organization. Structure

and Organization

of the Gene for PAI-

The human PAItranscriptional unit is contained within an approximately 16.4-kbp segment of the long arm of chromosome 18 and is divided into eight exons by seven intervening sequences (Fig. 2). This organization of the gene was determined by a combination of restriction enzyme mapping, Southern blotting, and direct DNA sequencing, followed by comparison with the PAIcDNA sequence we reported previously (Webb et al., 1987). Table 1 is a compilation of these data for intron-exon sizes and their boundary sequences. With respect to the overall length of the transcriptional unit and the number, placement, and sizes of the exons, these data are in almost perfect agreement with the other recently published report for the complete structure of the gene for PAI(Ye et al., 1989). The two sets of data differ only in their estimates of the sizes of some introns, presumably as a result of the error implicit in accurately measuring restriction fragment lengths on gels and blots. However, this does not explain the additional kilobase of sequence we find in intron VI compared to the estimate of Ye and associates (1989) for this intron (intron F by their designation), which may be symptomatic of gene polymorphism or simply the product of a cloning artifact. It can be seen from the data in Table 1 that all the intron-exon boundaries in the gene for PAI- conform

162

SAMIA

ET

AL.

a PlANH2

30 20 t

10 g .a

1-

oDla I

DI 2

a

~1 a 4

. .PI a 5

L _. Q I a via 6 7

_

_ vra 0

.via 9

Chromosomes

b

21.2 21.3

H 18

FIG. 1. (a) Histogram PAIcDNA frequency of (28.5%) were

Chromosomd localization of the human gene for PAI(PLAhW2) to the long arm of chromosome 18 by in s&u hybridization. showing the distribution of 151 silver grains associated with 100 human metaphases hybridized with the 3H-labeled P&I-DmI fragment probe. The abscissa represents the human chromosomes displayed with their sizes to scale; the ordinate shows the silver grains. (b) Idiogram showing the distribution of silver grains along chromosome 18. Forty-three of the 151 silver grains scored over chromosome 18, and of these 74.4% (32/43) were localized over bands 18q21.Sq22.

HUMAN

UROKINASE

to the canonical “GT-AG” rule (Breathnach and Chambon, 1981) and also show a high degree of homology to the consensus sequence adjacent to splice donor and acceptor sites (Mount, 1982; Padgett et al., 1986). Of the six splice junctions in the coding region of this gene, five occur between codons (class 0) and only intron V is a class 1 site which interrupts amino acid residue 179 (glycine) between the first and second nucleotides of the codon (Sharp, 1981). With the exception of ATIII, which has intron I within its signal peptide, all serpin genes isolated to date have their first intron positioned in the 5’ UTR close to the initiator methionine residue (Prochownik et al., 1985). PAI- adheres to this pattern and also to another common feature of serpin gene organization, namely, that the 3’ UTR is uninterrupted by splice junctions (Breathnach et al., 1978; Heilig et al., 1982; Leicht et al., 1982; Tanaka et al., 1984; Prochownik et al., 1985; Loskutoff et al., 1987). In addition to the entire 3’ UTR, exon 8 contains the reactive center arginine residue (380) and almost a third of the polypeptide coding region (amino acid residues 282-415). The only discrepancy between our cDNA and genomic sequences was found in the 3’ UTR in exon 8, where an additional T residue between nucleotides 1745 and 1746 on the sense strand of the pcD-1214 sequence was detected in clones from both the genomic libraries used in this study. This change confirms the sequence reported by other groups for PAIcDNAs (Ye et al., 1987; Schleuning et al., 1987; Antalis et al., 1988) and results in the loss of the unique KpnI site we reported in our cDNA clone (Webb et al., 1987). Reevaluation of our cDNA sequence data from both strands in this region confirms the original observation of a KpnI site, and therefore may represent either a reverse transcription error in the pcD-1214 clone or bona fide sequence polymorphism. In all other regards, our published monocyte PAIcDNA sequence (Webb et al., 1987) was confirmed by

INHIBITOR

GENE

163

(PAI-2)

TABLE IntrosExon

Boundary Sequences Gene for PAI-

Inmul

5’ Boundary

I n III Iv

1

AACAACCA: ATG GCC

AA?

Me4

Lys

Ala

mmn

length

of Human

3’ Boundary

gtamcaas ... .... .. -3&b

.... ... ngcnctag

gtgqtttga

... ...ctgattgcsg ““err,

...... . -3&b

%lTGAAACA ClT

val

ATT ‘ITG CAz3 lk LXI Gh

gtatctgact ..... .... l.aSObp .. ....tcttttcaag

Lea

364 GCACAAGCT Ala an

CAG Gin

Ala

493

TIT CGG GA:* PlK Arg Gh

gtaagtgaaa ...... ..SO7bp ... ..tgctttaaag

Glu 610

V

ACC AAA G ‘lk

Lys

GM 611

GC AAA ATC

gtaaatccaa .. .... -3.2kb .. .....ttttctgtag

ly

Lys

Ile

GCI’ Ala

CAG Gh

CXC Ag

G 153

VI

GTA val

AAC ASI

TCG sex

VU

2

;;G

z

TAT AlT Tyr l!x

154

gratgagaca

.._... -1.5kb ..... .amtattag

gtaagacan

. . -0.4kb ... ...tgcntgcag

918

919

CR3 GAA AGT Leu Glu !&I

the genomic exon sequences reported here and those reported by Ye et al. (1989). This is particularly noteworthy since of the four PAIcDNA sequences published to date, the one characterized by Antalis et al. (1988) from U-937 cells was identical in the coding region to our monocyte sequence, but differed by four codons (one silent change) from the identical sequences reported by Ye et al. (1987, 1989) and Schleuning et al. (1987) from placental and U-937-derived cDNAs, respectively. This apparent polymorphism reflected in the PAImRNA from various sources may be indicative of allelic forms of the gene, but unfortunately none of these differences lie within a palindromic sequence that would be detectable in genomic blots as a restriction fragment length polymorphism (RFLP).

FIG. 2. Map showing the genomic organization of the human PAIlocus. The positions of the eight exons (l-8) are represented by boxes (solid = coding region; open = 5’ and 3’ UTR) and the seven introns (I-VII) by the horizontal line connecting them. The precise lengths of the exons and approximate lengths of the introns are presented in Table 1. The locations of restriction enzyme sites used for subcloning and mapping are indicated by vertical lines (B, BamHI; E, EcoRI), and oligonucleotides (oligos) used in Southern blots and as sequencing primers are indicated by vertical arrows (A-G and ES45). The extent of the two phage clones characterized (EMBL/PAIS-1 and X-Fix 18) and their relevant subclones are shown as horizontal lines below the map.

164

SAMIA

However, another difference between our PAIgenomic sequence and that reported by Ye et al. (1989) should be detectable as a RFLP, namely, the presence or absence of an EcoRI site within intron IV. We have sequenced completely through this 5O7-bp intron on both DNA strands and have confirmed the presence of an EcoRI recognition sequence 75 bp 3’ to the BamHI site in intron IV. One further difference in the PAI- cDNAs reported previously was the additional 581 bp of 3’ UTR found in the Antalis et al. (1988) clone that was not present in any of the other three published sequences. Examination of our sequence flanking the 3’ end of the PAI2 gene showed no homology with this extended 3’ UTR. Mapping

of the Transcriptional

Initiation

ET

AL.

a

ACGT

2

3

4

5

6

7

ACGT * c

“m *

88

-60

Site

b Of the four PAIcDNA sequences published, our clone pcD-1214 had the longest 5’ UTR by 17-48 nucleotides (Webb et al., 1987; Ye et al., 1987; Schleuning et al., 1987; Antalis et al., 1988). Consequently, the antisense oligonucleotide A2 (nucleotides 75-100 of pcD1214) was annealed to monocytic mRNA and used in a primer extension reaction to definitively identify the 5’ end of the gene. Figure 3a illustrates that in both stimulated peripheral blood monocytes and U-937 cells, the major site of transcriptional initiation is most frequently at the G residue, 3 nucleotides beyond the 5’ end of our reported PAIcDNA sequence (Webb et al,, 1987). Minor bands seen below the primary extension product in this analysis (103 nucleotides from the 5’ end of the primer) indicate either secondary cap sites for PAI1 or 2 nucleotides downstream from this position or incomplete extension due to 5’ capping of the mRNA. Figure 3b shows the nucleotide sequence surrounding the PAItranscriptional start site derived from X-Fix clone 18 which ends with the Sau3Al restriction site at -83. It can be seen that the PAI- gene has a conventional TATAAA box (Breathnach and Chambon, 1981) located at -25 to -30 nucleotides 5’ to the G at +l and a 66-bp first exon which encompasses the bulk of the 5’ UTR. These data are only slightly at variance with those recently published by others using PMA-stimulated U937 mRNA. Kruithof and Cousin (1988) suggested that PAItranscription initiates at the T residue just 3’ to this G, yielding a 65-bp exon 1, whereas Ye et al. (1989), using both primer extension and Sl protection assays, identified the primary cap site at the 5’ C residue (nucleotide position -2 in our analysis, making exon 1 68 bp in length). It should be noted that by far the majority of eukaryotic RNA polymerase II transcripts have been found to initiate at a purine flanked by pyrimidines, rather than the reverse (Breathnach and Chambon, 1981), and that our data from both stimulated monocytes and U-937 cells are identical (Fig. 3a).

S-~:ATCAAAA GACAGAGGGA

G-

TGCCATGTGG

GAGGGGCAAA

GCTG-

ACCAGTCATT

ACCATGTCTG

AACTGTAACA I-->

ACTCTCAGAG

GAGCA-ITGCC

CGTCAGAGAG

CAACTCAGAG

AATAACCAGA

-25

+l

gtatl tcaag

amc cagcc

GAACAACC::

agtct aggaa taggg

nc--

_______ > ____ _____ __.___ intro” 1(3,Skb) ____ initiation site. (a) FIG. 3. Mapping of PAI- transcriptional The synthetic oligonucleotide complementary to nucleotides 75-100 of the pcD-1214 plasmid sequence (42) was 5’-labeled with polynucleotide kinase in the presence of [y-32P]ATP and then annealed to various poly(A)+ RNAs (Lane 1, primer alone; Lane 2, rat renal cell carcinoma; Lanes 3 and 4, LPS-stimulated human peripheral blood monocyte; Lanes 5 and 6, unstimulated human peripheral blood monocyte; Lanes 7 and 8, toxic shock toxin-l-stimulated U-937 cells) and extended with either MMLTV (lanes 3,5, and 7) or AMV (lanes 2,4,6, and 8) reverse transcriptase as indicated under Materials and Methods. The lower intensity of extension products seen in Lane 8 wae due to the addition of suhoptimal amounts of enzyme (

Chromosomal organization and localization of the human urokinase inhibitor gene: perfect structural conservation with ovalbumin.

Plasminogen activator inhibitor 2 (PAI-2) plays an essential role in the regulation of localized extracellular proteolysis by its inactivation of urok...
1MB Sizes 0 Downloads 0 Views