Structural and Functional Analysis of the Insulin Receptor Promoter

Catherine McKeon, Victoria Moncada*, Thang Phamf, Paola Salvatoref, Takashi Kadowaki, Domenico Accili§, and Simeon I. Taylor Diabetes Branch National Institute of Diabetes and Digestive and Kidney Diseases National Institutes of Health Bethesda, Maryland 20892

The insulin receptor plays a critical role in the maintenance of glucose homeostasis. Regulation of this key function must be under stringent controls. In order to study the regulation of insulin receptor gene expression, we have cloned, sequenced and characterized its promoter. The first exon of the insulin receptor gene is embedded in an unusual segment of DNA composed of Alu repeats. The promoter has the characteristics typical of a housekeeping gene. It is GC-rich and has multiple start sites of transcription. A 574 base pair fragment immediately upstream of the translation initiation site contains promoter activity when transfected into eukaryotic cell lines. Deletion analysis was performed to study promoter function. These studies showed that only 150 base pairs of promoter sequence were necessary for promoter function. This region contains three potential binding sites for the transcription factor, Sp1 and a TC box sequence. Furthermore, the fragment functions equally well in either orientation. We have defined an element in this region with enhancer function for both its homologous and a heterologous promoter. In addition, this region seems to contribute some degree of tissue specificity to insulin receptor gene expression. (Molecular Endocrinology 4: 647656, 1990)

3). In at least one instance, we and others have shown that the number of receptors is transcriptionally regulated. When the insulin receptor is induced by glucocorticoids, the 3-fold stimulation is the result of a comparable increase in transcription of insulin receptor mRNA (4, 5). Recently, mutations in the insulin receptor gene have been shown to cause two clinical syndromes of extreme insulin resistance (6, 7). Although the primary defect in non-insulin dependent diabetes remains unclear, there appears to be a decrease in the number of receptors on the cell surface (8). This defect could in principle be caused by impaired regulation of the insulin receptor gene. In order to identify other factors that may regulate insulin receptor gene transcription, we have cloned 40 kilobases (kb) of genomic DNA encompassing exons 1 and 2. This DNA contains many repetitive sequences homologous to the Alu repeat family that surround the promoter region. We have demonstrated that the 574 base pairs (bp) fragment immediately upstream of the translation initiation site contains promoter activity. In fact, this promoter is surprisingly strong when compared to the Simian virus 40 (SV40) early promoter. By deletion analysis, we have shown that promoter activity requires the presence of a 150 bp region that contains three putative binding sites for the transcription factor, Sp1, and an element with enhancer function.

INTRODUCTION RESULTS The insulin receptor is an integral membrane protein whose role is to bind insulin and subsequently transmit a signal across the plasma membrane. This protein is synthesized as a single polypeptide precursor that is proteolytically cleaved into an a- and /3-subunit (1-3). Since the receptor is synthesized in a single unit, control of the receptor must lie at least in part in the regulation of transcription of the mRNA. Recently, the cDNA for the insulin receptor has been isolated which has allowed molecular investigation of transcriptional regulation (2,

Isolation and Characterization of Insulin Receptor Genomic Clones We have isolated six overlapping phage clones spanning 40 kb of genomic DNA encompassing exons 1 and 2 of the human insulin receptor gene. These clones also contain 7 kb of 5'-flanking region, 17 kb of intron A, and 15 kb of intron B (Fig. 1). Because there were repetitive sequences in intron A, we were not able to isolate clones that span the gap in this intron by chromosome walking. The exon/intron junctions were sequenced and were aligned following the GT/AG rule (9)

0888-8809/90/0647-0656S02.00/0 Molecular Endocrinology Copyright © 1990 by The Endocrine Society

647

The Endocrine Society. Downloaded from press.endocrine.org by [${individualUser.displayName}] on 01 August 2016. at 10:04 For personal use only. No other uses without permission. . All rights reserved.

Vol 4 No. 4

MOL ENDO-1990 648

XHIR 205

777X ALU Hybridizing • _ ALU Arm

1////////////////////////77////A LRRLR GAG Ggtgagt Glu Val 6 7

1 kb

XHIR 26 XHIR 31 XHIR 83

Exon 2

f T

~ "

X//77////////////////////////////////////////////.

ttgtagTG VAL 7

X

AAA Ggtacgc Lys VAL 190 191

Fig. 1. Exon/lntron Structure of the 5'-End of the Insulin ReceptorGene This schematic diagrams the 6 overlapping phage clones covering exons 1 and 2. The sequence of the exon/intron splice junctions are shown. A partial restriction map is provided. H/ndlll (circles) and Sacl (arrows) are shown on the upper map and EcoRI (boxes) is shown on the lower map. Fragments that hybridized to an Alu sequence probe are denoted with hatched boxes. Alu sequences that were identified by sequencing are shown with the orientation denoted by L (left arm) and R (right arm).

(Fig. 1). Both exons 1 and 2 terminate after the first G of the codon for valine at amino acids number 7 and 191, respectively. Repetitive Sequences The presence of repetitive sequences both upstream and downstream of exon 1 made the isolation of this DNA region particularly difficult. Therefore we sought to characterize these sequences. Southern blot analysis of the phage clones under normal stringency revealed that there were bands with homology to Alu sequences. These sequences were mapped using several restriction enzymes and were shown to be located throughout the region more than 1700 bp upstream and more than 1400 bp downstream of the translation initiation site. The downstream Alu hybridizing region extended through clone XHIR44 and was again found in XHIR26 ending 800 bp upstream of exon 2 (Fig. 1). Sequencing the unique/repetitive junctions showed the presence of complete Alu sequences with approximately 86% homology to the consensus sequence both upstream and downstream (10) (Fig. 1). Additional Alu sequences have been identified by sequencing and their location is denoted in Fig. 1. Other Alu sequences were detected by hybridization in regions that have not been sequenced. The Alu sequences bracket a unique region of DNA which is approximately 3 kb in length. This region is likely to contain most of the regulatory sequences for the insulin receptor promoter. Start Sites of Transcription There has been considerable disagreement in the published literature over the location and number of tran-

scriptional start sites for this gene (11-14). In order to construct an expression vector to study this promoter, it was important to confirm the location of the start sites. We have used the technique of primer extension to determine the end of the insulin receptor mRNA isolated from an Epstein-Barr virus-transformed lymphocyte cell line. The result of this experiment using a 23 base antisense oligonucleotide primer homologous to the sequence between - 3 2 and - 1 0 is shown in Fig. 2. There are two extended products at nucleotide position -249 and -236 and a cluster of eight bands between -537 and -388. We have previously isolated a cDNA clone beginning at -536 which is consistent with initiation at the most 5'-start site (15). These results are in close agreement with the start sites determined by Mamula et al. (12). The two smaller bands terminate in a region where there is an inverted repeat (Fig. 3). These may represent sites where reverse transcriptase pauses due to secondary structure in the mRNA rather than genuine start sites. In fact, a recent publication by Tewari et al. (14) studied this question using S1 analysis and only confirmed start sites upstream of -387. Promoter Sequence The sequence of the 5'-flanking region from XHIR44 is shown in Fig. 3. This clone contains a 4.4 kb Hind\\\ genomic fragment. Southern blot analysis of DNA from five nonrelated individuals consistently gave a fragment of 9 kb (data not shown). Either this clone represents a rare polymorphism or a rearrangement has occurred during the cloning process. The clone reported by Ma-

The Endocrine Society. Downloaded from press.endocrine.org by [${individualUser.displayName}] on 01 August 2016. at 10:04 For personal use only. No other uses without permission. . All rights reserved.

Analysis of the Insulin Receptor Promoter

A. PE G A T C

-537

-388

649

and contains clusters of the dinucleotide CpG which is characteristic of an "HTF island" (for a review see 16). Therefore, the HIR promoter should be classified as a housekeeping promoter as described previously (17, 18). Inspection of the sequence reveals that this promoter has no TATA sequence. Instead, three potential binding sites for the transcription factor, Spl (19), and two consensus TC boxes (20, 21) are found in the first 574 bp. Although there is general agreement between the published sequences in the promoter region, there are several differences (11 -14). These are denoted in Table 1. Whether these represent polymorphisms, cloning artifacts, or sequencing errors remains to be determined. Eukaryotic Expression

249 — • g * 236 — # £ -

B. -537-501 — -483-450-428— -416— -408-388-

Fig. 2. Mapping the 5'-End of the mRNA by Primer Extension A shows the products of a primer extension reaction in the lane labeled PE and the sequencing reaction initiating from the same primer. The primer contains the antisense sequence from - 3 2 to - 1 0 . There are two bands at nucleotide positions -249 and -236 and a cluster of bands between -537 and -388. B shows a densitometer tracing of the cluster of bands and demonstrates that there are at least eight discernible bands. The nucleotide position as determined from the sequence is given. The location of the primer and the start sites is diagrammed in Fig. 3.

mula et a/.(12) was isolated from the same library and seems to be identical in size to XHIR44. However, there are many differences between the two sequences as shown in Table 1. The restriction map of these clones is colinear for the first 2.1 kb with the promoter clone reported by Tewari et al. (14). Therefore, the alteration must have occurred in the repetitive region at the 3'end of the clone. The promoter sequence is composed of 82% GC

The region from -575 to - 2 upstream of the translation initiation site has been cloned into the eukaryotic expression vector, pSV0-chloramphenicol acetyl transferase (CAT). This fragment contains all of the start sites of transcription as determined by the primer extension assay. This region contained substantial promoter activity in several cell lines as demonstrated in Fig. 4, lane D. We compared this construct in five eukaryotic cell lines: 3T3, MCF-7, HeLa, HepG2, and 293. Each cell line was derived from a different tissue and had a different number of insulin receptors as indicated in Table 2. One would expect a direct correlation between receptor number and the strength of this promoter. In fact, there is a trend for the cells with the most receptors on their cell surface to have highest activity of the transfected insulin receptor promoter. However, the magnitude of expression did not correspond to that seen with receptor number. The relative strength of the HIR promoter was compared to the SV-40 early promoter (SVE) containing vector, pSV2-CAT, and a promoterless vector, pSV0CAT. Although insulin receptor mRNA is found in low abundance, the promoter is very strong. In two cell lines, 3T3 and 293, the insulin receptor promoter was as strong or stronger than the SVE promoter. An example of the activity in 3T3 cells is shown in Fig. 4. The ratio of the activity of these two promoters shows that the regulation is discordant despite the presence of similar Spl type elements in both promoters (Table 2). In addition, this ratio shows that the apparent correlation between transcriptional activity and receptor number is not just a function of transfection efficiency of particular cell lines. Like promoter activity, this ratio also decreases as receptor number decreases with the exception of HepG2. These results may suggest that this region of the promoter contributes to tissue specificity. The addition of further upstream sequences has been reported to significantly increase HIR promoter function (14). We added 875 bp of additional sequence upstream and assayed for promoter activity in both HepG2 and 3T3 cells. In both cell lines we saw an increase in activity as seen in Fig. 4 by comparing lanes D and E.

The Endocrine Society. Downloaded from press.endocrine.org by [${individualUser.displayName}] on 01 August 2016. at 10:04 For personal use only. No other uses without permission. . All rights reserved.

Vol 4 No. 4

MOL ENDO-1990 650

AGCTTTCCCTCCCTCTCCTGGGCCT

-575 550

CTCCCGGGCGCAGAGTCCCTTCCTAGGCCAGATCCGCGCCGCCTTTTCCC

500

Pvu II GCGGCCCGCACGGGCCCAGCTGACGGGCCGCGTTGTTTACGGGCCCGAGC 4c

Bst NI 450 400

AGCCCTCTCTCCCGCCGCCCGCCCGCCACCCGCCAGCCCAGGTGCCC©OC * * * * Ava I I CGCCAGTCAGCTAGTCCGTCGGTCCGCGCGTCCCTCTGTCCCGGAGCCCG

- 350

CAGATCGCGACCCAGAGCGCGCGGGGCCGAGAGCCGAGAGACAGTCCCGG Sac

I

-300

GCGCAGCGCGGAGCTCCGGGCCCCGAGATCCTGGGACGGGGCCCGGGCCG

250

CAGCGGCCGGGGGGTCGGGGCCACCACCGCAAGGGCCTCCGCTCAGTATT

-200

TGTAGCTGGCGAAGCCGCGCGCGCCCTTCCCGGGGCTGCCTCTGGGCCCT

- 15 0

CCCCGGCAGGGGGGCTGCGGCCGCGGGTCGCGGGCGTGGAAGAGAAGGAC

-100 -50

GCGCGGCCCCCAGCGCCTCTTGGGTGGCCGCCTCGGAGCATGACCCCCGC GGGCCAGCGCCGCGCGCTCTGATCCGAGGAGACCCCGCGCTCCCGCAGC

Fig. 3. Sequence of the 574 bp Promoter Region and the Start Sites of Transcription The sequence from -575 to - 2 which was introduced into the eukaryotic expression vector is shown. The location of several restriction sites used to construct some of the deletions are denoted. The primer used in Fig. 2 is underlined, an inverted repeat is marked by a line above, potential Sp1 sites are outlined, and TC boxes are marked by wavy lines underneath. The asterisks represent the start sites of transcription as determined by the primer extension experiment shown in Fig. 2.

The mean values for three separate experiments were 137 ± 26 for HepG2 and 159 ± 9 for 3T3 when HIR94 was set equal to 100%. Deletion Analysis Because the first 574 bp region contained most of the promoter activity, the start sites of transcription, and the ability to determine some degree of tissue specificity, we decided to study the effect of deletion on the function of this promoter fragment. Deletion of bases 3' to -287 had no effect upon promoter activity in the various cell lines (Fig. 5). This deletion removes both the start site of transcription reported by Araki et al. (11) and the ones reported by Seino et al. (13). Small interstitial deletions within this region also had no effect

on promoter activity (data not shown). Deletion toward the 5'-end of this fragment had a large effect on activity. Deletion of the 39 bp upstream of -536 had no effect. However, the deletion of 69 bp upstream of -506 which removes a TC box reduced activity in all cell lines by 50%. Removal of an additional 120 bp upstream of -386 reduced activity 2- to 8-fold depending on the cell line. The major cluster of start sites as well as the Sp1 binding sites are found in this 120 bp region. Therefore the core promoter element is confined to the 150 bp region between -536 and -387. We constructed a vector with the sequence between -536 and -387 that contains all the structurally important promoter sequences as defined by our deletion analysis. These included one TC box, three Sp1 binding sites, and most of the start sites of transcription. As expected, the

The Endocrine Society. Downloaded from press.endocrine.org by [${individualUser.displayName}] on 01 August 2016. at 10:04 For personal use only. No other uses without permission. . All rights reserved.

651

Analysis of the Insulin Receptor Promoter

Table 1. Sequence Comparison Position

-498 -486" -455 -444 - 2 9 1 , 292 -264 -252 - 1 8 7 , - 1 8 6 , -185 -178 -128 -115, 116, 117 -94 -52 -33 +106 +107 +119 +133 +135

McKeon etal.

Araki

Mamula

Seino

Tewari

etal. (11)

etal. (12)

etal. 03)

etal. (14)

A

G

C

C

GC C GCC

C GCC

GCG

GCG C

C T C G C

C

A G" C G" GC G" CGC G"

A G" G

A

GC

GC

C GCC

C CCG

C

C GCG T

GCG A

G A

C

C T C

C T C G G

C

Between +135 and +407 there are many sequence differences between Tewari et a/, and ourselves. Our sequence agrees with that of Seino et al. (22). In our sequence there are five potential Sp1 binding sites and a TC box sequence. a We have sequenced this region in both directions, however, it is possible that there is some gel compression in this region. We have not used a guanosine analog to vigorously study this difference. " Indicates a base insertion compared to our sequence.

vector with the promoter in the correct orientation, HIR49, retained most of the promoter activity (Fig. 5). Two small deletions were tested to further localize the important sequences. In HIR177, deletion of 23 bp downstream from -410 including the 3'-potential Spl binding site resulted in no loss of activity. In contrast, in HIR176 removal of 55 bp from the 5'-end of this fragment resulted in a 50% loss of activity similar to that seen in HIR91. These results suggest that the sequence from -481 to -410 contains the promoter element that also determines some degree of tissue specificity. The Insulin Receptor Promoter Contains an Enhancer Deletion of the sequence upstream of -481 seemed to reduce promoter activity in two different overlapping deletions. To investigate the role of this region further, the sequence between -575 and -481 was cloned in both orientations into the expression vector, pA10-CAT where the enhancer of the SV-40 early promoter has been deleted. This sequence was able to stimulate this unrelated promoter at least 3-fold in either orientation in both HepG2 and 3T3 cells. Figure 4 shows an autoradiograph of typical experiments in HepG2 and 3T3 cells (compare lanes A and B to lane C). The mean values of three separate experiments are shown in Table 3. The ability to stimulate a heterologous promoter in both orientations demonstrates that this fragment contains an enhancer. Taking all the deletion data

together this element should be located between -536 and -506, the same region as one of the TC boxes. The Core Element is Orientation Independent We constructed two expression vectors in which the promoter sequence was inverted. One contained the core 150 bp sequence and is denoted HIR49R. One contains the 574 bp fragment and is denoted HIR94R. When the core promoter fragment was inverted, it had the same if not more promoter activity than the natural orientation. When the 574 bp fragment was inverted, denoted HIR94R, there was some decrease in activity with approximately 80% activity in 3T3 and 293 cells and lower activity in HepG2 and MCF-7 cells. These findings demonstrate that the core element functions bidirectionally.

DISCUSSION

The insulin receptor promoter is GC-rich and belongs to a class of housekeeping promoters. Like other housekeeping promoters which lack TATA boxes, the insulin receptor has many start sites of transcription. By primer extension, we have mapped two clusters of start sites one between -537 and-388 and one consisting of two sites at -249 and -236. These starts sites are consistent with the sizes reported in IM9 cells by Mamula et al. (12). We have precisely mapped the

The Endocrine Society. Downloaded from press.endocrine.org by [${individualUser.displayName}] on 01 August 2016. at 10:04 For personal use only. No other uses without permission. . All rights reserved.

Vol 4 No. 4

MOL ENDO-1990 652

HEPG2



1.0

-

#

3.0

0.3

11.2

16.7

36.1

3T3 Fig. 4. CAT Activity of Various Insulin Receptor Promoter Constructions Autoradiographs of CAT assays show the activity of various recombinant CAT vectors when transfected into either HepG2 cells (upper panel) or 3T3 cells (lower panel). Lane A, Sequences -575 to -481 of the insulin receptor promoter were cloned into pA10-CAT in the reverse orientation; lane B, same construction as A except the fragment is in the normal orientation; lane C, pA10-CAT; lane D, HIR94 (-575 to - 2 in pSV0CAT); lane E, HIR170 (-1450 to - 2 in pSV0-CAT); lane F, pSV2-CAT was run in the same experiment but on a separate TLC plate. The numbers under each lane give the percent of 14 C-chloramphenicol converted as quantifed by /3-scanning.

start sites to individual nucleotides by comparison with a sequencing ladder. Deletion analysis of the expression of this promoter fragment is consistent with the assignment of major start sites to the region between -537 and -388. The two putative start sites at -249 and -236 occur in a region of secondary structure and may represent pause sites for reverse transcriptase which is a common artifact of this technique. Since submission of this manuscript, Tewari et al. (14) reported S1 analysis in conjunction with primer extension to determine the transcriptional start sites in the insulin receptor gene (14). Although there were multiple start

sites by primer extension, only four major start sites between -570 and -390 were confirmed by S1 analysis using HepG2 mRNA. The three most abundant sites that we determined in IM9 mRNA correspond to three of the four major sites they identified. The insulin receptor promoter is located in a 3 kb segment of unique DNA which is surrounded by Alu sequences that extend for a distance in both directions. The first intron of the gene is greater than 17 kb. Every restriction fragment generated from the phage clones covering this region hybridized to the Alu probe under stringent conditions. Recently Alu sequences have been shown to be enriched in a segment of DNA which is associated with early replication (24). In addition, repetitive sequences have been found in regions thought to contain origins of replication (25). This unusual arrangement of Alu sequences may indicate that an origin of replication lies near the promoter of this growth factor receptor gene. The region from -575 to - 2 upstream of the translation initiation site has promoter activity in several eukaryotic cell lines. This activity seems to require the presence of the region from -536 to -410 which includes the major start sites of transcription, a TC box, as well as two putative Spl binding sites. This region functions as a promoter in both directions probably due to the presence of the orientation independent Spl sites and an enhancer element. Several genes that have Spl sites in their promoters have been shown to have bidirectional function in vivo including the SV-40 early promoter and dihydrofolate reductase gene (26, 27). Additional studies are underway to determine whether the insulin receptor promoter functions bidirectionally in vivo as it does in vitro. The insulin receptor promoter is also surprisingly efficient. In both 3T3 and 293 cells, the insulin receptor promoter was 1.1 and 3.0 times stronger, respectively, than the SV-40 promoter. Since insulin receptor mRNA is not expressed at very high levels, this finding may suggest that there are additional negative regulatory sequences yet to be identified in this promoter. The SV-40 early promoter also has Spl binding sites. In fact, the transcription factor Spl was first identified because of its interaction with this promoter (19). Interestingly, these two promoters are not coordinately regulated suggesting that additional factors also play an important role in their expression. In addition to the Spl binding sites there are also two consensus sequences for a TC binding factor (28). Deletion of the first TC box has no effect on activity whereas deletion of the region that includes the second TC box results in a 2-fold reduction in CAT activity. This moderate effect is similar to that demonstrated in the human epidermal growth factor receptor promoter (28) and the chicken «2(l) collagen promoter (McKeon, C , unpublished data). In addition, we have shown that this region of the insulin receptor promoter contains a weak enhancer. Additional TC boxes and Spl binding sites are also found in the upstream region (11, 13) and in the first intron sequence (14, 22; McKeon, C , unpublished

The Endocrine Society. Downloaded from press.endocrine.org by [${individualUser.displayName}] on 01 August 2016. at 10:04 For personal use only. No other uses without permission. . All rights reserved.

Analysis of the Insulin Receptor Promoter

653

Table 2. Insulin Receptor Promoter Activity in Different Cell Types CAT Activity* Cell Lines

293 HEPG2 3T3 MCF-7 HeLa

HIR

SV-40

SV0

72% 27% 19% 6.3% 0.4%

24% 59% 17% 40% 10%

7.4% 4.9% 5.3% 0.5% ND

Ratio HIR/SV

Receptor Number

Cell Type

3.0 0.5 1.1 0.2 .04

3 x 105 7 x 104(23) 1 x 104(6)

Structural and functional analysis of the insulin receptor promoter.

The insulin receptor plays a critical role in the maintenance of glucose homeostasis. Regulation of this key function must be under stringent controls...
1MB Sizes 0 Downloads 0 Views