J. Mol. Biol. (1990) 214, 673-683

Establishment

of de Novo DNA Methylation

Patterns

Transcription Factor Binding and Deoxycytidine Methylation at CpG and Non-CpG Sequences in an Integrated Adenovirus Promoter Miklos

Toth”f, Ulrich

Miillerx

and Walter

Doerflers

Institute for Genetics University of Cologne D-5000 Cologne, West Germany (Received 19 December 1989; accepted 5 February

1990)

The establishment of de novo patterns of DNA methylation in mammalian genomes is characterized by the gradual spreading of methylation, which has been documented to occur across an entire integrated adenovirus genome as well as at the nucleotide level in the integrated late E2A promoter of adenovirus type 2. By applying the techniques of genomic sequencing and dimethylsulfate or DNase I genomic footprinting in viva, we have now demonstrated that the spreading of methylation in cell lines that carry the late E2A promoter with three in vitro pre-methylated 5’-CCGG-3’ sequences initially involves a DNA domain of this promoter that is devoid of bound proteins. Subsequently, methylation further spreads to neighboring regions, and the patterns of complexed transcription factors are altered. Evidence has been adduced that DNA methylation at sequences homologous to the AP-1 and octamer binding factor sites interferes with protein binding. In contrast, the methylation of sequences in the vicinity of but not involving sequences homologous to an AP-2 site still permits the binding of proteins to these sites. It is significant that during the spreading of methylation a few 5’-CG-3’ sequences can remain hemimethylated for several cell generations, before they also become methylated in both complements. Moreover, in cell line HE2, the integrated, heavily methylated late E2A promoter has been shown by the genomic sequencing technique to contain 5-methyldeoxycytidine residues, not only in all 5’-CG-3’ dinucleotides but also in a 5’-CA-3’ and a 5’-CT-3’ dinucleotide sequence. Hence, 5-methyldeoxycytidine occurs in a silenced mammalian DNA sequence also in dinucleotides other than 5’-CG-3’. This finding raises the question of whether 5-methyldeoxycytidine in non-5’-CG-3’ dinucleotides can be maintained in the methylated state during continuous cell propagation.

1. Introduction

DNA, which has recently been inserted into the mammalian genome, becomes methylatedl in distinct patterns, and that methylation spreads throughout the integrated foreign genome progressively with time of subcultivation (Sutter e,t al., 1978; Sutter & Doerher, 1979, 1980; Kuhlmann & Doerfler, 1982, 1983). The spreading of -DNA methylation has also been demonstrated at the nucleotide level. The in vitro 5’-CCGG-3’ methylated late E2A promoter of adenovirus type (Ad2) .DNA has been genomically fixed in hamster cells (Miiller & Doerfler, 1987). Starting from the premethylated seminal sites, methylation spreads to other 5’-CG-3’ dinucleotides (Toth et al.; 1989). It is thus hkely that complex patterns of DNA methylation can be established by gradual spreading. How is spreading regulated? It is tempting to deliberate whether the

With sequence-specific promoter methylations being recognized as an important factor in the regulation of gene expression (Doerher, 1981, 1983), it has become essential to elucidate the mechanisms of establishing de novo patterns of DNA methylation. We have shown that adenovirus type 12 (Ad1211) ton leave from the Institute of Biochemistry, The Hungarian Academy of Sciences, Szeged, Hungary. IPresent address: Department of Molecular Biology, Princeton Universit’y, Princeton, iVJ 08544, U.S.A. §Author to whom all correspondence should be addressed. JlAbbreviations used: Ad12, adenovirus type 12; Ad2, adenovirus type 2; S-mC, 5-methyldeoxycytidine; OBF, octamer binding factor; DMS; dimethylsulfate. OOSZ-2836/90/150673-11

$03.00/O

673

0

1990 Academic

Press Limited

M. Toth et al -

674

gradual extension of inactivation over an entire human X chromosome (Gartler & Riggs, 1983; Monk, 1986) might also be related to the spreading of DKA methylation. In this paper, the spreading of DNA methylation from in vitro premethylated sites in the genomically int’egrated correlated

lathe E2A promoter by

the

genomic

of Ad2 DNA sequencing

has been and

foot’-

print’ing techniques to binding pa’tterns of transcription factors to this promoter in independently established cell lines. Methylation initially extends primarily to a DNA domain that is not protected by bound proteins. After further cell divisions, DNA methylation spreads from this focus and from the in vitro premethylated sites across the entire promoter, which thus becomes fully or almost fully methyla,ted. Subsequently, protein binding is abrogated at. certain promoter sequences. DNA methylation may directly or indirectly influence the ensemble of transcription factors binding to the late E2A promoter.

-

with chromatin, nucleotides other than 6: were occasion ally involved. This reaction possibly reflected peculiarities in the secondary structure of Dh’S. (d) Genomic

D&Tase i footp-intiny

Nuclei of UC% cells (Fig. -1) w-ere isolated and treated with 30 or 100 pg DPiase I/ml for 3 min at 25°C. The reaction was terminat.ed by adding SDS and EDTA, the Dh‘A was re-extracted by the SDS/proteinase K/phenol method and cleaved with SacI. The DPI’A fragments were separated by electrophoresis on DN-4 sequencing gels (Maxam & Gilbert, 1980). and the DNase I-sensitive sites were visualized as described above. (e) IZBA

transfer jiVorthern blot) and C’AT expression experiments

These standard techniques have been described & Doerfler, 1981; Gorman et al., 1982).

(Sehirm

3. Results 2. Materials

(a) Nethylation jocus and speadiny DNA meth,ylation

and Methods

(a) Cell i&es The origins and characteristics of t,he cell lines used in this study were described: HEI, HE2 (Cook & Lewis, 1979: Vardimon & Doerfler, 1981; Toth et al., 1989) uc2, u&O, mc23, mc40 (Miiller & Doerfler, 1987; Toth et al., 1989). Briefly. in the UC cell lines the Ad2 late E2A promoter-chloramphenicol acetyltransferase (CAT) gene (pAd2E2ALCAT) construct (Langner et al.: 1986) was genomically fixed in the unmethylated form in BHK21 cells. These cell lines expressed the CAT gene. In the mc cell lines, the in vitro 5’-CCGG-3’ premethylated pAd2E2ALCAT construct was integrated. All cells were passaged at a split ratio of 1 : 4. Cell lines HEI and HE2 contained about 2 to 4 Ad2 genome equivalents per cell 1981). The UC and me cell lines (Vardimon & Doerfler, carried 20 or fewer integrated copies of the late E2A promoter-CAT gene assembly per cell; except cell line mc40, which harbored more than 120 copies (Toth et al., 1989). (b) Sequencing This Gilbert,

technique

method has been described elsewhere (Church 1984; Saluz & ,Jost, 1987; Toth et al., 1989).

&

(c) Treatment in vivo oj’ cells with dimethylsulfate Cells growing exponentially in monolayer cultures were treated at 20°C for 5 min with 0.05, 0.005 or 00005% dimethylsulfate (DMS) in Dulbecco’s medium without serum (Ephrussi et al.; 1985; Saluz et al., 1988). The reaction was quenched by adding ice-cold phosphatebuffered saline (PBS). After 2 washes with PBS. the DNA was extract’ed and the late E2A promoter fragment was isolated as described (Tot,h et al., 1989). The /?-elimination reaction w-as performed with I M-piperidine. The generated DNA fragments were electrophoretically separated on 6% (w/v) polyacrylamide gels in 5 M-urea, transferred to nylon membranes, and hybridized to different segments of 32P-labeled; single-stranded sequences from the E2A promoter as described for genomic sequencing experiments (Toth et al., 1989). In the reaction of DMS

oj

Tt has been shown that in cell lines HEI and uct% all 5’-CG-3’ dinucleotide pairs between nucleotides +30 and - 160, relative to t,he cap site in the act,ive late E2A promoter region of Ad2 DNA, were unmethylated, whereas in cell line HE2 the same sequences in the inact’ive E2A promoter were all methylated (Toth et al.: 1989). The cell lines now investigated for the stability of methylation in the late E2A promoter were again carried from low passage numbers to passages 94, 22 and 134 for cell lines HEl, uc2 and HE2, respectively. At these passage levels, the extent of methylation at the 5’-CG-3’ sequences in the la.te E2A promoter was unaltered in these cell lines (Figs l(a) and 21, when compared with earlier data (Toth et al., 1989). Thus: the patterns of methylation in the integrated lat,e E2A

promoter

region

in cell line

HE2,

and

t,lle lack

of methylation in this region in cell lines HE1 and 1x2, were stable over ma,ny cell generations. The methylation of the late E2A promoter sequences in the same region was also anaiyzed in cell line uc20. The results for the nucleotides between positions +40 and -30 are presented in Figure l(a) and demonstrate that t,hese 5’-CG-3’ sequences in the late F:2A promoter have remained unmethylated. Additional 5’-CpG-3’ sites up to position - 160 were also unmet’hylated (dat,a not shown). This lack of methylation was stable over many passages. In cell lines uc2 and uc20, the unmet,hylat!ed stat,e of the late E2A promoter correlated with promoter activity as demonstrated by NA transfer (Northern blot) experiments and CAT assays (Fig. l(c); Miiller & Doerfler, 1987). Cell lines me23 and me40 were established bv the t’ransfection of KHK21 cells with t’he S’-G&&3’ methylated pAd2E2ALCAT construct (Langner et ul., 1986). Thus, initially only the 5’-CG-3’ dinucleotides +I1 and +I in 5’-CCGG-3’ sequences in this segment of the late E2,A promoter of Ad2 DNA had

111111 III I 1111 IL

scd 0~3~ 1Ed EZDuJ 0. ozd 0~2~ zzd 8~3~

II

II

Ill

I

II Ea OC (Ji=u CHl

OZD” Z=l Z3H 13H law d

d Z3H

oz=n Zan Z3H 13H wur d

.-.--

M. Tot/i et al.

676

CAT assays

3”-Acetyl- CA Ad2specific RNA

1 ‘-Acetyl -CA

Star

Figure 1. Autoradiograms ofgenomic sequencing gels (a) and (b). DSA (a) from eel1 imes HEl. HE& UC%and L&O. (b) from cell lines me23 and mc40 in different passages (p), was analyzed in the late E2A promoter regions of integrated Ad2 DP;A. For the C reactions in (a), DP\‘A was sequenced from t’he B&XI site, for the G reactions in (a) and for the C’ reactions in (b) from the Sac1 site (Toth et al., 1989). Roman numerals and the longer horizontal lines refer to the S-K-3 dinucleotide positions on the map (see Fig. 2). The short horizontal lines designate C residues. The 5-mC residues in the 5’-CA-3’ (m) and 5’-CT-3’ (@) dinucleotides in cell line HE2 (a) are designated. The results documenting the preseme of G residues in these positions on the opposite (Bottom) strand are also shown, Depending on the hybridization probe used, t’he Top or t,he Bottom strand signals were visualized. Unexplained bands were marked by small open circles. In the control lanes. the 5’-CCGG-3’ methyiated (P met) or the unmethylated (P) late E2A promoter fragment was genomically sequenced as a plasmid clone. (c) Expression of the late E2A promoter in cell lines HE1 and HE2 was documented by RNA transfer (Northern blot,) analyses. CAT gene expression was tested in extracts of the UC and mc cell iines by CAT assays. CAT control reactions (enzyme control) were done with commercial CAT. CAM, chloramphenicol.

been premet’hyiated in vitro in these cell lines (Fig. 2; mc23, me40 in !uitro). Cell lines me23 and mc40 were passaged continuously in culture up t,o passages 31 and 35, respect’ively (Fig. 2). The results of the present genomic sequencing experiments in this promoter revealed that the methylation patterns observed during early passage levels of cell lines mc23 and me40 (Fig. 2; Toth et al., 1989) were not stable in that D9A methylation expanded from the preimposed 5’-5-mCG-3’ dinucleotides + II and +I at positions + 24 a’nd +6, respectively (Figs 1 (b) and 2). The 6’-(K-3 positions -TV to -X, in cell line me40 extending to position -XI, were modified at relatively early pa’ssages, thereby creating a secondary focus of promoter methylation. Tn later passages, methylation spread furt,her in either direction of esta.blished domains of methylated dinucleotides. Finally, e.g. in passage 31 for cell line mc23 or in passage 35 for cell line mc40. methylation in individual foci had coalesced, approaching nearly complete methylation of all 5’-CG-3’ dinucleotides in the analyzed segment of the late EZA

promoter (Figs l(b) and 2). This pat,terii ol promoter met’hylat,ion resembled that observed in cell line NE2 that had been kept in culture for many passages (Figs I(a) and 2; Toth et al.; 1989). It was concluded that,, in the late E2A promoter of both cell lines, the methylation of 5’-CG-3’ dinucleotides spread in similar patterns st#arting from a seminal focus of only a few methylated nucleotides, then gradually modifying the unmethylated sequences between foci, a,nd finally encompassing all 5’-CG-3’ sequences in the entire promoter segment. Tn cell line mc23, the 5’-CCGG-3’ sites in the lat’e E2A promoter remained methylated (Fig. 2, 1313 and p22), and the CAT gene, which was controlled by this promoter, was not expressed (Fig. Z(c), pl3 and ~17). In cell line mc40, the +I and iI1 5’-CG-3’ sequences became partly demethylated during earlier pa,ssages (Fig. 2, plO), a,nd the CAT gene was actively expressed (Fig. 1 (c), ~10; Miilier &, Doerfler. 1987). Tn consecutive passages and con comitant with the spreading of methylation to upstream sequences, these two 5’-CG-3’ sequences

Establishment

+40 I

+20 I

ll I

-20

of de Novo

-40 I

Methylation

Patterns

BSfXI

-80

-60 I

677

-100 I

SUCI

-120 I

4-J

cap -I -II +I

-III

-IIr -P

-XI -XII -mm -Ix

-x-x1-

HE1 p94 UC2 p22

HE2 p134

+m +n El-+l-+UI +lI I

+I I

m

-I -It

-In

m n

-III -SC

nrl

-??I -ml -?zln -lx-x-XT n nn n r-II--l

UC20 P20

-91

-XII -m

-Ix

-x -XI

n nn n nn

mc23 rn vifro

I I

I

+I[ I

+I I

m I

f3n

I

Pl3

m

B

wa

I

P22

II

I

I

I

P31

-1-T

-In

-Ix -I?

-xt

m n

n

n/m

w

ffl

I I 5’-C-3’

n

nn

-mt -mu -lx

-X-H

rl nn n nn

mc40 In vitro

nn

I

n 11

nn

I

I

P20

I

I

P35

I

plOA/ plOB

0 5kG-3’

Figure 2. Summary of the genomic sequencing data. Fully and partially methylated or unmethylated Y-CG-3’ dinucleotides in the late E2A promoter of Ad2 DNA were presented for the transformed cell lines HEl, HE2. ue2, uc20, or for cell lines me23 and me40 in different passages (p). as shown by genomic sequencing. The scale refers to nucleotide numbers in the late E2A promoter relative to the cap site (4). The 5’-CCGG-3’ sequences are at nueleotides + 6 ( + J) and + 24 (+ II); which have been prem.ethylated in vitro in the generation of cell lines me23 and mc40. Horizontal lines represent the late E2A promoter segment in individual cell lines. The 5’-CG-3’ sequences in this segment ( + III to -XI) are represented by vertical bars: (0) unmethylated; ( n ) completely methylated; and ( q ) 5’-CG-3’ sequences which are methylated in only some of the integrated promotIer copies. The bars above the horizontal line designate 5’-CG-3’ dinucleotides in the top strand of the promoter sequence, the bars below the line represent the same dinucleotides in the bottom strand. In the E2A region of Ad2 DNA the bottom strand is the transcribed strand, the top strand the nontranscribed complement. All other symbols are as described in the legend to Fig. 1. The methylation maps of cell lines mc23p13 and mc40plO were reproduced from Toth et ~2. (1989). The data for positions +I and +I1 in cell lines mc23p22 and mc40p20 are not shown in Fig. l(b). were remethylated (Fig. 2, ~20). At the same time, CAT activity was markedly reduced (Fig. 1 (c), ~23). However, CAT activity was not completely abolished, possibly because a few of the 120 copies of the integrated late E2A promoter-CAT gene assemblies in cell line mc40 (Miiller & Doerfler, 1987) were not methylated or remained undermethylated. By using the genomic sequencing technique, undermethylation in a few copies of about 120 copies would of course not be detectable.

(b) A few S-CA-3 and 5’-CT-3’ dinucleotides in the late EZA promoter in cell line HE2 contain 5-methyldeoxycytidine The analysis of the data in Figure I (a) revealed two Smethyldeoxycytidine (5mC) residues in the late E2A promoter of Ad2 DNA in cell line HE2 at passage 134 also in positions + 11 (complete methylation) and + 13 (partial methylation), relative to the cap site. These positions corresponded to the C

M. Toth et al

678

B

S HE1

HE2

B

HE1 HE2 ____

NLNL

NCN

HE1 HE2 ~___ Lm LLm

C

+I-I-r---In--

HE1 tiE2 -NCNC

-I-r-Iu-OBF

-IsIr---

__.Ir;

-X-xc-

HElt

i-El-b

6 0’ HE2t G

ldE2-

/ a WP-1

q HEl+ HE2-

‘e40~v)ulba! __-

G

Top strand h) Fig. 3.

residues in 5’-CA-3’ and 5’-CT-3’ dinucleotides; respectively. This finding was ascertained by demonstrating the presence of matching G residues at positions + 11 and + 13 in the complementary strand of this region (Fig. l(a,)). The 5-mCT dinucleotide was also found in passage 170 of HE2 cells (see Discussion). Thus, the results of genomic sequencing experiment’s demonstrat,ed tha,t 5-mC occurred in mammalian DP\‘A also in dinucleotide combinat’ions other than 5’-CG-3’. (c) Unprotected as in viva

late E’2.A pwmoter seyment zerved establish,ed methylation focus

We intended Lo correlate the positions and extent of late E2A promot,er methylation to binding of proteins at’ specific sequences in promoter. Initially, the non-methylated late

the the this E2A

unwnoter in ceil lines ~~20 and HQEl was nnal.vzrG since this non-modified promoter could presumably interact with all transcriptional factors whose presence might influence the spreading of methyiation. The cells were treated with va’rying concentrat’ions of DMS: the DSA was then isolated and analyzed on Ii:/, polyacrylamide gels in 7 x-urea by the indirect end-labeling method. Proteins tha.t bound at, specific sequence motifs could prevent or enhance the methyktion by DMS of specific guanosine residues at the N-7 positions in the major groove of DNA (Ephrussi et al., 1985). Similar observations were made with specific adenosine residues. The results obtained (Fig. 3), and summarized in Figure 5; demon&rate t,hat in the unmet,hylated lat,e E2A promoter, three sequences with homologies to the recognition sites for factors ORF. &4P-2 and AP-1 (Jones et al., 1988) responded to DMvls

Establishment

Methylation

B

S

UC20 N

p&L

Patterns

679

B

S

mc23

UC20

NL

of de Nova

mc23

C

N

C

OBF f;

mc23-

AP-2

--- Z mc23+ AU

J”,

AP-1

q

1-G

uc20+

n

mc23-

E A

G

Bottom strand

TOY

strand

b) Figure 3. Autoradiograms of DMS genomic footprinting gels. (a) Cell lines HEl, HE2; (b) cell lines uc20 and mc23. S, isolated genomic DNA; C, chromatin; P, control plasmid DNA. DMS was used in increasing concentrations in wiwo. The wedge symbols (A) covering (chromatin) lanes 2 to 4, 6 to 8, 14 to 16, 18 to 20 in (a), and lanes 2 to 4; 8 to 180;14 to 16; 20 to 22 in (b) correspond to DMS concentrations of 0.0005, @005 and 0.05%. The decreasing wedge symbols (k) in lanes 6 and 7, and 18 and 19 (b), refer to DMS concentrations of 0.05 and 0.005°/0. Unless especially indicated, DMS was used at O.OO5o/ofor chromatin (C) and O.O50/o for isolated genomic DNA (N). For control plasmid DNAs (P) the concentration of DMS was @5%. Protection against and enhancement of t,he DMS reaction are indicated by symbols ( n ) and (!J), respectively. Sequences homologous to factor binding sites are indicated by the nucleotide sequences. Protein binding to individual sequence motifs is indicated by the symbol (+), absence of binding by (-). S; the DNA was sequenced from the Sac1 site; B, from the B&XI site. The numbers 1 to 24 designate individual lanes on the gels. G refers to the base-specific chemical cleavage used in genomic footprinting. Intense bands that did not correspond to G residues and occurred predomina,ntly in chromatin samples probably represented DMS hyperreactive sites which might be caused by a certain DNA conformation. All other symbols are as described in the legend to Fig. 1.

modifications in chromatin in a way different from that of isolated DNA. Only the AP-1 binding site in the late E2A promoter had been identified (Bhat et al., 1987; Goding et al., 1987; Huang & Roeder, 1988). The late E2A promoter motif TACAAA; the non-canonical TATAA equivalent, was reported to bind the TFIID factor in vitro very weakly (Huang

et al., 1988). In viva DMS footprinting did not reveal protein binding at this site (Figs 3 and 5). The DMS footprinting patterns for the late E2A promoter in cell line uc2 were identical with tlhose in cell line uc20 (data not shown). The relatively high copy number of integrated Ad2 DNA in cell line uc2 (20 copies/cell) permitted us also to analyze the

680

M. Toth et al.

ATF

-

-t @i c

*I

- -I --II -- III

OBF ] O-

AP-

AP-2 1 oSP-1

z

SP

-- Ip -- Y

-- YE --pm -- Ix -- X AP-1

-- xr

r@J _

nd.cs -

C

N

TOP strand DNase I Figure 4. Autoradiogram of DBase T footprinting experiments. Isolated nuclei (C for ohromatin) or pt:rified ceili&: DKA (K for naked DKA) from uc2 cells was treated with DlVase I at the following concentrations @g/ml): 30 (lanes 1 and 5). 100 (lanes 2 and B), 1 (lanes 3 and 7). or 3 (lanes 4 and 8). As a marker, late EZA promoter-containing plasmid DNA was reacted with DMS (G reaction). Depending on which probe was used, the top or the bottom strand sequences in the late E2A promoter were visualized. Sequences with homologies to known factor binding sites (ATF’. OBF, AP-2. SP-1 or izI’-1) are indicated by vertical lines. Protection against (0) or enhancement,s (0) of the DSase I reaction in isolated nuclei is indicated. All other symbols are explained in the legend to Fig. 1.

binding of t,ranscription factors in the iate E2A promoter by DNase I footprinting (Figs 4 and 5). With this method, additional ATF and W-1 binding sites were iderltified at positions -2 to -9 and -42 t*o - 31, respectively. These sites were not detected by DAIS treatment. The SF’-1 binding site in the late JI:ZA promoter was documented by in

vitro arisiyses (Goding Pt al., 1987). From tht~,i’ results it was concluded that up to the 5’-CYk-3’ sequence in position -‘I’. the upstream sequence of the non-methylated Me E2A promoter in cell lines HEl, uc2 and uc20 appeared to be protected by proteins. However, from positions -VI to -S a major unprotected region was apparent (Fig. -5).

Establishment

of de Novo

Methylation

Patterns

681

SP-1

ATF [TT%?F~ GGGCTGTIGGACGTc&cTTAcc CCCGAC~CTGCAGC~GAATGG

L-----J

+

HE1

+ +

HE2 UC20

N.D.

N.D.

+

mc23

Figure 5. Summary of the in wivo DMS and DNase I genomic footprinting data. The nucleotide sequence analyzed was that of the late EZA promoter of Ad2 DNA between nucleotide positions +6 and - 113. Sequences with homologies to binding sites of transcriptional factors are framed. Factor binding sites, which were determined by in viv#o DMS footprinting (see Fig. 3), are boxed (continuous line boxes), those identified by DNase I footprinting (cell line uc2) (Fig. 4) are designated by a broken box. The known consensus sequences for transcription factor binding sites are bracketed above or below the nucleotide sequences. This promoter region includes a CCAAT motif, which interacts with proteins in in vitro binding studies (Goding et al., 1987), but is not protected in in viwo experiments (this paper). Roman numerals refer to the 5’-CG-3’ dinucleotide sequences that define positions -1 to -XI. The symbols in the graph indica,te: ( + ) protein binding; (- ) absence of protein binding; ( n ) protection against DMS; (0) enhanced DMS reaction; N.D.. not done.

This latter domain coincided with the newly methylated focus in early passages of cell lines me23 and me40 (Fig. 2). (d) Promoter methylation abrogates the binding of certain transcription factors Protein binding to sequences in the late E2A promoter was also analyzed in the methylated late E2A promoter in cell lines me23 (~22) and HE2. Extensive E2A promoter methylations interfered with factor binding at promoter sequences homologous to OBF and AP-1 sites (Figs 3 and 5). The sequence homologous to an AP-2 site in the E2A promoter contained no 5’-CG-3’ dinucleotide and was not methylated. Factor binding at this motif was observed in all cell lines investigated including cell lines me23 and HE2, although the E2A promoter in these cells was highly methylated in other motifs (Figs 2 and 5). Apparently, the methylation of DNA motifs in the late E2A promoter abrogated the binding of some of the transcription factors in viva.

4. Discussion (a) The establishment methylation

of de novo patterns cells

of

in mammalian

The interplay of the cellular DNA methyltransferase system with proteins bound to chromosomal DNA and the regulation of the spreading of DNA methylation in newly inserted DNA have previously not been investigated. The data presented in this paper demonstrate, at the nucleotide levels, that many cell generations are required to establish de nova patterns of DNA methylation. Evidence has been adduced that these patterns develop gradually

(Sutter et al., 1978; Kuhlmann & Doerfler, 1982, 1983). The spreading of DNA methylation, starting at from a seminal site, has now been documented the nucleotide level. It is likely that the spreading progresses in distinct patterns that are dependent on proteins bound to specific sequence motifs in DNA. The results presented indicate that the in vitro 5’-CCGG-3’ premethylated sequences at 5’-CG-3’ sites could serve as the initial Soci of methylation for the late E2A promoter in cell lines mc23 and mc40. In vivo the secondary fo’cus of methylated 5’-CG-3’ sequences in the late E2A promoter has been located between 5’-CG-3’ sites -IV and -X. This focus coincides with a proteinfree domain (Figs 3 to 5). It is conceivable that the absence of bound factors predisposes this region to the action of the DNA methyltransferase system. Subsequently, methylation spreads to many, eventually to all the adjacent 5’-CG-3’ dinucleotides, as documented for the late E2A promoter in cell lines me23 and mc40. We do not know how far this spreading of DNA methylation can proceed and if or how it is eventually controlled. In other cell lines, uc2; uc20 (Miiller & Doerfler, 1987) or HE1 (Cook & Lewis, 1979), in which the unmethylated late E2A promoter has been introduced, methylation is not observed (Toth et al., 1989). The late E2A promoter in cell line HE2 is completely methylated at all 5’-CG-3’ sequences, and at one 5’-CA-3’ and one 5’-CT-3’ sequence. In this cell line, the Ad2 DNA has possibly been integrated close to a cellular focus of methylation. Alternatively, it is conceivable that structural signals in the integrated viral DN,4 serve as signals for the initiation of the DNA methyltransferase reaction in vivo. From in vitro experiments, evidence has been adduced that a certain spacing of 5’-CG-3’ sequences in the globin promoter

682

M. Toth et al

facilitates t,he generation of a very specific pattern of DNA methylation in vitro (Wartl et al.; 1987). However, in the late E2A promoter the signal t.o methylate cannot’ be provided by sequence alone, since cell lines HE1 and HE2 or the UC and mc cell lines, which carry the same late E2A promoters, show decisive differences in promoter methylations. Tnterestingly, in celi line HE2 the late E2A promoter contains a 5-mC residue in positions + 11 and + 13 of t’he top strand in a 5’-CA-3’ and a 5’-CT-3’ dinucleotide, respectively (Figs l(a) and 2). In these instances, 5-mC hss been demonstrated by genomic sequencing in positions other than 5’-CG-3’ dinucleotides. These modifications mav reflect the high degree of methylation in the inaccve late E2A promoter in cell line HE2 and the gradual spreading also to non-(?/-W-3 sequences. The 5-mC residue in posit,ion -+ 11 (5’~5-mCA-3’) has been lost, 5-mC in position + 13 (5’~5-mCT-3’) has been ma,intained at least over several cell generations. It is not known how the 5’-5-mCT-3’ dinueleotide in position + 13 can be maintained in the methylated form over severa,l cell passages. There is no apparent sequence symmetry, not even in a. trinueleotide, that might. serve as a methylation signal. Evidence for the presence of 5-mC residues in non-5’-CC-3’ dinucleotides in human DNA has been adduced by the restrict.ion sequence and by nearest-neighbor enzyme analyses (Woodcock et al., 1987, 1988). In the present paper we show the presence of 5-mC in 5’-CA-3’ and 5’-CT-3’ dinucleotides in a specific promot,er segment directly by genomic sequencing. Tn a previous report, the 6-mC residue in the 5’-CT-3’ dinucleotide in position + 13 has been observed (see Fig. 2; cell line HE2 ~170, Toth et al., 1989) but not commented upon, since at the time it could have been a, singular phenomenon. The DNA from cell line HE2 that was analyzed in the experiment illustrated in Figure l(a) stems from cells in a different passage, (b) Persistence

of hemimethylated

sequences

The gradual spreading of DNA methylation yields hemimethylated sequences in the late E2A promoter, e.g. in cell lines me23 and me40 (Fig. 2). These hemimethylated 5’-CG-3’ sequences can persist for several cell generat,ions. Eventually, however, even these sequences become methylated in both complements. The hemimethylated state of sites in the integrated adenovirus promoter suggests a more complex mechanism for DNA methylation than a semiconservative copying by the DXA methyltransferase system.

has been compared. In cell lines mcd3 and HE2: the !ate E2A promoher is highly methylat’ed and onfy the sit.e homologous to an AP-2 sequence, which is devoid of 5’-CG-3’ sequences and hence of 5-mC. is complexed with protein (Figs 4 and 5). DSA methylation can interfere with the in vitro binding of the EZF, MTT,F and CREB factors to their respective binding sites (Kovesdi et al., 1987; Watt 8: ,Molloy: 1988; Iguehi-Ariga & Schaffner, 1989), and of prot’ein fa,ctors to the + 37 to - 13 region in the late E2A promoter of Ad2 DNA (Hermann et aI., 1989). In contrast, t,he W-1 factor can bind to the promoter and activate transcription, even when the binding site is met,hylated (Harrington et ai.. et al., 1988). Hence a methylated L988; HSller promoter is probably partly depleted of transcription factors both in vitro and in vbvo (t’his work). (d) Prw&er

methylation and promoter--protein interactions

By in viva DXS footprinting (Ephrussi et al., 1985; Becker et al., 1987; Saluz et al., 1988), protein association of certain promoter sequences corresponding to factor binding motifs (OBF, AP-1 and AP-2) in the methylated or non-methylated state

a,nd inhil&ion

Tn ihe late E2A promoter. t#he methylation oi three specific 5’-CG-3’ sequences at positions -?-2-l (+II), +6 (+I) and -215. relative to the cap site of the promoter. suffices to inhibit or ina.ctivate this promoter (Langner et al.: 1984, 1986). The region around position - 215 in the late E2A promoter has not been investigated by the genomic sequencing method in t.ransformed cell lines. The ext,ensive methylations of 5’-CG-3’ sequences bet’ween promoter positions + 30 and - 160, which interfere with factor binding, are thus either a safeguard for permanent promoter inactiva,tion or serve some other function not related to promoter inactivation. The present findings also relate to the argument’ whether promoter methyl&ion can be the cause or consequence of promoter inactivation or inhibition. Both ways of reasoning appear to apply in that the methylation of a few 5’-CG-3’ dinucleotides causes: directly or indirectly, promoter inactivation. Subsequently, these methylated sites ca,n serve as foci for the spreading of met,hylation to neighboring sequences. Thus, the spreading of DNA methylation can also be viewed as a consequence of promoter inactivation. M.T. was on leave from the Hungarian Academy of Sciences. Szeged, Hungary, and was supported bs a fellowship of the Alexander-von-Humboldt FoundaGon, Bonn, Germany. We thank Petra Biihm for editorial assistance. and Hanna Mansi-Wothke for the preparation of media. This research was supported by the Deutsche Forschungsgemeinschaft through SFB274-TP2.

References P. B.. Ruppert, 435-443.

Becker.

(c) DNA

methylation

Bhat.

S. &, Schlitz. 6. (1987). Cell. 51;

G., SivaRaman, L., Xurthy, S.: Domer, P. & Thimmappaya, B. (1987). EMBO J. 6, 2045-2052. Church, G. M. & Gilbert, W. (1984). Proc. Nat. Acnd. Sei.. U.S.A. 81, 1991-1995. Cook. J. L. & Lewis. A. M., Jr (1979). Cancer Res. 39. 14551461. Doerfler, W. (1981). J. Gen. Viral. 57, l-20.

Establishment

of de Kovo

Doerfler, W. (1983). Annu. Rev. Biochem. 52, 93-124. Ephrussi, A., Church, G. M., Tonegawa, S. & Gilbert, W. (1985).

Science:

227,

134-140.

Gartler, S. M. 85Riggs, A. D. (1983). Annu. Rev. Genet. 17, 155-190. Goding, C. R.; Temperley, S. M. & Fisher; F. (1987). Nucl. Acids Res. 1.5; 7761-7780. Gorman, C. M., Moffat, L. F. & Howard, B. H. (1982). Mol. Cell. Biol. 2, 1044-1051. Harrington, M. A., Jones; P. A., Imagawa, M. 8: Karin, M. (1988). Proc. Nat. Acad. Xci., U.S.A. 85, 2066-2070. Hermann, R., Hoeveler, A. & Doerfler, W. (1989). J. Mol. Biol. 210, 411-415. Holler, M., Westin, G., Jiricny, J. & Schaffner, W. (1988). Genes Develop. 2, 1127-1135. Huang, D.-H. & Roeder, R. G. (1988). Mol. Cell. Biol. 8, 1906-1914. Huang, D.-H., Horikoshi, M. & Roeder, R. G. (1988). J. Biol. Chem. 263, 12596-12601. Iguchi-Ariga, S. M. M. $ Schaffner, W. (1989). Genes Develop. 3, 612-619. Jones N. C., Rigby, P. W. J. & Ziff; E. B. (1988). Genes Develop. 2, 267-281. Kovesdi, I.; Reichel, R. & Nevins, 9. R. (1987). Proc. Nut. Acad. Sci., U.S.A. 84, 2180-2184. Kuhlmann, I. & Doerfler; W. (1982). Virology, 118, 169-180. Kuhlmann, I. & Doerfler, W. (1983). J. Viral. 47, 631-636. Langner, K.-D., Vardimon, L., Renz, D. & Doerfler, W. (1984). Proc. Nat. Acad. Sci., U.S.A. 81, 2950-2954. Edited

&fethylation

Patterns

683

Langner, K.-D., Weyer, U. & Doerfler, W. (1986). Proc. Nat. Acad. Sci., U.S.A. 83, 1598-1602. Maxam, A. M. & Gilbert, W. (1980). Methods Enzymol. 65; 499-560. Monk, M. (1986). BioEssays 4, 204-208. ,Miiller, U. & Doerfler, W. (1987). J. Viral. 61, 3710-3720. Saluz: H. P. & Jost, J. P. (1987). A Laboratory Guide to Genomic Sequencing, BirkhLuser, Basel. Saluz, H. P., Feavers, I. RI.: Jiricny, J. &, Jost, J. P. (1988). Proc. h’at. Acad. Sci., U.S.A. 85, 6697-6700. Schirm, S. & Doerfler, W. (1981). J. Viral. 39, 694-702. Sutter, D. & Doerfler, W. (1979). Cold Spring Barbor Symp. Quant. Biol. 44; 565-568. Sutter, D. 85 Doerfler, W. (1980) Proc. Nat. Acad. Sci., U.S.A. 77, 253-256. Sutter, D., Westphal, M. & Doerfler, W. (1978). Cell: 14, 569-585. Toth, M., Lichtenberg, U. & Doerfler, W7. (1989). Proc. Nat. Acad. Sci., U.S.A. 86, 3728-3732. Vardimon, L. & Doerfler, W. (1981). J. Mol. Biol. 147, 227-246. Ward, C., Bolden, A., Nalin, C. M. & Weissba,ch; A. (1987). J. Biol. Chem. 262, 11057-11063. Watt, F. 8: Xolloy, P. L. (1988). Genes Develop. 2, 1136-1143. Woodcock, D. M., Crowther, P. J. & Diver, W. P. (1987). Biochem. Biophys. Res. Commun. 145, 888%89,4. Woodcock, D. M., Crowther, P. J.? Jefferson S. & Diver, W. P. (1988). Gene, 74, 151-152.

by P. Chambon

Establishment of de novo DNA methylation patterns. Transcription factor binding and deoxycytidine methylation at CpG and non-CpG sequences in an integrated adenovirus promoter.

The establishment of de novo patterns of DNA methylation in mammalian genomes is characterized by the gradual spreading of methylation, which has been...
6MB Sizes 0 Downloads 0 Views