372

Biochimica et Biophysica Acta, 564 ( 1 9 7 9 ) 3 7 2 - - 3 8 9 © E l s e v i e r / N o r t h - H o l l a n d Biomedical Press

BBA 9 9 5 2 6

O R G A N I S A T I O N OF I N V E R T E D REPEAT SEQUENCES IN HAMSTER CELL N U C L E A R DNA

N O R M A N H A R D M A N , A N N E J. B E L L a n d A L A N M c L A C H L A N

Department of Biochemistry, University of Aberdeen, Marischal College, Aberdeen AB9 1AS (U.K.) ( R e c e i v e d F e b r u a r y 1 2 t h , 1979)

Key words: Nuclear DNA; Inverted repeat sequence; Duplex; Foldback molecule

Summary Hamster cell nuclear DNA is shown to contain inverted repeat (foldback) sequences, in some respects similar to the foidback fraction in DNA from other animal cell types. Using electron microscopy the majority of foldback duplexes are shown to be located in simple hairpin-like DNA structures, formed from individual pairs of complementary inverted repeated sequences 50--1000 nucleotides in length, in some cases arranged in tandem, and in other cases separated by intervening sequences, up to 16 000 nucleotide residues long. In addition, a novel class of foldback structure, referred to as 'bubbled hairpins' is reported, which appear to be formed from clusters of inverted repeat sequences that are separated from adjacent clusters of complementary inverted repeats by large intervening sequences which vary in length from 5000 to over 20 000 nucleotide residues. Due to the special pattern of distribution of these latter inverted repeat sequences, 'bubbled hairpins' are observed only in long foldback DNA. Evidence is presented that the distribution of foldback sequences in hamster cell DNA is highly ordered. The lengths of the intervening single chains in foldback structures appear to vary non-randomly. This gives rise to a localised periodic pattern of organisation that is believed to be a consequence of regular alternating arrangements of foldback and non-foldback sequences in the segments of DNA from which foldback structures are derived.

Introduction When nuclear DNA from eukaryotic cells is fragmented, denatured under appropriate conditions, and then the DNA chains allowed to anneal under conAbbreviation:

(in seconds).

Cot is the product of the DNA concentration (tool nucleotide/1) and the time of annealing

373 ditions favouring duplex formation, a proportion of the polynucleotide chains anneal extremely rapidly, and independently of DNA concentration, to form molecules with partly duplex character. Many studies have been carried out to determine the nature of the sequences that give rise to these duplex-containing molecules [1--8]. These studies lead to the general conclusion that some homologous repeated sequences are closely located in segments of eukaryotic DNA, and are sometimes arranged as inverted, complementary repeats juxtaposed in the same polynucleotide chain. On denaturation, these sequences give rise to foldback molecules due to 'spontaneous' intrachain annealing of the complementary inverted repeated elements. When inverted repeat sequences are immediately adjacent (sequence palindromes) they result in the formation of unlooped hairpin molecules [2], and when located some observably finite distance apart they give rise to structures referred to as looped hairpins [3] which contain intervening single chain loops defining the distance separating the inverted repeated sequences in the native duplex. For both types of molecule the size of the duplex stem in the foldback structure, determined by electron microscopy, has been used to estimate the physical length of the region of intrastrand sequence homology. Such measurements have revealed that the regions of complementary inverted sequence are short, mostly between 100 and 600 nucleotides [2--5,7,9]. Some studies on the distribution of foldback sequences in eukaryotic DNA have utilised the technique of hydroxyapatite chromatography to determine the extent to which foldback sequences and other sequence components are mutually interspersed [2,6,10]. It is concluded from such experiments that in some cases foldback sequences are clustered together in segments of the genome, whereas in other instances foldback foci are widely distributed in the genome and are located in segments of DNA containing a spectrum of all the different reassociation-kinetic components present in total DNA [10]. The function of foldback sequences is not known. However, that they are present in DNA from all eukaryotic organisms studied, both primitive and complex, argues that sequences of this type may have been conserved during evolution, possibly for some special purpose, or purposes. Motivated by this possibility, we have undertaken a study of these sequences in nuclear DNA isolated from cultured hamster cells. In the present paper, experiments are described which reveal the presence of a novel foldback structure formed exclusively from longer strands of hamster cell DNA. From the properties of these and other foldback structures, it is suggested that foldback duplexes are derived from inverted repeat sequences that are arranged in a highly regular manner in hamster cell DNA. The reasons for an apparent discrepancy in the amount of foldback DNA that can be detected in hamster cell DNA by hydroxyapatite chromatography and electron microscopy is discussed.

Experimental methods

Labelling and isolation ofDNA. BHK-21/C13 cells were grown in monolayer culture at 37°C for 2.5 days in the presence of [methyl-3H]thymidine (0.25 pCi/ml). Cells were screened routinely for the absence of contamination by

374 mycoplasmas. Nuclei were prepared and nuclear DNA isolated as described previously [7]. Alkaline sucrose gradient centrifugation. 3H-Labelled hamster cell nuclear DNA (specific activity 4 . 4 - 1 0 3 cpm/pg) was denatured in the presence of 0.2 M NaOH, 1 mM EDTA and layered on to 16.4 ml 5--20% (w/v) sucrose gradients containing 0.9 M NaC1, 0.1 M NaOH and 1 mM EDTA. Gradients were centrifuged at 2 • 104 rev./min at 20°C for 5 h using the SW27.1 rotor in a Beckman preparative ultracentrifuge. Gradient fractions containing 0.58 ml were collected and neutralised by the addition of 0.15 ml of 1.0 M sodium phosphate buffer, pH 6.8 at 4°C. Recovery of radioactivity was greater than 98%. Hydroxyapatite chromatography. The m e t h o d is based on the routine procedure used previously [7]. Appropriate neutralised alkaline sucrose gradient fractions containing 3H-labelled DNA were dialysed into 0.12 M sodium phosphate buffer, pH 6.8. Samples of the dialysed material were heat denatured for 10 min at 97--100°C, annealed to Cot 1 • 10 -4 M • s, and applied to hydroxyapatite crystals at 60°C. U n b o u n d single-stranded DNA was removed completely with several washes of 0.12 M sodium phosphate buffer, pH 6.8, at 60°C, after which the duplex fraction was eluted using 0.4 M sodium phosphate buffer, pH 6.8, at the same temperature. Recovery of radioactivity was routinely in excess of 94%. Digestion o f DNA with $1 nuclease. 2 ml of a solution containing 148 gg unsheared 3H-labelled hamster cell nuclear DNA and 12.3 pg 14C-labelled Escherichia coli DNA in 0.18 M NaC1, 0.01 M piperazine-N,N'-bis(2-ethanesulphonic acid) (Pipes), pH 6.8, was heated at 100°C for 10 min to denature the DNA mixture. The solution was quickly cooled to 37°C and immediately added to 2 ml of a solution containing 120 mM NaC1, 0.3 mM ZnSO4, 60 mM sodium acetate, pH 4.5. The mixture was incubated in the presence of 1920 units of $1 nuclease (Boehringer Corp., London) at 37°C, and digestion allowed to proceed. The Cot value up to the digestion step is estimated to be 1--2" 10 -3 M • s. To monitor the production of radioactive acid-soluble nucleotides, 0.1 ml samples of the mixture were removed periodically, added to 0.3 ml water and 0.2 ml of a 0.3% (w/v) solution of bovine serum albumin. After mixing, each sample was precipitated by addition of 1.0 ml of 8.0% (w/v) trichloroacetic acid. The precipitate was centrifuged at 0°C in an MSE 4L centrifuge at 3000 rev./min for 10 min, the pellet washed in 1 ml of 5% (w/v) trichloroacetic acid, and the acid-soluble radioactivity determined in the combined supernatants. Digestion for 45 min was sufficient to render all except 0.48 pg of the 14C-labelled DNA and all except 4.65 pg of the 3H-labelled DNA acid soluble, corresponding to 96.8% digestion of the 3H-labelled hamster cell DNA. In separate, control digestions, using tracer 14C-labelled native E. coli DNA, 2--6% of the ~4C label was rendered acid soluble. The products of digestion of the denatured, fast-annealed DNA were applied to a column containing 1 g of hydroxyapatite (Bio-Rad HTP, Bio-Rad Laboratories, Bromley, Kent) previously equilibrated at 60°C with 0.05 M sodium phosphate, pH 6.8. The column was washed at 60°C with 10 ml of the same buffer, followed by 10 ml of 0.12 M sodium phosphate, pH 6.8, to remove residual single-stranded DNA, Duplexes were then removed by elution at 60°C

375 with 10 ml of 0.4 M sodium phosphate, pH 6.8. This fraction contained 3.2% of the total recovered 3H radioactivity, and 0.2% of the 14C radioactivity. Electron microscopy. Neutralised alkaline sucrose gradient fractions, at 20°C, containing DNA, were prepared immediately for electron microscopy without intentional annealing. To avoid possible shear degradation of DNA, fractions were used directly without removal of sucrose, after dilution of the samples at least two-fold with 0.4 M sodium phosphate buffer, pH 6.8, in order to bring the fractions to a uniform DNA concentration of about 2 pg DNA/ml. For the gradient fractions referred to in this study, the estimated maximum Cot value for reassociation for the samples varied between 1 . 1 0 -4 and 1 - 1 0 - 3 M . s . For unfractionated DNA, its concentration was adjusted appropriately to give an estimated Cot value of 1 • 10 -4 M • s during spreading for electron microscopy. A modified protein monolayer technique was used for electron microscope sample preparation, similar to that devised by Davis and Hyman [11]. Hyperphase solutions contained 0.6 ~g DNA/ml, 0.12--0.15 M sodium phosphate, 30--50 ~g cytochrome c/ml, 55% {v/v) formamide, 10 mM EDTA and 100 mM Tris-HC1, pH 8.5. The hypophase contained 20% (v/v) formamide and 10 mM Tris-HC1, pH 8.5. Specimens were mounted on 200-mesh copper grids coated with parlodian support film, then stained for 30 s in a fresh solution of 50 pM uranyl acetate, 50 pM HC1 in 90% (v/v) ethanol, washed for 10 s in 2-methylbutane and air-dried. Grids were rotary shadowed with Pt/Pd (4 : 1, w/w) and viewed with a Philips EM400 electron microscope. Length measurements were made using a Depose-HC map measurer. Using k-DNA as a duplex standard, and ~bX174 single-stranded circles as a single chain standard, DNA size was calibrated for the conditions of DNA spreading described above. Accordingly, the nucleotide equivalent/unit length of duplex DNA was shown to be 3100 bases/pm and for single-stranded DNA 3370 bases/ pm [12]. Results

Hydroxyapatite chromatography of fast-annealed DNA Previous observations have shown that foldback sequences are possibly clustered in segments of some eukaryol~ic DNAs [2,10]. Accordingly, attempts were made during the course of the present study to obtain sufficiently long DNA chains to investigate ~the possibility that foldback foci in hamster cell DNA might also be clustered. Due to difficulties in obtaining homogenously long DNA molecules by Ordinary methods, we resorted to sedimenting DNA through alkaline sucrose gradients to obtain fractions containing single-stranded molecules of reasonably uniform lengths for further study. After neutralisation of fractions selected from across the gradient, the proportion of the DNA that contained duplexes was determined from the amount of material which bound selectively to hydroxyapatite'crystals at 60°C, after annealing to an approximate Cot ~ 1 • 10-4M • s. The results, shown ~in Fig. 1, demonstrates that the yield of spontaneously annealed, hydroxyapatite-bound duplex-containing DNA increases disproportionately at long chain lengths. This indicates that longer strands of DNA

376

I .~ 8.

~d •c_ 2

> : ~ 6-

o~ o~.2-

. ... •

• '","",

' " " '.

.

.

.

.

.

.

.I

0

I

. . . .

i

. . . .

i

....

10 20 30 DNA chain length (nucleotides x 10 - 3 )

Fig. 1. H y d r o x y a p a t i t e b i n d i n g o f b r i e f l y a n n e a l e d h a m s t e r cell D N A . F r a c t i o n s c o n t a i n i n g 3 H - l a b e l l e d h a m s t e r cell D N A w e r e t a k e n f r o m alkaline s u c r o s e g r a d i e n t s a f t e r c e n t r i f u g a t t o n , as d e s c r i b e d in t h e t e x t . T h e size o f D N A in e a c h f r a c t i o n was d e t e r m i n e d b y s e d i m e n t i n g a p p r o p r i a t e s a m p l e s in f u r t h e r alkaline s u c r o s e g r a d i e n t s b y r e f e r e n c e t o i n t a c t single c h a i n s of b a c t e r i o p h a g e k - D N A ( s e d i m e n t a t i o n c o e f f i c i e n t --~40 S), u s i n g t h e r e l a t i o n s h i p b e t w e e n s e d i m e n t a t i o n c o e f f i c i e n t a n d c h a i n l e n g t h f o r d e n a t u r e d D N A [ 1 3 ] . N e u t r a l i s e d f r a c t i o n s w e r e r e - d e n a t u r e d b y h e a t , a n n e a l e d t o Cot = 1 . 10 --4 M ' s , a p p l i e d to h y d r o x y a p a t i t e at 6 0 ° C , a n d t h e f r a c t i o n o f r a d i o a c t i v i t y in t h e d u p l e x - c o n t a i n i n g f r a c t i o n d e t e r m i n e d . R e c o v e r y o f r a d i o a c t i v i t y was r o u t i n e l y o v e r 90%. D N A o f l e n g t h 61 0 0 0 bases ( d a t a n o t s h o w n ) gave 28% b i n d i n g in t h e d u p l e x f r a c t i o n .

contain additional segments of hydroxyapatite-bindable duplex, formed under conditions of fast annealing, that are absent from smaller single chains. The result prompted the following detailed investigation of the structural properties of foldback molecules which result from fast annealing of DNA fractions containing long single chains. Electron microscopy of foldback molecules. Samples of fractions from alkaline sucrose gradients were taken neutralised, and spread immediately for observation in the electron microscope in the presence of formamide, w i t h o u t intentional annealing. The DNA molecules observed were classified as shown in Table I. Data from two DNA fractions of different sizes is compared with smaller, heat-denatured fragments of unfractionated hamster cell DNA under similar conditions of spreading for electron microscopy. Electron micrographs of samples taken from fractions containing longer molecules showed structures broadly similar to those classified in Table I. However, these molecules were extremely tangled, contained many duplex regions and single chain crossovers, and were difficult to interpret completely. Further analysis was therefore restricted to samples containing DNA of intermediate size. There are several features of note in the data shown in Table I. First, most of the structures in both sucrose gradient-fractionated samples are basically of similar types to those present in total, unfractionated DNA. It is therefore considered unlikely that the fractionation procedure selects unrepresentative subfractions of foldback sequences from the DNA, although such a possibility cann o t be excluded from the type of experiment described here. Most of the DNA molecules formed structures which were interpretable, the remaining structures possessing many single chain crossovers which precluded a completely unambiguous analysis. However, some of these complex, uninterpreted molecules also contained foldback elements. A significant increase in the observed frequency of looped hairpins compared with unlooped hairpins was found in

377 TABLE I C L A S S I F I C A T I O N OF S T R U C T U R E S IN H A M S T E R CELL F O L D B A C K DNA

The number-average length o f m o l e c u l e s in each case (in parentheses) s h o u l d be t a k e n as a p p r o x i m a t e , and w a s d e t e r m i n e d from the length o f fifty single chain m o l e c u l e s in each o f t h e samples t a k e n at rand o m . Linear ds/ss refers to linear d u p l e x m o l e c u l e s w i t h single chains at each terminus. T h e origin o f f o r k e d structures, w h i c h c o n t a i n d u p l e x e s b o u n d e d b y three or m o r e single chain tails, and o t h e r structural t y p e s , is discussed in the t e x t . F r a c t i o n a t e d samples w e r e o b t a i n e d f r o m alkaline sucrose gradients, and were neutralised and treated as described in E x p e r i m e n t a l m e t h o d s . T h e u n f r a c t i o n a t e d sample was heat d e n a t u r e d for 10 rain at 1 0 0 ° C in 0 . 1 2 M s o d i u m p h o s p h a t e , PH 6.8, and t h e n annealed to Cot 1 • 1 0 -4 M • s at 6 0 ° C b e f o r e spreading for o b s e r v a t i o n in the electron m i c r o s c o p e u n d e r c o n d i t i o n s similar to t h o s e u s e d for fractionated samples. N u m b e r s o f structures observed were: unfractionated, 7 9 8 ; fraction (a), 3 4 2 ; f r a c t i o n (b) 548. =

Structural t y p e s (% o f total)

Simple hairpins Looped Unlooped Multiple hairpins Bubble d hairpins Linear ds/ss Forks F o r k s + hairpins Single chains Uninterpreted

Source o f D N A Unfractionated (1.5 pro)

Alkaline sucrose fraction (a) (4.5 ~ m )

Alkaline sucrose fraction (b) (8.2 ~ m )

3.9 5.5 0.9 0 0.9 0.1 0 88.2 0.6

9.9 1.8 1.2 1.2 1.3 2.4 0 79.3 2.9

10.9 3.6 1.9 4.0 1.2 1.1 0.4 74.8 2.1

long, sucrose gradient-fractionated DNA as opposed to unfractionated DNA. This may partly account for the increase in foldback DNA yield in longer chains (see below), since the yield of looped hairpins is expected to be dependent upon the relative size of the DNA compared with the length of single chains that separate sequences potentially able to form looped foldback duplexes. As expected, a significant increase in the number of structures containing multiple foldback foci was observed for longer, fractionated DNA. Occasional forked molecules were detected, interpreted as having arisen from intermolecular annealing of separate DNA chains [4,14]. A small proportion of long linear duplexes with single chain terminals were seen, which have been found previously as a component in hamster cell foldback DNA [7]. The most striking difference in comparing the data in Table I is the appearance of a new class of foldback molecule, present exclusively in long chains, which contain multiple regions of DNA duplex, many of which are arranged in clusters. These structures will be referred to as 'bubbled hairpin', some examples of which are shown in Figs. 2 and 3. From electron micrographs taken of alkaline sucrose-fractionated DNA, containing 8.2-~m fragments, referred to in Table I, estimates were obtained for duplex stem lengths, single chain loop lengths and interhairpin distances for simple hairpins and bubbled hairpins. The results are shown in Figs. 4 and 5. The frequency distribution of duplex stem lengths on simple hairpins, illustrated in Fig. 4a, resembles that observed in our previous study using much shorter DNA chains [7]. Most of the stem lengths fall within the range 50--

378

1000 nucleotide pairs, similar to the majority of the individual duplex stems present in bubbled hairpins, also shown in Fig. 4a. A somewhat similar distribution of lengths is observed for the foldback duplexes recovered by hydroxyapatite chromatography following removal of single-stranded tails on foldback molecules with S1 nuclease (Fig. 4b), although smaller duplexes are preferentially lost after this treatment. An electron micrograph of the S1 nucleaseresistant duplexes is shown in Fig. 6. A similar length distribution to those presented in Fig. 4 is also observed for the single chain loops, which we have termed substitution loops, present between successive duplexes along the stems of bubbled hairpins. These data

379 are shown in Fig. 5a. Combined measurements of interhairpin distances, the central loop lengths of simple hairpins, and the central loop lengths of bubbled hairpins, are shown in Fig. 5b. Discussion

Duplexes in fast-annealing hamster cell DNA have a foldback configuration Several explanations might account for the presence of a duplex-containing fraction when DNA is denatured and allowed to anneal briefly. The presence of interchain DNA cross-links causes denatured DNA to renature 'spontaneously'

]

Figs. 2 a n d 3. E l e c t r o n m i c r o g r a p h s o f l a r g e b u b b l e d h a i r p i n s in 8 . 2 /~m l o n g f o l d b a e k D N A . M o l e c u l e s w e r e o b s e r v e d in e l e c t r o n m i c r o g r a p h s t a k e n o f n e u t r a l i s e d a l k a l i n e s u c r o s e g r a d i e n t f r a c t i o n (b), r e f e r r e d t o in T a b l e I. T h e b a r s c o r r e s p o n d t o a l e n g t h o f 0 . 5 # m .

OO 0

I

L-

¢,0 O0

382 70.

(a)

50"

50'

o_

40-

>

m 30" ~3 o d z

I i I I

20-

10-

i 0

1

~J~,,,n

2

~

,

,

~

Length ( n u c l e o t i d e

4

~ l

|

5

8

i r l t

10

12

p a i r s x 1 0 - 3)

20-

(b)

i i

.o IO.el o ¢ z

0

8

~

~

~

Length(nucleotide

4

5

8

lo'1~

p a i r s x 10 - 3 )

Fig. 4. L e n g t h d i s t r i b u t i o n o f d u p l e x s t e m s in 8 . 2 # m l o n g h a m s t e r cell f o l d b a c k D N A . (a) S h o w s the size d i s t r i b u t i o n o f d u p l e x s t e m s m e a s u r e d in m o l e c u l e s c o n t a i n i n g u n l o o p e d hairpins, and s i m p l e l o o p e d hairpin d u p l e x e s ( u n s h a d e d areas), and individual d u p l e x s t e m s in b u b b l e d hairpins ( s h a d e d areas) as i n d i c a t e d in t h e c h a i n diagrams. (b) S h o w s the size d i s t r i b u t i o n o f the d u p l e x e s f o l l o w i n g l i m i t d i g e s t i o n o f foldb a c k D N A w i t h S1 n u c l e a s e , and r e c o v e r y o f the d u p l e x f r a c t i o n by h y d r o x y a p a t i t e c h r o m a t o g r a p h y .

under appropriate conditions [2]. Similarly, highly repetitive satellite-like sequences in DNA from some eukaryotic organisms anneals rapidly, in this case with second-order kinetics [15]. As discussed previously [7], the latter explanation might conceivably account for the configuration of the minor fraction of the duplex-containing structures in fast-annealed DNA, listed as forks and linear double-stranded/single-stranded structures in Table I. However, the overwhelming majority of the duplex structures classified have a clearly distinguishable foldback configuration. Hence, the DNA binding to hydroxyapatite after denaturation and annealing (Fig. 1) is believed to be due primarily to the presence of foldback structures formed from inverted repeat sequences in hamster cell DNA. One unfortunate, and serious, disadvantage of hydroxyapatite fractionation, however, is that it does not appear to be entirely selective. Structures containing smaller foldback duplexes do not bind to hydroxy-

383 (o) 40"

2( o

8 76 10"

,n,n 1

2

12

6 7 Length (nucleotides x 10 -3) 3

4

~ , 16

(b)

20

C

o

~3 1 0 0 6 Z

i i i

F

o

8

,J,

12

16

Len( th ( nucleotides x 10 -3)

Fig. 5. Length distribution of interspersed single chains in the segments of hamster cell DNA containing foldback sequences. (a) Single chain substitution loops in the s t e m s of bubbled hairpins. (b) Collective data for central loops of all looped hairpins, and interhairpin distances.

apatite as effectively as longer foldback duplexes [7]. This can account for the larger proportion of DNA chains presently scored as containing foldback foci by electron microscopy (Table I) compared with the a m o u n t of material bound to hydroxyapatite in the foldback fraction (Fig. 1) for DNA chains of a given length. This effect may similarly account for the results presented in Fig. 4, where the foldback duplex fraction recovered by hydroxyapatite chromatography after S1 nuclease digestion contains a reduced proportion of duplexes in the size range 50--500 base pairs. As will be discussed below, some of these observations may offer a possible explanation for the somewhat unusual nature of the foldback DNA binding curve shown in Fig. 1.

384

Fig, 6. E l e c t r o n m i c r o g r a p h o f S1 n u c l e a s e - r e s i s t a n t d u p l e x e s in h a m s t e r cell f o l d b a c k D N A . D e n a t u r e d h a m s t e r cell D N A w a s a n n e a l e d b r i e f l y , t r e a t e d w i t h S1 n u e l e a s e , a n d t h e r e s i s t a n t m a t e r i a l f r o m t h e l i m i t d i g e s t a p p l i e d to h y d r o x y a p a t i t e a t 6 0 ° C . T h e d u p l e x f r a c t i o n w a s e l u t e d a n d e x a m i n e d b y e l e c t r o n microscopy. The bar corresponds to a length of 0.5 ~m.

Binding o f foldback DNA to hydroxyapatite is not monophasic The binding o f foldback sequences to h y d r o x y a p a t i t e as a funct i on o f DNA chain length has been used to indicate the pattern of distribution of foldback sequences in eu kar yot i c DNA [2]. The almost linear nature of the DNA binding curve in the range 0--10 000 nucleotides in Fig. 1, and the clearly posi-

385 tive intercept on extrapolation to the ordinate axis, are similar to the results obtained by Wilson and Thomas [2] for mouse main-band foldback DNA. This t y p e of observation has been used as evidence to indicate that foldback sequences are clustered together [16]. However, the observations outlined above with regard to the quantitative efficiency of the hydroxyapatite fractionation m e t h o d casts some d o u b t on the validity of the technique on which this conclusion is based. In view of the indeterminate nature of the factors which would appear to govern the binding of duplex-containing foldback molecules to hydroxyapatite, as indicated above, we regard the electron microscopic method as being possibly a more reliable assay for the proportion of molecules containing foldback duplexes. An additional factor indicated by the results in Fig. 1, which is n o t taken into account in the simple treatment of hydroxyapatite binding data by Hamer and Thomas [16], is the upturn in slope of the DNA binding curve for DNA chains longer than 10 000 bases. The observation demonstrates that longer DNA chains contain new populations of foldback structures that are absent from shorter molecules, formed by the pairing of complementary inverted repeat sequences that are separated by intervening sequences in excess of 10 000 nucleotide residues in length. In qualitative terms, this coincides with the appearance in the electron microscope of bubbled hairpins and many additional looped hairpins (Table I). The chain diagrams of a number of representative bubbled hairpins is shown in Fig. 7. It can be seen that the larger intervening sequences present in these structures can account readily for the absence of bubbled hairpins from short DNA chains. Moreover, these structures contain several duplex regions with larger than average foldback duplexes (Fig. 4a) and might therefore be expected to bind to hydroxyapatite more efficiently than the foldback structures formed from small DNA fragments which contain shorter foldback duplexes [7]. The additional looped hairpins that are formed in longer DNA chains (Table I) similarly may contribute to this effect.

Periodic organisation of foldback sequences The length distributions illustrated in Figs. 4 and 5 show apparently nonuniform distributions of lengths for duplex hairpin stems, and also the lengths of the interspersed single chain segments in hairpin-containing regions of DNA. The results show evidence of periodicities in the length determinations, with regular intervals between the peak lengths. Limitations, both in the number of molecules which can be scored using this approach, and the precision of the mapping procedure, have precluded a n accurate determination of the periodic intervals. We have ruled o u t the possibility, however, that non-random length distributions may arise due to either to an artifact in the method of measurement, or to the way in which lengths are sorted into groups. We regard the above evidence as preliminary although we have obtained similar results in studies of the organisation of foldback sequences in Physarum DNA [9,17]. The observations suggest that the lengths of separate foldback duplexes, and the size of intervening sefluences in the same locality as foldback sequences in DNA, may be related in a manner analogous to the terms of an arithmetic series. One of t w o possible explanations might account for this result. They are n o t mutually exclusive. The first and most straightforward explanation is that

386

L

0cO

t

i"% -

*.

l

C L r r-.i

i.

"

--

....

b

-

,I

t

::: r t

b L. •

t

. }

4 N

8

~

1'o

'~%

Length

do

(nucleotides

iN

ds

b

3'~

4%

4%

do

× 10 -3)

F i g . 7. D N A c h a i n d i a g r a m s o f b u b b l e d h a i r p i n s t r u c t u r e s . A s e r i e s o f b u b b l e d h a i r p i n s t r u c t u r e s are m a p p e d , as i l l u s t r a t e d , t o s h o w t h e v a r i a t i o n i n t h e l e n g t h o f f o l d b a c k d u p l e x e s a n d i n t e r s p e r s e d s i n g l e c h a i n s . M o s t b u b b l e d h a i r p i n f o c i occupy regions of DNA less t h a n 1 4 0 0 0 b a s e s , a n d m a n y o f t h e d u p l e x e s a p p e a r t o b e a r r a n g e d i n c l u s t e r s (see t e x t ) . L a r g e r f o l d b a c k s t r u c t u r e s e x t e n d o v e r m u c h l a r g e r segments o f D N A , u p t o 5 0 0 0 0 b a s e s . T h i s is t h e s a m e o r d e r o f size as t h e D N A c h a i n s u s e d i n t h e s e e x p e r i m e n t s ( 8 . 2 p r o , T a b l e I). D u p l e x e s are i n d i c a t e d b y t h i c k e n e d segments, and arrows show the m i d p o i n t o f t h e ' t u r n - a r o u n d ' in t h e b u b b l e d h a i r p i n s . S i n g l e c h a i n s u b s t i t u t i o n s i n s t r u c t u r e s w h i c h contained m a n y b u b b l e s , w h e r e n e c e s s a r y , w e r e a s c r i b e d a r b i t r a r i l y t o e i t h e r a r m o f t h e s t r u c t u r e , s i n c e i t is n d t p o s s i b l e t o d e t e r m i n e f r o m w h i c h a r m a g i v e n l o o p is d e r i v e d .

regions of the DNA in which some, or all, foldback sequences reside may share a similar structure. For example these regions may consist of tandemly repeated sequences, that can generate foldback DNA structures whose properties reflect the periodic nature of the component sequences. Sequences of this general type are already known in eukaryotic DNA. For example, certain rodent satellite DNA species consist of sequences that have evolved continuous long-range tandem repeats superimposed upon divergent, shorter-range repeats [18,19]. Alternatively, sequences unrelated to (or diverged from) foldback sequences may be interspersed in a given region of DNA containing inverted repeat sequences, and these sequence elements may share the same highly ordered pattern of organisation with the surrounding foldback sequences. It is

387

stressed that the nature of the sequences in these regions in either case, whether repetitive or 'single copy' in the normal sense, is not the subject of this study. The tandemly arranged sequences in different segments of DNA would be required to share a common unit length in order to explain the results. It is not necessary, however, to postulate that the nucleotide sequences in these regions are related; again, by analogy with DNA satellites distinct, unrelated tandemly repeated sequences, even in different eukaryotic organisms, share a common long-range periodic pattern of organisation [ 20]. A plausible scheme which can explain the properties of foldback sequences based on these ideas is shown in Fig. 8. a

R-REGiON S h l ~ S'hl

M,HS'l'q

H S'l~

~l S ~

IH S'hlIH S 14

I

d

b

a

Fig. 8. A s c h e m e t o illustrate t h e possible a r r a n g e m e n t of i n v e r t e d r e p e a t s e q u e n c e s in h a m s t e r cell D N A . I t is s u p p o s e d t h a t f o l d b a c k d u p l e x e s m a y b e d e r i v e d f r o m r e g i o n s of D N A , Roregions, t h a t are a r r a n g e d i n t e r m i t t e n t l y a n d c o n t a i n t a n d e m l y a r r a n g e d s e q u e n c e s of a c o m m o n u n i t l e n g t h , s. S o m e R-regions c o n t a i n i n v e r t e d t a n d e m r e p e a t s . On d e n a t u r a t i o n , d i f f e r e n t t y p e s of f o l d b a c k s t r u c t u r e m a y r e s u l t b y a n n e a l i n g c o m p l e m e n t a r y i n v e r t e d s e q u e n c e s in R-regions. T h e s t r u c t u r a l p r o p e r t i e s o f f o l d b a c k m o l e c u l e s d e p e n d u p o n w h e t h e r t a n d e m r e p e a t s in a d j a c e n t R-regions axe r e l a t e d , w h e t h e r t h e y are in d i r e c t r e g i s t e r o r in a n i n v e r t e d o r i e n t a t i o n in t h e n a t i v e d u p l e x , a n d on t h e l o c a t i o n a n d e x t e n t of S~ngie c h a i n scissions in t h e D N A . I n t h e s c h e m e s h o w n h e r e , u n l o o p e d h a i r p i n s (c) a n d m o s t l o o p e d h a i r p i n s (d), b o t h s i m p l e a n d b u b b l e d , c a n b e d e r i v e d f r o m s e q u e n c e s w i t h i n a n R-region. L a r g e b u b b l e d h a i ~ i n s t r u c t u r e s (b, see Figs. 2 a n d 3) a n d l a r g e r s i m p l e l o o p e d h a i r p i n s t r u c t u r e s (a), c a n r e s u l t w h e n s e p a r a t e a n d d i s t i n c t R - r e g i o n s , c o n t a i n i n g several i n v e r t e d c o m p l e m e n t a r y s e q u e n c e s , are P r e s e n t in t h e s a m e D N A chian. W h e n P r e s e n t in s e p a r a t e D N A chains, s u c h s e q u e n c e s m a y fail t o f o r m f o l d h a c k d u p l e x e s e n t i r e l y , or m a y f o r m s m a l l e r f o l d b a c k d u p l e x e s t o o u n s t a b l e to b i n d t o h y d r o x y a p a t t t e , as d e s c r i b e d in t h e t e x t .

388

Possible origin and function of foldback sequences A parallel has been drawn between the possible nature of foldback sequence elements, and the properties of t a n d e m l y repeated sequences in defined DNA satellites. No distinct satellite species have yet been resolved in hamster cell DNA [21]. There is no reason to suppose, however, that satellite-like sequences should not be present, since the organisation of the hamster genome would not be expected to be fundamentally different from that of closely related species such as mouse or rat, both of which contain collections of satellite sequences in their DNA [19,22]. Several explanations could account for the absence of satellite components in DNA; for example they may be covalently attached to other sequence components, or they may have a base composition indistinguishable from main-band DNA. An alternative explanation for the origin of a period pattern of organisation of foldback sequences which should be considered is that special classes of sequences with a periodic structure, such as repetitive gene clusters, might account for these observations. Potential candidates that have so far been characterised include the genes for rRNA, tRNA, 5 S RNA and histone genes. However, based on the known structure of these genetic units, this explanation fails to account for the present results. Moreover, the experiment involving the selection of foldback molecules by h y d r o x y a p a t i t e chromatography, shown in Fig. 1, shows that at least 11% of long fragments of hamster cell DNA contain foldback elements, a proportion which in any event is too great to ascribe to DNA containing defined repetitive gene clusters. Other independent evidence suggests that nucleotide sequences other than defined satellites in eukaryotic DNA may be arranged in a periodic fashion. In experiments similar to those described here. Thomas et al. [23], in Fig. 13 of tbeir paper, show that the distribution of foldback sequences in HeLa cell DNA is seemingly periodic. Denaturation bubbles produced a Chinese hamster cell nuclear DNA by thermal melting and fixation with formaldehyde have also been shown to be located in comparatively discrete, periodic intervals [24]. It is also of interest that the sites for the initiation of DNA synthesis in Drosophila DNA are arranged in spaced clusters that possess a periodic pattern of organisation similar to that of the foldback elements described here [ 25]. If the regions of DNA from which the foldback sequences are derived resemble satellite sequences in character, then, similar to satellite DNA, foldback sequences may serve a largely 'structural' role [26]. In other systems, the presence of foldback elements in RNA transcripts has been used as evidence for their involvement in RNA processing [27], although it may well be too simplistic to ascribe a single function to foldback sequences. Perhaps, in some cases at least, their presence in eukaryotic DNA is a consequence of complex evolutionary processes acting on DNA in chromatin at a structural level, rather than their configuration being related to the function of foldback sequences per se. Acknowledgements The authors are grateful to the Medical Research Council and the Scottish Hospitals Endowments Research Trust for the support of this work. We also t h a n k Henry Eichelberger and Doris Duncan for skilful preparation of electron

389

micrographs, Alison Slater for help with tissue culture, Sandra Short for typing the manuscript, and Professor H.M. Keir for provision of facilities in the Department of Biochemistry. References 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

Cech, T.R., Rosenfeld, A. and Hearst, J.E. (1973) J. Mol. Biol. 81, 299--325 Wilson, D.A. and Thomas, C.A. (1974) J. Mol. Biol. 84, 115--138 Cech, T.R. and Hearst, J.E. (1975) Cell 5, 429--446 Schmid, C.W., Manning, J.E. and Davidson, N. (1975) Cell 5, 159--172 Deininger, P.L. and Schmid, C.W. (1976) J. Mol. Biol. 106, 773--790 Perlman, S., Phillips, C. and Bishop, J.O. (1976) Cell 8, 33--42 Bell, A.J. and Hardman, N. (1977) Nucleic Acid Res. 4, 247--268 Hardman, N. and Jack, P.L. (1977) Eur. J. Biochem. 74, 275--283 Hardman, N. and Jack, P.L. (1978) Nucleic Acids Res. 5, 2405--2424 Schmid, C.W. and Deininger, P.L. (1975) Cell 6 , 3 4 5 - - 3 5 8 Davis, R.W. and Hyman, R.W. (1971) J. Mol. Biol. 62, 287--301 Hardman, N., Jack, P.L., Brown, A.J.P. and McLacb-hn, A. (1979) Biochim. Biophys. A c t a 562, 3 6 5 - 376 Studier, F.W. (1965) J. Mol. Biol. 1 1 , 3 7 3 - - 3 9 0 Hardman, N. (1974) Biochem. J. 143, 521--534 Waring, M. and Britten, R.J. (1966) Science 154, 791--794 Hamer, D.H. and Thomas, C.A., Jr. (1974) J. Mol. Biol. 84, 139--144 Hardman, N., Jack, P.L., Brown, A.J.P. and McLachlan, A. (1979) Eur. J. Biochem. 94, 179--187 Cooke, H.J. (1975) J. Mol. Biol. 94, 87--99 Southern, E.M. (1975) J. Mol. Biol. 94, 51--69 Malo, J.J., Brown, F.L. and Musich, P.L. (1977) J. Mol. Biol. 1 1 7 , 6 3 7 - - 6 5 5 Thiery, J.P., Macaya, G. and Bernardi, G. (1976) J. Mol. Biol. 108, 219--254 HSrz, W., Hess, I. and Zachau, H.G. (1974) Eur. J. Biochem. 45, 501--517 Thomas, C.A., Jr., Pyeritz, R.E., Wilson, D.A., Dancis, B.M., Lee, C.S., Bick, M.D., Huang, H.L. and Zimm, B.H. (1973) Cold Spring Harbour Syrup. Quant. Biol. 38, 353--370 Evenson, D.P., Mego, W.A. and Taylor, J.H. (1972) Chromos oma 39, 225--235 Zakian, V. (1976) J. Mol. Biol. 1 0 8 , 3 0 5 - - 3 3 1 Walker, P.M.B., Flamm, W.G. and McLaren, A. (1969) in H a n d b o o k of Molecular Cytology (Lima de Faria, A., ed.), pp. 52--67, North Holland, Amsterdam Jelinek, W. (1977) J. Mol. Biol. 115, 591--600

Organisation of inverted repeat sequences in hamster cell nuclear DNA.

372 Biochimica et Biophysica Acta, 564 ( 1 9 7 9 ) 3 7 2 - - 3 8 9 © E l s e v i e r / N o r t h - H o l l a n d Biomedical Press BBA 9 9 5 2 6 O R...
6MB Sizes 0 Downloads 0 Views