J. Mol. Biol. (1976) 108, 665-682

Fluctuational Opening of the Double Helix as Revealed by Theoretical and Experimental Study of D N A Interaction with Formaldehyde ALEXANDER V. LVKASHrS,ALEXANDER V. VOLOGODSKII M~Yr~ D. FRANK-KAMENETSKII AND YURII L. LYUBCHENKO

Department of Biology I. V. Kurchatov Institute of Atomic Energy, Moscow 123182, U.S.S.R. (Received 2 March 1976, and in revised form 7 July 1976) I t follows from the t h e o r y of helix-coil transition t h a t local opening of the double helix due to thermal fluctuations m u s t t a k e place a t temperatures well below the melting range. P r o v i d e d t h a t formaldehyde can n o t react with hydrogen-bonded bases a n d can react only with the exposed ones, such fluetuational opening of base-pairs m a y be probed b y formaldehyde. To test the a d e q u a c y of the theory for describing the fluctuational opening of the double helix, the process of D N A interaction with formaldehyde has been simulated b y the Monte Carlo m e t h o d with the aid of a computer. On the basis of kinetic constants for the forward a n d reverse reactions of formaldehyde with all four nucleotides measured b y McGhee & v o n Hippel (197 ha,b), the kinetic curves of DI~A unwinding b y formaldehyde have been calculated theoretically without the use of any adjustable parameter. The calculations have d e m o n s t r a t e d t h a t the highly reversible b u t v e r y fast reaction of formaldehyde with the imino group of t h y m i n e plays a v e r y i m p o r t a n t role in the process as a whole. The results of computations of the characteristic time of the process of D N A unwinding b y formaldehyde have been compared with experimental d a t a obtained for bacteriophage T7 I ) N A for different values of p H , t e m p e r a t u r e and concentration of formaldehyde. This comparison leads to the conclusion t h a t the t h e o r y offers a correct general description of fluctuational opening of the double helix. The main characteristics of this process calculated b y the t h e o r y are as follows. A t room temperatures only individual base-pairs are opened a n d the mean distance between adjacent unpaired bases is as great as 2 • 105 base-pairs. E a c h A . T pair is sho~-n to be opened with frequency a b o u t 10 ~ s -1 a n d each G . C pair with a frequency a b o u t 10 s-1. A t elevated temperatures, in the D N A premelting region, the p r o b a b i l i t y of fluctuational opening, as well as t h e average n u m b e r of base-pairs in an opened region, increases considerably. A possible role of the results obtained from an analysis of the processes of D N A function in the cell is discussed. 1. Introduction D u r i n g r e c e n t y e a r s i n t e r e s t h a s b e e n a r o u s e d in t h e i n v e s t i g a t i o n o f t h e c o n f o r m a t i o n a l m o t i l i t y of t h e D N A d o u b l e helix (for references see r e v i e w b y F r a n k - K a m e n e t skii & L a z u r k i n , 1974). S u c h c o n f o r m a t i o n a l m o t i l i t y is b e l i e v e d to p l a y a n i m p o r t a n t role in D N A f u n c t i o n i n g in t h e cell a n d has b e e n u s e d r e c e n t l y t o e x p l a i n t h e D N A o r g a n i z a t i o n in e h r o m a t i n , in v i r u s p a r t i c l e s a n d so on (see, for e x a m p l e , Crick & K l u g , 1975). H o w e v e r , i n v e s t i g a t i o n o f t h e s e effects p r e s e n t s severe p r o b l e m s b e c a u s e 665

666

A.V.

LUKASHIN

ET

AL.

the conformational changes under ordinary conditions are of a fluctuating nature and their stationary concentration for isolated DNA in solution is very small; the overwhelming majority of base-pairs belongs to the helical B conformation. Consequently they cannot as a rule be studied directly by physical methods, and different indirect approaches are generally used. Among such indirect approaches to the fluetuational motility of the double helix, the hydrogen exchange method and chemical modification of DNA are of wide application. Although such experiments undoubtedly indicate some sorts of transient conformational changes, their quantitative interpretation in respect of particular types of conformational changes involved is a difficult problem. In this paper we have restricted ourselves to considering only one particular type of conformational motility, a fluctuational opening of the double helix. There are two distinct lines of evidence for such transient openings. On the one hand, there are experimental and theoretical studies of the helix-coil transition in oligonucleotides (Gralla & Crothers, 1973) and polynucleotides (see review by Frank-Kamenetskii & Lazurkin, 1974) ; these studies have led to the conclusion that melting out of small regions of double helix occurs at temperatures below the melting range. On the other hand, investigations of the kinetics of DNA unwinding by formaldehyde have led to similar conclusions (Lazurkin et al., 1970, Utiyama & Dory, 1971 ; v o n Hippel & Wong, 1971; :Frank-Kamenetskii & Lazurkin, 1974). The use of formaldehyde as a probe for fluctuational openings of the DNA molecule is based on the assumption that any base may react with formaldehyde if and only if it is positioned in an opened state (with disrupted hydrogen bonds and stacking interactions). Recently MeGhee & yon Hippel (1975a,b) have presented additional evidence for this assumption. However, it is very difficult to relate the process of fluctuational opening of the double helix to the process, followed experimentally, of DNA interacting with formaldehyde, due to a host of elementary reactions of formaldehyde with different chemical groups (amino and imino) of the bases (McGhee & yon Hippel, 1975a,b). We believe that the variety of the elementary reactions makes insoluble the inverse problem of extraction of the characteristics of the fluctuational opening directly from the experimental data on DNA interaction with formaldehyde. Due to this complexity the attempts to solve the inverse problem by Utiyama & Dory (1971) and yon Hippel & Wong (1971) have led to contradictory results (see also Frank-Kamenetskii & Lazurkin, 1974). In this paper we describe a different approach to the problem. We assume, as has been done by Lazurkin et al. (1970), Frank-Kamenetskii & Lazurkin (1974) and Vologodskii & Frank-Kamenetskii (1975), that the model used in the theory of helix-coil transition is valid, in its main features, at temperatures below the melting range. The characteristics of the process of fluetuational opening of the double helix are calculated with the aid of this theory. Then the results of these calculations, in conjunction with kinetic constants of the formaldehyde reaction with free nucleotides (measured by McGhee & yon Hippel, 1975a,b), are used to compute kinetic curves of DNA interacting with formaldehyde. In the course of these calculations the reversibility of the reaction of formaldehyde with nucleotides has been taken into account. In this point our treatment differs radically from the preceding ones. Taking into consideration the reversibility of reaction has been proved to be indispensable for computing the quantitative characteristics of the interaction of DNA with formaldehyde. It has made possible a

F L U C T U A T I O N A L O P E N I N G OF DOUBLE H E L I X

667

calculation of the quantitative values of the characteristic time of the process and their dependence on temperature, p H and concentration of formaldehyde, without the use of adjustable parameters. The results of these calculations are compared with experimental data. A degree of agreement between the theory and experiment m a y be used as a measure of the validity of assumptions which form the basis of the theory: the fii~t and major assumption is t h a t the results of the theory of helix-coil transition m a y be extrapolated to the temperature region well below the melting range.

2. Materials and Methods A homogeneous preparation of bacteriophage T7 DNA was obtained by hot phenol extraction (Massie & Zimm, 1965) from phage T7 purified in a density gradient. The DNA samples contained not more than 1 ~o of protein. A formaldehyde stock solution with a concentration of about 10 M was obtained by dissolution of pure gaseous formaldehyde in deionized water. Any formic acid present was removed by passing the solution through an ion-exchange column of IRA-68. Different concentrations of formaldehyde were used in 0.2 ~-phosphate buffer. Kinetic curves were obtained using a Unicam SP800 spectrophotometer interfaced with an SP22 recorder, pH measurements of each formaldehyde solution, both before the experiment and in the reaction mixture after it, were measured with a pH-340. In all cases the pH value remained practically unchanged in the course of experiment.

3. Method o f C a l c u l a t i o n Our calculations of the interaction of DNA with formaldehyde are based on the following main assumptions. The formaldehyde molecule m a y react only with opened nucleotides and its reaction blocks the Watson-Crick hydrogen bonding. The probability of fluctuational opening of the double helix m a y be obtained from the theory of helix-coil transition. The calculations have been performed b y the Monte Carlo method with the aid of a BESM-6 computer. First, a random sequence of nucleotide pairs was generated. Then the probability Wt of fluctuational opening for each base-pair was computed using the theory of the helix-coil transition (see Appendix 1). Provided t h a t the rate of Watson-Crick base-pair closing after fluctuational opening is much greater t h a n the reaction rate of formaldehyde with an opened nucleotide, the modification rate constant for one of two nucleotides in the ith pair of generated double helix is equal to (see, for example, Frank-Kamenetskii & Lazurkin, 1974): iV~,a = I~r ,, i '/,,-(f).t ~ v',

(I)

where ~ is the formaldehyde concentration in solution; K ~ ) is the rate constant of the forward reaction of formaldehyde with an opened nucleotide. Index ~ m a y have, generally speaking, one of four values (A,T,G or C). For a given value of i it m a y have only one of two values according to which particular base-pair (A. T or G.C) occupies the ith position of the chain. The rate constants K ~ ) as well as the reverse rate constants K~r) have been measured recently b y McGhee & yon Hippel (1975a,b) for all four nucleotides at different values of temperature and pH. Table 1 shows the rate constants and their activation enthalpies t h a t have been chosen by us on the basis of these and other data (see Appendix 2).

668

A. V. L U K A S H I N

ET AL.

The instant of time tl of the first event of modification of a nucleotide in the chain by formaldehyde is a random value which may be obtained from the equation (see, for example, Hammersley & Handscamb, 1964): t =

-

In7 ~-,

(2)

where ~, is a random digit uniformly distributed on the interval [0,1]; 2: is the sum of all 2N rate constants Pt,a (N is the total number of base-pairs in the chain). As soon as the value of tl is obtained, the particular nucleotide, which has been modified at this first instant of time, is determined. I n doing so the probability value for modification of a nucleotide (i,~), which is equal to P~,a/2:, is used. After determination of the particular nucleotide modified first, the interval of time t2 between the first and the second event of modification is determined. To do so new values of Wt are calculated, taking into account that a pair in which one of the nucleotides is modified must stay opened. Using these quantities new values of pt,a are computed via equation (1). Two additional processes should be taken into consideration: forward reaction of formaldehyde with a second unmodified nucleotide of the opened (due to preceding modification) base-pair, which has a rate constant __~I~A~, and the reverse reaction of formaldehyde desorption from the modified nucleotide with the rate constant K r The t2 value is determined from equation (2), where 2: is now the sum of rate constants of all processes mentioned above. Then a particular nucleotide is chosen which has been modified at this second instant of time, and corresponding changes in the state of the chain are introduced. For this new state the rate constants of all possible elementary processes are calculated, and from their sum the value t3 is determined, and so on. In the course of this process all possible elementary events were taken into account in accordance with the basic assumptions. To determine the mean degree of denaturation the calculated probabilities were summed and divided by the total number of base-pairs in the chain. All data presented in this paper were obtained for the chains 4000 base pairs long. For chains of such length the end effects were shown to be negligible. The process was repeated several times to obtain a statistically sound result, and for each repetition a new set of random digits was used. B]~SM-6 computer time required for these calculations depended considerably on the particular relation between the rate constants of elementary reactions used, and varied over a wide range from several minutes to hours. - - a

"

4. R e s u l t s a n d D i s c u s s i o n

(a) Role of reversibility of the reaction of DNA with formaldehyde Formaldehyde reacts with amino groups of adenine, guanine and cytosine and with ~mlno groups of thymine and guanine. Table 1 gives the rate constants of these reactions and their activation enthalpies; these have been chosen mainly on the basis of recent data of McGhee & yon Hippel (1975a,b); see also Appendix 2. The reactions both with amino groups and with imlno groups are reversible. However, there are important quantitative differences between them. On the one hand, the equilibrium constant for reaction with an amino group is several times greater than the equilibrium constant for reaction with an imino group. On the other hand, rate

F L U C T U A T I O N A L O P E N I N G OF DOUBLE H E L I X

669

TABLE 1

Rate constants of forward, K (f), a~d reverse, K (r), formaldehyde reaction with nucleotides and their entha~ies t

Nucleo~ide

K of)• lO~ (min-1 ~-i)

ANIP CMP GNIP TMP:~

9.36 73.8 18.0 1680.0

• 103 (rnln-i)

E Cf~ (keal/mol)

ECf~ (kcal/mol)

1.38 8.4 5.4 1400.0

19-8 18.8 16.9 23.4

24.1 25.3 21-2 26-8

Kr

See Appendix 2. ~fTemperature 25~ pH 7. :~K of) and K(') for TMP depend on pH as K = K o • 10pH-7. constants of forward as well as reverse reactions for the thymine imino group are some tens of times greater t h a n the rate constants for amino groups. The rate constants of reaction with the thymine imino group are highly pH-dependent (both forward and reverse rate constants increase tenfold when the p l l value increases b y one unit, but their ratio, the equilibrium constant, remains unchanged). B y contrast, the rate constants of the reaction with amino groups are practically independent of p H (at least in the acidic and neutral regions). Each base-pair has one amino group for which equilibrium, in real experimental conditions, is shifted significantly in the direction of the forward reaction. As a result of the reactions with the amino groups of adenine and cytosine, the equilibrium state for the DNA plus formaldehyde system, in the experimental conditions under consideration, corresponds to completely unwound DNA. Reactions with guanine and thymine have much smaller equilibrium constants and cannot substantially affect the final state of the system. However, the reaction with the thymine imiuo groups has a dramatic effect on the rate of DNA unwinding by formaldehyde, because this reaction has a very high rate constant. The peculiar properties of the reactions of formaldehyde with different nucleotides, in conjunction with some properties of the fluctuational opening of the double helix, lead to a very specific and complex detailed picture of DNA interacting with formaldehyde. This is illustrated b y Figure 1, which shows calculation of the process of formaldehyde interacting with an arbitrarily chosen region having 22 base-pairs, which is a part of a long random sequence. At the left, the numbers of elementary events and the instants of time (in minutes) when t h e y took place are given. Filled circles above the letters indicate t h a t a given nucleotide is modified b y formaldehyde. Note that a region between two adjacent modified nucleotides is practically unwound for all cases shown in Figure 1. Each line differs from the preceding one b y the number of unwound base-pairs. The intermediate situations, which differ in the states of unwound nucleotides (sorption or desorption of formaldehyde molecules), but do not differ in the dimensions of unwound regions, are not shown. A difference between the order numbers of two adjacent lines m a y be used as a measure of the number of such intermediate events between them. The first event in the region under consideration took place at 22.853 minutes. As in all cases inspected, the thymine nucleotide was the first to be modified. I t is

670

A.V.

0.000

LUKASHIN

AL.

ET

CTCTGCTCAGTCTGAATATGTC GAGACGAGTCAGACTTATACAG CTCTGCTCAGTCTGAATATGTC

22.853

GAGACGAGTCAGACTTATACAG 24.170

CT CTG CT CAGT CTGAATATGT

C

@

GAGACGAGTCAGACTTATACAG @@

24.277

CTCTGCTCAGTCTGAATATGTC

GAGACGAGTCAGACTTATACAG 24.449

CTCTGCTCAGTCTGAATATGTC GAGACGAGTCAGACTTATACAG CTCTGCTCAGTCTGAATATGTC

24.528

GAGACGAGTCAGACTTATACAG 25.041

CT CTG CT CAGT CTGAATATGT

C

GAGACGAGTCAGACTTATACAG @

25.075

9

CTCTGCTCAGTCTGA~TATGTC

GAGACGAGTCAGACTTATACAG 9

@@

11

25-403

CTCTGCTCAGTCTGAATATGTC GAGACGAGTCAGACTTATACAG

12

25.411

CTCTGCTCAGTCTGAATATGTC GAGACGAGTCAGACTTATACAG

19

27-338

CTCTGCTCAGTCTGAATATGTC ee GAGACGAGTCAGACTTATACAG

24

28.003

9 9

00

9

9

CTCTGCTCAGTCTGAATATGTC 0000

GAGACGAGTCAGACTTATACAG 00

25

28.073

000

CTCTGCTCAGTCTGAATATGTC 9 0"00

GAGACGAGTCAGACTTATACAG 9

27

28.349

@9

9149149

CTCTGCTCAGTCTGAATATGTC 00 9

9

GAGACGAGTCAGACTTATACAG 9

29

28.369

00

0000

CTCTGCTCAGTCTGAATATGTC 9 0

@

9

GAGACGAGTCAGACTTATACAG

FLUCTUATIONAL

39

28.993

OPENING

OF

DOUBLE

HELIX

671

0000000 CTCTGCTCAGTCTGAATATGTC 9

000 9

9

GAGACGAGTCAGACTTATACAG 9

40

48

29.015

29.845

000 9

CTCTGCTCAGTCTGA_~TATGTC 9

000 9

9

0000

GAGA CGAGT 9

50

29 -959

9

GAGA CGAGT CAGA CTTATA 00 9 00 CTCTGCTCAGTCTGAATATGTC

CAG

00

CAGA CTTATACAG

0000 9149149

CTCTGCTCAGTCTGAATATGTC 9

0000

09

GAGACGAGTCAGACTTATACAG 00 9

64

31-445

72

31.712

75

32.180

CTCTGCTCAGTCTGAATATGTC 00000 9 09 GAGACGAGTCAGACTTATACAG 0 9149149 CTCTGCTCAGTCTGAATATGTC 000000 9 GAGACGAGTCAGACTTATACAG 000

9

00

CTCTGCTCAGTCTGAATATGTC 0 9149

00

9

GAGACGAGTCAGACTTATACAG 00 9149

77

32.240

00

00 9149

GAGA CGAGT

00

32.394

82

32.462

83

32.498

85

32.626

97

33-053

9

CAGA CTTATA

00000 9

80

9

CTCTGCTCAGTCTGAATATGTC

9

CAG

9

CTCTGCTCAGTCTGAATATGTC o0 9149 0 o o@ GAGACGAGTCAGACTTATACAG o 9149

9

CTCTGCTCAGTCTGAATATGTC o0o@10o 9 o@ GAGACGAGTCAGACTTATACAG 0oo 9 CTCTGCTCAGTCTGAATATGTC 0@ 9 0 0 00 GAGACGAGTCAGACTTATACAG 00 9149149149 CTCTGCTCAGTCTGAATATGTC o0o0oo o9 9 9 GAGACGAGTCAGACTTATACAG o00o 9149

9

CT CTG CT CAGT CTGAATATGT C 9

000 9

0 9149149

GAGACGAGTCAGACTTATACAG 00000000 9

100

33.194

9

CTCTGCTCAGTCTGAATATGTC 9

0000

9149149

9

GAGACGAGTCAGACTTATACAG 9 oo0o000o0

101

33.217

9

9

CT CTG CT CAGT CTGAATATGT C 9

000

9

0o

00 9

9

GAGACGAGTCAGACTTATACAG FxQ. 1. T h e c a l c u l a t e d r e s u l t s of f o r m a l d e h y d e i n t e r a c t i o n w i t h a f r a g m e n t 22 b a s e - p a i r s l o n g o f a specific r a n d o m s e q u e n c e . A t t h e left, t h e n u m b e r s of e l e m e n t a r y e v e n t s a n d t h e i n s t a n t s of t i m e (min) w h e n t h e y t o o k place are p r e s e n t e d . T h e n u c l e o t i d e s w h i c h h a v e r e a c t e d w i t h f o r m a l d e h y d e a r e m a r k e d b y filled circles. T h e c o m p u t a t i o n h a s b e e n p e r f o r m e d for t h e following cond i t i o n s : t ~ 33~ f o r m a l d e h y d e c o n c e n t r a t i o n ~b ~ 5.7 M; p H 6.5. H y p h e n s h a v e b e e n o m i t t e d for t h e s a k e o f clarity.

672

A. V. LUKASHIN ET AL.

a natural consequence both of the higher rate constant and of the higher opening probability for the thymine nucleotide. The probability of fluctuational opening of adjacent base-pairs increased considerably after the modification event, and before long another thymine was modified. After it, the G" C pair situated between these two modified thymines became fully opened and cytosine was modified almost immediately. Subsequent desorption of formaldehyde from thymine (line 4) is a very natural process because reaction of formaldehyde with thymine is highly reversible. On the other hand, as a rule the modification of cytosine fixes reliably an unwound region. However, in the particular case under consideration, the cytosine was released from formaldehyde very quickly (line 5, Fig. 1). This event could lead to the disappearance of the unwound region because modification of a single thymine could not fix it reliably. However, before the formaldehyde molecule was released from thymine (line 11), the modification of guanine and cytosine took place (lines 6, 7). These events fixed the region reliably. After that, the grouch of the open region began from its ends. As a rule each propagation step was started by the reaction with thymine. Then, before long, formaldehyde reacted with some nucleotide inside the newly created, unwound region. This sequence of events is demonstrated clearly by lines 27 to 39 of Figure 1. Appreciable reversibility of the reactions leads to highly irregular growth of the unwound regions from their ends. Temporal shortening of unwound regions was found widely, as may be seen by inspection of lines 85 and 97. However, as a whole, the process led to the growth of unwound regions. The above example illustrates the characteristic properties of the interaction of DNA with formaldehyde. The important role of the reaction with the thymine imino group became absolutely clear. In spite of its strong reversibility, it is this reaction which provides the initiation and propagation of unwound regions. The reactions with other nucleotides lead practically only to the fixation of these unwound regions. However, the latter reactions are also of principal importance because without them the reaction with thymine only would not lead to DNA unwinding at all (at the formaldehyde concentration used}. (b) Comparison of calculated and experimental kinetic curves of D N A unwinding by formaldehyde Using the computer algorithm outline in Method of Calculation we have calculated the kinetic curves of DNA unwinding by formaldehyde. In Figure 2 three curves calculated for different pH values are shown. These curves are sigmoidal, in qualitative agreement with numerous experimental observations (see, e.g. Trifonov et al., 1967,1968; Lazurkin et al., 1970; Utiyama & Doty, 1971}. The observed shape of the kinetic curves has been interpreted as an indication of two types of unwinding processes : initiation of small unwound regions and their subsequent growth (Trifonov et al., 1967,1968; Lazurkin et al., 1970; Frank-Kamenetskii & Lazurkin, 1974}. It was seen in the preceding section that, in spite of the great complexity of the process, phenomenologically it proceeds in general agreement with such a simple mechanism. If the theory of helix-coil transition describes adequately the process of fluctuational opening of the double helix, the detailed scheme proposed in this paper should give not only the general shape of kinetic curves but should also predict correctly the

FLUCTUATIONAL

OPENING

OF DOUBLE

HELIX

673

1.0

~ 0.5 == E3

200

400 Time (rain)

600

800

FIG. 2. C a l c u l a t e d k i n e t i c c u r v e s of D N A u n w i n d i n g b y f o r m a l d e h y d e . T h e c a l c u l a t i o n s h a v e b e e n p e r f o r m e d for a r a n d o m s e q u e n c e w i t h ( G + C ) c o n t e n t z0 = 0.48 ( c o r r e s p o n d i n g to T7 D N A ) for t h e following c o n d i t i o n s : [ N a + ] = 0.04 M, t = 33~ ~b = 5.7 M. D i f f e r e n t c u r v e s corresp o n d to different v a l u e s o f p H : 5-5 (curve 1); 6.0 (curve 2); a n d 6"5 (curve 3).

quantitative value of the characteristic time of the process and its dependence on the environmental conditions (pH, temperature and concentration of formaldehyde), without the use of adjustable parameters. Statistically reliable results cannot be obtained for arbitrary values of kinetic constants. As the rate constants of forward and reverse reactions with the thymine imino group increase sharply with pH (see Table 1), the computer time also increases very rapidly. At the same time it is well-known that a considerable shift of pH values from neutral produces changes in thermodynamic properties of DNA. Consequently we have had to restrict ourselves to a rather narrow region of pH values, for which the changes in thermodynamic properties of DNA are negligible but the computations are not too time-consuming. This region proved to be between pH 5.5 and pH 6.5. Inside this region the kinetic curves for phage T7 DNA have been measured. Three of them are shown in Figure 3. Experimentally an increase of optical absorption at some wavelength (at 260 nm in our particular case) is recorded. This value includes not only the optical changes due to DNA unwinding but also the optical changes due to chemical modification of 1.0

9

_0/ o

0/ o.5

/0 /

2

! Q

/ 0 / O.f

-

^/

e/e~

o/

o/ . /

0/0/ /./O/o/.0' o/ -

o ..,.. 0 -"" e~

~~o~

~.o /

o/ o / ~ 0/

-

/ o/

/o /

~/o./~ o/~ ^/~/ ~.o ~ 0 --o~e

~e

6 ....=~0~ o ~ ~ o .~. o.-"~ ~ ~::o~

,

I00

I

I

200 Tirns(min)

I

I

:500

I

I

400

FIG. 3. E x p e r i m e n t a l k i n e t i c c u r v e s for T 7 D N A (Xo = 0.48) a t t = 33~ [ N a + ] = 0.04 ~[ a n d ~b = 5.7 M a n d different v a l u e s o f p H : 5.5 ( - - O - - C ) - - ) ; 0-0 ( - - O - - O - - ) ; a n d 6.5 ( - - 0 - - 0 - - ) .

A. V. L U K A S H I N E T AL.

674

bases b y formaldehyde. Special calculations have shown however t h a t , a t least for our p a r t i c u l a r conditions, the shape of the kinetic curve is v i r t u a l l y unaffected b y t a k i n g i n t o a c c o u n t t h e optical effects of chemical reactions, or t h e difference in optical changes due to m e l t i n g of different base-pairs. So i n r o u t i n e calculations we h a v e restricted ourselves to c o m p u t a t i o n of degree of helicity, b u t n o t t h e optical changes. As a characteristic t i m e of t h e process, t h e t i m e T of h a l f - d e n a t u r a t i o n has b e e n chosen. Tables 2 to 4 show the values of t h e p a r a m e t e r r d e t e r m i n e d e x p e r i m e n t a l l y a n d theoretically for different e n v i r o n m e n t a l conditions. One c a n see t h a t t h e t h e o r y n o t o n l y correctly predicts t h e dependences of the k i n e t i c p a r a m e t e r on p H , t e m p e r a t u r e a n d f o r m a l d e h y d e c o n c e n t r a t i o n , b u t also gives q u a n t i t a t i v e values of t h e

TABLE 2

Comparison of theory with experimental data for T7 D N A (x o = 0.48) pH T • 10 -2 (min)

Theory Expcrimentalt

5.5

6.0

6.5

4.9 2.9~0.3

3"0 2.54-0.3

1'8 2.0•

tVaines averaged over several independently measured kinetic curves. Conditions were as follows: ~b= 5.7 M; INn +] = 0.04 M; t = 33~ TABLE 3

Theoretical and experimental dependence of -r on temperature for T7 D N A

t (~ 33 38 43

Theory

T/re Experimental

1 0.41 0.16

1 0.37=]=0.05 0.16:J:0.03

Conditions were pH 5.5, ~b= 5.7 ~. The value TOcorresponds to t ~ 33~ (see Table 2). TABLE 4

Theoretical and experimental dependence of T on formaldehyde concentration for T7 D N A (M)

TIT0

Theory Experimental

4.0

5.7

7.3

9.0

1.7 1.4:t:0.1

1 1

0-72 0"79 :t: 0.07

0.60 0.59-[-0.06

Conditions were as follows: pH 5.5; ~ = 38~ The value re corresponds to r at ~ = 5.7 ~.

F L U C T U A T I O N A L O P E N I N G OF D O U B L E H E L I X

675

p a r a m e t e r which are close to experimental ones. The only significant discrepancy between theory and experiment is observed for the dependence of r on p i t . We compared also the theoretical predictions with experimental d a t a for some very different conditions, temperature, formaldehyde concentration and D N A (GA-C) content. I n all cases the discrepancy between predicted and experimental values was not more t h a n threefold. Note t h a t the rates of elementary reactions with free nucleotides differ from the rates of the final process of D N A unwinding by a factor of a hundred or more. The former are known only approximately (see Appendix2), and the latter depend drastically on environmental conditions. Taking all this into account, the correlation obtained between theory and experiment seems satisfactory. I t should be noted t h a t though the shapes of observed and calculated curves are qualitatively alike, they differ systematically in their initial regions. Indeed, experimental curves have in our particular conditions a much more distinct lag phase than calculated ones (compare Figs 2 and 3). I n this connection it should be emphasised t h a t our model of the interaction of formaldehyde with D N A is based on some detailed assumptions which do not so far have direct experimental proof. Most important, there is no direct demonstration t h a t for the reaction of formaldehyde with a n y base a complete opening of a base-pair, i.e. disruption both of the hydrogen bonds and of the stacking interactions, is needed. At present we m a y judge the degree of validity of this hypothesis only indirectly on the basis of the correlation of calculations based on it with the experimental data~. Once a nucleotide in an opened pair has reacted with formaldehyde this pair may, in principle, insert again into the double helix forming, of course, a less stable basepair:~. We have introduced corresponding changes into our calculation algorithm to take into account the possibility of the formation of weakened base-pairs which include h y d r o x y m e t h y l a t e d adenine, cytosine and guanine. The calculations have shown that, within a wide range of variation of the p a r a m e t e r which determines the degree of weakening of such base-pairs, only a small increase of the half-denaturation time, in comparison with the values indicated in Tables 2, 3 and 4, is observed. At the same time the introduction of such a process m a y lead to some changes in the initial region of the kinetic curves. Note t h a t the p a r a m e t e r z, the characteristic time of the process as a whole, which has been chosen b y us for comparing the theory with the experimental data (being comparatively insensitive to the details of the interaction of formaldehyde with D N A mentioned above) is v e r y sensitive, as special calculations have shown (see Appendix 1), to the probability of fluctuational opening of base-pairs. (c) Characteristics of the process of fluctuational opening of the double helix The results obtained seem to be indicative of the general correctness of our basic assumption t h a t the results of the theory of helix-coil transition m a y be extrapolated to temperatures well below the melting range (for a more detailed discussion see ~fWhen the first version of this paper was completed a dissertation by J. D. McGhee entitled Formaldehyde as a probe of DNA structure (Oregon University, December 1975) became available to us. In this fundamental contribution much experimental evidence for the above hypothesis is presented, and the problem as a whole is discussed extensively. :~Such a process for exocyelie amino groups has been demonstrated b y McGhee in his dissertation.

676

A.V.

ET AL.

LUKASHIN

Appendix 1). Here we would like to present some general characteristics of the process of fluctuational opening of the double helix predicted by the theory. Figure 4 shows the opening probability maps for some arbitrary chosen sequence at two temperatures. Table 5 contains the average characteristics of the process, namely the average number of nucleotides in the fluctuationally opened region and the correlation length, i.e. the number of base-pairs between two nucleotides which prevents the state of one of them influencing the probability of fluctuational opening of the other (for details see Appendix 1). One can see that at low temperature only individual base-pairs are opening, with overwhelming probability. So the opening probability depends only on the particular type of base-pair (A. T or G" C) which is clearly seen in Figure 4 where, e.g. for 20~ all A . T pairs have an opening probabihty of about 8 • 10 -6 and all G.C pairs about 1.5 • 10- 6. The correlation length at low temperatures does not exceed, on average, ten pairs. I t means that if a defect of secondary structure (locally denatured region) is introduced into the DNA double helix, it should affect the probability of fluctuational opening of only the nearby base-pairs which are not more than ten bases apart. It should be remembered, however, that the correlation length depends significantly on the (G~C) content of a region situated between the pair under consideration. The process of fluctuational opening begins to vary only at elevated temperatures immediately below the melting range (see Table 5). In this premelting zone the

I

(ol

(b)

x

Fm 4. Probability of fluctuational opening of base-pairs for a randomly chosen sequence for 2 temperatures: 20~ (a) and 80~C (b). The DNA melting temperature for the conditions used is e q u a l to 87~

TABLE 5

The values of mean correlation length (r) (in base-pairs) and the mean number (n) of bases in the opened region for DNA with random sequence (50% G~-O) having a melting temperature of 87~ t (~ r n

0 4-1 1-01

20

40

60

80

5.2 1.04

7-1 1.10

12-2 1-29

40-3 4.54

FLUCTUATIONAL OPENING OF DOUBLE HELIX

677

average number of base-pairs in the opened region, the correlation length, as well a s the overall probability of opening of a base-pair in the double helix, are increasing significantly. The latter value depends now not only on the type of the base-pair but also on its nearest neighbour (Fig. 4). The characteristics of the process of fluctuational opening of the double helix outlined above are in general agreement with rough estimates made earlier by Lazurkin et al. (1970) and Frank-Kamenetskii & Lazurkin (1974), but do not agree with the estimates made by Utiyama & Dory (1971) and yon Hippel & Wong (1971), who also probed the fluctuational opening by formaldehyde. Some causes of this disagreement have been discussed by Frank-Kamenetskii & Lazurkin (1974). Our present calculations have further clarified the situation. As an example, the conclusion about fluctuational opening of large regions containing nine base-pairs was made by yon Hippel & Wong (1971) on the basis of the extremely high values of apparent activation enthalpy of kinetic parameters for the process of DNA unwinding by formaldehyde. It is clear now that such an effect is connected with a rapid growth of the correlation length with temperature in the premelting zone. Indeed special calculations have shown a considerable increase in the apparent activation enthalpy for the rate of growth of the unwound region at elevated temperatures. Our results for DNA helices agree, as regards the main points, with the conclusions made by Gralla & Crothers (1973) for the fluctuational opening of RNA helices. The principal conclusion is that only single base-pairs may be opened, with overwhelming probability, at room temperatures. As regards quantitative estimates, our results correlate very well with the results of Gralla & Crothers (1973) for G.O pairs but disagree considerably for A.T (.U) pairs. The latter may be attributed at least partly to the greater difference in the mean stabilities between G-C and A.U pairs in RNA than between G-C and A.T pairs in DNA (Kallenbach, 1968) and to the much greater heterogeneity of the stacking interaction in RNA than in DNA helices (Belintsev et al., 1976). (d) Possible role of the opening of D N A double helix One can see that under physiological conditions the process of fluctuational opening of nucleotide pairs induces only slight stationary violation of the double helix. So at 2O~ the opened base-pairs are spaced about 2 • l06 nucleotide pairs apart. However, this is by no means an indication that the process plays a small role in the functioning of the DNA molecule. An important role of the process is connected first of all with large rate constants of the conformational transitions in DNA. Indeed, it follows from relaxational (Craig et al., 1971) and nuclear magnetic resonance (Hflbers & Patel, 1975) data that the rate constant of base-pair closing at the end of the helix (X) is about 107 s -1. The rate constant of the opening of a base-pair from the helix is xW, where W is the thermodynamic probability of the process. One can see that under physiological conditions the rate constant for the opening of a base-pair would be about 101 to 102s-1. This constant decreases sharply as the number of base-pairs in the opened region increases (approximately tenfold as the length of the opened region is increased by one base-pair). The rate constant of the opening of a base-pair situated at the end of the helix is about 106 s- 1, and this value also decreases sharply as a base-pair recedes from the end. The fluctuational opening of the double helix should play a decisive role in a variety of processes of DNA damage by chemical agents which, just as formaldehyde,

678

A. V. L U K A S H I N E T A L .

react witli nucleotide bases. Such chemical modifications are known to lead to lethal effects or, after some fltrther processes, to mutations. The effect of fluctuational opening may also play an important role in DNA normal functioning, in the processes of replication, transcription and modification of DNA, because these processes are accompanied apparently by a local separation of DNA strands. However, to clarify this role a detailed investigation is needed of the interaction of the double helix with proteins participating in the processes of replication and transcription. So our knowledge of the properties of pure DNA may allow analysis of only a very limited number of possible mechanisms. Among them is the hypothesis concerning the existence of stable or fluctuational cruciform structures of palindromic sequences about 20 nucleotides long at the operator regions proposed by Gierer (1966). This hypothesis is apparently inconsistent ~dth the thermodynamic properties of DNA because it supposes a transient appearance of large opened regions. We cannot exclude theoretically the possibility of the formation of such a cruciform structure under the protein but it is inconsistent with the experimental data presented by Wang et al. (1974). So large unwound or cruciform regions should not exist both before and after binding of repressor ~4th operator. In this paper we have discussed only one particular type of conformational motility of the double helix. Many other types of fluctuational changes have been discussed in the literature. Among them are the local B to A transition (Ivanov et al., 1974), fluctuational change in a wound angle in DNA helix (Depew & Wang, 1975), and more exotic processes such as proposed by Teitelbaum & Englander (1975a,b). Crick & Klug (1975) have assumed the existence of kinks in the DNA double helix which result from a disruption of the stacking interaction between adjacent base-pairs without disruption of hydrogen bonds. The opening of a base-pair studied in this paper should not differ considerably from the formation of kinks as regards the free energy required. At the same time, the opening of a base-pair must be followed by formation of a hinge in the double helix with a considerable degree of freedom, much higher than in the case of the kinks proposed by Crick & Klug (1975). So, if inside chromatin or virus particles such hinges could exist, as Crick & Klug (1975) have proposed, they may be formed successfully by the opened base-pairs. Note that our calculations show that such hinges should be formed due to opening of an A.T pair with a much greater probability than that due to the opening of a G-C pair. As we have pointed out, the quantitative characteristics of the process should be strongly affected by interaction with specific proteins. A formation of the opened bases capable of reacting with formaldehyde, induced by the binding of protein (RNA polymerase) to DNA, has been shown and studied in detail by Zarudnaya et al. (1976). APPENDIX 1 C o m p u t a t i o n o f the P r o b a b i l i t y o f Opening o f a Base-pair

To calculate the probability of opening of a base-pair we have used the theory and the values of parameters which were used earlier for description of the helix-coil transition in DNA. The theory and its comparison with experimental data are reviewed in detail by Lazurkin et al. (1970) ; Vedenov et al. (1971) ; Frank-Kamenetskii & Lazurkin (1974); Vologodskii & Frank-Kamenetskii (1975) and Lyubchenko et al. (1976). The theory describes practically all details of the experimentally observed

FLUCTUATIONAL

OPENING

OF DOUBLE

HELIX

679

behaviour of D N A inside the melting range. It should be emphasized that the theory is based, strictly speaking, on an assumption that the opened regions are sufficiently large (see e.g. Vedenov et al., 1971) and its applicability to the temperature region well below the melting range is questionable. Nevertheless we have assumed that the theory of helix-coil transition m a y be used at temperatures below the melting range, and this assumption is the main one in our treatment. It m a y be tested by the comparison of the results of calculations with experimental data. The calculations of the probability of opening of a base-pair inside the helical section have been performed using the equations of the theory of the helix-coil transition which strictly take into account the effect of loops formation (Poland, 1974) as well as the variant of the theory wlueh ignores this effect (Vologodsii & Frank-Kamenetskii, 1975). Because at low temperatures only short regions are melted out, the results of calculations were virtually unaffected by taking into account the effect of loop formation. So we used in our calculations of kinetic curves a simpler model (Vologodskii & Frank-Kamenetskii, 1975). In present work we have used the N

single equation Wf : 1 -- ~. ~9k from the paper by Vologodskii & Frank-Kamenetskii 1

(1975) to calculate the probability of opening of any base-pair including the boundary ones. The effect of loop formation becomes important at elevated temperatures, so the characteristics of fluctuational opening presented in Figure 4 and Table 5 were calculated by the method of Poland (1974). The mean correlation length is an important characteristic of the process of fluctuational opening. To obtain this value the correlation function F for the states of two base-pairs in the chain is calculated. This value is defined by

F(i)

:/w,1: \ Wm

/

where W~I is the probability of opening of the base-pair situated i nucleotides apart from the given base-pair when the latter is opened and Wm the same value when the given base pair is closed. The symbol ( . . . ) denotes the average overall sequences. The correlation length r is defined to be equal to a minimum number i for which F(i) _< 1. For calculations the standard set of DNA thermodynamic parameters was used (Frank-Kamenetskii & Lazurkin, 1974): UAT--~ 8 kcal/mol; a - 5 • -5. The values TAT and TGc depend on the sodium ion concentration in solution and they were determined with the aid of an empirical relation proposed by Frank-Kamenetskii (1971). We did not take into account the heterogeneity of stacking interactions. This effect was shown by Belintsev et al. (1976) to be very small for DNA and at ordinary conditions it is masked completely by the great difference in stability between A . T and G.C pairs. In what respect are the results of calculations of kinetic curves presented in the main part of the present paper sensitive to the particular assumptions used as the bases of these calculahons? The probability of opening of the double helix depends mainly on the value of the co-operativity factor a. Special calculations have shown that the characteristic time r of the process of DNA unwinding b y formaldehyde depends strongly on a (~ ~ 1/%/a). For example a variation by a factor of ten or more from the value of a used in our calculations should produce a large gap between theory 45

680

A . V . LUKASHIN

BT

AL.

and experiment. On the other hand, the probability of opening of a boundary basepair, which governs the rate of growth of a despiralization centre, m a y be expected to depend considerably on the size of the adjacent opened region when the latter is small. Such an effect follows from the experimental data presented b y Gralla & Crothers (1973). To test the possible influence of this effect on our results we have introduced corresponding correlation terms into the values of probability of base-pair opening at the boundaries of small loops. Introduction of these corrections with the use of loop-weighting factors in the form proposed by Gralla & Crothers (1973) was shown to lead to no effect on the kinetic curves of DNA unwinding by formaldehyde, within the limits of accuracy of calculations. So the values of probability of opening of base-pairs calculated by the theory of helix-coil transition cannot differ considerably from the real values, but a correlation between calculated and experimental kinetic curves cannot guarantee the correctness of calculated values of the probabilities of opening of boundary base-pairs for very small loops. The latter values may differ from real ones b y a factor of ten or more. I t should be emphasized that this may be valid only for very small loops, containing few base-pairs, and as the size of a loop increases the probability of opening of a boundary base-pair must approach the value predicted by the theory of helix-coil transition. APPENDIX 2 Rate Constants o f Formaldehyde R e a c t i o n with Nucleotides The rate constants of the forward and reverse reactions of formaldehyde with nucleotides, collected in Table 1, have been chosen mainly on the basis of data presented by McGhee & v o n Hippel (1975a,b). An unambiguous choice of the parameters presents difficulties because the values of the ratio K(r~/K (r) obtained by MeGhee & yon Hippel (1975a,b), and the values of equilibrium constants directly measured by them, differ one from another by a factor of 1-5 to 2. MeGhee & yon Hippel (1975a) have discussed the possible causes of this discrepancy but have not come to any definite conclusion. Besides, some of the parameters needed have been measured not for nucleotides but for nucleosides. We have used a set of the data obtained by McGhee & v o n Hippel (1975a,b) as well as some results from earlier papers by Haselkorn & D o t y (1961), Grossman et al. (1961) and Levine & Barnes (1966) to choose the values of parameters. In the limits of ambiguity mentioned, and within the accuracy of the experiments, the values of parameters chosen correlate with the values listed by McGhee & yon Hippel (1975a,b). However, some parameters were not directly measured at all. Therefore activation enthalpies for the reverse reactions have been calculated from the activation enthalpies of forward reactions and equilibrium enthalpies of reaction. For GMP the equilibrium enthalpy was not measured, so we have chosen it to be equal to the equilibrium enthalpy for AMP. Note in this connection that the particular values of parameters for guanine are least important. I t is the cytosine which plays a leading role in the reaction of the G-C pair with formaldehyde, because it has a much greater rate constant as well as equilibrium constant of modification. In other cases we have also introduced controls such that a variation of parameters of reactions within the limits of accuracy of their

F L U C T U A T I O N A L O P E N I N G OF D O U B L E H E L I X

681

determination does not lead to considerable changes in the results presented in this paper. I n what respect m a y the values of parameters determined for free nucleotides and nucleosides be applied to opened nucleotides inserted into a polynucleotide chain? The use of these parameters seems to be reasonable for A, G and G, because McGhee & yon Hippel (1975a) have shown t h a t these values do not change appreciably when one passes from nucleoside to nucleotide, and so they m a y be expected to remain unchanged under further passage from free l:ucleotides to nucleotides inserted into the chain. This m a y not be the case for thymine. For this base the rate constants of both forward and reverse reactions change by a factor of two under the passage from nucleoside to nucleotide (McGhee & yon Hippel, 1975b). A further change m a y be expected for TMP inserted into the polynucleotide chain. Unfortunately, the rate constants of formaldehyde reaction with TMP inserted into polynucleotide chain are not known and we have used the values for free TMP. I t has introduced an additional uncertainty into our results. This uncertainty, however, cannot be great because for UMP the rate constants change about fourfold after its insertion into the polynucleotide chain (McGhee & v o n Hippel, 1975b). This change leads to a minor increase of r at p H 5.5 and leads to an increase of r for higher p H values by a factor of 1-5 to 2. We thank :Dr Y. S. Lazurkin and our colleagues in the laboratory for helpful discussions and Drs P. H. von Hippel and J. :D. McGhee for sending us copies of their work before publication. REFERENCES Belintsev, B. N., Vologodskii, A. V. & Frank-Kamenetskii, M. D. (1976). Molek. Biol. 10, 764-769. Craig, M. E., Crothers, D. M. & Doty, P. (1971). J . Mol. Biol. 62, 383-401. Crick, F. H. C. & Klug, A. (1975}. Nature (London), 255, 530-533. Depew, R. E. & Wang, J. C. (1975). Proc. Nat. Acad. Sci., U.S.A. 72, 4275-4279. Frank-Kamenetskii, M. D. ( 1971). Biopolymers, 10, 2623-2624. Frank-Kamenetskii, M. D. & Lazttrkin, Yu. S. (1974). Ar~nu. Rev. Biophys. Bioeng. 3, 127-150. Gierer, A. (1966). Nature (London), 212, 1480-1481. Gralla, J. & Crothers, D. M. (1973). J . Mol. Biol. 78, 303-319. Grossman, L., Levine, S. S. & Allison, W. S. (1961). J. Mol. Biol. 3, 47-60. Hammersley, I. M. & Handscamb, D. C. {1964). Monte Carlo Methods, Academic Press, London and New York. Haselkorn, R. & Dory, P. (1961). J. Biol. Chem. 236, 2738--2745. Hilbers, C. W. & Patel, D. J. (1975}. Biochemistry, 14, 2656-2660. Ivanov, V. I., Minehenkova, L. E., Minyat, E. E., Frank-Kamenetskii, M. D. & Schyolkina, A. K. (1974). J. Mol. Biol. 87, 817-833. Kallenbach, N. R. (1968). J. Mol. Biol. 37, 445-466. Lazurkin, Yu. S., Frank-Kamenetskii, M. D., Trifonov, E. N. {1970). Biopolymers, 9, 1253-1306. Levine, S. & Barnes, M. A. (1966). J. Chem. Soc. B6, 478-485. Ly~lbchenko, Yu. L., Frank-Kamenetskii, M. D., Vologodskii, A. V., Lazurkin, Yu. S. & Gause, G. G. (1976). Biopolymers, 15, 1019-1036. Massie, H. R. & Zimm, B. H. (1965). Proc. Nat. Acad. Sci., U.S.A. 54, 1641-1643. McGhee, J. D. & von Hippel, P. H. (1975a). Biochemistry, 14, 1281-1296. MeGhee, J. D. & v o n Hippel, P. H. (1975b). Biochemistry, 14, 1297-1303. Poland, D. (1974). Biopolymers, 13, 1859-1871. Teitelbaum, H. & Englander, S. W. (1974a). J. Mol. Biol. 92, 55-78.

682

A.V.

LUKASHIN

ET

AL.

Teitelbaum, H. & Englander, S. W. (19755). J. Mol. Biol. 92, 79-92. Trifonov, E. N., Frank-Kamenetsk~, IV[. D. & Lazurkin, Yu. S. (1967). Molek. Biol. 1, 164-176. Trifonov, E. N., Shafranovskaya, N. N., Frank-Kamenetskii, l~I. D. & Lazurkin, Yu. S. (1968). Molek. Biol. 2, 887-987. Utiyama, H. & Dory, P. (1971). Bioc~m@try, 10, 1254-1264. Vedenov, A. A., Dykhne, A. M. & FrankoKamenetskii, M. D. (1971). Uspskhi .Fizich. N a u k , 105, 479-519. Vologodskii, A. V. & Frank-Kamenetskfi, M. D. (1975). J. T ~ o r . Biol. 55, 153-166. yon Hippel, P. H. & Wong, K.-Y. (1971). J. Mol. Biol. 61, 587-613. Wang, J., Barkley, M. D. & Bourgeois, S. (1974). Nature (London), 251, 247-249. Zarudnaya, M. I., Kosaganov, Yu. N., LazurkLu, Yu. S., Frank-Kamenetskii, M. D., Beabealashvilli, R. S. & Savochkina, L. P. (1976). Eur. J. Biochsm. 63, 607-615.

Fluctuational opening of the double helix as revealed by theoretical and experimental study of DNA interaction with formaldehyde.

J. Mol. Biol. (1976) 108, 665-682 Fluctuational Opening of the Double Helix as Revealed by Theoretical and Experimental Study of D N A Interaction wi...
1MB Sizes 0 Downloads 0 Views