J. Mol. Biol. (1992) 226, 1161-1173

Molecular Structure of the EDNA d(CGCAAATTTGCG),

Dodecamer

An Examination of Propeller T$st and Minor-groove Water Structure at 2*2A Resolution Karen J. Edwards, David G. Brown, Neil Spink, Jane V. Skelly and Stephen Neidlet Cancer Research Campaign Biomolecular Structure The Institute of Cancer Research Sutton, Surrey SM2 5NG, U.K.

(Received 23 December 1991; accepted 21 April

Unit

1992)

The crystal structure of the dodecanucleotide duplex d(CGCAAATTTGCG), has been solved to 2.2A resolution and refined to an R-factor of 18.1 y. with the inclusion of 71 water molecules. The structure shows propeller twists of up to -20” for the A*T base-pairs, although there is probably only one (weak) three-centre hydrogen bond in the six base-pair AT narrow minor-groove region. An extensive ribbon of hydration has been located in this groove that has features distinctive from the classic “spine of hydration”. Solvation around phosphate groups is described, with several instances of water molecules bridging between phosphates. Keywords:

B-DNA

dodecamer; d(CGCAAATTTGCG),;

1. Introduction

crystal structure;

hydration

et al., 1986). It has been concluded on the basis of

gel retardation and nuclear magnetic resonance studies (Nadeau & Crothers, 1989; Crothers et al., 1990) that the bending is towards the minor groove direction, within the A*T tract. There is some evidence that it does not require highly propellertwisted base-pairs (Nadeau & Crothers, 1989) or bifurcated cross-strand hydrogen bonds (Diekmann et al., 1987). Hydroxyl-radical cleavage studies (Burkhoff & Tullius, 1987) have suggested that the minor groove is narrowed in oligo(dA) tracts in kinetoplast DNA. Several single-crystal studies have been reported on oligonucleotides containing varying lengths of oligo(dA) tract. All are dodecamers. The structure of d(CGCAAAAAAGCG)*d(CGCTTTTTTGCG) shows high propeller twists for the A*T base-pairs, ranging up to 25 ‘, together with an extensive network of bifurcated hydrogen bonds linking adjacent A-T base-pairs (Nelson et al., 1987). Similar features have been reported in the structure of d(CGCAAATTTGCG)2 and its distamycin complex (Co11 et al., 1987). A more complex situation has the structure been found ’ of d(CGCAAAAATGCG;d(CGCATTTTTGCG), with

It has long been established that oligo(dA) tracts in DNA, either as poly(dA)*poly(dT) or within a B-DNA sequence, exhibit a variety of anomalous properties. The former has a reduced periodicity in solution compared to random-sequence DNA, of 10 1 ( + 91) base-pairs per turn (Rhodes & Klug, 1980; Travers & Klug, 1987; Travers, 1989), and cannot be reconstituted into nucleosomes (Simpson & Kunzler, 1979). All these properties are indicative of unusual structural features. Detailed structural models for poly(dA)*poly(dT) have been derived from X-ray tibre diffraction studies (Arnott et al., 1983; Alexeev et al., 1987a,b; Park et al., 1987), computer simulation studies (Chuprina, 1987; Fritsch & Westhof, 1991) and molecular modelling based on crystallographic data (Aymami et al., 1989).

Intrinsic bending has been found in oligo(dA).oligo(dT) tracts that are at least four base-pairs in length and are in phase with the helical repeat (Koo TAuthor to whom all correspondence should be addressed. 1161 0022-2836/92/161161-13

sos.oo/o

0

1992

Academic

Press

Limited

1162

K. J. Edwards

two orientations for the duplex in the crystal (DiGabriele et al., 1989). Again, high propeller twists and bifurcated hydrogen bonds were observed in the oligo(dA) tract of the structure, although the helix bending of 20” in the major groove direction was ascribed to crystal packing forces rather than being representative of A-tract bending in solution. During the course of crystallographic studies on minor-groove drug-oligonucleotide complexes in this laboratory, attempts were made to co-crystallize the sequence d(CGCAAATTTGCG), (termed A3T3) with the drug pentamidine. (We had previously solved the structure of this sequence cocrystallized with the drug berenil (Brown et al., 1992).) Crystals of this “pentamidine” complex were shown by h.p.1.c.t analysis not to contain any drug. These crystals of the native A3T3 structure, which diffracted in our hands to a significantly higher resolution (2.2A; 1A = 91 nm) than had been previously reported (Co11 et al., 1987; 2.5A) have been re-solved by us and refined in order to compare it with the berenil complex. We report here the structure of the native sequence in detail. It is now possible to describe the extensive water network in the oligo(dA) tract minor groove, and a number of structural features that differ significantly from the previous brief report of Co11 et al. (1987). In particular we examine whether the spine of hydration first observed in the AT region of the dodecamer d(CGCGAATTCGCG)2 (termed A2T2) (Drew & Dickerson, 1981), has an equivalent in the present structure, in the light of more recent theoretical studies (Subramanian et al., 1988; Chuprina et al., 1991). We also examine detailed base-pair flexibility features in the AAATTT region, in the light of the evidence that a minimum run of four consecutive adenine residues is required for the manifestation of oligo(dA) tract bending (Crothers et al., 1990).

2. Experimental (a) Synthesis and c~lstallization

The DNA dodecamer d(CGCAAATTTGCG)2 was synthesized on the 2 pmol scale on an Applied Biosystcms automatic DNA synthesizer using phosphoramidite chemistry and purchased from Oswel DNA Services (Edinburgh, U.K.). Prior to use, the DNA was annealed by immersion of a solution of DNA in buffer in a water bath at 80°C for 5 min and allowing it to cool slowly overnight to room temperature. Crystals were grown by vapour diffusion at 16°C using hanging drops, during attempts to co-crystallize this sequence with the minor-groove-binding drug pentamidine. Rectangular, block-like, colourless crystals grew in a droplet containing 5pl of DNA (lOmg/ml; 3mM), 2l.d of MgCl, (lOOmM), 2~1 of spermine (2mM), 6~1 of pentamidine (5mM) and 5~1 of 30% (v/v) MPD made up in 30mMsodium cacodylate buffer at pH 7.0 (the DNA and drug were made up in buffer while the spermine and MgCl, were made up in distilled water). All solutions were filtered through a 2pm filter prior to use. The reservoir TAbbreviations chromatography;

used: h.p.l.c., high pressure MPD.

liquid

et al.

contained lml of 30% (v/v) MPD and O5ml of 60% (v/v) MPD that was added after 2 weeks. The concentrations of pentamidine and reservoir MPD were increased after 2 weeks when no crystals grew. Thereafter crystals grew in 2 days after increasing the reservoir concentration. One crystal of size Ol mm x O15 mm x O15 mm was mounted in a 05 mm quartz capillary and used for data collection. (b) Data collection Intensity data were collected at 19°C on a Xentronics Multiwire Area Detector using an Enraf-Nonius GX21 rotating anode X-ray generator with the power at 40kV and 75mA. A crystal-to-detector distance of 8 cm was used with the 20 swing angle being set at 8”. Two data sets of 500 frames each were collected at 2 phi settings of 0” and 60”, each over a 100” rotation in omega in order to collect the unique data set, the step size being set at 0.2”. Frame scanning, unit-cell determination, integration, data reduction and scaling were performed with the XENGEN V1.3 program package. The crystal remained stable in the beam for the duration of the data collection, but exhibited a certain amount of mosaic spread at low resolution, suggesting internal crystalline disorder. 9704 reflections out of a total of 12,896 pyssible reflections were collected, to a resolution of 22A. These were merged to give 2943 unique reflections out of a possible total of 3402 (865o/o) with an overall merging agreement of 442 %, of which 2502 reflections had intensities over the 2a level. Between 2.4 and 2.2A, a total of 220 out of a possible 540 reflections were observed. (c) Structure

solution

and re$nement

The unit cell dimensions-of the crystal are: a = 2487A, b = 409OA and c = 6564A, in the orthorhombic space group P2,212,. These overall unit cell dimensions are similar to those published by Co11 et al. (19F7) for this sequence (a = 25208, b = 41.658, c = 6581A) and also with other dodecamer sequences (Dickerson et al.. 1991) suggesting an isomorphous structure. We decided to solve and refine the structure using coordinates from idealized, fibre-diffraction B-DNA (Chandrasekaran & Arnott, 1989) as a starting-point, in order to minimize any bias towards existing dodecanucleotide structures. An idealized B-DNA duplex was generated by the program GENHELIX (S. Neidle & L. H. Pearl, unpublished results) and transformed to the position reported by Co11 et al., for the distamycin complex, as a starting model for the refinement. Rigid-body refinement was initiated using the program CORELS (Sussman, 1984). The model was initially refined as a duplex agamst the experimental data, first in the resolution range 9A to 4A with 902 reflections and subsequently in the range 8A to 3A with 1329 reflections. After scaling and 5 cycles in each resolution range, the model gave an R-value of 36.5%. The model was then partitioned into 48 groups comprising 24 nucleosides (i.e. base and sugar) and 22 phosphate groups with the 2 O-5’ atoms being defined separately, and refined further against the experimental data to a resolution of 2.5A, with 2212 reflections using all data (i.e. no si ma cut@). The R-factor converged to 27.0% for the 81 to 2.5A resolution data. At this stage the refinement was continued using a restrained least-squares refinement procedure with the program NUCLSQ (Westhof et al., 199). All observed data with I >2a(I) initially between 8A and 2*5A were

Molecular Structure of d(CGCAAATTTGCG),

Table 1 Crystallographic and stereo&mica1 rejnement parameters for d(CGCAAATTTGCG), Resolution range No. reflections [I > 2u(Z)] Temperature Final R-factor Final weighted R-factor Distances > 2a

a2.2B 2502 19°C l&l yo l&S% 61 r.m.s.

dev.

0016

0025

0033 0603

izi:i .

0040

0040

Chiral volumes

0.067 0.088 0221

;:y;; i 3 0250 0250

A li

2.728 3846 5274 4.206

300 500 500 500

A2 A’ A2 A2

Single torsion contacts Multiple torsion contacts Isotropic temperature factors: sugar-base bonds sugar-base angles phosphate bonds

phosphate angles/H-bonds Weighting scheme applied to structure factors AFSIG BFSIG

The temperature factors of the water molecule oxygen atoms ranged from 31.5A2 to 71.4A2, with an average of 51.3A2. A total of 71 water molecules were included, giving a final R-factor of 18.1 y. using 2a data in the range 8A to 2.2A. It was not possible to identify any sodium, magnesium or spermine ions. The details of the refinement statistics are shown in Table 1. Final refined co-ordinates and structure factors have been deposited in the Brookhaven Protein Data Bank as entry no. PDBlD65.

u Value

Sugar-base bond distances Sugar-base bond angle distances Phosphate bond distances Phosphate angle and H-bond distances Planar groups

0.025

1163

ii

3. Results (a) Overall structure

A

[l/(SIGAPP)‘] 2.5 - 490

R = ZJF;F,I/ZF,. R, = ([~W(F.-F,)J2)“2/[z(w(~~)~]~‘~. w = l/SIGAPP2. SIGBPP = AFSIG+BFSIG (STHOM.1666667). r.m.s. dev., root-mean-square deviation.

used (a total of 2179 reflections). Several rounds of positional and thermal refinement reduced the R-factor to 24 %. (F, - F,) and (2F, - F,) difference maps were calculated using the PROTEIN (Steigemann, 1974) package and displayed on a Silicon Graphics Iris 3130 workstation using the graphics package TOM (Cambillau, 1988). At this stage, all base, sugar and phosphate groups were well fitted to the density with the exception of base 19. An “omit” map leaving out the base/sugar group 19 was calculated and the base manually fitted into the density. The F,- F, and 2F,-F, maps were used to locate solvent molecules and only peaks that were within 2.2 to 3.4A of possibly hydrogen bonding partners and that showed acceptable hydration geometries were accepted as possible water molecules. These were included in subsequent refinements. Positional and temperature factor refinement was performed for all DNA atoms and oxygen atoms representing water molecules. Over the next few rounds of refinement theOresolution limit was gradually increased, firstly to 2.4A with 2383 reflections, and finally to 2.2A with 2502 reflections, updating the AFSIG, BFSIG and scale values as necessary. Major-groove and phosphate-backbone water molecules were included after every 5 cycles. Possible water molecules in the minor groove were initially excluded. No significant continuous density developed to correspond to drug in the minor groove. Instead, a series of discrete resolved peaks were observed and assigned as water molecules. Water molecules in this region were then included in the refinement. Subsequent F,-F, difference “omit” maps, excluding the water molecules in the minor groove, also showed a spine of discrete peaks. We were therefore confident that no drug molecule was present in the minor groove. There was no evidence of disorder in the DNA structure.

The helix is of the right-handed B-DNA type with the crystallographic asymmetric unit consisting of two equivalent self-complementary dodecanucleotide strands forming an antiparallel duplex. Bases are numbered from Cl through G12 in the 5’ and 3’ direction for one strand and Cl3 through G24 in the 5’ to 3’ direction

for the other

strand.

The

solvent

molecules are labelled W25 to W95. Figure 1 shows a stereo view of the structure. There is an average of 10.0 base-pairs per turn over the entire helix, with an overall bending in the helix axis of 19”. The mean helical twist angle between base-pairs is 35.9” and the mean rise per base-pair is 3.388. There is a slight negative average roll value of -0.6”. (b) Sugar-phosphate Conformational diester backbone

backbone

parameters for together with

the phosphosugar pucker

information, are given in Table 2. Glycosidic angles are all in the anti range; as with the parent A2T2

and other dodecanucleotide structures (see for example, Narayana et al., 1991), there is marked variation along the sequence. The average glycosidic angle of 258”, is significantly greater than that for A2T2 (243”; Dickerson & Drew, 1981), with a greater spread of values. In contrast to A2T2, a trend is apparent for purine residues to adopt consistently higher 1 values (mean of 272”) than pyrimidine residues (mean of 247 “). Other backbone angles show the variation typical of oligonucleotide structures, with average values close to those reported for the A2T2 structure. Values for the 6 angle, which relate to sugar pucker, reflect the variations in pucker. These vary between C-2’endo and C-4’exo, with the majority clustering around the former range. There is a definite tendency for the thymidine sugars to be in the C-l’exo/O-4’erulo range and the adenosine sugars to be C-2’endo, as noted for the A2T2 structure (Drew et aE., 1981). Significantly different mobilities for the adenosine and thymine residues is indicated bg their average temperature factors of 20A2 and 24A2, respectively. (c) Minor groove width The duplex has a narrow minor groove in the AAATTT region (Fig. 2(a)), with a minimum P...P

1164

Figure 1.

K. J. Edwards et al.

Stereo

view

of the

molecular

structure

of

d(CGCAA,ATTTGCG)*.

interstrand separation of 46A (taking into account the van der Waals’ radii of phosphate groups). The extent of this narrowing AT stretch, covering four to five of the AaT base-pairs, is overall similar to that reported for the Dickerson-Drew A2T2 dodecamer (Dickerson & Drew, -1981). However, the minimum groove width of 3A in A2T2*is much less than in the present A3T3 one (46A) as is the

average width of 3.4A compared to 46A in A3T3, which has a markedly more even AT region. A similar pattern is evident in the plot of interstrand H-4’...H-5’ distances (Fig. 2(b)). Atoms H-4’ and H-5’ (whose positions were generated using standard geometric criteria) are situated on and form the outer edge surface of the minor groove. Their positions are thus a sensitive indication of changes in

Table 2 Conformation&

parameters for d(CGCAAATTTGCG),

Main-chain Residue

Gly cosyl x

Cl G2 c3 A4 A5 A6 Tl T8 T9 GlO Cl1 G12

223.5 2886 2222 272.8 2789 255.8 244.3 238.5 256.8 2935 3043 2826

-277.0 301.1 309.3 320.1 296.4 2933 2821 3394 307.0 304.3 292.9

Cl3 G14 Cl5 A16 Al7 A18 T19 T20 T21 G22 C23 624

242.8 2503 227.1 2763 2700 2675 2335 261.4 2534 2764 260.8 2536

Average

2584

torsion

angles

(“)

(” )

a

B

8

i

P-P separation

(A)

Pseudorotation angle (“)

sugar conformation

Y

6

~ 192.4 1476 1596 1788 1766 1753 1753 1489 1904 147.1 181.6

131.4 44.7 47.2 65.2 236 531 540 640 256 233 34% 27.7

104.5 1557 894 1287 146.7 127.1 100.1 1250 1037 1699 1446 1390

1741 2192 1980 1870 1894 1754 1683 1935 1791 251.9 206.4

2820 2031 261.1 2557 2538 2722 2758 2397 2809 1580 253.4

68 62 7.1 6.5 7.0 66 69 63 68 68

1137 1694 685 153.7 1700 1399 1039 1321 log6 182.4 1689 1523

C-l’ C-2’ C-4’ C-2’ C-2’ C-l’ O-4’ C-l’ O-4’ C-2’ C-2’ C-l’

exo do em end0 end0 exo end0 exo end0 end0 emi0 end0

2440 3094 2903 3391 317.2 3251 3297 309.8 3281 3286 3107

147.2 1537 1752 161.6 1672 1459 1769 167.7 1586 1497 1m4

3564 562 539 656 255 329 3Q5 249 441 299 206 339

1258 1322 86.7 1438 1467 1336 107.3 1484 1100 137.8 1459 QO.7

3216 197.6 164.5 1839 1792 1782 190-o 188.9 180.6 2193 2061

131,9 2336 284.1 232.0 2461 2492 2501 2491 261.5 18Ql 227.8

66 65 69 67 67 67 67 6.5 67 64

128.6 1403 54.7 1660 1621 1560 1143 1649 1160 1467 1763 27.0

C-l’ C-2’ C-4’ C-2’ C-2’ C-2’ C-l’ C-2’ C-l’ C-2’ C-2’ C-3’

exo end0 exo e&o end0 em!0 exo end0 exo end0 en& end0

306.6

164.9

574

1268

197.6

2396

67

1334

Molecular Structure of d(CGCAAATTTGCG),

1165

(d) Base-pair and step morphology

s:,.,.,.,.,.;.,., 5-24

6-23

7-22

8-21

9-20

P-P

IO-19

II-18

12-17

pair (a)

I

0

lb t-l’-H’

pair (b)

Figure 2(a). Plot of minor-groove width in $erms of inter-strand P...P distances, less 58A. (El) A3T3-distamycin; (0) A3T3, native; ( n ) A3T3-berenil; (0) A2T2, native; (m) AT6 (Yoon et al., 1988); (0) R6 (Nelson et al., 1987). (b) Plot of minor-groove width in terms of inter-strand H-4’...H-5’ distances. H-4’ on one strand has been paired with H-5’ on the second strand, n+3 residues distant. (0, A2T2, native; (0) A3T3, native.

groove width (Neidle, 1992). Figure 2(b) indicates a minimum average groove width of -55A in the A3T3 structure, being roughly constant over three A.T base-pairs. The A2T2 structure has a slightly central minor groove region, longer narrow extending in the 3’ direction, although the two structures have very similar widths at the extremities of the central six base-pair region. This shortening of the narrow region in A3T3 may be at least in part due to the slightly seduced propeller twists at the 3’ end of the A.T stretch (Fig. 3(a)), in agreement with the proposed dependence of groove width on propeller twist (Fratini et al., 1982; Yoon et al., 1988). Widening at the 3’ end has also been observed in the structure of the alternating A*T duplex d(CGCATATATGCG), (Yoon et al., 1988), although here the propeller twists at base-pairs 8 and 9 are high compared to A3T3. Nonetheless, this structure has lower average A.T propeller twist than the A2T2 structure, or the A-tract structure (Nelson et al., 1987), which as with the present A3T3 one, results in an increased average minor groove width.

The values for several base-pair parameters? closely parallel those found in the A2T2 structure as well the related dodecamer d(CGTGTATT&CG)z (Larsen et al., 1991; Narendra et aZ., 1991) as instanced by local helical twist values for A3T3 (Fig. 3(b)). The helical twist at the C-A step in A3T3 is 36”, which is larger than in most other dodecamers (average 31”) yet smaller than in decamer structures (Yanagi et al., 1991, average of 49”). Propeller twist values are all negative; as in A2T2 the CG base-pairs at the ends of the AAATTT sequence in the present structure have to -6”) propeller twist at the AT low (-2” junctions. A propeller twist of - 26” was reported for the AST3distamycin complex (Co11 et al., 1987), and similarly high values ( - 24” to -26”) in the two A-tract dodecamer structures (Nelson et al., 1987; DiGabriele et al., 1989; Fig. 3(b)). Figure 3(a) shows that the pattern of propeller twists along the sequence of A3T3 is close to that in the A3T3-berenil drug complex (Brown et al., 1992), as indeed are other base-pair and step parameters. The large differences that we see in propeller twist (and these other parameters) between the A3T3 structure and its complex with distamycin (Co11 et al., 1987), may of course be ascribable to distortions induced by drug binding. The minimal changes in these parameters that berenil binding produces, may be a consequence of the smaller size of this drug molecule compared to distamycin, which makes contacts with five out of the six A-T base-pairs, compared to the four contacted by berenil. Other base-pair and base-pair step parameters show significant differences from the A2T2 structure (Fig. 3(c) to (g)). Differences in buckle are mainly confined to base-pairs T8A17 to CllwG14, although A4*T21 in the present structure has a high buckle of - 12”. Base-pair T8.Al7 has a 7’ increase compared to the same pair in A2T2. The values for base-pair tip (Fig. 3(e)) are generally closer to 0” for the A.T base-pairs in A3T3 compared to A2T2. Roll values are significantly different in the A.T region of A3T3 compared to A2T2 (Fig. 3(f)), with for example, a reversal in roll sign at step 8 at the 3’ end of the AT sequence, with a change in magnitude of 6”. Differences in tilt angles for the two structures (Fig. 3(g)) are again the greatest in the AT region; these are relatively small in absolute terms (2*5”), and thus we cannot be sure whether they represent significant differences.

(e) Three-centre hydrogen bonding All possible major-groove three-centre bonding geometries have been examined, shown in Figure 4. Potential hydrogen

hydrogen and are bonds are

tHeIica1 parameters were calculated with the NEWHELIX program of R. E. Dickerson, available from the Protein Data Bank, Brookhaven National Laboratory, Upton, NY 11973, U.S.A.

K. J. Edwards et al.

1166

-,”

z? 0 -IO ‘;.3 L al = : -20 P a

o-

z al r ” 2 -lO-

-2o-

.

( 2

0

.

, 4

.

, . , 6 8 Base-pair

, IO

.

, . 12

12

14

Cd)

-30 0

2

4

6

8

IO

12

14

Base-pair

IO-

(a) i++)

A3T3-dlstamycin

(+)

A3T3,

(9)

A3T3-berenil

$ z a +

native

o-

-lO-

0 -20-I 0

2

4

6

8

Base

2 v - -10 .-;; ?

IO

pair (e)

=li? 4

-20

fi

30

(*)

A3T3,

0

native

(-+)

AZTZ,

(w)

AT6

(9)

R6 (Nielson

2

4

native (Yoon

6

8

Base-pair

eta/.,

if

1988)

step )

ef al, 1987)

-3 I

2

3

4

5

6

Base-pair 20

. 0

, 2

.

, 4

. Base

, 6 pair (cl

. step

, 8

.

, IO

7

8

9

1011

step

(9)

. 12

Figure parameters. native.

3(a)

to (c)

to

(g). Plots (g) (!J)

of base-pair morphological A3T3, native; (a) A2T2,

Molecular Structure of d(CGCAAATTTGCG),

,O-6

2.90

CGZL)

/----._--

_

,I’

_ -0-6 ,* I.91

1167

IG221

H’

N-4:.:-

-. ._

.-- __ 3.81 ‘-.

2.89_., __----

*. \,,3.3?. ._ ,-‘r

*’

N-6~.-

/--90-4 ,s *’ I.93

tT2li

n -

-

-’ .-.,_ 3.58

‘. .3.48 -

-i-+0-4

2.91_,.-_ .-. --

,e

CT201

,,,“I.93

-H

N-6-. -..___

3.12

--\ ------J*

-.

2.87 ,,

2.93 *_-.-

,*(46)

H

N-6

I

H

__:--

.’

N-6

(Al81

H-

~N+(Al8t _ ..---

_I .-.-;.g4 (T7,

o-4

-. 2=357.0”

ITl9)

n

1.96,,’

O-4:*’ ‘K

O-4 .,’

,A.97

(A61 N-6#

,*’ ITi)

_ __-’

k.--. “-

95.7”

95+3”HtN-6,A,7)

--

3. 44

H_ _.-a 1.9!,” ,, _ -..-.-.2.89

,x165.5’ CT81 O-4;;

-

3.20”.

CT81 O-W::__ 3.48

_

356.9”

6;:$g+N-6tA16; I’

iT9)

3.33.L ‘. -‘, 1.87/

O-4’

(Tg)

O-c&-

:::;;;-;

.-

fN-6

(816)

_ SN-4

tCl5)

2.87

0.2.93

Figure 4. The geometries of major-groove Watson-Crick and potential 3-centre hydrogen bonds in the central 8 base-pairs of the structure. The left-hand side gives angles (“), and the right-hand side distances (A), with hydrogen atoms in generated positions. between atom N-6 of adenine and O-4 of thymine residues on the next base-pair and of the complementary strand and between N-4 of cytosine and O-4 of thymine residues. There are a total of six such potential hydrogen bonding situations. We have chosen the geometric criteria, that a hydrogen bond is likely if the Y-H.. ..X distance is shorter than the sum of the van der Waals’ radii for hydrogen and X (Taylor et al., 1984) and the Y...X distance is less than the van der Waals’ radii for X and Y. On this basis, there is only one plausible three-centre hydrogen bond in the present structure, between N-6 of A5 and O-4 of T19. The N-6.,.0-4 distance of 3.128 is well within standard hydrogen bonding range, although the N-H...O-4 distance of 2+37A, with a N-H...O-4 angle of 96”, suggests a weak interaction between the hydrogen and oxygen atoms. The A5.T20 base-pair has the highest propeller twist in the structure (Fig. 3(a) and (b)); the fact that it is only marginally greater than at base-pairs A6*T19 and T&A17 suggests that it is not a critical factor in three-centre hydrogen bonding in this structure.

(f) Hydration The majority this structure

of the 71 water molecules found are in hydrogen-bonding distance

in to

0 57

-4

5

3

Figure 5. Schematic

view of the hydration scheme surrounding the phosphate groups. Continuous contact lines indicate distances between 2.4 and 3.2A. broken contact lines indicate distances between 3.2 and 35A.

the phosphate groups (Fig. 5 and Table 3). Hydration is principally to atoms 0-1P and 0-2P of the phosphate (with a total of 38 contacts). There are also 14 contacts to O-3’ and O-5’ ester oxygen atoms, and a few to sugar O-4’ atoms. Even though the 71 water molecules only represent a small fraction of those present in the crystal structure, it is possible to discern part of the extensive networks of water molecules emanating from the phosphate groups. Thus, phosphates P3 and P4 are linked by W59 and W80 via W69 to W70 and W84, and finally to W89 and W68. Phosphates P22 and P23

1168

K. J. Edwards et al.

Environment Possible bonding

Solvent W25

31.92

W26

4317

w27

51.89

W28 w29

3213 62.15

w30 w31

60.01 4849

W32

ml5

w33

5207

w34

3288

w35

44.84

W36

58.86

w37

4622

W38

31.49

w39

66.89

w40 w41

44.26 3913

W42

3393

w43

4403

w44

5613

w45 W46

3989 4579

w47

W48

4567

3990

hydrogen partners

(GlO)O-4” (GlO)N-2’ (GlO)N-3’ (Cl 1)0X’ (A16)0-3” (A17)0-2P’ W71’ (G24)0-1P (AlS)N-7’ (A16)N-6a W38’

6% (G22)0-3’. (C23)0-2p W73’ (Cl)O-3” (T9)0-2pL w52d W58d (Tl9)0-1P’ w3v W64’ (T20)0-1P’ (T20)0-5” (C3)N-4’ (G22)0-6’ W42’ W96 (G12)0-3” (A4)N-3f (A5)0-4’ (GlO)O-2p (AlS)O-2pd W37d W52d (A18)0-2P’ (A18)O-5’a W52” W36’ (T8)0-4’ W27” W32^ W64” (A5)0-1P’ (T21)0-2P’ WW W68b W6Sb W8gb (G22)N-7’ W34’ (T8)0-3” W76’ (A5)N-7’ (A5)N-6’ W45” W44’ (A6)N-3’ (TZO)O-2’ W47’ W66’ (A5)N-3’ (A6)0-4” W46’ (G2)0-2P’ W86’

Table 3 of solvent vositions in d(CGCAAATTTGCG), Distance (4 329 345 287 255 225 2.50 2.22 2.37 2.47 347 336 262 345 2.56 >41 343 292 249 249 288 2.32 347 260 339 329 2.35 344 321 320 245 2.68 2.49 340 1.99 333 2.50 337 296 1.99 306 336 2.32 325 2.51 323 213 2.95 2.42 217 266 344 341 271 318 320 332 332 345 327 349 323 2.31 2.90 349 288 268

Possible bonding

Solvent w49

56.18

W50 W52

42.76 57.95

w53

44.79

w54

3501

w55

3905

W56

6520

w57 W58

6324 5295

w59

41.09

W60 W61

7o80 4507

W62

60-92

W63 W64

658.5 61.99

W65

3968

W666

6543

W68

45.89

W69

3574

w70

49.56

w71

5502

w72 w73

47.17 5831

w74

3239

w75

61.22

hydrogen partners

(cl)o-ga (T7)0-5’ w53* WW (c11)0-1pP W37’ W56 (T9)0-3” W31C W36’ (T7)0-1P W4gh (A4)0-1P W68 (C3)0-1P’ W85” (Gl2)0-1P’ (Gl2)0-5* (Cl3)0-3’n W52’ W31C (C3)0-2P’ W69” (G22)0-lPe W80’ (Al7)0-2P (T9)0-4” W63” W75’ (T8)0-5’” (TS)O-1P’ W61’ (T19)0-5’a W32’ W39’ (T20)0-4” W66” W46’ W65” (A4)0-lP= W54” W8Y W41’ W59’ W70’ W84’ (T21)0-3’” (G22)0-1P’ (G22)0-2Pc W41” W80” (C3)0-3” (A4)0-2p” W69’ W&p” (G22)0-2P’ W26’ W81” (G24)0-1P’ W75” (G22)0-2P’ W29^ (T9)0-2” (Al7)N-3’ W75a (A18)0-4” W61” W72” w74*

Distance (4 2.91 332 2.28 265 330 2.96 2.94 3.31 2.49 3.33 2.60 2.28 3.49 252 2.65 3.10 2.83 3.02 3.31 2.94 2.49 2.87 3.35 3.37 1.93 3.05 332 3.13 3.03 3.25 3.12 3.13 3.01 3.47 3.25 3.14 2.96 3.23 2.96 2.61 2.52 3.10 2.95 3.35 2.60 2.97 3.34 3.37 2.98 2.42 2.49 340 2.70 2.60 258 2.96 2.22 298 2.93 3.11 3.07 2.41 3.14 285 2.57 297 303 3.11 2.57

Molecular Structure of d (CGCAAATTTGCG)

z

1169

Table 3 (continued) Possible bonding

Solvent W76 W77

32.67 37.28

W78 WI9 W80

6510 64.44 4977

W81

7949

W82

5362

W82 W84

5362 69.37

W85 W86 W87

70.19 71.41 64.19

B. Some useful

distances

Solvent W25 w47 W61

31.92 4567 4507

W63

65.85

W65

39.68

hydrogen partners

Distance (h

W43” W8F W95’ (C23)0-1pS (T21)0-2P’ W41” W5gb W5gb W6gb W71’ (C23)0-5’ (G24)0-1P’ (Cl)N-4= W77’ W95” (A4)0-2Y W69’ W70’ W89’ W55’ W48” (T20)0-2P

in

the minor

groove

Possible bonding

hydrogen partners

(C15)0-2’ (T21)0-2’ (A18)N-3’ (TS)O-2’ (T19)0-2’ (T7)0-2’ (T19)0-2’ (T7)0-2’

2.71 2.30 2.62 306 269 2.13 244 1.93 2.49 298 3.16 3.04 2.99 2.30 307 314 297 2.58 312 310 2.68 2.68

that lie

outside

Possible bonding

Solvent

the range

Distance (4 4.53 372 611 449 520 584 4.19 4.23

W88

3997

W89

5911

W90

5644

w91

54.18

w92 w94

52.35 5240

w95

4479

of hydrogen

bonding

(G12)0-2P’ (C23)0-3” (G24)0-2P’ (G24)0-5” W68” W84” (T21)0-2P’ W41’ (Cl)O-5’h W49h (GlO)N-2’ (Cll)O-2” (G24)0-3’ (G24)0-4’ (A18)0-1PS (C3)N-4’ (T21)0-4’ (G2)0-6’ (G22)0-6a W34’ W77* W82’

(for

we with

Fig.

Possible bonding

Solvent W66

6543

w74 w75

43.39 61.22

w91

54.18

hydrogen partners

Distance (A) 283 1.84 2.10 303 310 312 327 217 %38 264 3.44 2.75 2.61 2.19 345 3.32 3.18 2.70 3.45 3.21 262 307

7) hydrogen partners

(T20)0-2’ (A5)N-3’ (A6)N-3’ (AlS)O-3’ (A17)N-3” (T8)0-2’ (C15)0-2’

Distance (4 402 7.53 463 4.66 411 525 453

Symmetry operations and cell translatkww: % y 2. bl +z y 2. ‘--l+syz. d1/2+z 1+1/2-y --t. ‘-1/2+x 1+1/2-y --z. ‘1+1/2--x l-y 1/2+2. 81-z -1/e+y 112-z. hl --z 1/2+y 1/2-z.

are linked by just two water molecules, W29 and W73. In one instance, a single water molecule bridges adjacent phosphate groups such as W81 between P23 and P24. Table 3 shows that a number of water molecules and their networks serve to hold duplexes together in the crystal lattice. Relatively few (11) water molecules were located in the major groove of the structure (Fig. 6). This is in contrast to the more extensive arrangements found in the A2T2 structure (Drew t Dickerson, 1981) and in d(CGTGAATTCACG)2 (Narayana et al., 1991). This difference may reflect, in part, the lower water mobilities in the latter low-temperaturF study. The A2T2 structu!e was refined at 1.9A resolution compared to 2.2A here, which would have materially aided the unambiguous determination of water molecules. Nonetheless, it is surprising that none have been found in the central AT major

groove region in the A3T3 structure. A short firstshell network was found at the C + G-rich 5’ end of the duplex, linking adjacent bases via N-4 and O-6 atoms; there is also interstrand bridging between O-6G-2 and o-6022, and a weak interstrand bridge involving W94 between N-4C3 and O-4T2 1. There is an extended ribbon of water molecules in the minor groove, one water molecule in width, which extends from base-pair A4*T21 to Cl l*GlP at the 3’ end of the duplex (Fig. 7 and 8). This arrangement is similar to the spine of hydration reported for the parent A2T2 dodecamer (Drew & Dickerson, 1981). The ribbon in the present structure is probeably not fully continuous, with a distance of 4.5A between W63 and W65. Most of the water molecules in the ribbon are in contact with one strand or the other, in contrast to the water spine in A2T2. Only one water molecule here (W74), bridges N-3 and O-2

1170

K. J. Edwards et al.

* Figure 6. Schematic major groove, showing DNA.

Figure

c13

/J

view of water their contact

8. Stereo view of the duplex

molecules distances

showing

in the to the

the ribbon

Figure 7. Schematic the minor groove.

of minor-groove

view of the ribbon

water molecules.

of hydration

in

Molecular

Structure

of d(CGCAAATTTGCG),

in an interstrand manner; whereas four consecutive base-pairs are so bridged in A2T2. In the A2T2 structure, the second-shell water molecules bridge between the first-shell ones to create the spine of hydration. By contrast, in the present structure, the first-shell water molecules are directly hydrogenbonded to each other. There is extensive involvement of sugar ring O-4’ atoms and a clear asymmetry of the network in the 3’ half of the duplex, which makes five hydrogen-bonded contacts to strand 1 yet only two to strand 2.

4. Discussion The crystal structure of the sequence d(CGCAAATTTGCG)* shows it to be similar in overall structure and minor-groove width to that of d(CGCGAATTCGCG)2. In some respects, this is not surprising since the sequence changes are conservative, replacing purine by purine and pyrimidine by pyrimidine. The AAATTT region in the present struct,ure shows little in the way of abnormally high base-pair propeller twist or three-centred hydrogen bonding, compared to the longer oligo(dA) tract dodecamer structure (Nelson et al., 1987). This may well be indicating a requirement for the tract to have at least four consecutive adenine residues in order to show such features, in general accord with solution data on oligo(dA) tracts (Crothers et al., 1990), which indicates that four is the minimal length required for significant bending to be possible. However, extrapolations from the crystalline to the solution state must, at least for DNA bending, be made with care; it is still a matter of controversy as to whether and how the majorgroove bending seen in several oligo(dA)-containing structures relates to the minor-groove bending that occurs in solution (DiGabriele et al., 1989). The present structure does not enable us to assess directly the role or importance of threecentred hydrogen bonding in stabilizing the high propeller twists in oligo(dA) tracts, Our results do suggest though that propeller twists of up to -20” do not require such hydrogen bonding for their structural integrity, although we must add the caveat (Yanagi et al., 1991) that since hydrogen bonding is electrostatic in origin, some small energetic contribution from weak bifurcation cannot be ruled out. The recent molecular dynamics analysis of poly(dA).poly(dT) (Fritsch & Westhof, 1991) has suggested that these hydrogen bonds are relatively transient and are not determinants of high propeller twist stability. This study also found that adenosine residues tend to adopt C-S’endo sugar puckers and thymine residues 04’endo ones, in general accord with the pattern of puckers found here and in the A2T2 st,ructure (Drew et al., 1981). The location of the significant length of a minorgroove ribbon of hydration lends further support to the view that structured water in the minor groove is of general significance for AT stretches of DNA. The manner of water co-ordination to the bases forming the minor-groove floor is quite distinct from

1171

that observed in the A2T2 structure, with extensive first-shell hydrogen bonding yet little interstrand bridging being apparent. This, together with the tendency for one strand in A3T3 to be preferentially co-ordinated by water, represents a new class of minor-groove hydration motif, a ribbon rather than a spine. It is reminiscent of one of the two water ribbons found in the rather wider minor grooves of several decamer crystal structures (Grzeskowiak et al., 1991). The present analysis also confirms and extends several previous findings that networks of water molecules are not always observable in the central region of the major groove, either in decamer or dodecamer crystal structures (see, for example, Heinemann & Alings, 1989), even with high-resolution data. The extensive location here of water molecules around phosphate groups enables us to examine theories concerning the role of phosphate hydration in determining DNA helical conformation (Saenger et al., 1986; Vovelle et al., 1989; Westhof, 1987). The location here of several instances of one or two water molecules bridging adjacent phosphate groups (e.g. W29 and W73 between P22 and P23; W81 between P23 and P24), is in general accord with findings in A-DNA structures. It has been suggested (Saenger et al., 1986) that B-DNA phosphates cannot be bridged by one or two water molecules, in contrast to A or Z-DNA. The fact that the phosphate groups in the present B-DNA oligomer are not independently hydrated but have bridging water molecules similar to those in A-DNA oligomer structures suggests that, these water molecules may not be the major factors in A to B-DNA and other structural transitions. We are grateful to the Cancer Research Campaign for support and the provision of research studentships (to KJE, DJB and KS). We thank 41ex Rich and Andy Wang for correspondence and discussion regarding their studies on this sequence, and Terry

Molecular structure of the B-DNA dodecamer d(CGCAAATTTGCG)2. An examination of propeller twist and minor-groove water structure at 2.2 A resolution.

The crystal structure of the dodecanucleotide duplex d(CGCAAATTTGCG)2 has been solved to 2.2 A resolution and refined to an R-factor of 18.1% with the...
1MB Sizes 0 Downloads 0 Views