ANNUAL REVIEWS

Further

Quick links to online content

Annu. Rev. Biophys. Biophys. Chern. 1990. 19: 405-21 Copyright © 1990 by Annual Reviews Inc. All rights reserved

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

ZINC FINGER DOMAINS: Hypotheses and Current Knowledge Jeremy M. Berg

Department of Chemistry, The Johns Hopkins University, 34th and Charles Streets, Baltimore, MD 2 12 18 KEY

WORDS:

DNA binding proteins, metal ions, nucleic acid binding proteins.

CONTENTS PERSPECTIVES AND OVERVIEW ....................................................................................... . EXISTENCE

or SMALL

ZINC-BASED DOMAINS IN TFIIIA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . , .

OCCURRENCE OF SIMILAR ZINC-BINDING DOMAINS IN OTHER PROTEINS

. . . . . . . . . . . . . . .... . ..... . . . .

405 406 408

THREE-OIMENSIONAL STRUCTURE OF ZINC FINGER DOMAINS .......... .

410

INTERACTIONS BETWEEN ZINC FINGER PROTEINS AND DNA

............•.................................

412

A MODEL FOR THE TFIIIA-DNA COMPLEX ......................................................................... .

415

SUMMARy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

418

PERSPECTIVES AND OVERVIEW

In 1980, Engelke et al (24) purified a protein, called transcription factor IlIA (TFIIIA), from Xenopus oocytes that was specifically required for the accurate transcription of 5S RNA genes by RNA polymerase III. This protein was shown to be identical to a protein that accumulates in imma­ ture oocytes as a specific complex with 5S RNA (63). A single molecule of this protein (72) with a weight of approximately 40 kilodaltons was found to bind to and protect approximately 50 base pairs of DNA from enzymatic attack, suggesting that TFIIIA has a very elongated structure (24). In 1983, Hanas et al (39) demonstrated that TFIIIA contained bound zinc ions and showed that these ions were required for specific DNA binding. A cDNA clone for TFTTTA was isolated and sequenced in 1984 (35). Subsequent analysis by two groups (12, 5 1) revealed an intriguing feature. The amino terminal, three-fourths of the protein, consists of nine tandem 405 0883-9 182/90/0610-0405$02.00

406

BERG

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

sequences that closely match the consensus sequence (Tyr, Phe)-X-Cys­ X 2_4-Cys-X3-Phe-Xs-Leu-X2-His-X3,4-His-Xl,6, where X represents rela­ tively variable amino acids, Based on this observation, the hypotheses made in one or both of these papers concerning the structure and func­ tion of this protein included: 1. Each of the tandem sequences binds a zinc ion through the invariant cysteine and histidine residues to form a relatively independent struc­ tural domain (12, 51). These domains were termed zincjingers (51). 2. Other gene regulatory proteins would be found to contain variable numbers of sequences that match the consensus sequence noted above (51). 3. The large loop of 12 amino acids folds up into a well-defined structure perhaps a long twisted beta sheet (51) or a structure that includes an alpha helix (12) that contacts the DNA. 4. The protein interacts with DNA in an extended fashion (12, 51). Each domain interacts with one half turn of DNA (5.0-5.5 base pairs) (51). ,

This review critically examines these hypotheses. In addition, a model for the complex between TFIIIA and the DNA in the internal control region of 5S RNA genes is developed. EXISTENCE OF SMALL ZINC-BASED DOMAINS IN TFIIIA

The first hypothesis is strongly supported by an extremely wide variety of evidence ranging from spectroscopic studies of TFIIIA to studies of the intron-exon structure of the TFIIIA gene. The first evidence for the exis­ tence of small domain in TFIIIA came from limited proteolysis studies of the TFIIIA-RNA complex (51). Digestion with a number of proteases revealed the presence of metastable fragments that had molecular weights consistent with the presence of structural units of 3--4 kilodaltons and integer multiples thereof. This is approximately the size expected for single and mUltiple zinc finger domains. Subsequent studies of a single zinc finger peptide revealed that such units resist proteolytic attack in the presence of appropriate metal ions (29). A second, completely different, line of evidence suggesting that each sequence repeat corresponds to a relatively independent structural domain came from structural analysis of the TFIIIA gene (79). Each of the first six zinc finger sequences is encoded on a separate exon, which supports the notion that modern TFIIIA may have evolved by gene duplication from a single zinc finger domain-encoding exon.

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

ZINC FINGER DOMAINS

407

Further evidence comes from studies of single zinc finger peptides con­ taining 25-31 amino acids (29, 47, 6 1). Studies of such peptides by a variety of techniques strongly support the hypothesis that each sequence repeat is, in fact, an independent structural domain. In particular, these studies have revealed that such peptides are apparently unfolded in the absence of bound metal ions but fold into highly structured units upon addition of stoichiometric amounts of zinc or other appropriate metal ions. The first studies reported involved a 30-amino acid peptide corresponding to the second zinc finger from TFIIIA (29). The metal-dependent folding process was observed using circular dichroism spectroscopy and changes in sensitivity of the peptide to proteolysis. In other work, proton nuclear magnetic resonance (NMR) methods demonstrated the metal-dependent folding of a single zinc finger peptide from the yeast transcription factor ADRI (61), which has two TFIIIA-like zinc finger sequences. In this case, an apparently unique conformation was adopted in the presence of zinc that was lost upon treatment with EDTA (6 1). These initial studies laid the groundwork for more complete structural studies of zinc finger peptides by two-dimensional NMR methods, discussed below (47). Finally, these single domain peptides have been quite useful for studies with spectro­ scopically active metal ions such as Co2+. The absorption spectra of 2 Co +-substituted single domain peptides (29, 61) are entirely consistent with tetrahedral two cysteinate-two histidine coordination, especially when compared with appropriate model complexes (20, 2 1). Perhaps the most direct evidence concerning the coordination environ­ ment around the zinc ions in TFIIIA comes from x-ray absorption studies of the TFIITA-5S RNA complex (23). Analysis reveals that the extended x-ray absorption fine structure (EXAFS) of the x-ray absorption at the zinc K edge is consistent with zinc coordination by two sulfur atoms at 2.30 A and two nitrogen atoms at 2.00 A. Furthermore, the observed intensity of the x-ray fluorescence was consistent with approximately nine zinc ions per complex. These observations suggest that the zinc sites in TFIIIA are quite similar to one another and that each zinc is coordinated to two cysteinates and two histidines, as proposed. A final line of evidence comes from footprinting studies of a series of deletion mutants of TFIIIA bound to 5S RNA genes (8 1). The details of this study are discussed in a later section; however, the key observation, with regard to the existence of small structural domains in TFIIIA, is that truncation of a single zinc finger domain from TFIIIA produces small and localized changes in the footprint produced by Fe(II)-EDTA-generated hydroxyl radical. This result nicely illustrates the modular nature of the TFIIIA DNA-binding domain.

408

BERG

OCCURRENCE OF SIMILAR ZINC-BINDING

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

DOMAINS IN OTHER PROTEINS

After the discovery of the periodic nature of the TFIIIA sequence, Miller et al (51) suggested that "it would not be surprising if the same 30-residue units were later found to occur in varying numbers in other related gene control proteins." However, even these workers could not have anticipated the virtual avalanche of DNA sequences that have been found to contain regions that encode TFIIIA-like protein sequences (2,5, 10, 13-19,40,42, 43,45,46,48, 50, 52-54, 56,57,59,60,62, 65-69,7 1, 74, 75, 76, 78,80). These coding sequences were discovered in two ways. First,a large number of sequences that were identified and cloned on the basis of some bio­ chemical or genetic property have been found to contain one or more TFIIIA-like potential zinc-binding sequences (2, 10, 13, 19, 40, 42, 43, 48, 50,52-54,56, 57, 60,62, 65,68,69, 74,76, 80). These sequences represent essentially independent discoveries of the zinc finger motif, and the large number of cases reporting this observation suggests that a typical eukary­ otic genome has many of these sequences. Second, the degree of sequence similarity within at least some members of this protein superfamily is high enough to allow intentional cloning of zinc finger-encoding genes by hybridization (5, 14-18, 45, 46, 56, 59, 66, 67, 71, 75, 78). The probes used include previously isolated zinc finger-encoding genes and oligo­ nucleotides derived from consensus sequences. The most highly conserved region for many zinc finger proteins is the linker region between zinc finger domains. This region often has thc sequence His-Thr-Gly-Glu-Lys-Pro­ (Tyr, Phe)-X-Cys, where the histidine and cysteine residues are metal­ binding residues from the preceding and succeeding zinc finger domains and X is relatively variable. This sequence has been termed the H-C link (7 1). Hybridization to sequences encoding this region are primarily responsible for the isolation of other zinc finger- (and H-C link-) encoding genes. Many of the zinc finger--containing genes that have been isolated on the basis of genetic or biochemical characteristics are involved in significant biological processes. Some of the proteins are known to be transcription factors like TFIIIA. Thus, yeast ADRI (40) is a transcription factor required for expression of alcohol dehydrogenase II genes, whereas human Spl (42) is a general transcription factor that regulates the expression of a variety of cellular and viral genes. Several genes involved in control of development in Drosophila have been identified genetically. Examples include the segmentation genes Kriippel (65) and Hunchback (76). Binding sites for these proteins have been identified in regions that suggest they regulate both their own and each other's expression (73, 77). Another

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

ZINC FINGER DOMAINS

409

example is the human gene termed ZFY, which encodes a protein that contains 13 tandem zinc finger domains (57). This gene is found on the Y chromosome and it appears to be the testis-determining factor, the major sex-determining gene in humans. A highly similar gene (termed ZFX) is found on the X chromosome (69). These proteins probably act by binding to (and competing for) specific DNA or perhaps RNA sequences, although these sequences have not yet been identified. Another zinc finger-encoding gene was isolated on the basis of specific amplication. Thus, the human GLI gene (43) is normally present as a single copy, whereas it occurs 50100 times in certain tumor-derived cell lines. This suggests that this gene may act as an oncogene when overexpressed. Finally, a protein called RNAl 1 present in the yeast splicosome was found to contain a single copy of a sequence quite similar to the zinc finger consensus ( 13). These examples illustrate the important systems that include proteins that contain one or more zinc finger sequences. The intentional cloning of zinc finger-encoding genes has also enhanced the realization of the central and widespread roles these proteins must play. First, it appears that all eUl\aryotes tested contain one or more sequences that hybridize to zinc finger-derived probes strongly enough to be detected ( 16, 7 1). Second,a number of regions encoding authentic zinc finger sequences have been identified and characterized. As one example, sets of over 100 different cDNAs have been isolated from Xenopus and human libraries derived from various developmental stages (5, 45, 46, 56). Estimates based on the statistics of recovering certain of the cDNAs, predict approximately 300 human cDNAs that encode zinc fingers that include the H-C link (5). In addition, some spectacular examples of zinc finger proteins have been discovered, such as a Xenopus protein called Xfin that has 37 sequences that closely approximate the zinc finger consensus (66). A major disadvantage of the hybridization-based cloning of zinc finger-encoding genes is the difficulty in determining the specific functions of such genes, although these proteins probably act via interactions with nucleic acids. Several other classes of nucleic acid binding and gene regulatory proteins have been discovered that also contain patterns of cysteine and histidine residues apparently involved in the formation of metal-binding domains (6, 9). These proteins are often grouped together with the TFIIIA-like proteins and termed zincjinger proteins. These different classes, however, may both have small metal-binding domains involved in nucleic acid binding or gene regulatory processes without being structurally or evolu­ tionarily related to one another. Some of these classes have recently been reviewed (9, 26, 44) and some leading references include steroid-thyroid hormone receptor superfamily (4, 8, 25, 3 1), retroviral gag gene-encoded

4 10

BERG

nucleic acid binding proteins (36, 38), the GAL4 family (41, 58), the human immunodeficiency virus TAT protein (30),the adenovirus EI a gene product (22), and the bacteriophage T4 gene 32-single-stranded nucleic acid binding protein (34). New classes and subclasses are appearing quite frequently.

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

THREE-DIMENSIONAL STRUCTURE OF ZINC FINGER DOMAINS

Both groups reporting the periodic nature of the TFIIIA sequence ( 12,51) speculated about the secondary structure of the zinc finger motif. After unsuccessfully searching for evidence for a helix-turn-helix-like structure, Miller et al (51) suggested that the large loop between the cysteines and the histidines might "fold into a long twisted ribbon of beta sheet which wraps in some way around the zinc-binding pocket." Brown et al ( 12) proposed the presence of an alpha helix entering from the invariant phenyl­ alanine to just past the invariant leucine residue based on a secondary structure prediction algorithm. Later, Brown & Argos (II) altered the position of this putative helix somewhat,based on an analysis of 39 zinc finger sequences, to begin just before the leucine and extend two residues past the first histidine residue. The observation that such a helix would have strong amphipathic character strengthened the prediction that this helix resided in this region. More complete predictions of the tertiary structure of zinc finger domains have been made. The first was developed based on the observation of recurring substructures in crystallographically characterized metallo­ proteins (7). The zinc finger consensus was divided into the following reglOns: Cys-Cys Loop

Tip

His-His Loop

Linker

[(Tyr, Phe)-X -Cys-X 2.4-CyS-X 3-Phej-X dX - Leu-X 2-His-X 3.4-Hisj-Xn•

It was observed that the proteins aspartate transcarbamylase and rubre­ doxin contained ten amino acid sequences of the form Hyd-X-Cys-X2Cys-XrAro in which the cysteines are coordinatcd to mctal ions and where Hyd is a hydrophobic residue (Leu and Tyr, respectively) and Aro is an aromatic residue (Phe and Tyr). Examination of the three-dimensional structures revealed that these regions both consist of two antiparallel beta strands with the first and tenth residues (Hyd and Aro) joined by two backbone-to-backbone hydrogen bonds. Similarly, thermolysin and hemerythrin (as well as hemocyanin) contain sequences of the form X­ Hyd-X2-His-X3-His in which each of the histidine residues coordinates to

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

ZINC FINGER DOMAINS

411

a metal ion. In each case, this region was helical with each histidine coordinated through its s-nitrogen. Attaching the two substructures cor­ responding to the Cys-Cys loop and the His-His loop to a tetrahedral metal ion results in packing the three hydrophobic residues with one another and with the relatively hydrophobic parts of the metal-ligating residues. Figure I summarizes construction of the model. A similar analysis was made independently by Gibson et al (32), who used molecular dynamics calculations to refine the structure obtained by "cutting and pasting" substructures from metalloproteins. The structure produced in this manner was quite similar to that discussed above, although the alignment of residues across the beta sheet was offset by two residues and the helix was slightly longer. Recent two-dimensional NMR studies of a single domain peptide (cor­ responding to domain 3 1 from Xfin) have revealed a structure strikingly

A Model for the "Zinc Finger" Domain Based on Recurring Substructures in Meta1loproteins

Rubredoxin

1

ATease Hom,,>,,""

1 """"" ';"

��m�� Figure 1

Development ofa model for the three-dimensional structure of a zinc finger domain

based on the observation of recurring substructures in crystallographically characterized metalloproteins (7). A beta hairpin substructure including two metal-coordinating cysteinate residues was observed to occur in both ruhredoxin and the regulatory subunit of aspartate transcarbamylase, while an alpha helical substructure including two metal-coordinating histidine residues was obscrved to occur in both hemerythrin and thermolysin. Connecting these two substructures around a tetrahedral metal center produced an appealing model with the invariant hydrophobic residues packed together above the metal coordination unit. Subsequent experimental studies (47) have proved this structure to be essentially correct. Alpha carbon positions are shown.

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

412

BERG

similar to the proposed model (47). The structure consists of a two­ stranded beta sheet followed by a helix with the metal ion coordinated to the two cysteines in the loop at the base of the sheet and to the two histidines via their c-nitrogens. The overall chirality at the metal center is as predicted. The alignment across the beta sheet corresponds to that proposed in the first model (7) in which two hydrogen bonds join the two invariant hydrophobic residues (a Tyr and a Phe in the peptide studied). As noted in a previous section, preliminary NMR studies have also examined a single-domain peptide-derived yeast ADRI (61). A helical region that includes the invariant leucine and extends past the first of the histidine residues was observed. The NMR features characteristic of a classical beta sheet structure involving the two invariant aromatic residues were not observed, although the region does appear to have a hairpinlike conformation. Thus,from the observations used in making the structure predictions and from the NMR experimental results, the structure of single zinc finger domains seems to be relatively well in hand. However, an important remaining question asks how zinc finger domains in a tandemly repeated array are rclatcd to one another. As notcd above,domains arc oftcn joined by the H-C link sequence His-Thr-Gly-Glu-Lys-Pro-(Tyr,Phe)-X-Cys. Since almost all of the amino acids in this sequence are strongly conserved, a structural model for this region should provide a role for each of the conserved side chains. Figure 2 shows one possible model. The structure consists of a type II beta turn formed by the sequence His-Thr-Gly-Glu. The hydroxyl group of the Thr residue donates a hydrogen bond to the unpaired carbonyl group at the end of the helix, while the Gly residue provides the conformational flexibility necessary for formation of a type II turn. The side chain of the Glu residue accepts a hydrogen bond from the NH group of the His residue and participates in a salt bridge with the Lys residue that follows it. The Pro residue occurs just prior to thc bcginning of the first of the beta strands of the next domain. Repetition of the structure for the zinc finger domain with this structure for the linker leads to the formation of a right-handed superhelix with approximately the right pitch and radius to allow intimate interactions with DNA. The model for the complex between zinc finger domains joined by H-C links aids the considerations in the final two sections. INTERACTIONS BETWEEN ZINC FINGER PROTEINS AND DNA

Zinc finger proteins appear to interact with DNA primarily through inter­ actions in the major groove. The most direct evidence for this interaction

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

ZINC FINGER DOMAINS

Figure 2

4 13

A model for the three-dimensional structure of the H-C link that often connects

zinc finger domains. The sequence shown is X-X-X-His-Thr-Gly-Glu-Lys. Key hydrogen bonding interactions

(solid bonds) include one between the hydroxyl group of the Thr residue

and an unpaired carbonyl group at the end of the alpha helix, one between the amide hydrogen of the Glu residue and the carbonyl group of the His residue that defined the His­ Thr-Gly-Glu type II beta bend, and one between the carboxylate group of the Glu residue and the NH group of the His imidazole ring.

comes from methylation interference experiments with TFIIIA and a 5S RNA gene, which revealed that methylation at N7 of numerous guanine residues significantly interfered with protein binding (72). Similar obser­ vations have been made for Spl (33). Miller et al (51) originally suggested that each zinc finger domain binds to approximately one-half turn or 5.0-5.5 base pairs on DNA. They based this hypothesis largely on the observation that the TFIIIA binding site contains approximately 45-50 base pairs of DNA and has nine zinc finger domains. This relationship was also consistent with interpretations of a series of chemical and enzymatic footprinting experiments on TFIIIA bound to a 5S RNA gene (28). In addition, one report observed that the TFIIIA binding site as well as other protein binding sites have sequences that show detectable periodicity, particularly with regard to guanine residues with periods of 5.0-5. 5 base pairs (64). However, several other lines of evidence suggest a ratio of three base pairs of DNA per zinc finger domain. First, binding sites have been identified for several other proteins and the relationship between the

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

414

BERG

number of domains and the size of the binding site appears to be more consistent with three base pairs per domain. Thus, Spl has three zinc fingers in its DNA binding domain and recognizes a site of 9-10 base pairs (42). Yeast SWI5, which also has three zinc fingers, binds to an IS-base pair site, but this site has a nearly perfect dyad axis, suggesting that the protein may bind as a dimer with nine base pairs per protein monomer (55). Second, as noted by Gibson et al (32), and independently discovered in my own laboratory, the structure for a single domain, discussed above, can easily interact with three base pairs, but it is difficult to construct models that involve more than three base pairs per domain due to the limited length of the linker region. A model is discussed below that appears to account for the large size of the TFIIIA binding site, given the three base pairs of DNA per domain ratio for tandemly repeated zinc finger domains with "normal" linker regions. Since TFIIIA binds both 5S RNA and DNA, questions have been raised about the structure of the DNA in the TFIIIA binding site. The possibility that TFIIIA recognizes similar structures in 5S RNA and DNA suggests that the internal control region might have an A-like conformation (64). Several studies have examined the problem (1, 27, 37,49,64). Methods utilized have included circular dichroism spectroscopy (1,27, 37),nuclear magnetic resonance spectroscopy (1), x-ray crystallography on a nine base pair fragment with sequence derived from a portion of the internal control region (49), as well as chemical and enzymatic techniques (1, 27, 64). The results obtained from these studies have contradicted each other somewhat. Currently, the internal control region appears to have a struc­ ture that is not completely the classical B-form conformation, but this conclusion may simply represent structural variability present in many nonrandom DNA sequences. An additional complication has arisen for the TFIIIA-DNA complex. Electron microscopic observations (3) as well as solution studies (70) have suggested that the binding ofTFIIIA induces significant bending in the DNA. Further experimentation including the determination of the structure of a zinc finger protein-DNA co-crystal using diffraction methods will probably be required before the importance of DNA structure in zinc finger protein site-specific binding is settled. Based on these observations and on the structure of a single zinc finger including the model for the H-C link discussed above, a proposal for the complex between a protein containing tandemly repeated zinc finger domains joined by H-C links and approximately B-form DNA has been developed and is shown in Figure 3. Each domain has its helical portion lying in the major groove of the DNA. The beta sheet region lies further away from the DNA helical axis. Residues along one face of the helix and in the turn at the top of the zinc finger are positioned so that they might

415

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

ZINC FINGER DOMAINS

Figure 3

The structure of four tandem zinc finger domains wrapping around a DNA double

helix. Alpha carbon atom positions are shown for the protein, and phosphate positions are shown for the DNA. The positions of the highly conserved cysteine, histidine, and hydro­ phobic residues are shown in one-letter code (C

L

=

=

Cys, H

=

His,

Y = Tyr, F

=

Phe,

Leu). The zinc ions are shown with cross-hatched atoms. This configuration of the

protein is a consequence of the structure of the individual domains with the H-C link structure shown in Figure 2. This arrangement corresponds to three base pairs of DNA per zinc finger domain.

make specificity-determining contacts with the edges of base pairs in the major groove. A MODEL FOR THE TFIIIA-DNA COMPLEX

As noted previously, a variety of studies, especially the hydroxyl radical footprinting studies of the TFIIIA-DNA complex, reveal that the nine domains do not interact with the DNA in the same manner (81). In particular, the footprint consists of a large broad protected area extending approximately from base pairs 46 to 58, an essentially unprotected region centered at position 62, a relatively sharp, protected region centered at position 67, another unprotected region around position 74, and finally, another broad protected area extending approximately from positions 75 to 92. Figure 4 shows the amino acid sequence of TFIIIA. One of the striking features of this sequence involves the sixth zinc finger domain. This finger matches the consensus least well of any of the nine domains. In addition, the alignment shown in Figure 4 (which is based on the single domain structure) reveals that the linkers going into and out of this domain are unusually short. All of the other linkers approximate the H-C link

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

.j>. 0\

I:I:J



MetGlyGluLysAlaLeuProValValTyrLys

o

(1) Arg�IleCy.SerPhe

AlaAspcyaGlyAlaAla�snLysAsnTrpLys�GlnAla&L.LeuCys

LysB1aThrGlyGluLys

(2) Pro�rocyaLysGlu

GluGlyCy.GluLysG1Y�ThrSerLeuHisHis�ThrArqB1aSerLeu

ThrSiaThrGlyGluLys

(3) Asn�ThrcyaAspSer

AspGlyCyaAspLeuArq�ThrThrLysAlaAsnMetLysLysB1.PheAsnArqPhaRiaAsnIleLysIleCys

(4) Val�ValcyaHisPhe

GluAsnCy.GlyLysAla�ysLysHisAsnGln�ysValB1aGlnPhe

SerSiaThrGlnGlnLeu

(S) Pro�GlucyaProHis

GluGlyCy.AspLysArq�SerLeuProSerArq�ysArqB1.GluLys

ValB1aAla

(6) Gl�roCyaLysLysAspAspSerCyaSerPheValGlyLysThrTrpThrLeuTyrLeuLysSiaValAlaGluCysBiaGlnAsp (1) LeuAlaValcyaAspVal

CYaAsnArqLya�rqHisLysAspTyr�qAspB1aGlnLys

ThrBiaGluLysGluArqThr

(8) Val�eucyaProArq

AspGlycyaAspArgSer�hrThrAlaPheAsn�rqSerB1.IleGlnSerPh�aGluGluGlnArq

(9) ProfDAValcyaGluHis

AlaGlycyaGlyLysCys�laMetLysLysSer�GluArqalaSerVal

ValB1aAspProGluLys

ArqLysLeuLysGluLysCysProArgProLysArqSerLeuAlaSerArqLeuThrGlyTyrIleProProLysSerLysGluLysAsnAlaSerVal SerGlyThrGluLysThrAspSerLeuValLysAsnLysProSerGlyThrGluThrAsnGlySerLeuValLeuAspLysLeuThrIleGln

Figure 4

The amino acid sequence of transcription factor InA (TFlIIA) (12, 35, 51). The invariant cysteine and histidine residues are shown in

bol dface and the conserved hydrophobic residues are underlined. The alignment is based on the three-dimensional structure of the zinc finger domains beginning with the first of the beta strands. Note the short linker regions going into and out of the sixth zinc finger domains.

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

ZINC FINGER DOMAINS

4 17

in length and often in sequence. Model building studies based on this .observation reveal that the short linkers do not allow domain 6 to remain in the major groove following domain 5. Studies of proteolytic fragments and the deletion mutants have clearly revealed that the amino terminal end of TFIIIA contacts the 5' end of the internal control region, whereas the carboxyl terminal zinc fingers contact the 3' end (72, 81). Because of these observations and the unusual nature of thc sixth zinc finger and its linkers, zinc fingers 1�5 probably form a set that are responsible for the protected region of base pairs 75�92; zinc fingers 7�9 form a set that are responsible for the protected area of base pairs 46-58; and the sixth zinc finger is responsible for the sharp protected region around base pair 67. Each of the two sets of zinc fingers is expected to wrap around the DNA in the major groove about like the model shown in Figure 3. The short linker going between the fifth and sixth zinc fingers does not allow the protein to remain in the major groove. Instead, the protein crosses over the minor groove. The sixth zinc finger lies approximately parallel to the DNA helical axis outside the grooves. After the sixth zinc finger, the protein reenters the major groove and remains there for the final three zinc finger domains. Model building studies based on this idea have revealed that construction of a suitable structure without bending the DNA due to the relatively large space between the region contacted by the fifth and seventh zinc fingers is not possible. However, bending the DNA in the region around that contacted by the sixth zinc finger allows con­ struction of the model shown in Figure 5. This bending may account for the TFIIIA-induced bending (3, 70) noted above. DNA bending may not be a general property of zinc finger proteins (70) but instead may be peculiar to the TFIIIA due to the unusual structure around the sixth zinc finger. Some evidence for this model comes from analysis of the DNAase I footprinting results derived from the same deletion mutants used in the hydroxyl radical footprinting experiments discussed above (81). Deletion into the ninth zinc finger ofTFIIIA results in a hydroxyl radical footprint only slightly different from the footprint of the intact protein. In contrast, the DNAase I footprint is significantly changed�the 5' end (base pairs 45 to 63) is no longer protected. Further deletion of zinc fingers 8, 7, and 6 does not significantly change the DNAase I footprint. Deletion into the fifth zinc finger results in the complete loss of DNAase I footprinting activity under the conditions used. These results can be interpreted in terms of the model above as follows. Deletion into the ninth zinc finger disrupts the multidomain unit consisting of zinc fingers 7, 8, and 9. DNAase I is then able to effectively compete with this end of the protein for binding to the DNA so that only zinc fingers 1�5 are tightly bound.

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

418

BERG

Figure 5

A proposed structure for the TFIIIA-DNA complex. The structure shows zinc

fingers 1-5 and 7-9 lying in the major groove, as shown in Figure 3, whereas zinc finger 6 lies across the surface of the DNA duc to the passage of the short linkers into and out of this domain. The DNA must bend as shown to fit this sort of a structure into the hydroxyl radical footprint of TFIIIA on DNA (81). Alpha carbons and phosphate positions are shown and the individual zinc finger domains are numbered. The solid lines in the bent region of the DNA correspond to six phosphate groups.

Further deletion does not have any effect until the other multidomain unit (zinc fingers 1-5) is reached; at which point the overall affinity for the DNA is significantly reduced. The similarity of the breakpoints deduced from these DNAase I footprinting experiments and those deduced from the analysis of the TFIIIA sequence provides some support for the model discussed above. SUMMARY

Many of the hypotheses are clearly correct that are based on the initial observation that TFIIIA has an approximately periodic structure with invariant pairs of cysteines and histidines apparently capable of coor­ dinating metal ions. A startling number of other cDNA clones encode proteins that contain one or more sequences that match the zinc finger consensus, revealing that zinc finger proteins represent perhaps the largest class of DNA binding proteins in eukaryotes and that zinc finger protein­ controlled gene expression may be a fundamental aspect of development as well as other processes. A great deal of progress has been made in elucidating the structure of single zinc finger domains. From knowledge

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

ZINC FINGER DOMAINS

419

of these structures, plausible and testable models can be developed for the complexes between zinc finger proteins and their DNA binding sites. Clearly, one of the most important challenges remaining in this area involves testing and extending these models. Structural data on such pro­ tein-nucleic acid complexes derived from NMR or crystallographic studies is tremendously valuable in this regard. Finally, an additional fundamental question is raised by the observation that this family and other important nucleic acid binding proteins contain zinc ions bound in small structural domains. Is zinc binding merely a useful structural strategy for generating domains involved in macromolecular interactions, or are zinc con­ centration fluctuations used in some manner to regulate gene expression? The biophysical data available to date certainly do not rule out this intriguing possibility. ACKNOWLEDGMENTS

I thank the members of my research group for many useful discussions. Financial support for some of the work described herein has come from the National Institutes of Health (GM-38230), the National Science Foun­ dation (DMB-8858069), The Camille and Henry Dreyfus Foundation, and Eli Lilly and Company.

Literature Cited 1 . Aboul-ela, F., Varani, G., Walker, G. T., Tinoco, I. Jr. 1988. Nucleic Acids Res. 16: 3559 2. Baldarelli, R. M., Mahoney, P. A., Salas, F., Gustavson, E., Boyer, P. D., et al. 1988. Dev. Bioi. 125: 85 3. Bazett-Jones, D. P., Brown, M.L. 1989. Mol. Cell. Bioi. 9: 336 4. Beato, M. 1989. Cell 56: 335 5. Bellefroid, E. J., Lecocq, P. J., Ben­ hida, A., Poncelet, D. A., Belayew, A., Martial, J. A. 1989. DNA 8: 377 6. Berg, J. M. 1986. Science 232: 485 7. Berg, J. M. 1988. Proc. Natl. Acad. Sci. USA �5: 99 8. Berg, J. M. 1989. Cell 57: 1065 9. Berg, J. M. 1989. Prog. Inorg. Chern. 37: 143 10. Boulay, J. L., Dennefeld, C., Alberga, A. 1987. Nature 330: 395 11. Brown, R. S., Argos, P. 1986. Nature 324: 215 12. Brown, R. S., Sander, C., Argos, P. 1985. FEBS Lett. 186: 271 13. Chang, T.-H., Clark, M. W., Lustig, A.

14.

15.

16.

17. 18.

19.

20.

21. 22.

J., Cusick, M. E., Abelson, J. 1988. Mol. Cell. Bioi. 8: 2379 Chavrier, P., Lemaire, P., Revelant, 0., Bravo, R., Charney, P. 1988. Mol. Cell. BioI. 8: 1319 Chavrier, P., Zeria1, M., Lemaire, P., A1mendral, J., Bravo, R., Charnay, P. 1988. EMBO J. 7: 29 Chowdhury, K., Deutsch, U., Gruss, P. 1987. Cell 48: 771 Chowdhury, K., Dressler, G., Breier,G., Deutsch, U., Gruss, P. 1988. EMBO J. 7: 1345 Chowdhury, K., Rodhewohl, H., Gruss, P. 1988. Nucleic Acids Res. 1 6 : 9995 Christy, 8., Lau, L. F., Nathans, D. 1988. Proc. Nat!. Acad. Sci. USA 85: 7857 Corwin, D. T. Jr., Fikar, R., Koch, S. A. 1987. Inorg. Chern. 26: 2079 Corwin, D. T. Jr., Gruff, E. S., Koch, S. A. 1988. Inorg. Chirn. Acta 151: 5 Cu1p, J. S., Webster, L. C., Friedman, D. J., Smith, C. L., Huang, W.-J., et al.

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

420

BERG

1988. Proc. Nat!. Acad. Sci. USA 85: 6450 23. Diakun, G. P., Fairall, L., Klug, A. 1986. Nature 324: 698 24. Engelke, D. R., Ng, S.-Y., Shastry, B. S., Roeder, R. G. 1980. Cell 19: 717 25. Evans, R. M. 1988. Science 240: 889 26. Evans, R. M., Hollenberg, S. M. 1988. Cell 52: 1 27. Fairall, L., Martin, S., Rhodes, D. 1989. EMBO J. 8: 1809 28. Fairall, L., Rhodes, D., Klug, A. 1986. J. Mol. Bioi. 192: 577 29. FrankeIl, A. D., Berg, .T. M., Paba, C. O. 1987. Proc. Natl. Acad. Sci. USA 84: 4841 30. Frankel, A. D., Bredl, D. S., Paba, C. O. 1988. Science 240: 70 31. Freedman, L. P., Luisi, B. F., Korszun, Z. R., Basavappa, R., Sigler, P. R, Yamamoto, K. R. 1988. Nature 334: 543 32. Gibson, T. J., Postma,J. P. M., Brown, R. S., Argos, P. 1988. Protein Eng. 2: 209 33. Gidoni, D., Dynan, W. S., Tjian, R. 1984. Nature 312: 409 34. Giedroc, D. P., Keating, K. M., Wil­ liams, K. R., Coleman, J. E. 1987. Bio­ chernislry 26: 5251 35. Ginsberg, A. M., King, B. 0., Roeder, R. G. 1984. Cel! 39: 479 36. Gorelick, R. J., Henderson, L. E., Hansen, J. P., Rein, A. 1988. Proc. Natl. Acad. Sci. USA 85: 8420 37. Gottesfe1d, J. M., Blanco, J., Tenant, L. L. 1987. Nature 329: 460 38. Green, L. M., Berg, J. M. 1989. Pror. Nat!. Acad. Sci. USA 86: 4047 39. Hanas, J. S., Hazuda, D. 1., Bogen­ hagen, D. F., Wu, F. H.-Y.,Wu, C.-W. 1983. J. Bioi. Chern. 258: 14120 40. Hartshorne, T. A., Blumberg, H., Young, E. T. 1986. NalUre 320: 283 41. Johnston, M. 1987. Nalure 328: 353 42. Kadonaga, J. T., Carner, c., Masiarz, F. R., Tjian, R. T. 1987. CellS!: 1079 43. Kinzler, J. W., Ruppert, 1. M., Bigner, S. H., Yoge1stein, B. 1988. Nalare 332: 371 44. Kl ug, A., Rhodes, D. 1987. Trends Biochern. Sci. 12: 464 45. Knochel, W., P oting, A., Koster, M., El Baradi, T., Nietfeld, W., Bouwmeester, T., Pieler, T. 1 989. Proc. Nat!. Acad. Sci. USA 86: 6097 46. Koster, M., Pie1er, T., Poting, A., K nochel, W. 1988. EMBO J. 7: 1735 47. Lee, M. S., Gippert, G. P., Soman, K. Y., Case, D. A., Wright, P. E. 1989. Science 2 45: 635 48. . Lemaire, P., Revelant, 0., Bravo, R., Charnay,P. 1988. Proc. Natl. Acad. Sci. USA 85: 4691

49. McCall, M., Brown, T., Hunter, W. N., Kennard, O. 1986. Nature 322: 661 50. Milbrandt, J. 1987. Science 238: 797 51. Miller, J., McLachl an, A. D., Klug, A. 1985. EMBO J. 4: 1609 52. Morishita, K., Parker, D. S., Mucenski, M. L., Jenkins, N. A., Copeland, N. G., Ihle, J. N. 1988. Cell 54: 831 53. Moses, K., Ellis, M . C., Rubin, G. M. 1989. Nature 340: 5 3 1 54. M urphy, N. B., Pays, A . , Tabibi, P., Coquelet, H., Guyaux, M., et al. 1987. J. Mol. BioI. 195: 855 55. Nagai, K., Nakaseko, Y., Nasmyth, K., Rhodes, D. 1988. Nature 332: 284 56. Nictfeld, W., E1-Baradi, T., Mentzel, H., Pieler, T., Koster, M., et al. 1989. J. Mol. BioI. 208: 639 57. Page, D. c., Mosher, R., Simpson, E. M.,Fisher,E. M. c., Mardon,G., et al. 1987. Cell 51: 1091 58. Pan, T., Coleman,J. E. 1989. Proc. Nat!. Acad. Sci. USA 86: 3145 59. Pannuti,A., Lanfrancone, L., Pascucci, A., Pelied, P. G.,La Mantia,G., Lania, L. 1988. Nucleic Acids. Res. 16: 4227 60. Parkhurst, S. M., Harrison, D. A., Remington, M. P., Spana, c., Kelley,R. L., et al. 1988. Genes Dev. 2: 1205 61. Parraga, G., Horvath, S. J., Eisen, A., Taylor, W. E., Hood, L., et al. 1988. Science 241: 1489 62. Pays, E., Murphy, N. B. 1987. J. Mol. BioI. 197: 147 63. Picard, B., Wignez, M. 1979. Proc. Natl. Acad. Sci. USA 76: 241 64. Rhodes, D., K1ug, A. 1986. Cel! 46: 123 65. Rosenberg, U. B., Schroder, C., Preiss, A., Kienlin, A., Cote, S., et al. 1986. Nature 319: 336 66. Ruis i Altaba, A., Perry-O'Keefe, H., Melton, D. A. 1987. EMBO J. 6: 3065 67. Ruppert,.T. M., Kinzler,K. W., Wong, A., Bigner, S. H., Kao, F. T., et al. 1988.

Mol. Cell. BioI. 8: 3104

68.

69.

70.

71.

72. 73. 74.

Schilf, L.

A., Nibert, M. L., Co, M. S., Brown, E. G., Fields, B. F. 1988. Mol. Cell. Bioi. 8: 273 Sehneider-Giidicke, A., Beer-Romero, P., Brown, L. G., Nussbaum, R., Page, D. C. 1989. Cell 57: 1247 Schroth, G. P., Cook, G. R., Bradbury, E. M., Gottesfe1d, 1. M. 1989. NalUre 340:487 Schuh, R., Aicher, W., Gaul, U., Cote, S., Preiss, A., et al. 1986. Cell 47: 1025 Smith, D. R., Jackson , I. J., Brown, D. D. 1984. Cell 37: 645 Stanojevic, D., Hoey, T., Levine, M. 1989. Nature 341: 33 1 Stillman, D. J., Bankier, A. T., Seddon, A., Groenhout, E. G., Nasmyth, K. A. 1988. EMBO J. 7: 485

ZINC FINGER DOMAINS

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

75. Sukhatme, V. P ., Cao, X., Chang, L. C, Tsai-Morris, C H., Stemenkovich, D., et al. 1988. Cell 53: 37 76. Tautz, D., Lehman, R., Schniirch, R ., Schuh, R., Seifert, E., et al. 1987. Nature 327: 383 77. Treisman, J., Desplan, C. 1989. Nature 341: 335 78. Tsai-Morris, C. R., Cao, X., Sukhatme,

421

V. P. 1988. Nucleic Acids Res. 16: 8835 79. Tso, J. Y., Van den Berg, D. J., Korn, L. J. 1986. Nucleic Acids Res. 14: 2187 80. Vincent, A., Colot, R. V., Rosbash, M. 1985. J. Mol. Bioi. 186: 149 81. Vrana, K. E., Churchill, M. E. A., Tullius, T. D., Brown, D. D. 1988. Mol. Cell. Bioi. 8: 1684

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

Annu. Rev. Biophys. Biophys. Chem. 1990.19:405-421. Downloaded from www.annualreviews.org Access provided by Kent State University on 04/04/18. For personal use only.

Zinc finger domains: hypotheses and current knowledge.

Many of the hypotheses are clearly correct that are based on the initial observation that TFIIIA has an approximately periodic structure with invarian...
601KB Sizes 0 Downloads 0 Views