Cell,

Vol. 14, 641-653,

July 1976,

Copyright

0

1976 by MIT

Organization of Coding and Intervening in the Chicken Ovalbumin Split Gene J. L. Mandel, R. Breathnach, P. Gerlinger, M. Le Meur, F. Gannon and P. Chambon Laboratoire de Genetique Moleculaire des Eucaryotes du CNRS et U 44 de I’INSERM Faculte de Medecine Strasbourg, France

Summary The interruptions in the chicken ovalbumin gene which were reported previously (Breathnach, Mandel and Chambon, 1977) are shown to be due to the presence of intervening sequences which separate the messenger-coding sequences. We present evidence for an additional interruption of the gene, which, together with those reported earlier and by Garapin et al. (1978b), make a total of six intervening sequences. All of these intervening sequences are located in the DNA region that corresponds to the part of the ov mRNA which codes for amino acids. The seven coding fragments of the split ovalbumin gene are arranged in the same order and relative orientation as in the ovalbumin double-stranded cDNA. All the sequences coding for ov mRNA are contained in a chromosomal DNA region of 6000 bp, which is more than 3 times longer than ov mRNA. The general organization of the ovalbumin split gene is discussed. Introduction We have previously shown (Breathnach et al., 1977) that the ovalbumin gene is split in chicken DNA, regardless of the transcriptional status of the tissue investigated [for a summary of other examples of split genes discovered recently, see the Introduction to Garapin et al. (1978b)]. Our results allowed us to define three regions of the ovalbumin mRNA (ov mRNA), I, II and III (Figures la and lb; Breathnach et al., 1977), which are encoded by DNA sequences present in four Eco RI fragments of the chromosomal DNA. We demonstrated that regions I and III, corresponding to the S’and 5’regions of ov mRNA, respectively, were coded for by sequences present in Eco RI fragments “a” and “b,” respectively. We then assumed that region II was coded for by DNA sequences contained in fragments “c” and “d,” but were unable to determine the distribution within these two fragments of the sequences coding for region II. Neither did we map the relative positions and transcriptional orientations of the coding sequences carried on the four Eco RI fragments. We were thus unable to distinguish between two possible models for the organi-

Sequences

zation of the ovalbumin gene (Breathnach et al., 1977). This paper describes experiments which demonstrate that Eco RI fragments “c” and “d” each code for the entire region II due to the existence of different alleles of the ovalbumin gene. In addition, we show that there is a further intervening sequence in the chromosomal DNA sequences which code for this region of ov mRNA. We also demonstrate that the various coding sequences of the ovalbumin split gene are arranged in the same order and relative orientation in the chromosomal DNA as in ovalbumin double-stranded cDNA (ov dscDNA). All these results, together with those presented by Garapin et al. (1978b), lead to a unique model for the structural organization of the ovalbumin split gene, which is discussed below. Results Origin, Location and Orientation of Ovalbumin Eco RI Fragments “c” and “d” We have shown previously (Breathnach et al., 1977) that four fragments (a, b, c and d; see Figure 3, lane 1) capable of hybridizing to a double-stranded ovalbumin cDNA probe (ov ds-cDNA, Hhaov; see Figure lb) are present in an Eco RI digest of chicken DNA. Eco RI fragment “a” [9.2 kilobase pairs (kb)] contains sequences (located to the right of the Hae III site of the ov ds-cDNA) which code for the S’half of the mRNA (region I; Figures la and lb; Breathnach et al., 1977). Eco RI fragment “b” contains sequences to the left of the Pst I site which code for the 5’quarter of the mRNA (region III; Figures la and lb). We assumed from the location of the interruptions which were detected in the ovalbumin mRNA coding sequence that one or both of the remaining Eco RI fragments “c” and “d” [1750 and 1250 base pairs (bp), respectively) contains the sequences which map between the Hae III and Pst I sites of the ov ds-cDNA( region II; Figures la and lb). Our assumption was confirmed by the finding that the Eco RI fragments “cl’ and “d” hybridize to the probe Hae-Pst (Figure lc and Figure 2, lane 4), but not to the probe Hae A, which hybridizes only to the Eco RI fragment “a” (Figure 2, lane 2). Since it has been shown previously that the probe Pst B does not hybridize to the fragments “c” and “d” (Breathnach et al., 1977), these fragments must contain the sequences of the ds-cDNA w,hich lie between the Hae III and Pst I sites of the ov dscDNA( Figure lb). The probes Hae A (Figure 2, lane 3) and Hae-Pst (Figure 2, lane 5), however, hybridize to the same 4.6 kb Hind fragment “a” of the cellular DNA. These results indicate that the 4.6 kb Hind III fragment “a” contains the ovalbumin gene

Cdl 642

5’

TRANSCRIPTION

3’

III t= (a)

ov

5’ mRNA----

1610

1630 8

Hha I (b)

1420

1100 I

I Pvu II Pst I Hinf 3 I I Sat13

Hlnf 4 I

Hhaov

t

1330 I

, Hph2

H~nd 2

I , Sau 2 Sau 1

160 I

0 I

bp poly(A)

Hinf 1

Hlnf 2 I

Hae Ill I

, Hph 1

I

SW I

Hae A

I

Hae-Pst

Eco 2

Hind 1

Eco 1

Psi 1

I

Cd) I

I\\

-

Eco 3 Eco 3’

(e)

Xba I I

260 I

450 I

I



Pst 2

570 I

Hae Ill

Hae B Pst B

1040 I

Hae Ill

c

I

Eco RI

Xba I

a

I

I

d I Hindlll

a

1 I 1

I 0

Figures

la-lc.

I

Maps

I 2

of Ovalbumin

mRNA

I I 3

and of the Cloned

I 4

I 5

I 6

Double-Stranded

I 7

I 6

I 9

I 10

I 11

J kb

cDNA

Ovalbumin mRNA(a) is aligned with the corresponding sequences in the Hhaov fragment which contains the cloned ovalbumin ds-cDNA (b) (see also legend to Figure 2 in Breathnach et al., 1977). Restriction enzyme sites were mapped either by conventional methods or after end-labeling of the Hhaov, as described by Smith and Birnstiel (1976) (for Hinf I and Sau 3AI sites). Positions of these sites are given with respect to the 3’ end of the mRNA, excluding the poly(A) tail. For enzymes with multiple sites, each site is given a number, startingfrom the 3’ end. Eco, Hinf, Hph and Sau denote Eco RI, Hinf I, Hph I and Sau 3AI, respectively. The arrows indicate the locations of the two interruptions in the mRNA coding sequence which define regions I, II and Ill (see the text; also Breathnach et al., 1977). Hhaov fragments which were nick-translated and used as specific hybridization probes are shown in(c). Figure

Id.

Map of Restriction

Sites

around

the Sequences

Coding

for Regions

I and II of the ov mRNA

Locations of restriction enzyme sites Eco 1 and 2, Pst 1 and 2, and Hind 1 and 2 are from Breathnach the coding regions (heavy lines) were positioned as described in the text. Hind and Pst denote Hind Figure

le.

The Eco RI and Hind III Fragments

Corresponding

et al. (1977). III and Pst I.

The remaining

sites

and

to the Map in(o)

The sizes of these fragments (in kb) for Eco RI “a, ” “c” and “d” are 9.2, 1.75 and 1.25, respectively; for Hind “a,” these values have been slightly modified from our previous results (Breathnach et al., 1977) to take into account these fragments with other restriction nucleases.

sequences present in Eco RI fragment “a” and the gene sequences present in either or both of fragthese latter sements “cl’ and “d.” In addition, quences must be located to the left of the Eco RI site 2-that is, outside of Eco RI fragment “a,” but within the 2.0 kb region defined by the sites Hind 2 and Eco 2 of Figure Id. To determine which Eco RI fragments are contained within the 4.6 kb Hind III fragment “a,” cellular DNA was digested with Hind III and electrophoresed on a 0.8% agarose gel, and bands corresponding to different size classes of DNA were excised. DNA was extracted, cut with Eco RI, reelectrophoresed and analyzed by the blotting method of Southern (1975). Unexpectedly, we found that Eco RI digestion of the band containing

the size is 4.6. Some results of digestion

of of

the 4.6 kb Hind III fragment “a” (Figure 3, lane 2) yields both Eco RI fragments “c” and “d,” in addition to the 2.6 kb Eco 2-Hind 1 fragment (Figure Id and Figure 3, lane 3, a’) predicted from our earlier results (Breathnach et al., 1977). Since the sum of these three fragments (5.6 kb) is significantly larger than the 4.6 kb starting Hind III fragment “a,” the possibility exists that the 4.6 kb fragment could contain either Eco RI fragment “c” or “d,” reflecting the existence of two different alleles of the ovalbumin gene. This interpretation was confirmed by the observation that the DNA of some chickens (either laying hens or non-) lacks either the fragment “c” or “d,” while preserving the normal Hind III digestion pattern (Figure 4). Thus Eco RI digestion of the DNA of heterozygotes

f$anization

of the Chick

Ovalbumin

Gene

Figure 2. Detection of Fragments Containing Complementary to Regions I and II of the mRNA

the

Sequences

Erythrocyte DNA was digested with Eco RI (lanes 2 and 4) or with Hind Ill (lanes 3 and 5) electrophoresed on a 1% agarose gel, transferred to nitrocellulose filters (see Experimental Procedures) and hybridized to the 32P-labeled probes Hae A (lanes l-3) or HaePst (lanes 4 and 5) (see Figure lc). The probes were prepared as described in Experimental Procedures. Internal hybridization markers (for molecular weight determination) are shown in lane 1. Their sizes (in kb) are (a) 13.5, (b) 6.3, (c) 6.5, (e) 2.4 (for details, see Figure 1 in Breathnach et al., 1977). The sizes of the Eco RI and Hind Ill fragments are given in the legend to Figure 1.

yields both Eco RI fragments “cl’ and ‘Id,” while the DNA of homozygotes yields either fragment “c” or “d.” From these results, we conclude that the Hind III 4.6 kb fragment “a” (Figure le) contains ovalbumin sequences which lie adjacent to each other in the ovalbumin messenger, and that there are two alleles for the ovalbumin gene which differ by the presence or absence of an Eco RI site in the Eco 2-Hind 2 segment( Figure Id). We have previously shown that the orientation of transcription of the ovalbumin-coding sequences within the Eco RI fragment “a” is from Eco RI site 2 toward the site Hind 1 (see Figure 1; Garapin et al., 1978a). To determine whether the orientation of the sequences coding for ov mRNA in fragment “c” (and “d”) is similar, we have separated these fragments by reverse-phase chromatography (RPC) and mapped the positions of different restriction enzyme sites. Pst I digestion of the 1.75 kb Eco RI fragment “c” yields a 1.65 kb fragment (see Figure 4, lane 5 in Breathnach et al., 1977). Since there is

Figure 3. Identification Fragment “a”

of Eco

RI Fragments

Present

in Hind

Ill

150 cg of erythrocyte DNA were digested with Hind III and electrophoresed on a 0.6% agarose gel (17 cm long, 4 mm thick) with molecular weight markers run in parallel. After electrophoresis for 16 hr at 40 V, 1 cm wide strips were excised, and the DNA was extracted (see Experimental Procedures for details). Half of each DNA fraction was digested with Eco RI in the presence of an internal adenovirus type 2 DNA marker to monitor the reaction. Undigested and Eco RI-digested DNA from each strip was electrophoresed side by side on a 1% agarose gel, and hybridized to 32Plabeled Hhaov after transfer to nitrocellulose filters as described in Experimental Procedures. Only the relevant part of the filter is shown here. Lane 1: Eco RI digest of erythrocyte DNA. Lane 2: purified Hind III fragment “a.” Lane 3: Eco RI digest of purified Hind III fragment “a.” The sizes of the fragments (in kb) are (a’) 2.6, (c) 1.75, (d) 1.25.

a Pst I site (Pst 2) at about 400 bp from the Hind 2 site (Figure Id), this places an Eco RI site (Eco 3) at 100 bp to the left of the Pst 2 site (Figure Id). The other Eco RI site would then be located at 1.75 kb to the right of Eco 3, and is thus identical to Eco RI site 2 (Figure Id). The 1.75 kb Eco RI fragment “c” is cut by Hae III into a nonhybridizing region of 600 bp and a 1.15 kb hybridizing fragment (Figure 5, lane 5). The latter is cut by Pst I to yield a 1.05 kb fragment (Figure 5, lane 3). This unambiguously maps the Hae III site in the Eco RI fragment “c” (Figure Id). The 1.25 kb Eco RI fragment “d” is not cut by Pst I (see Figure 4, lane 5 in Breathnach et al., 1977), but is cut by Hae III to yield a 650 bp hybridizing fragment (Figure 5, lane 2). The length of the

Cdl 644

Figure 4. Hind Gene Sequences

III and Eco RI Digestion Patterns in DNA from Different Chickens

of Ovalbumin

Erythrocyte DNA from three chickens was digested with Hind III (lanes 1, 3, 5) or Eco RI (lanes 2, 4, 6), electrophoresed in a 1% agarose gel and analyzed by hybridization with 32P-labeled Hhaov after transfer to nitrocellulose filters. Lanes 1 and 2: DNA from chicken 1. Lanes 3 and 4: DNA from chicken 2. Lanes 5 and 6: DNA from chicken 3. Lane 7: molecular weight markers; see the legend to Figure 2.

region which does not hybridize (600 bp) is the same as that for the Eco RI fragment “c.” This indicates that the restriction enzyme sites Eco 2 and Hae III are common to both fragments “c” and “d.” The other extremity of Eco RI fragment “d,” Eco 3: is then situated as indicated on the map in Figure Id, and Eco 3’ is the site which is absent in Eco RI fragment “c.” We have demonstrated previously that the ov mRNA coding sequences contained in Eco RI fragment “a” (region I, Figure la) are interrupted close to the Hae III site and do not include it (Breathnach et al., 1977). The Hae III site must therefore be located toward the 3’end of the coding sequence present in Eco RI fragments “c” and “d.” Since the above results demonstrate that the bulk of the ov mRNA coding sequences of fragments “c” and “d” ‘are located to the left of the Hae III site (Figure Id), it follows that the orientation of transcription of the coding regions in fragments “c” and “d” is from left to right, as it is for those regions present in Eco RI fragment “a” (see above and Figure 1).

Figure 5. Mapping of Fragments “co and “d”

the

Hae ,

Ill and

Xpa

I Sites

in Eco

RI

Eco RI fragments “c” and “d,” purified by reverse-phase chromatography (see Experimental Procedures), were digested with various restriction enzymes and hybridized to the probe Hhaov after electrophoresis on a 1.8% agarose gel and transfer to nitrocellulose filters. Lanes 1, 2 and 8: Eco RI fragment “d.” Lane 1 -undigested; lane P-digested with Hae Ill (band size-650 bp); lane 8-digested with Xba I [band sizes (in bp) are (a) 1000, (c) 2501. Lanes 3. 4, 5 and 7: Eco RI fragment “c.” Lane 3-digested with Hae Ill and Pst I (band size-1050 bp); lane 4-undigested; lane 5digested with Hae Ill (band size-1150 bp); lane 7-digested with Xba I [band sizes (in bp) are (a) 1000, (b) 7501. Lane 6: molecular weight markers (Hae Ill digesf of Hhaov) [band sizes (in bp) are (a) 1420, (b) 710, (c) 2401.

An Additional Intervening Sequence Is Present in Eco RI Fragments “c” and “d” Evidence for an additional intervening sequence in the ovalbumin gene was provided by digesting the purified Eco RI fragments “c” and “d” with Xba I restriction enzyme. Eco RI fragment “c” yields two fragments of 1000 and 750 bp in length, which hybridize with equal intensity (Figure 5, lane 7), whereas the Eco RI fragment “d” yields the same 1000 bp fragment and a fragment of 250 bp (Figure 5, lane 8). These results locate this Xba I site at 1000 bp to the left of Eco RI site 2 and at 750 bp to the right of Eco RI site 3, and thus at about 400 bp to the left of the Hae III site (see Figure Id). Since there is no Xba I site in this region of the ov dscDNA( Figure lb), and since the ovalbumin coding

i;anization

of the Chick

Ovalbumin

Gene

region of Eco RI fragment “c” appears to be equally distributed on both sides of the Xba I site, we conclude that the coding sequences present in the Eco RI fragments “c” and “d” are split at least once, as shown in Figure 1.

An 18 kb Barn HI Fragment Contains All of the Coding Sequences of the Ovalbumin Gene We have shown above that the sequences which code for the first 1400 nucleotides of ov mRNA starting from the S’end ( regions I and II; see Figure la) are all found in a DNA segment between the site Hind 1 and Hind 2(see Figures Id and 80. The next problem was to locate, with respect to this segment, the sequences coding for region III of the ov mRNA (see Figure la). We thus looked for a restriction enzyme which would produce a large digestion fragment hybridizing to probes which correspond to both the 3’ and the 5’ moieties of ovalbumin mRNA. We found that Barn HI cut oviduct (Figure 6) or erythrocyte (Figures 9a and 9b) DNA to give two hybridizing fragments 18 and 2.4

kb long. The 18 kb band hybridizes equally well (Figure 6, lanes 2 and 4) to probes specific for the B’and 5’halves of ovalbumin mRNA (Hae A and Hae B, respectively; Figure lc), and contains all of the ovalbumin split gene as shown below. To localize the Barn HI sites with respect to the known Eco RI or Hind III sites in the split ovalbumin gene (see Breathnach et al., 1977), we have performed double digests of cellular DNA with Barn HI and either Eco RI or Hind III. The 9.2 kb Eco RI fragment “a” is cut by Barn HI, yielding a 4.7 kb hybridizing fragment (Figure 7, lane 2, a’); thus there is a Barn HI site (Barn 1) at 2.3 kb beyond the sequence which corresponds to the S’end of the ovalbumin mRNA (Figure 8). This places the other Barn site (Barn 2) at 13 kb from the Eco RI site 2, in the 5’direction (Figure 80. Eco RI fragments “b,”

Figure 7. Identification ent in Barn HI Fragment

Figure 6. Detection of Ovalbumin DNA Cleaved with Barn HI

Gene

Sequences

in Oviduct

Oviduct DNA was digested with Barn HI, electrophoresed on a 0.7% agarose gel, transferred to a nitrocellulose filter and hybridized to the SZP-labeled probe Hae A (lanes 1 and 2) or Hae B (lanes 3 and 4). Lanes 1 and 3: internal markers; see the legend to Figure 2. Lanes 2 and 4: Barn HI digest [band sizes (in kb) are (a) 16.0, (b) 2.41.

of the Eco RI or Hind “a”

Ill Fragments

Pres-

DNA digests were separated on a 1% agarose gel and hybridized to the 92P-labeled probe Hhaov after transfer to nitrocellulose filters. Lane 1: internal markers: see the legend to Figure 2. Lanes 2 and 3: a Earn HI digest of chicken DNA was redigested with Eco RI (lane 2) or Hind III (lane 3) [band sizes (in kb) are lane 2-(a’) 4.7, (b) 2.35, (c) 1.75, (d) 1.25; lane 3-(a) 4.6, (b) 3.21. Lanes 4 and 5: digest of erythrocyte DNA with Eco RI (lane 4) or Hind III (lane 5); sizes of the fragments are as in the legend to Figures 86 and 6c. Lanes 6 and 7: Barn HI fragment “a,” purified by reverse-phase chromatography (see Figure 9), was digested with Eco RI (lane 6) or Hind III (lane 7). The sizes of the fragments are the same as for lanes 2 and 3, respectively.

Cdl 646

TRANSCRIPTION

(a)

0

2

4

6

0

I

I

I

I

I

10 I

12 I b

Eco

(b)

I

c

BAM

HI

Kpn I

1 I

b

I Hind

I

I

L Barn 2

22 kb

I

1 a

(a)

I

a

a (d)

20

a

I

Hind Ill 1

18 I

16

RI 1 b

(cl

14 I

2

EC03

I Eco 3’

Ecc 2

Hind 1

Barn 1

Eco 1

(0

Kpn 3

Hind 3

Eco4

Hind 2

(9)

(h)

Figure

8. Organization

of the Sequences

Coding

Hha I

Pst I

Hae III

1810

1420

1100

0

W

for ov mRNA

(a) Scale (in kb) for(b-g). (b-e) Definition of restriction enzyme fragments hybridizing to kb) for(b) Eco RI “a, ” “b” and “c” are 9.2, 2.35 and 1.75, respectively; for(c) Hind III “a” Barn HI “a” the size is 18.0; for (e) Kpn I “a” and “b” sizes are 11.5 and 7.0, respectively. which code for regions I and II of the ov mRNA. Barn and Kpn denote Barn HI and Kpn I. (g) Map of restriction sites around the sequences which code for region Ill of the represented by heavy lines, are taken from Garapin et al. (1978b). (II) Map of ovalbumin cDNA are given as in the legend to Figure 1.

“c” and ‘Id,” and Hind III fragments “a” and “b,” are not cut by Barn HI (Figure 7, lanes 2 and 3). For further restriction enzyme analysis, the 18 kb Barn HI fragment was partially purified and separated from the 2.4 kb hybridizing fragment by reverse-phase chromatography (Figures 9a and 9b). The peak containing the 18 kb fragment, which was well separated from the 2.4 kb fragment, was digested with Eco RI or Hind III. This treatment yielded the three intact Eco RI fragments “b,” “c” and “d” and the Barn HI digestion product of Eco RI “a” (Figure 7, lane 6), or the two intact Hind III fragments “a” and “b” (Figure 7, lane 7), respectively. When individual fractions of the RPC column were digested with Eco RI throughout the 18 kb Barn HI fragment peak, the four Eco RI bands appeared simultaneously, giving no indication of heterogeneity (not shown). Unless there are two Barn HI fragments of similar size and behavior in reverse phase chromatography, the sequence cod-

the Hhaov probe. The sizes of these fragments (in and “b” sizes are 4.6 and 3.2, respectively; for(d) (f) Map of restriction sites around the sequences Coding sequences are represented by heavy lines. mRNA. Locations of the coding sequences, as mRNA. Positions of the restriction sites of ov ds-

ing for region III of the ov mRNA (contained in Eco RI fragment “b”) must be present in the 18 kb Barn HI fragment, together with the sequences which code for the rest of the mRNA. This conclusion is also supported by the results of similar experiments carried out on the 18 kb Barn HI fragment partially purified by gel electrophoresis (not shown). All the Ovalbumin Coding Sequences Are Found in a 6 kb DNA Segment in the Same Order and Orientation as in the mRNA Mapping of the Kpn I and Xba I sites has permitted the localization of region III (see above) in the 18 kb Barn HI fragment. Digestion of cellular DNA with Kpn I yields two fragments which hybridize to the ds-cDNA Hhaov probes (see Figure Ic). A 7 kb fragment (Kpn I fragment “b”; Figure 8e) hybridizes with the probe Hae A (Figure 10, lane 4), while the 11 kb fragment

Organization 647

Figure

of the Chick

9. Reverse-Phase

Ovalbumin

Chromatography

Gene

of a Barn HI Digest

of Chicken

DNA

5 mg of Barn HI-digested chicken DNA were processed using a 120 cm x 0.7 cm column as described in Experimental Procedures. (A) Absorbance profile of column eluate. The approximate positions of elution of Barn HI fragments “a” and “b” (see Figure 6) are indicated by the arrows (see below). The elution gradient is represented as a straight line. (B) Detection of Barn HI fragments “a” and “b” in the column eluate. Aliquots of fractions were electrophoresed on agarose gels, transferred to nitrocellulose filters and hybridized to 32Plabeled Hhaov. Fragment “a” elutes within fractions 150-160, and “b” elutes within fractions 115-127. Eco RI fragments “a” (9.2 kb) and “b” (2.35 kb) are also shown (II), as is fragment Barn HI “a” (16 kb) (I) from an unfractionated erythrocyte DNA Barn HI digest. Numbers represent fraction number as shown in (A).

(Kpn I fragment “a”) hybridizes with the Pst B probe (Figure 10, lane 6). These two fragments are contained almost entirely within the 18 kb Barn HI fragment, since very similar fragments are also obtained by Kpn I digestion of RPC-purified 18 kb Barn HI fragment (Figure 11, lanes 1 and 2). The sum of the lengths of the Kpn I fragments is not significantly different from that of the 18 kb Barn HI fragment. Thus the two Kpn I fragments “a” and “b” are either contiguous or separated by a small Kpn I fragment which contains little or no ovalbumin sequence, and would thus be inactive in hybridization reactions. Judging from the size of the hybridizing fragments obtained by an Eco RI/Kpn I double digestion (Figure 11, lane 4), Eco RI fragment “a” is cut at one site (Kpn 1) located very close to the Barn HI site 1 (Figure 80. A second Kpn I site must lie 7 kb away, close to the Hind III site 2 (Figure 80. In the 2.4 kb Eco RI fragment “b” defined by the Eco 3 and 4 sites (see Figure 8g), there must be a Kpn I site at about 200 bp from one end (Figure 11, compare lanes 4 and 8). Since the corresponding 3.2 kb Hind III fragment defined by Hind III sites 2 and 3( see Figure 8g) is not cut by Kpn I (Figure 11, lane 5, b), this unambiguously placed Kpn I site 2 very close to the Hind III site 2 (Figure 89). [We have termed these Eco Rt, Kpn I and Hind III sites Eco 3 and 4, Kpn 2 and Hind 2, since we demonstrate below that they are identical to those

mapped above (see Figure Sfj .] Since the 11 kb Kpn I fragment “a” hybridizes to the Pst B probe (see above) which is specific for region III (located between Hind 2 and Eco 4; see above), the last Kpn I site 3 should be located as shown in Figure 8g. The above results show that the sequences coding for region III of ov mRNA are found close to one extremity of the 11 kb Kpn I fragment “a.” In this region, there are one Hind III site and one Kpn I site very close together, and an Eco RI site within 200 bp to their right( Figure 8s). The same sites are found in very similar relative positions at the 5’ extremity (with respect to transcription) of the other Kpn I fragment “b”( Figure 80. Since all three enzymes cut the cellular DNA infrequently, our data strongly suggest that the Kpn I fragments “a” and “b” are contiguous. This was proved by digestion with Xba I endonuclease. Digestion of the RPC-purified 18 kb Barn HI DNA fragment with Xba I yields three hybridizing fragments “a, ” “b” and “c” of 3.2, 2.75 and 2.5 kb (Figure 12, lane 7). Two Xba I sites are present in Eco RI fragment “a” (Figure 12, lane 2). The position of these two sites (Xba 1 and 2; Figure 13~) has been determined by the sizes of the two hybridizing fragments (1.5 and 4.2 kb) obtained from Eco RI fragment “a.” One of these sites must be located in the coding sequences present in Eco RI fragment “a,” and corresponds to the only Xba I site present in the ov ds-cDNA(see Figure lb). Thus the

Cell 646

Figure IO. Detection of Ovalbumin cyte DNA Cleaved with Kpn I

Gene

Sequences

in Erythro-

Erythrocyte DNA was digested with Kpn I, electrophoresed on a 0.7% agarose gel, transferred to a nitrocellulose filter and hybridized to either 32P-labeled Hhaov (lane 2). Hae A (lane 4) or Pst B (lane 6) (see Figure lc). Lanes 1,3 and 5: internal markers; see the legend to Figure 2. Lanes 2, 4 and 6: Kpn I digest. Fragment sizes (in kb) are (a) 11.5, (b) 7.0.

3.2 kb Xba I fragment “a” of the 18 kb Barn HI fragment is located between sites Barn 1 and Xba 2 (Figure 13). One Xba I site is located in Eco RI fragment “c” or “d” at 750 bp to the right of ECO RI site 3( see above and Figure Id). Thus there are 2.5 kb between Xba I sites 2 and 3 (Figure 13c), a measurement which corresponds to the 2.5 kb Xba I fragment “c” of the 18 kb Barn HI fragment. It follows that the 2.75 kb Xba I fragment “b” of the 18 kb Barn HI fragment must contain the ov mRNA sequences corresponding to region III of the ovalbumin gene( Figures 13b and 130. Xba I fragment “b” of the 18 kb Barn HI fragment is cut by Kpn I into two hybridizing fragments of 1.85 and 0.90 kb (Figure 12, lanes 7 and 8). There is an Xba I site at 300 bp from one end of Eco RI fragment “b” (Figure 12, lanes 1 and 3). Together with the information that the Xba I fragment “b” is cut by Kpn I to give two hybridizing fragments of 1.85 and 0.9 kb (see above), this allows positioning of the Xba I sites around Eco RI fragment “b” (defined by Eco RI sites 3 and 4) as shown in Figure 13cf. One of these Xba I sites therefore appears to

Figure 11. Sequences

Location of the Kpn in Chicken DNA

I Sites

around

Ovalbumin

Gene

DNA digests were electrophoresed on a 1% agarose gel and hybridized to the 32P-Hhaov probe. Lane 1: Kpn I digest of total DNA; for fragment sizes, see Figure 10. Lane 2: Kpn I digest of the Barn HI fragment “a” purified by reverse-phase chromatography; fragment sizes (in kb) are (a) ll:O, (b) 6.7. Lanes 3 and 7: internal markers; see the legend to Figure 2. Lanes 4, 5 and 6: double digests of cellular DNA with Kpn I and Eco RI (lane 4) Hind Ill (lane 5) or Pst I (lane 6); fragment sizes (in kb) are lane 4-(a’) 5.0, (b’) 2.15; lane B-see the legend to Figure 8c; lane 6-(a) 4.5. Lane 8: Eco RI digest of total DNA; for fragment sizes, see the legend to Figure 86.

be identical to the Xba I site 3, which was previously located at 750 bp to the right of Eco RI site 3. In fact, we must postulate such an identity to account for the production of the 1.85 and 0.9 kb hybridizing fragments obtained by digesting the Xba I fragment “b” with Kpn I. The general orientation of the coding sequences present in Eco RI fragment “b” was established as follows. Sequence analysis (R. Breathnach and C. Benoist, unpublished data) has shown that the Pst I site present in the ov ds-cDNA shares 3 nucleotides of its recognition sequence with a Pvu II site (see Figure lb). Digestion of the RPC-purified 2.4 kb Eco RI fragment “b” with Pst I, Pvu II or a combination of both, yields three indistinguishable fragments of 2 kb (not shown). This result establishes that the Pst I site 3 previously mapped in Eco

Organization 649

of the Chick

Ovalbumin

Gene

Figure 12. Mapping of Xba I Sites within the Eco RI Fragments “a” “a,” “b, ” “c” and “d” and the Barn HI Fragment Fragments were purified by reverse-phase chromatography (see Experimental Procedures and Figure 9 for Barn HI fragment “a”) and digested with Xba I; the digests were analyzed as described in the legend to Figure 4. Lane 1: Eco RI digest of cellular DNA( for fragment sizes, see the legend to Figure 8b). Lanes 2-5: Xba I digest of Eco RI fragments “a,” “b,” “c” and “d,” respectively: fragment sizes (in kb) are lane 2-(a) 4.2, (b) 1.5; lane 3-(a) 2.0; lane 4-(a) 1.0, (b) 0.75; lane 5-(a) 1 .O. Lane 6: internal markers; see the legend to Figure 2. Lane 7: Xba I digest of the Barn HI fragment “a”; fragment sizes (in kb) are (a) 3.2, (b) 2.75, (c) 2.5. Lane 8: Xba I plus Kpn I digest of the Blrn HI fragment “a”; fragment sizes (in kb) are (a) 3.2, (c) 2.5, (d) I 85, (e) 0.9. Lane 9: internal markers; band sizes (in kb) are (a) 1.42, (b) 0.71 (see the legend to Figure 5).

RI fragment “b” (Breathnach et al., 1977) is the same as the Pst I site present in ov ds-cDNA (see Figures 8g, 13e and 130. Since almost all the sequences which hybridize with Eco RI fragment “b” are on the 5’side of the Pst I site of the dscDNA (see Figure 13e; Breathnach et al., 1977), they are in the same orientation with respect to transcription as the other Eco RI fragments “a” and “c”(see Figures 1% and 13e). Discussion Our present knowledge (Breathnach et al., 1977; Garapin et al., 1978a, 1978b) of the structural organization of the ovalbumin split gene is summarized in Figure 13e, in the form of a restriction

map of the sequences in and surrounding the split ovalbumin gene. We use the terms “coding” and “intervening” (Tilghman et al., 1978a) sequences to describe the regions of the chromosomal DNA which code for the ovalbumin messenger RNA and the regions which separate them, respectively [exons and introns of Gilbert (1978)]. The sequences which code for the ovalbumin messenger are interrupted in six places by intervening sequences ranging in length from 0.2-l .6 kb. It has been demonstrated by electron microscopic studies of hybrids between ov mRNA and relevant cloned Eco RI DNA fragments that there are no further major interruptions in the coding sequences contained in Eco RI fragments “a” and “b” (Garapin et al., 1978a, 1978b). The absence of additional interruptions of the coding sequences between positions 180 and 1040 of the ov mRNA( Figures la and lb) was also confirmed by the finding that the Hinf I-Sau 3AI and Hph I-Sau 3AI restriction fragments of chromosomal DNA have the same lengths as the corresponding fragments of ov ds-cDNA (our unpublished results). We cannot exclude at present the possibility that there are further interruptions in Eco RI fragment “c,” the cloning of which is under way. The existence of two different alleles for the ovalbumin gene is indicated by the absence of either Eco RI fragment “c” or “d” in some chickens. A similar observation has been made by Weinstock et al. (1978). It is interesting to note that our results demonstrate that the variation responsible for this polymorphism lies in an intervening region, and does not seem to affect the expression of the ovalbumin gene. A similar variation in another intervening sequence is also suggested by results presented by Garapin et al. (197813). Split genes seem to be a common occurrence in the eucaryotic kingdom (for references, see the Introduction to Garapin et al., 1978b). In the case of the ovalbumin gene, all six interruptions are located in the protein-coding sequence, since the coding region of ov mRNA begins at -85 nucleotides from the 5’end and extends to within -600 nucleotides of the 3’poly(A)sequence (G. Brownlee, personal communication). As pointed out by Garapin et al. (1978b), however, our results do not provide evidence for a 150-200 nucleotide long “leader” sequence type of arrangement at the 5’ end of ov mRNA, as is found for some adenovirus and SV40 mRNAs. It is clear that the principle of colinearity between DNA-coding sequences of the gene and the amino acid sequence of the polypeptide product, established by work on procaryotes, has been transgressed in eucaryotic systems. The split gene arrangement poses some evident problems for mRNA

Cell,

650

5’ I

(a)

kb ov

/

9 ’

IO I Eco b

I

(b)

11 I

12 I I

13 I Eco c

14 I

15 1

TRANSCRIPTION

3’ b

16 Ecoa

‘7

‘6

I

Xba b

I

Xba c Eco 2

Eco 3 Eco 3’

fP

I

Xba a

Barn 1

Eco 1

(C) Barn 2

Cd)

Eco 4

Eco 3

++l-+l Xba 4

Kpn 3

Kpn 2

Xba 3

Kpn 1

Xba 1

Xba 3 Barn 1 I

(e)

(f)

POSY(A) Hha I

Hinf 4

Pst I

Hae III

1630

1420

t 1100

ttt 1610

Figure

Eco 1

13.

Location

of Xba I Sites around

(a) Scale (in kb) for(b-e). (b) Definition Hhaov. (c) Map of Xba I sites around sequences which code for region Ill of ovalbumin mRNA (f). Coding sequences between restriction sites in the cellular boundaries of regions I, II and Ill in the

Ovalbumin

Xba I

570

mRNA

Coding

Sequences

0

and General

Organization

bp

of the Ovalbumin

Gene

of the Eco RI and Xba I fragments obtained from Barn HI fragment “a” which hybridize to the probe the sequences which code for regions I and II of ov mRNA. (d) Map of Xba I sites around the ov mRNA. (e) General organization of the ovalbumin gene and correspondence with the map of are represented by the heavy lines. Dotted lines linking (e) and (f) indicate the correspondence DNA and in ovds-cDNA. Continuous lines linking (e) and (f) show the correspondence between the mRNA and the corresponding coding sequences m the cellular DNA.

synthesis with no known counterparts in procaryotic cells (the fact that no split genes have been demonstrated in procaryotes does not mean, of course, that they do not exist). New models have been proposed (Berget, Moore and Sharp, 1977; Klessig, 1977; for other references, see Chambon, 1977) to account for the synthesis of a single continuous mRNA from a split gene. The polymerase may transcribe only the messenger-coding regions, jumping over the looped-out intervening sequences. Alternatively, the messenger-coding sequences may be transcribed separately and joined later in an intermolecular ligation event. Finally, the primary transcript may be a precursor of the mature mRNA, and may contain sequences of both the messenger-coding regions and the intervening sequences. Excision of the latter by some type of splicing event would then lead to mature messenger. The precursor to globin mRNA (for references, see Chambon, 1977), in fact, has been shown to contain a full transcript of the globin intervening sequence (Tilghman et al., 1978b), thus demonstrating the validity of the last

model at least for the globin case. The fact that the messenger-coding sequences of the ovalbumin gene are arranged in the same order and relative orientation as in the double-stranded cDNA suggests that this model of transcription is also applicable to the case of the ovalbumin gene. No precursor to ovalbumin mRNA has been detected to date (McKnight and Schimke, 1974). It is interesting to note that the map shown in Figure 13 predicts that the minimum size of such a precursor may be 6000 nucleotides, which is more than 3 times the length of mature ov mRNA. The processing of ov mRNA precursor to mature mRNA could be facilitated by, for example, secondary structures in the precursor RNA at the extremities of the intervening sequence regions. The existence of intervening sequences could thus supply a rationale for the phenomenon of hn RNAs (for references, see Chambon, 1977). It should be pointed out, however, that the amount of DNA present in intervening sequences is still too low to account fully for the unexpected extra DNA in higher eucaryotic cells. Assuming that, on the average, the

Organization 651

of the Chick

Ovalbumin

Gene

relative amount of intervening versus coding sequence would be the same in all chicken structural genes as in the ovalbumin gene, the ovalbumin messenger precursor should be about 3 fold longer than the sum of the coding plus intervening sequences to account for the excess of DNA over that needed to code for the number of products estimated genetically. In no case did we find a different arrangement of the ovalbumin split gene when comparing the chromosomal DNA of cells in which the gene is expressed (for example, hen oviduct) with that of cells in which the gene is not transcribed (for example, erythrocyte). Intervening sequences are therefore not used to switch transcription of the ovalbumin gene on or off in various cell types. In addition, the immediate environment of the ovalbumin gene (as judged from studies of the 18 kb Barn HI and the 9.2 kb Eco RI fragments) is the same in oviduct and erythrocyte cells, indicating that translocation of the entire ovalbumin split gene is probably not involved in the mechanisms leading to its expression during differentiation. Nothing is known about the possible origin of the intervening sequences. They may be descendants of some putative eucaryotic insertion elements (for which circumstantial evidence exists; for references, see Bukhari, Shapiro and Adhya, 1977) inserted into the genome at some stage of evolution to fulfill an unknown function, or they may simply represent sequences which happened to separate regions of the genome which nature chose to link at the level of the messenger to yield a new protein. Possible evolutionary benefits of the split gene arrangement have been discussed by Gilbert (1978). What appears at first sight to be a waste of energy could, in fact, correspond to an increased rate of evolution for eucaryotic organisms. When the three-dimensional structure of the ovalbumin protein becomes available, it will be interesting to determine whether the various coding sequences of the ovalbumin gene code for different domains of the protein. Whatever the origin of the intervening sequences, they do not appear to be repeated in the chicken genome. Using 32P-labeled cloned Eco RI fragments “a” and “b” as probes, we did not find any evidence for the repetition of the intervening sequences elsewhere in the genome (our unpublished results). We cannot, however, rule out at present repetitions of small portions of these sequences, nor the existence of small procaryoticlike insertion elements in the intervening sequences. In contrast, it is interesting to note that some amino acid-coding regions of the ovalbumin gene must be repeated elsewhere in the chicken genome, since digestion of chicken DNA with several restriction endonucleases yields a weakly hy-

bridizing fragment which is not accounted for by the map shown in Figure 13e [2.4 kb Barn HI fragment “b,” see above; Pst I fragment “b” and Hind III fragment “c,” Breathnach et al. (1977); and a 3.5 kb Eco RI fragment barely visible in Figure 1 of Breathnach et al. (1977)]. Preliminary studies on these fragments indicate that at least the fragments Pst I “b” and the Hind III fragment “c” contain sequences which are complementary to the middle part of ov mRNA( defined by the Hinf I sites 2 and 3 of the ov ds-cDNA; see Figure lb). In any case, mapping of restriction sites in these fragments shows conclusively that they cannot contain intact ovalbumin genes (our unpublished results). Whether these repeats have any relationship to the phenomenon of pseudogenes (Ford, 1978) remains to be elucidated. Whatever the possible origin and role of intervening sequences may have been in evolution, they could have important roles in a cascade-type regulation of gene expression at the transcriptional and post-transcriptional levels, especially if they were transcribed. In this respect, it is interesting to note that all six intervening sequences are clustered in the region coding for the 5’ moiety of the ov mRNA. They may contain binding sites for positive or negative control elements, which could have a role in regulating transcription at the initiation level (by long-range interaction along DNA; for reference to a procaryotic example, see Kolata, 1977) or at the elongation level, each intervening sequence adding a control element which might be similar to bacterial attenuators (Lee and Yanofsky, 1977). Since the expression of the ovalbumin gene is under hormonal control in hen oviduct (for references, see Palmiter, 1975), it is certainly possible that hormone-protein complexes may act as transcriptional regulators. Such a multiplicity of transcriptional controls could be postulated to explain the lag phase which occurs during estradiol stimulation, but which is not observed when the ovalbumin gene is turned on by progesterone administration (McKnight, Pennequin and Schimke, 1975; Palmiter et al., 1976). If intervening sequences were transcribed together with the coding sequences to yield an ovalbumin mRNA precursor, their transcripts may be involved in several possible mechanisms which could regulate mRNA maturation. Again, hormone-protein complexes could be involved, either directly by binding to the intervening sequence transcripts, or indirectly by inducing the synthesis of a “specific” splicing enzyme. The genes for conalbumin, lysozyme, ovomucoid and ovoglobulin are under the same hormonal control in the chicken oviduct. Detailed comparison of the structures of these genes should provide clues to the roles, if any, played by intervening sequences in the control of gene expression. We also hope to

Cdl 652

determine whether the egg white protein genes are clustered on a single haploid chromosome, and whether there are some common features in the organization of these genes which could account for their hormonal regulation. It is not impossible that the longer intervening sequences themselves may contain genes, which might even overlap with part of the ovalbumin gene sequences, although in contrast to viruses (Contreras et al., 1977; Sanger et al., 1977), there seems to be little advantage to overlapping genes in systems where the genome size is not limited by the capsid dimensions. In an attempt to answer this point and some of the questions raised above, we are currently sequencing the intervening sequences of Eco RI fragments “a” and “b,” and cloning the 18 kb Barn HI fragment, which contains all the ovalbumin split gene and extensive flanking sequences at its S’and 3’ends. We hope to clarify the mechanisms involved in the control of expression of the ovalbumin gene by using this fragment and some of the new techniques which have been developed to study transcription in test systems such as Xenopus oocytes. We are also currently investigating the structure of the ovalbumin gene in the duck, ostrich and tortoise to see whether the intervening sequences have been as conserved during evolution as the coding sequence. Indeed, such conservation would argue strongly in favor of a functional role for intervening sequences. Finally, we believe that the possibility that intervening sequences could be involved in the higher order folding and domain structure of chromatin (for references, see Chambon, 1977) is improbable in view of their multiplicity and frequent interspersion with the coding sequences.

(1977) and Hinf I according to an unpublished method of et al. Barn HI, Hind Ill, Hph I and Sau 3AI were gifts from Humphries, J. Siimegi, V. Pirotta and J. Sussenbach, tively. All other enzymes were obtained from Biolabs. Other materials were as described by Breathnach et al.

S. Zain Drs. P. respec(1977).

Filter Hybrldlzatlon Probes were nick-translated and hybridized to nitrocellulosebound DNA samples as described by Breathnach et al. (1977). Transfers of DNA from agarose gels to filters were carried out by a modification of the method of Southern (1975). Reverse-Phase Chromatography 50 mg of chicken erythrocyte DNA were digested with Eco RI, treated with proteinase K, phenol-extracted and ethanol-precipitated. The DNA was dissolved in 10 mM Tris-acetate (pH 7.4) containing 1.6 M sodium acetate, and loaded onto a 120 x 2 cm column (Waters Associates) containing RPC5 resin. The resin was prepared (Pearson et al., 1971) by coating polychlorofluoroethylene powder (Voltalef, Kuhlman, France) with methyltrialkyl ammonium chloride (Adogen 484 (Arnaud, France). The elution was performed as described by Hardies and Wells (1976) with an 8 liter linear gradient of degassed sodium acetate (1.6-I .8 M) in 10 mM Tris-acetate (pH 7.4). A Waters Associates model 6000 A pump was used to obtain a 5 mllmin flow rate with a maximum pressure of 50 psi. 12 ml fractions were collected, and aliquots containing 5 pg of DNA were ethanol-precipitated. The DNA was centrifuged and washed with 80% ethanol, and an aliquot was run on a 1% agarose gel, transferred to nitrocellulose filters by the Southern blotting method and hybridized to 32P-labeled Hhaov. The 1.25 kb Eco RI fragment “d” eluted at -1.680 M sodium acetate, and the 2.35 kb fragment “b” at -1.665 M. The 1.75 kb fragment “c” and the 9.2 kb fragment “a” overlapped at-l .6701.675 M, with fragment “c” eluting ahead. To obtain fractions containing only one of these four fragments, pooled fractions of the column were rechromatographed on a second RPC5 column (120 x 0.7 cm) developed in the same manner with a 1.5 liter linear gradient. Fractions containing only one of the four Eco RI fragments with very little hybridizing contamination were precipitated for further analysis. Chromatography of Barn HI digests of chicken DNA was carried out essentially as described above, using a 120 x 0.7 cm column for 5 mg of digested DNA. Acknowledgments

Experimental

Procedures

Extraction of DNA from Agarose Gel Slices Gel slices were subjected to repeated freezing and thawing before incubation at 37°C with a buffer containing 20 mM Tris-HCI (pH 7.5), 1 mM EDTA, 0.4 M NaCl for at least 3 hr. Depending upon the size of the slice, the incubation was carried out in either plastic tips or 10 ml syringes plugged with glass wool. Fluid was collected by centrifugation or by pressure. After phenol-chloroform extraction, the DNA was freed of ethidium bromide and concentrated by several extractions with n-butanol, and ethanolprecipitated. Preparation of Specific Probes Hae A, Hae-Pst or Pst B fragments from a double digest of Hhaov with Hae Ill and Pst I were separated by electrophoresis on a 3.5% polyacrylamide gel. DNA was eluted from the gel slices as described by Maxam and Gilbert (1977), and was further purified on a small DEAE-cellulose column. Elution was performed with 1 M NaCl in 10 mM Tris-HCI (pH 7.9), 10 mM EDTA. The DNA was concentrated by extraction with n-butanol and ethanol-precipitated twice. The specific activity attained by nick translation of the purified fragments (8 x IO7 cpm/pg) was only one third the specific activity which we routinely obtain with Hhaov DNA. Restrlction Endonucleases Eco RI was prepared according

to the procedure

of Somegi

et al.

We thank Miss C. Chanal and Mr. P. Hickel for excellent technical assistance. We are grateful to Drs. J. Sumegi, P. Humphries, V. Pirotta and J. Sussenbach for generous gifts of restriction enzymes. This work was supported by grants to P. C. from the INSERM, the CNRS and the Fondation pour la Recherche Meditale Francaise. R. B. was supported by an EMBO long-term fellowship. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. Received

March

31,1978

References Berget, S. M., Moore, C. and Sharp, P. A. (1977). Spliced segments at the 5’ terminus of adenovirus-2 late mRNA. Proc. Nat. Acad. Sci. USA 74. 3171-3175. Breathnach, R., Mandel, J. L. and Chambon. P. (1977). min gene is split in chicken DNA. Nature 270, 314-319. Bukhari, A. I., Shapiro, J. A. and Adhya. S. L., eds. Insertion Elements, Plasmids and Episomes (New Spring Harbor Laboratory Press). Chambon,

P. (1977).

Molecular

biology

of eukaryotic

Ovalbu-

(1977). York: genome

DNA Cold is

Organization 653

coming 1235.

of the Chick

of age. Cold

Ovalbumin

Spring

Harbor

Gene

Symp.

Quant.

Biol. 42. 1211-

is transcribed within the 15s &globin Acad. Sci. USA 75, 1309-1313.

mRNA

precursor.

Contreras, Ft., Rogiers, R., Van de Voorde, A. and Fiers, W. (1977). Overlapping of the VP,-VP, gene and the VP, gene in the SV40 genome. Cell 12, 529-536.

Weinstock, R., Sweet, R., Weiss, M., Cedar, (1978). lntragenic DNA spacers interrupt the Proc. Nat. Acad. Sci. USA 75, 1299-1303.

Ford, P. (1976). 277, 205-206.

Note Added

Pseudogene

structure

in 5s RNA genes.

Nature

Garapin, A. C., Le Pennec, J. P., Roskam, W., Perrin, F., Cami, B., Krust, A., Breathnach, R., Chambon, P. and Kourilsky, P. (1976a). Isolation by molecular cloning of a fragment of the split ovalbumin gene. Nature 273, 349-354. Garapin, A. C., Cami, B., Roskam, W., Kourilsky, P., Le Pennec, J. P., Perrin, F., Gerlinger, P., Cachet, M. and Chambon, P. (1978b). Electron microscopy and restriction enzyme mapping reveal additional intervening sequences in the chicken ovalbumin split gene. Cell 14, 629-639. Gilbert,

W. (1978).

Why genes

in pieces?

Nature

277, 501.

Hardies, S. L. and Wells, R. D. (1976). Preparative fractionation DNA restriction fragments by reverse phase chromatography. Proc. Nat. Acad. Sci. USA 73, 3117-3121.

of

Klessig, D. F. (1977). Two adenovirus mRNAs have a common 5’ terminal leader sequence encoded at least 10 kb upstream from their main coding regions. Cell 12, 9-21. Kolata, G. B. (1977). Bacterial DNA. Science 198, 41-42.

genetics,

action

at a distance

on

Lee, F. and Yanofsky, C. (1977). Transcription termination at the trp operon attenuators of Escherichia coli and Salmonella typhimurium: RNA secondary structure and regulation of termination. Proc. Nat. Acad. Sci. USA 74, 4365-4369. f&Knight, G. S. and Schimke, R. T. (1974). Ovalbumin messenger RNA. Evidence that the initial product of transcription is the same size as polysomal ovalbumin messenger. Proc. Nat. Acad. Sci. USA 71, 4327-4331. &Knight, G. S., Pennequin, P. and Schimke. R. T. (1975). Induction of ovalbumin mRNA sequences by estrogen and progesterone in chick oviduct as measured by hybridisation to complementary DNA. J. Biol. Chem. 250, 8105-8110. Maxam, A. A. and Gilbert, ing DNA. Proc. Nat. Acad.

W. (1977). A new method Sci. USA 74, 560-564.

Palmiter, Ft. D. (1975). Quantitation of parameters the rate of ovalbumin synthesis. Cell 4, 189-197.

for sequencthat determine

Palmiter. R. D., Moore, P. B., Mulvihill, E. R. and (1976). A significant lag in the induction of ovalbumin RNA by steroid hormones: a receptor translocation Cell 8, 557-572.

Emtage, S. messenger hypothesis.

Pearson, R. L., Weiss, J. S. and Kelmers. A. D. (1971). Improved separation of transfer RNAs on polychlorotrifluoroethylene-supported reverse phase chromatography columns. Biochim. Biophys. Acta 288, 770-774. Sanger, F., R., Fiddes, M. (1977). Nature 265,

Air, G. M., Barrell, B. G., Brown, N. L., Coulson, A. J. C., Hutchinson, C. A., Slocombe. P. M. and Smith, Nucleotide sequence of bacteriophage $X174 DNA. 687-695.

Smith, H. 0. and Birnstiel, M. L. (1976). A simple method restriction site mapping. Nucl. Acids Res. 3, 2387-2398. Southern, E. M. (1975). DNA fragments separated 503-518.

Detection of specific by gel electrophoresis.

for DNA

sequences among J. Mol. Biol. 98,

Sumegi, J., Breedveld, D., Hossenlopp, P. and Chambon, (1977). A rapid procedure for purification of Eco RI endonuclease. Biochem. Biophys. Res. Commun. 76, 78-85. Tilghman, Maizel, G. identified Proc. Nat.

P.

S., Tiemeier, D., Seidman, J., Peterlin, B., Sullivan, M., V. and Leder, P. (1978a). Intervening sequence of DNA in the structural portion of a mouse Pglobin gene. Acad. Sci. USA 75, 725-729.

Tilghman, S., Curtis, P., Tiemeier, D., Leder, P. and Weissmann, C. (1978bl. The intervanina sec~uence nf I tnnn~wa R-nlnhin nann

Proc.

Nat.

H. and Axel, R. ovalbumin gene.

in Proof

Comparison of the DNA sequence of the 2.35 kb Eco RI fragment b (R. Breathnach, C. Benoist and K. O’Hare, manuscript in preparation) with the sequence of ovalbumin mRNA (L. McReynolds, B. W. O’Malley, A. D. Nisbet, J. E. Fothergill, B. Gival, S. Fields, M. Robertson and G. G. Brownlee, Nature, in press) indicates that the DNA sequence coding for the first 45 nucleotides of the RNA is not present in this fragment. There is, therefore, at the 5’ end of ov mRNA, a short leader sequence which could not be detected by electron microscopy (Garapin et al., 1976b) and which is located in its entirety outside the amino acid coding region.

Organization of coding and intervening sequences in the chicken ovalbumin split gene.

Cell, Vol. 14, 641-653, July 1976, Copyright 0 1976 by MIT Organization of Coding and Intervening in the Chicken Ovalbumin Split Gene J. L. Mand...
10MB Sizes 0 Downloads 0 Views