J. Mol. Biol. (1978) 121, 113-137

Nucleotide Sequence of the Promoter- Operator Region of the Tryptophan Operon of Escherichia coli G. N. BENNETT, M. E. SCHW~,INGRUB~.R~, K. D. BROW~: C. SQUIRES§ AND C. ¥'ANOFSKY

DeIoartment of Biological Sciences StanJ'ord University Stanford, Calif. 94305, U.S.A. (Received 22 June 1977, and in revised ]brm 5 January 1978) The nucleotide sequence of the region preceding the transcription initiation site of the tryptophan (~rp) operon of Escherichia cell was determined by RNA and DNA sequencing techniques. RIqA complementary to this region was synthesized in rive or in vitro on ~80trp transducing phage DNA and isolated by hybridization to the DNA of other transducing phage which contain only the pre-initiation region in common. The RNA was sequenced by analysis of complete RNase A and RNase T1 digestion products and overlapping these oligonueleotides with partial digestion products generated by RNase T1 and carboxymethylated RNase A. DNA sequencing of 5' end-labelled restriction fragments containing the pre-initiation region was carried out using the hydrazine/dimethyl sulphate procedure of Maxam & Gilbert ( 1977). Genetic and biochemical studies indicate that the region analyzed contains the trp promoter and the $rp operator. The DNA segment implicated in operator function exhibits twofold rotational symmetry and immediately precedes the transcription start-site. Sequence similarities between the trp promoter region and other promoters are discussed. 1. I n t r o d u c t i o n Initiation of transcription on the t r y p t o p h a n (trp) operon of Esche~ichia coli is regulated b y represser-operator interaction (Cohen & Jacob, 1959; Rose et al., 1973; McGeoch et al., 1973; Bennett et al., 1976). Studies in vitro have suggested t h a t the mechanism of T r p represser action is the exclusion of R N A polymerase from a common binding site on the operon {Squires et al., 1975). To define the sites of R N A polymerase and T r p represser action we have determined the nucleotide sequence of the region of the E. cell trp operon preceding the site of transcription initiation. This region was recognized b y reference to the previously determined 5' terminus of t~T messenger R N A of E. coli (Squires et al., 1976). I n the following paper we present the nucleotide sequence of the corresponding region of Salmonella typhimurium (Bennett et al., 1978). Since it was known from other studies t h a t the Trp repressors and trp operators of E. cell and S. typhimurium interact identically whether homologous or heterologous combinations are examined (Manson & Yanofsky, 1976), we t Present address : Department of Microbiology, University of Bern, Bern, Switzerland. :~Present address: School of Biological Sciences, University of Sydney, N.S.W. 2006, Aust~ralia. § Present address: Department of Biological Sciences, Dartmouth College, Hanover, N.H. 03755, U.S.A. 113 0022-2836/78/1212-1337 $02.00]0 © 1978 Academic Press Inc. (London) Ltd. 5

114

G . N . B E N N E T T E T AT,.

thought t h a t the identification of conserved and non-conserved sequences in the trp promoter-operator regions of the two organisms would be informative. I n the thh'd paper of this series (also following) we have analyzed the regions of the operon essential for RI~A polymerase interaction (Brown et al., 1978). I n the fourth paper of the series (Bennett & ¥anofsky, 1978) we present studies of operator-constitutive and deletion mutants of the trp operon which allow us to locate the operator sequence and thereby demonstrate t h a t I~NA polymerase and Trp represser do interact with the same region of the operon. I n this paper we present the nuclcotide sequence of the 116 base-pairs preceding the transcription initiation site of the trp operon of E. coli. The sequence was determined from analyses of the read-through trp R N A transcript produced b y transscription initiated at the phage promoters of trp transducing phage (Franklin, 1971 ; I m a m o t o & Tani, 1972; Bennett et al., 1976), and by sequence analyses (Maxam & Gilbert, 1977) of DNA fragments containing the region of interest.

2. Materials and Methods (a) Reagents and preparations Materials were obtained from the following sources: Cellogel strips from Kalex, Manhasset, l~. Y. ; acrylamide, bisacrylamide, N,N,N',/V'-tetramethylethylenediamine, agarose and ammonium persulphate from Biorad Laboratories, Richmond, Calif. ; [32P]phosphate and (~-32P)-labelled nucleoside tripl~osphates from New England l~uclear Corporation, Boston, Mass.; B-6 nitrocellulose ' membrane filters from Sehleicher-Schuell and Co., Keene, N. H. ; R P / R or X/R X-ray film, diethylpyrocarbonate, and hydrazine from Eastman Kodak, Rochester, N. Y. ; dimethyl sulphate from Matheson, Coleman and Bell, Norwood, Ohio; Tris base and urea from Research Plus Laboratories, Denville, N. J. ; restriction endonuclease EcoRI from Miles Laboratories, Inc., Elkhart, Ind. ; restriction endonucieases HpaI, HpaII, HhaI and HincII from New England Bio Gabs, Beverly, Mass. HinfI was a gift from F. Lee and K. Bertrand. One unit of a restriction enzyme is defined as the amount of enzyme required to degra~ie 1.0 ~g of lambda phage DNA in 1 h at 37°C. E. coli alkaline phosphatase (BAPF, Worthington Biochemical, Freehold, N. J.) or calf intestinal alkaline phosphatase (Boehringer-Mannheim, Indianapolis, Ind.) was used. One unit of phosphatase liberates 1 ~mol of p-nitrophenol from p-nitrophenylphosphate/ rain at 25°0 (Garen & Levinthal, 1960). The S-30 extracts were prepared by H. Zalkin and L. J. Kern. Bacteriophage T4 polynucleotide ldnase was provided by A. Maxam. [~.32p]. ATP (approx. 1000 Ci/ramol) was prepared by a modification (Maxam & Gilbert, 1977) of the method of Glyrm & Chappell (1964). Plasmid DNA was prepared by the method of Hershfield ~ al. (1974). DNA template was prepared as described by Rose et al. (1973). Strand separations of phage DNA were carried out as described by Shapiro et al. (1969). Other materials were prepared, or purchased, as described by Squires et al. (1976). (b) Bacterial strains, phage~ and plasmids Strain W3110 trpE9829 (Yanofsky et al., 1971) was used for phage infection experiments in which promoter-operator RI~A was labelled in vivo. The structures of the trp transducing phages used in this study are shown in Fig. 1. Strains bearing plasmids pVH 151 or pVH 153 (Helinsld e$ al., 1977) were used as sources of Cole 1 trp and miniColE 1 trp plasmid DNA, respectively. (c) Preparation of 32P-labelled promoter-operator R N A in rive Strain W3110 trpE9829 was infected with phage ~80trpEDCBA190 in the presence of a2PO4. The procedure used in the labelling and RNA extraction was that described by Squires et al. (1976). The 3uP-labelled RNA was subjected to 3 successive hybridizations in

E . C O L I trp P R O M O T E R - O P E R A T O R

115

I 8peOfrpEDCSA/90

PO

N ~- - - -

~-

t

L i

~teOtrpL~LO/O2 ~ . . . . . . . . .

I

I

~---

)~trpEDI

x,beo~a,~)~¢~80trpED46

C

D I

B

I

I

I

A

t

!

I

(

¢BOtrp,'ILC/4/5 ~ - ~eOtrp/1L.CI45-2

E I

)

!

!

I

~- . . . . . F.

.

.

.

.

~. . . . . . . . . . .

FIG. 1. Genetic constitution of trp transducing phages. The simplified nomenclature of each phage follows its full description. ~80trpP +0 +L +E +D +C +B +A +190 A [tonB] = ( ~ 8 0 t r p E D C B A 190) ¢80trpP +0 +A L D I 0 2 C +B + A [trpA 905tonB] = (¢80trpA LD102) ~80trpP+O + A L C 1 d 1 5 B + A[trpA9OStonB] = (¢80trpzJLC1415) ¢80trpP+O + A L C 1 4 5 - 2 B + A[trpA9OStonB] = (¢80trpALC145-2) ~trpP+O+ L + E + D + ~ (~trpED1) i~h-~ 8° trpP+O+ L +E + A [ t r p D C B A t o n B ] = (h~80trpE) iXh-V s° t r p L +E+ D+ 46 A [ t r p C B A t o n B ] = ( ~ 8 0 t r p E D 4 6 )

The ¢80trp phages exhibit read-through transcription initiated at the phage gene N promoter (Frank]in, 1971). Broken lines indicate phage DNA, solid lines designate bacterial DI~A, and bars denote deletions in and beyond the trp operon. The leftward fusion points of bacterial and phage segments are located about 112 base-pairs to the left of the trp operon transcription-initiation site in ¢ 8 0 t r p A L D l 0 2 and approximately 174-2 base-pairs to the left of the transcription-initiation site in h~80trpED46. The phage-bacterial fusion points of the other phages with t r p P +0 + are not known but they are to the left of the ~ 8 0 t r p A L D l 0 2 fusion point. Phages with deletion fusion points in the trp leader region are as follows: ¢ 8 0 t r p A L C l d 5 - 2 contains only the first base-pair of the trp leader region, and ~80trpzJLDl02 and ¢80trpALC1415 contain the first 25 and 38 basepairs, respectively (Squires et al., 1976). All phages shown above have a functional trp operator and promoter except h¢80trpED46 (Franklin, 1971). order to isolate t h e P~I~A s e g m e n t c o m p l e m e n t a r y to t h e trp p r o m o t e r - o p e r a t o r region. These h y b r i d i z a t i o n s were first to t h e / - s t r a n d of ; t ~ 8 0 t r p A L D l 0 2 , second to t h e / - s t r a n d of A t r p E D 1 , a n d t h i r d to t h e / - s t r a n d of ;~¢80trpE. T h e h y b r i d i z a t i o n p r o c e d u r e has been described p r e v i o u s l y (Squires et al., 1976). (d) P r e p a r a t i o n of p r o m o t e r - o p e r a t o r R 1 V A in v i t r o T h e r e a c t i o n m i x t u r e was essentially t h e s a m e as t h a t described b y Zalkin et al. (1974) a n d c o n t a i n e d 44 mM-Tris-acetate (pH 8.2), 12.8 raM-magnesium a c e t a t e , 7.4 raM-calcium a c e t a t e , 27 m ~ - a m m o n i u m a c e t a t e , 55 mM-potassinm a c e t a t e , 1.4 mM-dithiothreitol, 21 raM-sodium p h o s p h o e n o l p y r u v a t e , 0.1 m M - t r y p t o p h a n , 0.22 m ~ of t h e o t h e r 19 a m i n o acids, 80 ~g t R N A / m l , 27 ~g pyridoxine-HC1/ml, 27 ~g I~TADP/ml, 27 ~g F A D / m l , 27 ~g calcium l e u c o v o r i n / m l , 11 ~g p - a m i n o b e n z o i c acid/ml, ancl 40 ~g p h a g e D N A t e m p l a t e / m l . T h e nucleoside t r i p h o s p h a t e c o n c e n t r a t i o n s were as follows: t h e (~.32p).labelled nucleoside t r i p h o s p h a t e was a t 40 to 50 ~ a n d t h e o t h e r nucleoside t r i p h o s p h a t e s were at 0.55 raM. T h e S-30 p r e p a r a t i o n ( Z u b a y et al., 1970; Zalkin et al., 1974) was a p p r o x . 3 0 % of t h e t o t a l v o l u m e of t h e reaction. T h e r e a c t i o n was i n c u b a t e d for 1 h a t 34°C, t h e n 10 ~l of diethylp y r o e a i b o n a t e , 250 ~g of carrier t R N A a n d 0.1 vol. 1 M-sodium a c e t a t e (pH 4-5) were a d d e d and t h e m i x t u r e e x t r a c t e d twice w i t h p h e n o l s a t u r a t e d w i t h 0-1 H - s o d i u m a c e t a t e (pH 4.5). T h e R N A was r e c o v e r e d b y p r e c i p i t a t i o n w i t h ethanol, dissolved in 10 mMT r i s . H C 1 (pH 7.3), 0.33 M-KC1, 1 m ~ - E D T A ( T r i s / K / E D T A buffer) a n d filtered t h r o u g h B-6 m e m b r a n e filters. A f t e r filtration, calcium chloride a n d m a g n e s i u m chloride were a d d e d to a concn of 12.5 mM a n d t h e m a t e r i a l digested w i t h i o d o a c e t a t e - t r e a t e d DI~aseI ( Z i m m e r m a u & Sandeen, 1966) at a concn of 20 ~g/ml a t 37°C for 20 rain. I ) N a s e I was i n a c t i v a t e d a n d r e m o v e d b y t r e a t m e n t w i t h p h e n o l a t p H 4"5 (as above) a n d t h e I ~ N A p r e c i p i t a t e d w i t h ethanol, collected, d r i e d i n vacuo, a n d redissolved in t h e h y b r i d i z a t i o n

116

G.N. BENNETT

ET AL.

solution. Hybridizations were routinely carried out in T r i s / K / E D T A buffer (16 h, 55°C) with or without 20 to 40~/o formamide (24 h, 37°C), with single-stranded DNA at a concn of 50 to 100 ~g/ml. D N A - R N A hybrids were collected on B-6 membrane filters and treated with RNase TI. The RNase T1 was then inactivated by iodoacetate treatment. Details of these procedures have been described by Squires et al. (1976). R N A was eluted from the filter b y heating the filter plus 250 ~g carrier t R N A in 1.5 ml 5 mM-Tris.HCl (pH 7.3), 1 mM-EDTA for 10 rain at 85 to 95°C. The liquid was removed, chilled, and CaC12 and MgC12 were added to concentrations of 12.5 mM. The solution was treated with DNase I (20 ~g/ml) for 20 m i n at 37°C. One-tenth vol. of 1 M-sodium acetate (pH 4.5) was added and the solution extracted with phenol saturated with 0.1 M-sodium acetate (pH 4.5). The I~NA was then precipitated with ethanol. After drying in vacuo, the R N A was taken up in buffer for another hybridization. The final product was washed with 75% ethanol after precipitation with ethanol.

(el Get electrophoresis of R N A Eleetrophoresis was carried out as described by B e r t r a n d et al. (1976). The 10% polyacrylamide slab gel (25 cm long) contained 7 M-urea, 90 mM-Tris-borate (pH 8"3) and 2.5 mM-EDTA. The r u n n i n g buffer did not contain urea. Eleetrophoresis (200 V) was carried out for 4 to 6 h. R N A bands were located b y autoradiography a n d eluted b y ineubation~in 1.5 ml T r i s / K / E D T A buffer containing 250 ~g carrier t R N A for 10 to 16 h at 37°C. The solution containing the R N A was then extracted with phenol (pH 4-5). The R N A was precipitated with ethanol, washed with 75~/o ethanol and dried. (f) R N A fingerprinting and sequence analysis procedures The 2-dimensional fingerprinting system (Brownlee & Sanger, 1969; Barrell, 1971) was used to separate the complete RNase T1 or RNase A digestion products. The conditions of RNase T1 and RNase A digestion, electrophoresis on Cellogel strips at p H 3-5, transfer to PEI-cellulose thin-layer plates (Southern, 1974), a n d homochromatography in the 2nd dimension using Homomix C-10 (Barreli, 1971) have been described b y Squires et al. (1976). After the radioactive spots were located b y autoradiography, the thin-layer sheets were washed free of urea with water and the spots cut out with a soldering iron. Spots were eluted b y wetting the P E I - p l a t e with ethanol and scraping the PEI-cellulose into a small plastic tip plugged with W h a t m a n 3 MM paper. E t h a n o l (2 ml) was washed through the tip and then the labelled material was eluted by adding 0.1 ml of 30~o trie t h y l a m m o n i u m carbonate and centrifuging the liquid through the tip into a small glass tube. The eluted product was subjected to several cycles of water addition (0-1 ml) and drying in a desiccator to remove all of the volatile buffer. The spots were then taken up in water a n d lyophilized on a paraffin sheet. RNase A and RNase T1 secondary digestions of R N A eluted from homoehromatograms were carried out as described b y Barrell (1971). RNase U2 was used at a conch of I0 units/ml a n d digestions were incubated for 6 to 12 h at 37°C. Alkaline hydrolyses were carried out with 0"5 M-NaOH for 16 h at 37°C. RNase A, RNase T 1 and RNase U2 digests were fractionated on D E A E - p a p e r in pyridine acetate {pH 3"5). Products of the alkaline degradations were separated by electrophoresis on W h a t m a n 3 MM paper in pyridine acetate (pH 3-5). The products were identified b y their positions relative to known nueleotides a n d oligonucleotides (Barrell, 1971; Brownlee, 1972; Squires st al., 1976). (g) Partial digestion of promoter-operator 1¢1VA with C_~I-R1Vase A and RiVase T I Partial digestion of in vivo labelled R N A was carried out with CM-RNase A, carboxymethylated on the c-amino group of lysine at position 41, as described b y Heinrikson (1966). Separation of partial digestion products and analysis of the oligonucleotides were as described b y Squires et a/. (1976). For each CM-RNase A partial product one half was digested with RNase A a n d the other half with RNase T1 a n d the complete digestion products were separated b y eleetrophoresis on D E A E - p a p e r in 7% formic acid. Each of the products of the RNase A and RNase T1 digestion was eluted and digested with the

E. C O L I trp P R O M O T E R - O P E R A T O R

117

other enzyme. These secondary digestion products were identified by their mobility on D E A E - p a p e r at p H 3.5 (Barrell, 1971; Brownlee, i972; Squires et al., 1976). RNase T~ partial digestions were performed on labelled R N A plus 100 ~g carrier t R N A with 0.05 unit of enzyme in 5 pl of 10 mM-Tris.HC1 (pH 7-5), 10 mM-MgCI2. I n c u b a t i o n was at 0°C for 15 to 50 min. The products were separated by the standard R i a s e T1 fingerprint procedure (Squires et al., 1976) with the following modifications: (i) electrophoresis at p H 3"5 (pyridine/acetate) was performed for 35 min at 5000 V and the transfer was to two 20 cm × 20 cm P E I thin-layer plates by the method of Southern (1974); (ii) homochromatography mix C-2 (Barrell, 197 l) (a 2-min hydrolysate of 5~/o yeast RNA) was used in the 2nd dimension. Oligonucleotides were located b y autoradiography and eluted as described above for complete digestion products. The radioactive oligonucleotides were treated with RNase T~ and in some cases separately with RNase A, a n d the products fingerprinted in the standard P E I system (Squires et al., 1976). The products were identified as specific RNase T~ or Rl~ase A oligonucleotides b y their position on the plate or b y subsequent analysis. (h) Isolation of unlabelled D N A restriction fragments DNA of plasmid pVH151 (550 ~g) was digested with E c o R I (1000 units) a n d H p a I (1000 units) in 10 mM-Tris (pH 7-4), i0 mM-MgC12, 50 mM-KC1, 1 mM-dithiothreitol and bovine serum a l b u m i n (60 ~g/ml) in a vol. of 0.5 ml for 2 h at 37°C. The fragments were separated by electroph oresis on 0.8 % agarose gel (Selker et al., 1977). The bands containing DI~A of molecular weight 2.5 × 108 and 1.6 × l0 s (Fig. 8) were excised and the DNA was recovered from the gel by extrusion b y high-speed centrifugation (Brown et al., 1978). The small D N A fragment shown in Fig. 9(b) was isolated b y polyaerylamide gel electrophoresis of a combined H p a I I a n d E e o R I digestion of plasmid pVH153. The preparation of this fragment is described elsewhere (Brown et al., 1978). The D N A fragment was eluted from the gel b y the technique described b y Maxam & Gilbert (1977). (i) End-labelling of restriction fragments The procedure described b y Maxam & Gilbert (1977) was used for phosphatase treatm e n t and phage T4 polynucleotide kinase labelling of the 5' ends of DNA fragments. (j) Digestion and isolation of labelled restriction fragments Labelled DNA was dissolved and digested with 20 to 60 units of the appropriate restriction enzyme in a total vol. of 0-1 ml of 10 mM-Tris.HC1 (pH 7.4), 10 mM-MgC12, 25 mMKC1, 1 mM-dithiothreitol containing 60 ~g bovine serum a l b u m i n / m l for 2 h at 37°C. The mixture was heated for 5 rain at 65°C and then chilled on ice. Ten ~l of dye solution (50~/o glycerol, 0.05 bromophenol blue, 0"05~o xylene cyanol F F ) was then added. The sample was analyzed b y electrophoresis on a 7 ~o polyacrylamide gel (25 cm) in 90 mM-Tris-borate (pH 8"3), 1 mM-EDTA buffer (Tris/borate/EDTA, see Maniatis et al., 1975a), for 4 to 6 h at 200 V. The bands were located b y autoradiography, excised and eluted in the presence of 50 ~g carrier tRI~A in 0.5 M-ammonium acetate, 0.01 M-magnesium acetate, 0.1% sodium dodecyl sulphate, 0.1 mM-EDTA as described b y Maxam & Gilbert (1977). The eluate was then extracted with phenol saturated with 0.01 M-Tris-HCl (pH 7.4), 1 mMEDTA, and then extracted with ether. The DI~A was then precipitated with ethanol. After collecting the pellet b y centrifugation, it was repreeipitated, washed with 95~o ethanol, and dried. (k) D N A sequence analysis The partial reaction system of Maxam & Gilbert (1977) was used. I n particular, the 4 reaction mixtures and workups were : A, strong cleavage at adenine, weak guanine cleavage (HC1 t r e a t m e n t ) ; G, the alternative 1 M-piperidine cleavage reaction; C, the hydrazine reaction, carried out in the presence of 1 M-NaCI; T, the standard hydrazine reaction. A 1 M-piperidine t r e a t m e n t was used for C and T cleavages. The sequencing gel measured

118

G.N.

BENNETT

ET

AL.

2-5 mm X 37 em X 26 cm and contained 20~o acrylamide in 7-M-urea, 50 mM-Trisborate (pH 8.3), 1 mM-EDTA, 0.67 mg ammonium persulphate/ml. Electrophoresis was carried out at 600 V for 9 to 50 h. After eleetrophoresis the gel was wrapped with Saran Wrap and exposed to Kodak X/R X-ray film at --20°C. Details of the DNA sequencing procedure are given in Maxam & Gilbert (1977). (1) Preparation of the D N A fragment labelled at the HinclI site The HpaII-HpaII fragment illustrated in Fig. 9(b) (approx. 5 ~g) was digested with HincII and phosphatase under the following conditions: the DNA fragment in 0.1 ml of 10 mM-Tris.HC1 (pH 7.4), 10 m~1-MgC12, 25 mM-KC1, 1 m~.~-dithiothreitol, 60 ~g bovine serum albumin/ml was digested for 1 h at 37°C with 12 units of HincII endonuclease. Phosphatase (0.05 unit) was then added and the reaction incubated at 37°C for an additional 1 h. EDTA was then added to 20 mM and the solution diluted to 0.5 ml with l0 mMTris. HC1 (pI-I 7.4), 1 m~-EDTA. I t was then extracted with phenol and precipitated with ethanol as described above in the section on end-labelling of restriction fragments. The mixture of DNA fragments resulting from the HincII digestion was labelled at the 5' ends as described above. After precipitation and dr3dng of the 5'-terminally labelled DNA fragment mixture, it was taken up and digested ~dth HinfI endonuclease as described in an earlier section. A 7% Tris-borate/EDTA polyaerylamide gel (Maniatis et al., 1975a) was used to separate the digestion products. The HinfI-HincII 250 base-pair fragment (Fig. 9(b)) was located by autoradiography, eluted, and prepared for sequencing as stated above.

3. Results (a) R N A sequence analysis (i) Preparation o / t r p promoter-operator R N A Read-through transcription from the N operon of various ¢80trp P+ 0 + transducing phage (Fig. 1) was used to generate R N A complementary to the h T promoteroperator region. The R N A was labelled with 321) either in vivo during infection of cells with ¢80trp phages (Squires et al., 1976) or by transcribing the isolated ¢80hT phage DNA in vitro in a cell-free transcription-translation system (Zubay et al., 1970; Zalkin et al., 1974). The segment of the read-through transcript complementary to the trp promoter-operator region was isolated by successive hybridization to /-strand DNA from appropriate ¢80 and Atrp transducing phages (see Materials and Methods). The 5' end of the trp promoter-operator R N A was specified by the phage-bacterial fusion point in ¢80trpziLDl02 (Fig. 1) which has the shortest bacterial sequence to the left of the trp promoter of the phages in our collection. I t was used to set the leftward extent of the segment of R N A obtained, whether it was employed as template D N A or as the source of DNA for hybridization. To specify the 3' end of the trp promoter-operator RNA, we used trp internal deletion mutants which had leftward deletion endpoints within the lrp leader region (Bertrand et al., 1976). trp promoter-operator R N A of three types was prepared in vitro (Fig. 2). I n each case the R N A was labelled separately with each of the four (~-32P)-labelled nucleoside triphosphates. Type 1: trp promoter-operator R N A prepared using ¢80trpLDl02 DNA as template and isolated by hybridization to htrpED1 and ¢80trpALC145-2 DNAs. The 3' end of this R N A is defined by the 145-2 deletion endpoint in the trp leader region. Electrophoresis of this R N A on a 7 M-urea/polyacrylamide gel is shown in Figure 3. The dark band was estimated to be approximately 110 to 125 nucleotides in length based on its mobility relative to dye markers (Maniatis et al., 1975a) and to

E. C O L I trp P R O M O T E R - O P E R A T O R

119

trp promoter-operotor RNA Typel t 14o-t 3- |29o -t to-t t5 -t t~s- t 3 - t s - t z3-t I0-t 22-t t2 -t 2-t 5 - t 4 o - t z 4 - t t7 -t t9-tg-tso Type 2 ttdQ--t3--tZOo-tto-tts-tts--ts--tG--t2~s-tto--t22-ttz--tE-ts--tt4 o --tz4-tt7 - t t e - t e - t s a - t t 4 -tte-ttt-tso Type 3 t IS - t 3 - 1 2 0 o - t 19 - t I,~ - 113 - t 3 - t 6 -

t23-1

t0-t22 -

t 12 - | 2 - t 5 - 1 4 o - t 2 4 -

t 17 - t

19 -19-t6b

FIG. 2. The RNase Tx oligonuclootides present in the 3 types of in viSro-prcpared trp promoteroperator R N A are shown in the order in which they occur. The positions occupied b y oligonucleotidc t~ (G) are omitted from the diagram. RNase T~ oligonucleotides spanning the p h a g e bacterial fusion point, t14~, a n d the endpoint of the trpALDl02 deletion, tea, a n d trpzlLC145 deletion, tsb, are underlined.

various trp leader RNA fragments (Bertrand et al., 1977 ; Bro~a et al., 1978).' Type 2 : trp promoter-operator RNA prepared using ¢80trpALDl02 DNA as template and isolated via hybridizations to ~ttrpED1 and ¢80trpALC1415 DNAs. The 3' end of this RNA is established by the trpLDAl02 deletion endpoint in the leader region. Upon elcctrophoresis as in Figure 3, its size was estimated to be 150-4-10 nucleotides. Type 3: this RNA was prepared using DNA of ¢80trpALC145-2 as template and hybridization to ~trpED1 and ¢80trpA LDI02 DNAs. This RNA differed from type 1 RNA only at its termini (Fig. 2).

(ii) Analysis of RNase T1 and RNase A digests of trp promoter-operator RNA

trp promoter-operator RNA of each type was digested with RNase TI and RNase A and fingerprinted. Examples of RNase T 1 fingerprints of type 1 RNA are shown in Figure 4. The l~Nase A fingerprints of the same sample are shown in Figure 5. An RNase T 1 fingerprint of type 2 RNA is shown in Figure 6. Several RNase T1 oligonucleotides, t14, tls, tl~, and the trpALDl02 fusion oligonucleotide ts~ correspond to the 5' portion of the trp leader RNA sequence (Squires et al., 1976). RNase A and RNase T 1 fingerprints representative of type 3 trp promoter-operator RNA are displayed in Figure 7. RNase T 1 oligonucleotides present in type 3 RNA which are not present in type 1 RNA are t16 and teb, while t14a and tsa are absent (Fig. 2). The major difference in the RNase A fingerprint of type 3 RNA compared to that of type 1 RNA (Fig. 5) is the appearance of Psa" The analysis of each RNase T1 oligonucleotide was carried out by digestions with alkali, RNase U9 and RNase A as appropriate. The data obtained and nearestneighbour information permitted the determination of the sequence of most of the oligonueleotides (Table 1). From the information obtained by RNase T1, alkali digestion, and nearest-neighbour data, the sequence of each of the RNase A oligonueleotides was also determined (Table 2).

120

G.N.

BENNETT

ET

AL.

F*o. 3. Polyacrylamide gel electrophoresis of trp promoter-operator RNA. The R N A (type 1) isolated as described in Materials and Methods was electrophoresed on 10% Tris-borate/EDTA/ polyacrylamide slab gels (Maniatis et al., 1975a) containing 7 M-urea. The autoradiographs of 2 separate gels are shown. The left gel is [a-a2P]GTP-labelled RNA and the right is [c¢-32P]UTPlabelled RNA. The origin is marked by O and the position of the xylene cyanol F F dye marker is indicated by the X. The GTP gel was electrophoresed for a longer time t h a n the U T P gel. I n each case a dark band appears with a mobility of approximately 0.58 relative to the xylene cyanol FF. Size estimates based on mobilities relative to dye markers (Maniatis et al., 1975a), and comparisons with mobilities of various trp m R N A leader segments (Bertrand et al., 1977; Brown et al., 1978) indicated t h a t the dark band, the trp promoter-operator R N A of type 1, was 110 to 125 nucleotides long.

(a)

(b) Fro. 4. RNase T1 fingerprints of trp promoter-operator type 1 RNA. The 2-dimensional separation was generated by electrophoresis at p H 3.5 on Cellogel from left to right and by homochromatography (mix C-10; Barrell, 1971) on PEI-eeUulose thin-layers from bottom to top. The mobilities of various oligonucleotides in this system have been discussed previously (Squires ¢t al., 1976). The oligonueleotides are numbered as in Table 1. (a) An Rl~ase T1 fingerprint of the [u-32P]GTP-labelled gel band of Fig. 3. (b) An RNase T~ fingerprint of the [~-a2P]UTP-labelled gel band of Fig. 3.

122

G.N.

BENNETT

ET

AL.

(a)

(b) F~O. 5. RNase A fingerprints of trp p r o m o t e r - o p e r a t o r type 1 RNA. The 2-dimensionaI fingerp r i n t was generated as described in t h e legend to Fig. 4. The oIigonucleotides are n u m b e r e d as in Table 2. (a) RNase A fingerprint of t y p e 1 R N A eluted from the polyaerylamide gel (Fig. 3) labelled with [~-s2P]GTP. (b) R N a s e A fingerprint of type 1 R N A eluted from the polyaerylamied gel shown in Fig. 3, in which the R N A was labelled with [~-s2P]UTP.

E. COLI trp P R O M O T E R - O P E R A T O R

123

F m ~ 6. RNase T, fingerprint of trp promoter-operator type 2 RNA. The same fingerprinting system as in Fig. 4 was used. The RNase T1 oligonucleotides are numbered as in Table 1. t14 also contains tl4a (see Fig. 4 and Table 1) which migrates at the same position. The method of isolation of the RNA is given in Materials and Methods. The RNA sample was labelled with [~-32P]GTP.

(iii) Identifw,ation of the 5' and 3'-terminal oligonucleotides of trp promoter-operator

R N A of the three types The ends of trp promoter-operator RNA could be identified b y comparing the RNase T1 and RNase A oligonucleotides contained in RNA of the three types described above. The 3' end of type 2 RNA could be identified easily by the presence of oligonucleotides characteristic of the trp leader region (Fig. 6). RNA initiated at the trp promoter in strain trpzJLDl02 and isolated by a hybridization procedure similar to t h a t used here, contains the first 25 nucleotides of the trp leader region and includes the RNase TI fusion oligonucleotide A-C-G, numbered tsa in Figure 6 (Squires et al., 1976). Type 1 RNA (Fig. 3), isolated by hybridization to DNA of ¢80trpALC145-2, a trp transducing phage which contains less of the trp leader region than ¢80trp,4LDl02 (Bertrand et al., 1976), did not contain the RNase T1 oligonucleotide t6a. Since the remaining oligonucleotides of type 1 and type 2 were identical and the size difference between the two RNAs can be accounted for b y the absence of RNA corresponding to the trp leader region of trpzJLDl02, it was concluded t h a t the approximately 110 to 125 nucleotides of type 1 RNA Were 5' to the trp leader sequence. The RNase T1 oligonucleotide which overlaps the initiation site of trp m R N A synthesis would be required to contain the 5' terminus of the t~T leader, AAG with nearest-neighbour U (Squires et at., 1976). The only RNase T1 oligonueleotide of type 1 or type 2 RNA which fits this sequence is tsa, C-A-A-G[U] (Table 1). This alignment is

.

A G U

A G A

14a l

16

19

18

17

15

A

U C

13 14 l

11 12

U G C U

C C A A C

G C A U U

(CU)A

AAAAAG A G AG

AG

AAG

AAU

AG

U AC

U G C AAAU

AAG AC

C AC

C AG U U

U A A U A U

A

C

U A G

U AAC U AAAAAG AAC U

C

AU AAU

C

U AAAU G

G G C AAG

G G

A U

G

G

G

U~A

(CU)A

A (CU)A

UA A UG

A

A

G

A G U

C A

G G G

AAC

AAC

U

G AC U U

AU

AC G U

C AC AG

AG

UA CG

U C G

C G

C

U

U

A

U C

G

U C G

G

U AAC AG

AG AAO

AAU AU G C

U U G

U C AU AAAU

AAG

U C G

IIUA G

(CU)A

(CU)A G

A UG

A

G

R N A labelled with GTP R N A labelled with ATP R N A labelled with CTP R N A labelled with U T P Products identified Products identified Products identified Products identified upon digestion with upon digestion with upon digestion with upon digestion with N a O H R N ~ e A RNase U2 E a O H R N a s e A RNase U~ N a O H RNase A RNase U2 NaOH RNase A RNase U2

10

6b e 8a 9

6a o

6

1 3 4 4a 5

nO.

Oligonueleotide

Analysis of RNase T1 oligonudeotides

TABLE 1

UUAACUAG[U]

UAAAAAG[G]k

AACUAG[U]

GUCAAG[G]J

ACAAUG[U]

CUUGAG[G]

UXTUUUUG[C]h UUCACG[U]

UAUCG[A] AAAUG[A]

U[UC] UG[G] r

CCG[A] ~ ACG[A] GAG[C] CAAG[U]f UACG[C]

CUG[U]o

AG[G] b DUG[A]

CG[G] •

G[G] G[A] G[C] G[U]

Sequence deduced

U

C G

C

22

23

24

C

AAC G

U

C

UCG

CG

A U C G

A U C G A U C

C

C AC AU AAC U C G AC AAU

AAAU G C

C

CA

(U3C)G CA UA

CA UA

CA

A U

A U

C A U U

AC AU AAU

AAC AU AC

AC C U U

(CU)A UCG A

(CU) A A

(UaC)G

A U

A

C A U

C G

AU AAU

AU

AAAU AU C

AC G

(CU)A UUA A

(CU)A CA

UA A (U3C)G

ACAAUUAAUCAUCG[A]

ACAUCAUAACG[G]

CAAAUAUUCUG[A]m

CACUCCCG[U] z

3MM paper in pyridine/acetate (pH 3.5). Products of RNase A and RNase U2 were identified b y their mobilities following electrophoresis on D E A E - p a p e r in pyridine/acetate (pH 3.5). I n addition to those listed, oligonucleotides of sequence C-A-A-C-C-G,U-G and A-U-G were usually observed in low yield on RNase T1 fingerprints when ~8Otrp,~LDl02 was used as template for the preparation of the labelled RNA. Isolation of trp promoter-operator RI~A by hybridization did not result in a completely homogeneous product especially when the R N A was not purified on a polyacrylamide gel. The non-specific R N A in the preparation gave a background of small oligonucleotides in the RNase T1 and RNase A fingerprints (Table 2). These are listed in the footnotes as they were observed in the analysis, b u t their presence did not interfere with the sequence determination of the oligonucleotides f,.om the major R N A species present, a On some fingerprints a faint C-G[G] was observed, b On some fingerprints a faint A-G[A] was observed, c On most CTP-labelled fingerprints a weak spot was also observed. I t gave U and G as products when analyzed b y alkaline hydrolysis, d Occasionally a weak C-C-G[C] was also observed, e tea and t6~ occurred at the same position on the fingerprint, tea was observed when ¢80trpzJLDl02 was used as a template, and the R N A contained the other oligonucleotides of the trp leader, a~ in type 2 R N A (Fig. 5). tsb was observed only when ~b80trp~LC145-2 was used as template, type 3 R N A {Fig. 7). ~A weak A-A-C-G was also observed at this position on some fingerprints, r F r o m the estimation of relative molar yields of the oligonueleotides present on RNase T 1 fingerprints of in vivo labelled R N A and from the relative intensity of tlo on in vitro fingerprints (Figs 4 and 6) it was considered t h a t 2 copies of tl0 were present, h A comparison of the mobility of t13 with the series U,Gp (n = 2 to 6) on PEI.cellulose homochromatography, and on extended electrophoresis on the pyridine/acetate (pH 3.5), DEAE-eullose system (Lee e~ al., 1976) indicated t h a t tls was UsGp or UeGp. Subsequent DNA sequence analysis showed t h a t the length of this tract was 6. l t~4 and t~4~ were superimposed on the RNase T~ fingerprint. W h e n ~80trp,dLDl02 was used as template and the isolated R N A contained the 5' portion of the leader (type 2 RNA) b o t h t14 and t14~ were present (Fig. 6). I f the R N A isolation was b y a hybridization to ~80trpLJLC145-2 (type 1 RNA), tl~ was absent and only t~4a was observed (Fig. 4). t~4a is not observed when ~80~rp~LC145-2 or ¢80trp~LC1415 was used as template (Fig. 7). J tie was observed when ¢80trp~JLC145-2 (type 3 RNA) or ¢80trp.tLC1415 was used as template (Fig. 7). k The length of the AnG oligonucleotide was previously determined (Squires el al., 1976). z A comparison of the mobility of the RNase U2 digestion product of t2o~ with pyrimidine-rich RNase T1 oligonucleotides on PEI-eellulose homochromatography indicated a length of 6 for the RNase U2 product, m The long RNase U~ product of t22 had the mobility of tlo upon electrophoresis in the pyridine/acetate (pH 3.5) DEAE-celtulose system. H y p h e n s have been omitted from sequences for clarity.

The analysis of RNase T1 oligonueleotides was carried out on in v/tro labelled p r o m o t e r - o p e r a t o r RNA. Digestion of the RNase TI oligonueleotides from in vivo labelled R N A with RNase A gave similar results. Products of alkaline hydrolysis were identified from their electrophoretic mobilities on W h a t m a n

C

20a

AAU

9a

9

8

7

AG C AAG C

A

AAC

5

U G

A

G C G A U

AC

4

6

A C

C

R N A labelled with ATP Products identified upon digestion with NaOH RNase T1

AAO O

R N A labelled with GTP Products identified upon digestion with NaOH RNase T1

2 3

1

0

no.

Oligonueleotide

G

U

AG G G

C

AAU[U]

A~U[C]

AAGGC[G] e

GAGC[U] AGGC[G] d

AAAU[A]

AAU[G]

AU[A] AU[C] AU[U] ° U A A U A

U

GAC[G] b

OAAC[U]

GAC[A]

c[o] C[A] C[C] C[U] AC[U] AC[G] AAC[G] AAC[U] oc[G] GC[A] GC[Cp

Sequence deduced

C

C C C

R N A labelled with U T P Products identified upon digestion with NaOH RNase TI

A

C A A C G A

R N A labelled with CTP Products identified upon digestion with NaOH RNase T1

Analysis of RNase A oligonucteotides

TABLE 2

AG

AAG

AAAU

13

14

15

G

19

G U

A G

A

U

u C

u G

G U

U G A

U G U G A

u

G U

U G AU

U AAG AAAU

AG

AAAAAGGGU[A] ~

GGU[U]

GGAU[A]

GU[U] ~

GAAAU[G]

AAGU[U]

AGU[A] AGU[U] r

GGC[A]

u[G] U[A] U[C] U[U]

The products of alkaline hydrolysis were identified by their electrophoretic mobilities on W h a t m a n 3 MM paper in pyridine]acetate (pH 3.5). The products of RNase T1 digestion of the RNase A oligonucleotides were identified b y their eleetrophoretie mobilities on D E A E - p a p e r in pyridine/aeetate (pH 3.5). a G-C[U] was observed in some RNase A fingerprints, b G-A-C[G] was only observed when the ¢80trpLJLDl02 was used as template and the isolated RI~A contained the other oligonucleotides of the trp leader, type 2 RNA. I t arises from the ~rp~LDl02 fusion point in the trp leader (Fig. 2). ~ A faint A-U[G] was observed on some fingerprints. ~ G-A-G-C[U] was the only isomer found when ~80trpLJLG14S.2 (type 3 RNA) or ~b8OtrpdLC1415 was used as a template. B o t h G-A-G-C[U] and A-G-G-C[G] were observed when ~b80$rpALDl02 was used as template (type 1 and type 2 RNA) (Fig. 5). ~ P~a was only observed when ¢80trp~LGlgS.2 (type 3 RNA) or ~80trp~JLG1$15 DNA was used as template (Fig. 7). f Evidence of a G-A-U was also observed on several fingerprints. f Some fingerprints showed G-U labelled weakly b y ATP or CTP. h p2o, a large oligonucleotide from the trp leader, was present only on fingerprints which gave the other characteristic oligonueleotides of the 5' end of ~rp m R N A . I t was generally present in low yield on the fingerprint, and it was not completely analyzed. The complete sequence d a t a for this oligonucleotide has been reported previously (Squires eg al., 1976).

20

G

17

16

u G

10 12

128

G. N. B E N N E T T

ET

AL.

(a)

(b) FIG. 7. Fingerprints of trp promoter-operator R N A of type 3. The fingerprinting system was t h a t described in Fig. 4. The type 3 R N A was isolated as described in Materials and Methods. All oligonucleotides are numbered as in Tables 1 and 2. (a) RNase T1 fingerprint of type 3 R N A labelled with [~-32P]CTP. (b) RNase A fingerprint of type 3 R N A labelled with [~-32P]CTP.

E. COLI trp PROMOTER-OPERATOR

129

also supported by the sequence of P14, A-A-G-U[U]. The identity of the trp promoteroperator RNase T 1 oligonucleotides adjacent to those of the trp leader was demonstrated when type 2 RNA was hybridized to DNA of the phage 2¢80trpED46. The resultant RNA was approximately 50 nucleotides long and contained three RNase T1 oligonueleotides (t17, t19 and tg) in addition to tsa from the initiation site and those from the 5' end of the trp leader (Bennett et al., 1976). These RNase T1 oligonucleotides (tl~, t19, t9 and tsa) were found to be at the 3' end of type 1 RNA by analysis of partial RNase T1 products (Figs 2 and 8). Since the RNA of type 1 contained tsa but the type 3 RNA did not, the endpoint of the deletion trpALC145-2 was within this four-base-pair region, C-A-A-G. I~NA of type 3, in which the 3' fusion oligonucleotide corresponding to this region would be present, showed the RNase T~ product t6b, not found in type 1 or type 2 RNA. The sequence of t6b, C-A-G[C], shows that the 5' portion of the sequence, C-A, remains and this result defines the endpoint of this deletion as being after position 1 of the trp leader region. That the oligonucleotide tsb is from the 3' and not the 5' end of the molecule (type 3 RNA) was shown by a comparison of the fingerprints of type 3 RNA and trp promoter RNA isolated as was type 3 RNA but with ¢80trpALC1415 DNA as the template. From the endpoints of these deletions (Fig. 1) these two isolated RNAs would be expected to differ only at the 3' end. A fingerprint of this material did not contain t6b, but contained all other oligonucleotides of type 3 RNA plus oligonucleotides from the trp leader, including tSa (Bennett et al., 1976). These observations allowed us to conclude that tsa was the 3'-terminal l~Nase T~ oligonucleotide of type 1 RNA and t6b was the 3'-terminal oligonucleotide of type 3 RNA. The analysis of partial RNase T~ digestion products also confirmed this conclusion (see below). The 5' oligonucleotides of the trp promoter-operator RNA were identified by comparing the RNase T1 and RNase A fingerprints of type 1 and type 3 I~NA. When ¢80tr:pALDl02 DNA is used as template a 5'-terminal oligonucleotide should be isolated which overlaps the endpoint of the bacterial sequence with that of phage. In those phage which have the phage-bacterial fusion point further to the left of the trp promoter than in ¢80trpzJLDl02 an oligonucleotide corresponding solely to bacterial sequences would be isolated by hybridization to ¢80trpA LDI02 DNA. Thus, the sequences at the 3' end of the 5'-terminal oligonucleotide of type 1 and type 3 RNA should be the same but the sequence should differ at the 5' end of the oligonucleotide. The RNase T1 fingerprints of these two RNAs (Figs 4 and 7) show the difference at the 3' end of the molecule (tsa versus t6b), and the additional alteration of t14a in the case of type 1 RNA, C-U-U-C-A-G[G], and t16 in the case of type 3 RNA, C-U-C-A-AG[G]. A difference is also observed in the RNase A fingerprints. Type 1 RNA contains Pg, A-G-G-C[G], while type 3 RNA has Pga, A-A-G-G-C[G], instead. These RNase .4 oligonucleotides overlap the 3' terminus of the corresponding RNase T 1 oligonucleotides, t14~ and t16, respectively. The sequence of the two 5' ends of the I~NAs was concluded to be type 1 RNA 5' C-U-U-C-A-G-G-C-G type 3 RNA 5' C-U-C-A-A-G-G-C-G with the phage-bacterial fusion point in ¢80trpALDl02 at the position indicated by

I

t2Oo

i

-105

P]

l

-1(30

1

[]

Pl6

I

J

{3

-95

110

P17

l

-90

t15

i

I

-B5

,[~1

P7 t--.~{]

TI3

I

P4

I

-80 I

-7.5

16 C] ~----~]

h--[;I

I3

P3

P3 ~'C] P6

I

-70

123

P6 P--{]

I

-65

1

Pl9

, (.]

P2 ",~"{]

i

-60

110 ~3

=

P8

[

-55

Pl2 ~----'C.]

[

-50

t22

;)6 ~---(] {]

t2 ~--{]

P9

h----I]

l

-45 l

-40 I

-35

i

P7

I

-30

P4 ~-'~]

;40

Pl6

t12 t5 ~ . . . . . . C] ~ - - £ ]

{]

P15

i

l

-25

t24

P7 ~--'~{]

I

-20

P6

I

C]

=

P5

l

1

(]

P13

-15

117

(.]

I

-16

119

P2 P--"(]

I

lea

I

1

,(]

i

19

P3

-5

[]

~

Pl3

l

5

C]

I

Pl gt4 ~--"(] ~ C]

114

Pl

I

I0

~

fJ

~

118

I

15

P2O

[.3

i

i

20

III

P6

r3

I:]

PT-iill

4

CM-IZ4-B-I2

CM-125-S-67

C M - 1 2 4 - 8 - 25

I

I

CM-124-A- 41 I

CM-1?4-O-4

PT-II - 16

CM-124-B-6

I

=

PT-13- 5

PT-13-15

CM-124-B-29

CM-125-8-57

CM-IZS-B-38 I

p

PT-tS-2

I"

PTHg-IO i

I

CM-IZ4-R--4 CM- 1 2 4 - 6 - 7

PT-19-12

CM-124-B-33

I

CM~IZS-S- 7 CM-124-B-3

FIo. 8. The sequence of trp promoter-operator R N A (hyphens omitted for clarity}. The positions of the oligonucleotides of Tables 1 and 2 are shown above the sequence with [ ] representing ~he 3' nearest-neighbour of each oligonucleotide. The sequence is numbered with -}-1 being the first transcribed nucleotide of trp m R N A (Squires et al., 1976). Below the sequence is shown the portion spanned by various products of partial digestion. In each case the partial product is labelled to indicate its origin, either CM-RNase A (CM} or RNase TI (PT), the experiment number in which it was obtained (for example 124.B or ]3) and the individual spot number assigned to it on the initial 2-dimensional separation of the partial digest. Many additional partial digestion products were analyzed which were similar or identical in sequence to those illu%rated. The sequence shown is that of the type 3 R N A at the 5' end with tl6 and Pg~ to show the complete bacterial sequence of this region. A t the 3' end the oligonueleotides derived from the trp leader region are shown but the sequence of the trp,dLDl02 fusion oligonucleotide t6,, A-C-G, corresponding to nucleotides 24-25 is not shown, t6b (Table 1) the fusion oligonucleotide of deletion strain tr])z]LClg5 {type 3 RNA) corresponds to positions --1 to + 2 .

F

PT-II-2

CM-IZ5-B-Z

5' CUCA/~GGCGCACUCCCGUUCUGGAU~AUGUUUUUUGCGCCG~CAUCAU/~ACGGUUCUGGCAAAU&UUCUGAAAUGAGCU~UUGACA~UUAAU¢~UCGA~CU~GUU~CUAGUA£6CAAGUU¢~¢GUAAA~AGGGU~U£GA 3'

i

13 ~--(]

-lie

{]

i

116

-115

=,,

P9O

!03

~..--.~]

E. COLI trp P R O M O T E R - O P E R A T O R

131

the arrow. This information also allowed the ordering of the RNase T~ oligonucleotides t~6-tl-t3 at the 5' end of the RNA as a first step in the complete ordering process.

(iv) Partial digestion products of trp promoter-operator R N A In order to reconstruct the complete sequence of the trp promoter-operator I~NA, partial digestions were carried out with CM-RNase A and RNase T 1 as described in Materials and Methods. I~NA prepared in vivo was subjected to partial digestion with CM-RNase A. The products were analyzed as described in Materials and Methods. The positions of CM-l~Nase A partial products are shown in the overall sequence in Figure 8. Many of the partial products were observed in several different experiments ; some spanned the same region of the sequence but had slightly different termini. Partial digestions of trp promoter-operator RNA were also carried out with l~Nase T1. The products obtained were useful in deducing the sequences near the 5' and 3' ends of the molecule. At the 3' end of the I~NA of type 1, a partial product containing the two terminal RNase T1 oligonucleotides t9 and tsa was frequently observed. The analogous product, containing t9 and t6b, was identified in a partial l~Nase T~ digest of type 3 RNA. The series of RNase T 1 partial products PT-19-10, PT-19-12 and PT-19-2 (Fig. 8), along with several CM-RNase A products from this region, allowed the determination of the sequence of the first 33 nucleotides preceding the transcription start-site (Bennett et al., 1976). Essentially the same sequence information was obtained by analysis of the product of hybridization of type 2 RNA to A¢80trpED46 DNA (Bennett et al., 1976). A similar series of partial l~Nase T1 products was observed originating from the 5' end, with PT-11-2 and PT-11-1 (Fig. 8) extending the 5'-terminal sequence given above to include t20~ and t~0. This alignment was supported by CM-RNase A product CM-125-B-2 (Fig. 8). Since RNase A oligonucleotide P18, G-U[C] (Table 2), was present in quite a low yield compared to G-U[U], it was thought t h a t the connection between t20a and tl0, G-U, indicated the sequence of tlo at this position to be U-U-C-U-G. This allowed the sequence block from --116 to --95 to be deduced (Fig. 8). Other sections of the sequence were also deduced from information provided by both types of partial products. The region from --52 to --32 is ordered by three CM-RNase A products of differing lengths with their 5' termini at nucleotide --52. These data order the RNase T~ oligonucleotides t22, t12, t2, ts and t4a. The partial RNase T~ products PT-13-5 and PT-13-15 add support to the sequences deduced for this block and, when the complete sequence of t22 is included, the sequence covered by this block extends from --57 to --32 (Fig. 8). Another segment can be placed together between nucleotides --91 and --57. This sequence is based on the CM-RNase A products which have their 5' end at --91 (Fig. 8) and which extend at least into the U stretch of t~3 (CM-124-B-12). The longest of these partial products links t~6, t~3, t3 and t s. The sequence of the 3' section of this block is based on the CM-RNase A products (CM-124-A-41 and CM 124-B-4) with 3' endpoints at --57, which link t23, tl, t~0 and t 1. The exact sequence of t~0 of this block, U-U-C-U-G, is based on the G-G immediately 5' to it, and on the sole nearest-neighbour, U, of G-G-U (Table 2). The partial RNase T~ product PT-11-16 links the two sections of this block by spanning to, t6, t23 and t~0 to give the order tls-t13-t3-tc~-t~.3-tl-tlo-t 1.

]32

G. N. B E N N E T T

ET

AL.

The sequence of the above block leaves only t24 (containing a 5'-terminal A-C) as the oligonncleotide which is at the 3' end of CM-125-B-38, spanning positions --52 to --32 (Fig. 8). The sequence from the initiationsite can now be extended back to the 5' end of t2~ at position --57. The block extending from tls at the 5' end to a 3' terminus of G-G-C (--93 to --57) m a y be placed adjacent to t22 since the nearestneighbour of P12, G-G-C, is A, which requires an RNase TI oligonucleotide with a 5'terminal C-A as the 3' neighbour of CM-124-B-4. t22 is the only RNase TI oligonucleotide containing a 5' C-A which has not been assigned a 5' neighbour. This orders all of the RNase T~ oligonueleotides,the only gap remaining in the sequence is between PT-11-1 and CM-124-B-12. The two RNase T 1 oligonueleotides tlo and tls are evidently adjacent since the only G-containing I~Nase A product not already included in the preceding blocks, P~7, is G-G-A-U[A] ; it fits the 3' end of t~0, PyG[G], onto the 5' terminus of t~5, A-U-A. This completes the sequence illustrated in Figure 8. The sequence of the 5' portion of the leader is also shown since oligonucleotides of the trp leader region are present on fingerprints (Fig. 5) and are included in Tables 1 and 2. The sequence of the 5' end of the trp leader region was described previously (Squires et al., 1976). Although the simplest version of the RNA sequence can be deduced from the data presented, it is difficult to exclude the presence of additional small oligonucleotides in the sequence. This possibility is more likely for stretches high in alternating G-Cs such as at positions --111 to --107 and --81 to --76. To confn'm the sequence deduced from RNA sequencing and to eliminate the possibility that certain small oligonucleotides were omitted, we have also determined the nucleotide sequence of this region by direct DNA sequence analysis using the procedure of Maxam & Gilbert (1977). (b) .DNA sequence analysis (i) Restriction enzyme sites in the trp promoter-operator region In the DNA sequence predicted from the RNA sequence in Figure 8, there are recognition sites for several restriction endonucleases; HpaI, G-T-T-A-A-C (Garfin & Goodman, 1974) at positions --9 to --14; H i n d I I or HincII, G-T-PyPuA-C (Kelly & Smith, 1970; Landy et al., 1974) at positions --37 to - - 3 2 ; A l u I , A-G-C-T (Roberts et al., 1976a) at positions --41 to --38; and HhaI G-C-G-C (Roberts et al., 1976b) at positions -- 81 to -- 78 and -- 110 to -- 107. Restriction maps of a plasmid containing a segment of the trp operon, pVH151, and of a small restriction fragment containing the trp promoter-operator region are shown in Figure 9. Details of the restriction mapping are discussed elsewhere (Brown et al., 1978). Cleavage sites of all the enzymes mentioned above were found at the positions predicted from the RNA sequence. (ii) _Restriction fi'agments used in D N A sequencing The DNA sequencing procedure of Maxam & Gilbert (1977) can be used to determine the sequence of approximately 100 nucleotides from any 5'-labelled end. Suitable positions for end-labelling for sequencing the trp promoter-operator region are the ends of fragments formed by cleavage with the enzymes HpaI and HincII. The direction of sequence analysis from these sites is indicated in Figure 9(b) by the arrows below the restriction map. The EcoRI-HpaI 1.6× 106 and HpaI 2.5 x 106 M,. fragments (Fig. 9) were isolated and labelled at their 5' ends with [7-s2P]ATP and T4 polynucleotide kinase.

COLI trp P R O M O T E R - O P E R A T O R

E.

133

EcoRI

~

HpoI Hnidnl (o)

11

I

-240

1

I

1

-180

I

I

-120

I

I

-60

I

I

1 I

60

i

120

I I I 1 I

180

I

240

(b)

FIO. 9. Restriction map of plasmid pVH151 and the region around the trp promoter-operator. (a) The ColE1 vector, shown as a broad line, has a molecular weight of 4.2 × 106 (Bazaral & I-Ielinski, 1968). Estimates of the molecular weights of the EcoRI-HpaI (1.6 × 106), HpaI.HindIII (1.3 × 106), HindIII (l.l × 106), HindIII-HpaI (0.1 × 106) and HpaI-EcoRI (0.6 × 106) fragments were based on their electrophoretic mobilities on agarose gels relative to DNA fragments of -known size from bacteriophage ~bS0 and other plasmids (Selker et al., 1977; Brown et al., 1978). An asterisk marks the HpaI site which is protected by RNA polymerase and Trp repressor. (b) A restriction map of the HpaII DNA fragment which contains the RNA polymerase and Trp repressor protected HpaI site (at base-pair position --11 and indicated with an asterisk). The numbering is in base-pah's from the initiation site (+ 1) of trp mRNA transcription (Squires st al., 1976}. Numbers in the transcribed region are positive and in the region preceding the mRNA startsite they are negative. The position of the centre of each restriction enzyme recognition site is shown by an arrow. Details of the mapping procedures are described in an accompanying paper (Brown et al., 1978). The arrows below the restriction map indicate the direction and location of the DNA sequencing runs (Maxam & Gilbert, 1977) referred to in the text. R e s t r i c t i o n f r a g m e n t s c o n t a i n i n g o n l y a single labelled e n d were o b t a i n e d from each b y digestion with t h e Haemophilus haemolyticus e n z y m e , HhaI. This cleavage generates a f r a g m e n t of a p p r o x i m a t e l y 70 base-pairs which e x t e n d s from the labelled HpaI e n d a t position - - 1 2 of t h e EcoRI-HpaI 1.6 × 10 s M r f r a g m e n t to t h e HhaI site a t - - 7 8 (Fig. 9). A s e q u e n c i n g gel of this f r a g m e n t gave i n f o r m a t i o n o n t h e region from - - 1 3 to - - 6 6 . This c o m b i n e d with t h e u n i q u e sequence cleaved b y HpaI G-T-T-A-A-C (Garfm & G o o d m a n , 1974) confirmed t h e R N A sequence in t h e region from - - 9 to - - 6 6 . Digestion of the end-labelled HpaI 2-5 × 106 M r f r a g m e n t w i t h H h a I p r o d u c e d a f r a g m e n t labelled a t t h e HpaI e n d (position - - 1 1 ) ; its u n l a b e l l e d e n d was a t t h e H h a I site a t + 6 1 i n t h e trp leader region. A D N A s e q u e n c i n g gel of this f r a g m e n t gave t h e

134

G. N . B E N N E T T

ET

AL.

sequence of 34 nucleotides (--10 to +24) from the HpaI site and confirmed the previously reported sequence of the 11 base-pairs preceding the transcription startsite (Bennett et al., 1976) and the 5' portion of the trp leader region (Squires et al., 1976). To obtain DNA sequence information on the region from --67 to --116, we used a HinfI-HincII fragment (Fig. 9(b)) labelled at position --35, at the end generated by HineII (see Materials and Methods). A sequencing gel of this fragment provided sequence information for the region from --52 to --116. The overall DNA sequence of the segment analyzed is shown in Figure 10. It is in agreement with the RNA sequence data. 4. Discussion The nucleotide sequence of 116 residues preceding the site of transcription initiation in the tryptophan operon of E. coli was determined by RNA and DNA sequencing techniques. Read-through transcripts originating at the phage gene N promoter of ¢80trp transducing phages contain a segment which corresponds to the trp promoteroperator region. This RNA segment was isolated by selective hybridization and sequenced using conventional techniques. The RNA sequence deduced (Fig. 8) predicts the existence of several restriction enzyme cleavage sites within the trp promoteroperator region. One of these, the HpaI site just preceding the transcription-initiation point, is protected against HpaI endonuclease digestion by bound RNA polymerase or Trp repressor (Bennett et al., 1976). This observation permitted us to recognize restriction fragments bearing the trp promoter-operator region (Brown et al., 1978). A restriction map (Fig. 9) of the trp promoter-operator region was prepared (Brown et al., 1978) and appropriate restriction fragments were employed in DNA sequence analyses of this region using the method of Maxam & Gilbert (1977). The results obtained confirmed the RNA sequence data and extended the sequenced region, as shown in Figure 10. Several lines of evidence indicate that the sequence in Figure 10 contains the trp promoter and operator. First, the transducing phage (¢80trp2LDl02) (Fig. 1) used to define the left-hand bacterial endpoint of the region sequenced exhibits normal trp promoter and operator functions in vivo and in vitro (Rose et al., 1973; Zalkin et al., 1974). Secondly, analyses of regions essential for trl; promoter function (Brown et al., 1978) suggest that no specific sequence preceding --78 or beyond + 1 (Bennett et al., 1976) is required for transcription initiation. Thirdly, polymerase binds to and protects from nuclease attack the region from approximately --39 to + 2 0 (Brown et al., 1978). Finally, a phage which does not have trp promoter function, ~trpEDg6 (Fig. 1 ; Franklin, 1971), has a phage-bacterial fusion such that it contains only 17±2 basepairs of bacterial origin preceding the normal transcription-initiation site (Bennett et al., 1976). These observations and the results of transcription studies in vitro with restriction fragments (Brown et al., 1978) indicate that the left-hand end of the trp promoter region of E. coli is between --78 and --39. The trp operator appears to be located within the 20-base-pair symmetrical region immediately preceding the transcription start-site. Trp repressor protects the HpaI site in this region (at --14 to --9) against nuclease attack (Bennett et al., 1976). In addition the positions of base-pair alterations in operator-constitutive mutants are on either side of the HpaI site (Bennett & Yanofsky, 1978). Finally, a deletion which replaces the region beyond + 1 by a foreign sequence does not affect operator function (Bennett et al., 1976).

E. C O L I trp PROMOTEt~-OPERATOI~ HhoI

135

HhoI

5~

CTCAAGGCGCACTCCCGTTCTGGATAATGTTTTTTGCGCCGACATCATAACGG'I'rCTGGC~ATATTCTGA-

5'

GAGTTCCGCGTGAGGGCAAGACCTATTACAAAAAACGCGGCTGTAGTATTGCCAAGACCGTTTATAAGACT-115

-I10

-105 - I 0 0

-95

-90

-85

AluI Hinc]l

-80

-75

-70

-65

-60

-55

-50

HpoI

AATGAOCTGTTGACAATTAATCATCGAACTAG1-TAACTAGTACGCAAGTTCACGTAAAAAGGGTATCCA5, Q • TTACTCGACAACTGTTAATTAGTAGCTTGAT=AATFGATCATGCGTTCAAGTGCATTTTTCCCATAGCT 5~ I

-,

|

|1

5-,o-3s-io-2'5-io-,s

~

-1o

__~1

|

|

I

,

5

,o

FIe. 10. The DNA sequence of the trp promoter-operator region. Base-pairs are numbered from the trp mRNA transcription initiation site designated H-1. The centres of the hyphenated symmetries are indicated by (0) and ( , ) between the strands. The nueleotide pairs involved in each symmetry are denoted by similar bars above and below the sequence. The cleavage points of restriction enzymes on the upper DNA strand are indicated by arrows. Hyphens have been omitted for clarity.

The D N A sequences of a number of promoters have been compared and two regions show similarities. (i) The region from position --12 to - - 6 (Schaller et al., 1975 ; Pribnow, 1975a,b) and (ii) the region around position --35 (M~niatis et al., 1975b; Takanami et al., 1976; Sekiya et aL, 1976; Gilbert, 1976). I n the trp promoter (Fig. 10) the nucleotides at only three of the seven positions from --12 to - - 6 are homologous with the ideal heptamer, T-A-T-PuA-T-G, considered by Pribnow (1975b). Perhaps the dual function of this segment, i.e. its essential role in both operator and promoter function, severely limits acceptable sequences. The sequence of the trp promoter in the region --32 to --38 is similar to the relatively strong, cyclic-AMP independent promoters APL and APR (Maniatis et al., 1975b), and the promoter recognized b y E. coli R N A potymerase on simian virus 40 D N A (Dhar et al., 1974). The maimer in which I~NA polymerase interacts with the D N A of the trp promoter is discussed further in the following paper (Bennett et al., 1978).

We thank A. Maxam and W. Gilbert for a detailed protocol on the DNA sequencing method and instruction on its use. We acknowledge the assistance of Miriavn Bonner, Virginia Horn and Joan l~owe. We thank V. Hershfield and D. Helinski for trp plasmidbearing strains. The study was supported by grants from the National Science Foundation (PCM73-06774), the U.S. Public Health Service (GM09738) and the American Heart Association. One of us (G. N. B.) is a U.S.P.H.S. postdoctoral fellow and another author (M. E. S.) was a postdoctoral fellow supported by the Swiss National Foundation. A third author (K. D. B.) was a Career Investigator Visiting Scientist of the American Heart Association and was an awardee of the Australian ~eseareh Grants Committee. A fourth author (C. Y.) is a Career Investigator of the American Heart Association. These studies were performed using standard microbiological procedures which conform to the National Institutes of Health guidelines.

136

G.N. BENNETT ET AL, REFERENCES

BarreU, B. G. (1971). In Procedures in Nucleic Acid Research (Cantoni, G. L. & Davies, D. R., eds), vol. 2, pp. 751-779, Harper and Row, New York. Bazaral, M. & Helinski, I). R. (1968). J. Mol. Biol. 36, 185-194. Bemlett, G. N. & Yanofsky, C. (1978). J. Mol. Biol. 121, 179-192. Bennett, G. N,, Schweingruber, M. E., Brown, K. D., Squires, C. & Yanofsky, C. (1976). Prec. Nat, Acad. Sci., U.S.A. 73, 2351-2355. Bennett, G. N., Brown, K. D. & Yanofsky, C. (1978). J. Mol. Biol. 121, 139-152. Bertrand, K., Sqttires, C. & Yanofsky, C. (1976). J. Mol. Biol. 103, 319-337. Bertrand, K., Kern, L. J., Lee, F. & Yanofsky, C. (1977). J. Mol. Biol. 117, 227-247. Brown, K. D., Bemlett, G. N., Lee, F., Sehweingruber, M. E. & Yanofsky, C. (1978). Brownlee, G. G. (1972). Determination of Sequences in R N A , American Elsevier Publishing Co., New York. Brownlee, G. & Sanger, F. (1969). Eur. J. Biochem. 11, 395-399. Cohen, G. & Jacob, F. (1959). C. R. H. Acad. Sci. 248, 3490. Dhar, R., Weissman, S. M., Zain, B. S., Pan, J. & Lewis, A. M. Jr (1974). Nucleic Acids Res. 1,595-614. Franklin, N. (1971). In The Bacteriophage Lambda (Hershey, A. D., ed.), pp, 621-638, Cold Spring Harbor Laboratories, Cold Spring Harbor, New York. Gaxen, A. & Levinthal, C. (1960). Bioehim. Biophys. Acta, 38, 470. Garfin, D. E. & Goodman, H. M. (1974). Biochem. Biophys. Res. Commun. 59, 108-116. Gilbert, W. (1976). In R N A Polymerase (Losick, R. & Chamberlin, M., eds), pp. 193-205, Cold Spring Harbor Laboratories, Cold Spring Harbor, New York. Glynn, I. M. & Chappell, J. B. (1964). Biochem. J. 90, 147-149. Heinrikson, R. (1966). J. Biol. Chem. 241, 1393-1405. Helinski, D. R., Hershfield, V., Figurski, D. & Meyer, R. J. (1977). In lOth ll~Iiles International Symposium: Impact of Recombinant Molecules in Science and Society (Beers, R. F. & Bassett, E. G., eds), pp. 151-165, Raven Press, New York. Hershfield, V., Boyer, H. W., Yanofsky, C., Lovett, M. A. & Helinski, D. R. (1974). Prec. Nat. Acad. Sci., U.S.A. 71, 3455-3459. Imamoto, F. & Tani, S. (1972). Nature New Biol. 240, 172-175. Kelly, T. J. & Smith, H. O. (1970). J. Mol. Biol. 51, 393-409. Landy, A., Ruedisueli, E., Robinson, L., Foeller, C. & Ross, W. (1974). Biochemistry, 13, 2134-2142. Lee, F., Squires, C. L., Squires, C. & Yanofsky, C. (1976). J. Mol. Biol. 103, 383-393. Maniatis, T., Jeffrey, A. & van de Sande, H. (1975a). Biochemistry, 14, 3787-3794. Maniatis, T., Ptashne, M., Backman, K., Kleid, D., Flashman, S., Jeffrey, A. & Maurer, R. (1975b). Cell, 5, 109-113. Manson, M. D. & Yanofsky, C. (1976). J. Bacteriol. 126, 679-689. Maxam, A. & Gilbert, W. (1977). Prec. Nat. Aead. Sci., U.S.A. 74, 560-564. MeGeoch, D., McGeoch, J. & Mo1~e, D. (1973). Nature New Biol. 245, 137-140. Pribnow, D. (1975a). Proo. Nat. Acad. Sci., U.S.A. 72, 784-788. Pribnow, D. (1975b). J. Mol. Biol. 99, 419-443. Roberts, R. J., Myers, P. A., Morrison, A. & Murray, K. (1976a). J. Mol. Biol, 192, 157-165. Roberts, R. J., Myers, P. A., Morrison, A. & Murray, K. (1976b). J. Mol. Biol. 103, 199-208. Rose, J. K., Squires, C. L., Yanofsky, C., Yang, H-L. & Zubay, G. (1973). Nature New Biol. 245, 133-137. Sehaller, H., Gray, C. & Herrmann, K, (1975). Prec. Nat. Acad. Sci., U.S.A. 72, 737-741. Sokiya, T., Gait, M. J., Noris, K., Ramamoorthy, B. & Khorana, I-L G. (1976). J. Biol. Chem. 251, 4481-4489. Selker, E., Brown, K. & Yanofsky, C. (1977). J. Bacteriol. 129, 388-394. Shapiro, J., MacHattm, L., Eron, L., Ihler, G., Ippen, K. & Beckwith, J. (1969). Nature (London), 224, 768-774. Southern, E. M. (1974). Anal. Bioehem. 62, 317-318. Squires, C. L., Lee, F. D. & Yanofsky, C. (1975). J. Mot. Biol. 92, 93-111. Squires, C., Lee, F., Bertrand, K., Squires, C. L., Bronson, M. J. & Yanofsky, C. (1976). J . Mol. Biol. 103, 351-381.

E. C O L I

trp P R O M O T E R - O P E R A T O R

137

Takanami, M., Sugimoto, K., Sugisaki, H. & Okamoto, T. (1976). Nagure (London), 260, 297-302. Yanofsky, C., Horn, V., Bonner, M. & Stasiowski, S. (1971). Genetics, 69, 409-433. Zalkin, H., Yanofsky, C. & Squires, C. L. (1974). J. Biol. Chem. 249. 465-475. Zimmerman, S. G. & Sandeen, G. {1966). Anal. Biochem. 14, 2£9-277. Zubay, G., Chambers, D. A. & Cheong, L. C. (1970). In The Lactose Operon (Beekwith, J. 1%. & Zipser, D., eds), pp. 375-391, Cold Spring Harbor Laboratories, Cold Spring Harbor, :New York.

Nucleotide sequence of the promoter--operator region of the tryptophan operon of Escherichia coli.

J. Mol. Biol. (1978) 121, 113-137 Nucleotide Sequence of the Promoter- Operator Region of the Tryptophan Operon of Escherichia coli G. N. BENNETT, M...
14MB Sizes 0 Downloads 0 Views