Cell. Vol. 63, 417-424,

October

19. 1990, Copyright

0 1990 by Cell Press

A Self-Splicing Group I lntron in the DNA Polymerase Gene of Bacillus subtilis Bacteriophage SPOI Heidi Goodrich-Blair: Vincenzo Scarlato,t* Jonatha M. Gott,‘§ Ming-Qun Xu,’ and David A. Shub’ Department of Biological Sciences and Center for Molecular Genetics State University of New York, Albany Albany, New York 12222 t Department of Biology University of California, San Diego La Jolla, California 92093 and International Institute of Genetics and Biophysics Naples Italy l

Summary We report a self-splicing intron in bacteriophage SPOl, whose host is the gram-positive Bacillus subtilis. The intron contains all the conserved features of primary sequence and secondary structure previously described for the group IA introns of eukaryotic organelles and the gram-negative bacteriophage T4. The SPOl intron contains an open reading frame of 522 nucleotides. As in the T4 introns, this open reading frame begins in a region that is looped out of the secondary structure, but ends in a highly conserved region of the intron core. The exons encode SPOl DNA polymerase, which is highly similar to E. coli DNA polymerase I. The demonstration of self-splicing introns in viruses of both gram-positive and gramnegative eubacteria lends further evidence for their early origin in evolution. Introduction The discovery of a self-splicing group I intron in the to (thymidylate synthase) gene of bacteriophage T4 (Chu et al., 1984; Ehrenman et al., 1986) was entirely unexpected. Previously, mRNA splicing had been observed only in eukaryotes. In addition, introns capable of self-splicing were restricted to genes of mitochondria, chloroplasts, and rRNA genes of eukaryotic protists (reviewed in Waring and Davies, 1984; Cech, 1988). The origin of the td intron was, therefore, uncertain. Michel and Dujon (1986) pointed out the close resemblance of parts of the open reading frame (ORF) of the td intron to ORFs in introns of fungal mitochondria. This, they suggested, raised the possibility that T4 may have obtained its intron by horizontal gene transfer. Indeed, the discovery that the td intronic ORF is a site-specific DNA endonuclease and is capable of promoting the transfer of the intron to “homing sites” in genetic crosses (Quirk et al., 1989a) lends credence to this idea. $ Present address: Sclavo Research Center, Via Fiorentina 1, 53100 Siena, Italy. 5 Present address: Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado 60309.

If, on the other hand, the to intron is a molecular fossil-a remnant of an earlier time when many prokaryotic genes had introns (see, for example, Darnell and Doolittle, 1986)-then additional examples should exist among contemporary prokaryotes. In addition to td, there are at least two other introns in the T4 genome: in the nrdB (ribonucleoside diphosphate reductase, small subunit) and sunY genes (Gott et al., 1986; Sjoberg et al., 1986; Shub et al., 1988). Like td, these introns self-splice via the group I pathway (Gott et al., 1986; Xu and Shub, 1989). All three of these introns have the typical secondary structure and conserved sequences common to subgroup IA (Shub et al., 1988; reviewed in Cech, 1988). The amino acid sequences of the ORFs in each of the introns are not obviously related, but the high degree of similarity of the nucleotide sequences in their core structures is consistent with a common origin for the three T4 introns. Aside from T4 and its close relatives (Pedersen-Lane and Belfort, 1987; Quirk et al., 1989b; S. Eddy and L. Gold, personal communication), there is no evidence for introns in any of the eubacteria or their viruses. The same method used to detect the two additional introns in T4 failed to reveal the existence of self-splicing introns in Escherichia coli (Gott et al., 1986). In addition, sequence analysis of eubacterial and bacteriophage genes has given no other indication of the existence of introns. tRNA and rRNA genes of archaebacteria have introns (Kaine et al., 1983; Daniels et al., 1985; Kjems and Garrett, 1985) but these are not closely related to either of the self-splicing intron groups or to the mRNA introns of nuclear genes. Based on the assumption that introns may have persisted in T4 due to selection, we have chosen to seek additional introns in another bacteriophage, whose structure and mode of replication resemble that of T4, but whose host range does not overlap T4’s. It was hoped that comparison of the structures of independently derived introns, and identification of the genes in which they reside, might give additional clues to their origin and possible function. Bacteriophage SPOl infects the gram-positive Bacillus subtilis. No virus isolated on a gram-positive host has ever been propagated in gram-negative cells, and vice versa (see, for example, Jones and Sneath, 1970) so it is unlikely that T4 and SPOl have recently inhabited a common host. The strategies of infection of the two viruses are remarkably similar, however. Their genome sizes are among the largest of the bacteriophages, and their tails have a contractile sheath and complex baseplate. One of the bases in their DNAs has been completely replaced by a structural analog (hydroxymethylcytosine [HMC] for cytosine in T4, hydroxymethyluracil [HMU] for thymine in SPOl), for which the phage brings in biosynthetic genes. The ability to distinguish phage DNA from that of its host is probably responsible for the extreme virulence of these viruses: both of them stop host mRNA and DNA syntheses soon after infection (T4 even degrades the host DNA to mononucleotides), permitting them to divert the host’s metabolism exclusively to the production of progeny vir-

Cell 418

I

2

3

4

M

Kb)

Pre-mRNAINTRON-EIIINTRON-

-EI-EJI -1NTRON

-492

-369

Figure

1. In Vitro [c@P]GTP

Labeling

of SPOl

RNA

RNAs isolated from B. subtilis before and at various times after infection with SPOl were deproteinized and incubated with [c6’sP]GTP under self-splicing conditions. Samples were separated by electrophoresis on a 4% acrylamide-8 M urea gel, and labeled species were visualized by autoradiography. Lane 1, uninfected B. subtilis; lane 2, 3 min after infection by SPOl; lane 3, 10 min; lane 4, 20 min; lane M, 123 bp ladder DNA size markers.

ions. Gene expression in both bacteriophages is regulated positively, at least in part, by phage-encoded cr factors. Recent reviews by Mosig and Eiserling (1988) and Stewart (1988) provide complete overviews of T4 and SPOl, respectively. Preliminary characterization of RNA that could be labeled with GTP in vitro provided evidence that SPOl might have a self-splicing group I intron (Goodrich et al., 1989). We report here the cloning, in vitro transcription, and complete sequence of a self-splicing group I intron in the SPOl gene for DNA polymerase. Results A Self-Splicing Group I lntron in SPOl The group I splicing reaction is initiated by nucleophilic attack by the 3’OH of guanosine at the 5’splice site, resulting in addition of a noncoded G at the 5’end of the excised intron (Cech et al., 1981; Zaug and Cech, 1982). When this reaction is performed in vitro, with precursor RNA and [a-32P]GTP as the source of guanosine, only the intron is isotopically labeled. Indeed, the two additional introns in T4 pre-mRNA were detected by this method (Gott et al., 1988). RNA was extracted from cells infected with SPOl at 3, 10, and 20 min after infection-corresponding to the

EI-Eg-

Figure

2. In Vitro Transcription

of Cloned

SPOI

DNA

(A) Schematic diagram of cloned SPOI DNA in plasmid pHaEP1. The T7 promoter is indicated by an open arrow. Downward arrows show locations of the 5’ and 3’ splice sites. The box represents a 174 codon ORF identified by DNA sequencing (see Figure 3). Enzymes SnaBl and Hindlll were used for linearization of template DNA. (B) In vitro transcriptions were carried out using T7 RNA polymerase in the presence of [a-s*P]UTP Transcription products were separated on a 5% acrylamide-8 M urea gel and visualized by autoradiography. Species resulting from exon ligation (El-EII) and from cleavage at the 5’ splice site alone (INTRON-EII) are present in addition to an intron species of the approximate size seen in Figure 1.

early, middle, and late transcription classes (Gage and Geiduschek, 1971)-and exposed to [a-32P]GTP under conditions that promote the splicing of the T4 introns. Electrophoresis in a polyacrylamide gel under denaturing conditions followed by autoradiography revealed a single labeled species in the 10 min RNA that comigrated with a DNA marker of 881 nucleotides (Figure 1). Similar labeled species could not be detected in RNA extracted from uninfected cells, or at 3 or 20 min after infection. When end-labeled RNA was used as a probe for Southern blot hybridization of SPOl DNA, the putative intron was localized within the 8.8 kb EcoRI-9 restriction fragment. Dot hybridization with plasmid DNA, containing subfragments of EcoRI-9, further localized the labeled RNA to a 2.8 kb EcoRI-Pvull fragment (Goodrich et al., 1989). When this EcoRI-Pvull fragment was cloned down-

Self-Splicing 419

lntron

1

GAA Glu

TX Phe

CGT Arg

AAC Asn

GGA Gly

AAC Asn

CAC His

T-TA Leu

TAT Tyr

AAT Asn

AAC Asn

TTT Phe

GT-T Val

AGT Ser

AAA Lys

CTG Leu

TCT Ser

CTG Leu

ATG Met

ATA Ile

61

GAC Asp

CCT Pro

GAT Asp

AAC Asn

ATT Ile

GT-T Val

CAC His

CCT Pro

AGC Ser

TAC Tyr

AAC Asn

ATA Ile

CAT His

GGC GIY

ACT Thr

GTG Vel

ACA Thr

GGT Gly

CGT Arg

T-TG Leu

AGT

AGT

AAT

GAG

CCT

1 AAA

GAA

G-t-T

GGG

CAC

CGC

TAT

GGT

AAC

ATA

GCG

TGT

TAA

Ser

Ser

Asn

Glu

Pro

Lys

Glu

Val

Gly

His

Arg

Tyr

GIY

Asn

Ile

Ala

Cys

TER

121

161

in DNA Polymerase

Gene

of SPOl

TGAACCTATAAATATAGGGTGTATACTCCACGTGCAGGAGTGTCAGGAAATGGGCACTAGGGAGCGTGCTAACAGG

257

GGAACTGAGTAGGGATGAGCGCCTACACAGCAATCCTGTGCCAAGACCCT~GAGGGTAAGGTGCAACGACTATCG

333

AAACCACTCAGAAATGAGGAAGGGAGTAGAGTACCTTACAGGTGAAACTCCTGTAGGGGAAGCGCTGGGCAACCGT

409

TGAGATATGTATAAAACATAAGGAGGTGAAGAAT

476

ACCCAG

ATG MET

GAA Glu

TGG Trp

AAG Lys

GAC Asp

Al-T Ile

AAA Lys

GGA Gly

TAT Tyr

GAG Glu

GGG Gly

CAC His

TAT Tyr

CAA Gln

GTA Val

TCA Ser

AAC Asn

ACC Thr

GGG Gly

GAA Glu

GTA Val

TAC Tyr

AGC Ser

ATA Ile

AAG LYS

TCG Ser

GGA Gly

AAG Lys

ACT Thr

TTA Leu

AAA Lye

536

CAT His

CAG Gln

Al-f Ile

CCT Pro

AAA Lys

GAT Asp

GGG Gly

TAT Tyr

CAC His

AGG Arg

ATT Ile

GGA Gly

C-l-T Leu

-l-f-T Phe

AAA Lys

GGT Gly

GGA Gly

AAA Lys

GGG Gly

AAA Lys

596

ACG Thr

l-l-T Phe

CAA Gln

GTG Val

CAT His

CGT Arg

CTG Leu

GTG Val

GCG Ala

Al-T lie

CAC His

l-t-T Phe

TGT Cys

GAA Glu

GGA Gly

TAT Tyr

GAG Glu

GAA Glu

GGT Gly

CTA Leu

656

GTA Val

Gl-T Val

GAT Asp

CAT His

AAA Lys

GAT Asp

GGT Gly

AAC Asn

AAG Lys

GAC Asp

AAC Asn

AAT Asn

CTC Leu

AGT Ser

ACA Thr

AAT Asn

T-TA Leu

AGA Arg

TGG Trp

G-t-T Val

716

ACC Thr

CAA Gln

AAG Lys

ATA Ile

AAC Asn

GTA Val

GAG Glu

AAT Asn

CAA Gln

ATG Met

TCT Ser

AGA Arg

GGG Gly

ACT Thr

T-TA Leu

AAC Asn

GTA Vai

TCT Ser

AAG Lys

GCT Ala

776

CAA Gln

CAA Gln

ATC Ile

GCC Ala

AAG Lys

ATA Ile

AAA Lys

AAC Asn

CAG Gln

AAG Lys

CCA Pro

ATC Ile

Al-T Ile

GTG Val

ATC Ile

TCT Ser

CCA Pro

GAT Asp

GGA Gly

Al-T Ile

836

GAG Glu

AAA Lys

GAG Glu

TAT Tyr

CCA Pro

TCA Ser

ACT Thr

AAG Lys

TGT Cys

GCT Ala

TGT Cys

GAA Glu

GAA Glu

l-TG Leu

GGA Gly

T-TA Leu

ACA Thr

AGA Arg

GGT Gly

AAA Lys

896

GTA Val

ACT Thr

GAT Asp

GTC Val

CTG Leu

AAG Lys

GGA Gly

CAT His

AGG Arg

ATT Ile

CAC His

CAC His

AAG Lys

GGA

TAC Tyr

ACT Thr

l-f-T Phe

AGO TAC Arg Tyr

AAA Lys

CTC

AAC

GGT

TGA

AAC

GCT

CAG

Leu

Asn

Gly

TER

Asn

Ala

Gln

1027

CAA Gln

TTC Phe

CCA Pro

CGT Arg

AAG Lys

GTG Val

AAC Asn

ACG Thr

CCA Pro

ACA Thr

TTA Leu

TTC Phe

CAG Gln

TAT Tyr

AAC Asn

TTT Phe

GAG Glu

ATT Ile

AAG Lys

AAA Lyr

1067

ATG Met

TTT Phe

AAC Asn

TCT Ser

AGG Arg

TTT Phe

GGG Gly

GAT Asp

GGT Gly

GGT Gly

GTA Val

ATT Ile

GTA Val

CAG Gln

TTT Phe

GAC Asp

TAC Tyr

TCT Ser

CAG Gln

TTA Leu

1147

GAG Glu

TTA Leu

CGT Arg

A

956

Figure

3. Sequence

of DNA Cloned

GIY

AGATATAGTCTAGCAGGTAGTATAATCGGGGAATTTTATATACTACTTGTAG

1

into pHaEP1

The sequence of the 1156 bp EcoRI-SnaSI portion of pHaEP1 is shown. Numbers refer to nucleotides beginning at the EcoRl site. Arrows indicate the 5’and 3’splice sites of the 662 nucleotide intron. After splicing, the exons comprise a continuous ORF. An ORF of 174 amino acids is completely contained within the boundaries of the intron. A potential 6. subtilis ribosome initiation sequence is underlined. The nucleotide sequence data reported will appear in the EMIL, GenSank, and DDSJ nucleotide sequence data bases.

stream of a promoter for phage T7 RNA polymerase, transcription products from both Hindlll- and SnaBI-truncated DNA displayed species consistent with self-splicing (Figure 2). For example, in the reaction using Hindlll-truncated DNA, comparison with denatured DNA size standards gave sizes of 0.9 and 1.1 kb for the intron and ligated exon (El-EII) species, respectively, in good agreement with a primary transcript of 1.9 kb. Species corresponding to precursor RNA, the intron-3’ exon intermediate (poorly resolved in reactions with Hindlll-truncated DNA), ligated exon, and intron can be seen in both reactions. Thus, the entire self-splicing intron is contained on the EcoRI-SnaBI subfragment, whose sequence (determined by the dideoxy chain termination method) is shown in Figure 3.

The Nucleotide Sequence of the lntron RNAs corresponding to the primary transcript of Hindllltruncated template and the corresponding ligated exon species (as in Figure 2, but transcribed in the absence of isotopic label) were isolated from a gel, and the sequence near the putative splice site was determined by reverse transcription (Figure 4). The sequence of precursor RNA is identical to that of the DNA, but 882 nucleotides have been removed from the ligated exon species, joining residues 135 and 1018. As is the case for other group I introns, the 5’ splice site occurs after a U and the 3’ splice site after a G (Cech, 1988). Interestingly, like the T4 introns, the SPOl intron has an ORF contained entirely within its boundaries (Figure 3).

Cell 420

A

B

Figure

5’

U’

A

u’+J A/% c&J

.: 3’

4. Sequence

of Splice

Junction

Hindlll-truncated pHaEP1 DNA was transcribed with T7 RNA polymerase. Pre-mRNA (A) and ligated exon species (8) were isolated from a 5% acrylamide-8 M urea gel. An olrgonucleotide complementary to sequences in exon Ii was used for reverse transcriptase extension. Products of sequencing were separated on an 8% acrylamrde-8 M urea gel. In (A) and (B) the sequences are identical (3’ to 5’) until nucleotide 1018 (indicated by an arrow) In ligated exon RNA this nucleotide IS joined to nucleobde 135, deftning the limits of an 882 nucleotrde intron.

_

AGCUAGCU

An AUG codon preceded by a long (10 nucleotide) sequence complementary to the 3’ end of B. subtilis 16s RNA implies that this 174 codon ORF can be efficiently translated in vivo (Hager and Rabinowitz, 1985). Secondary Structure of the lntron Group I introns have a common core structure, comprising both local and long-range helical pairing regions, with highly conserved primary sequence elements present at defined locations (Michel et al., 1982; Davies et al., 1982). The SPOl intron can be folded into a structure that contains all of the elements common to group I introns (Figure 5). The common base-paired regions (Pl-P9) as well as conserved sequence elements (R Q, R, S) are all present. The R and S sequences (which tend to be the most highly conserved) match the consensus at 12/13 and 12/12 positions, respectively (Cech, 1988). The intron contains both the extra nucleotides (between the R sequence and 3’portion of P3) and the variations in the conserved primary sequence elements that are typical of subgroup IA (Michel et al., 1982; Cech, 1988). The 5’ splice site is located in Pl, after a U residue that pairs with a G, as is the case in most other group I introns. In addition, the intron 3’-terminal G is followed by three residues capable of Watson-crick base pairing (PlO) with residues near the intron 5’terminus (Figure 5). Thus, the 3’ portion of Pl fulfills the conditions for a putative “internal guide sequence:’ aligning the 3’and 5’splice sites for exon ligation (Davies et al., 1982). The position of the ORF within the secondary structure is also analogous to the T4 introns-beginning in a loop but overlapping at its 3’end with a highly conserved structural element (Shub et al., 1988). The SPOl intron ORF begins within the loop of P8, but its last three codons comprise the 3’portion of P8. Upon reaching the UGA termination codon, a translating ribosome would disrupt both the P7 and P8 structural elements, with obvious deleterious effects on the conformation required for splicing (Shub et al., 1987; Gott et al., 1988). Translation of the T4 intron ORFs is prevented by formation of a helix between up-

3’

stream sequences and the ORF ribosome initiation sites (Gott et al., 1988). No corresponding translational modulation can be inferred from the sequence surrounding the SPOl intron ORF, and it will be of interest to determine whether the intact intron also serves as mRNA for this protein. Two features of the proposed secondary structure are unusual. First, although group IA introns typically contain additional nucleotides between the 5’ portion of P7 and the 3’ portion of P3, the extra nucleotides can usually be folded into one or two (as for the T4 introns) stable stem and loop structures. This region of the SPOl intron contains 65 nucleotides, only 25 of which can be organized into a continuous based-paired stem with a single loop. Although it is likely that the remainder of this region is also highly structured, we cannot predict an obvious secondary structure from the primary sequence. The second unusual feature of the structure lies between the 5’ portion of P3 and P4. In most introns this region comprises a few (typically three or four) nucleotides. The SPOl intron, however, contains 69 nucleotides in this interval, most of which can be folded into three thermodynamically stable stems and loops (Figure 5). Only three other group I introns, none of them from group IA, contain extra nucleotides in this region. In all of these cases, the extra residues can be folded into two highly stable stem-loop arrangements (Trinkl and Wolf, 1986; Cummings et al., 1989). Since there is no evidence, either from mutants or phylogenetic comparison, concerning these two unusual regions of the SPOl intron, the structures drawn for them in Figure 5 must be considered tentative. Indeed, other potential base pairing interactions are consistent with alternative structures for the regions, including one in which the two regions interact with each other. The lntron Is in the SPOl Gene Encoding DNA Polymerase A search of protein sequence data bases with the sequence analysis program FASTA (Pearson and Lipman, 1988) revealed a significant similarity (40% identity) be-

f;ySplicing

lntron

in DNA Polymerase

Gene

of SPOl

U

G A

G

l G GZC G

270

160 u

A’

G”G 210

Gi

0

Pl

A

l

’ A

C=G (zc

A 240,G

G G

l A

A

A

A-

A

190.A

c=G c=G gag=C -

0

P3 l 180

A-U

-

mAW!AA;::

;r:

A

c

A

A

A

u C-G-

u

. UA-UA v"-cu IJ-A "A-U G !%

c CA=-:

230

E:;

GGAAkfGG’v

U

A-

U

A-U

0

U-A

P7J

l

370

G=C U-4 C=G UC=%

G

A

G Figure

5. Secondary

Structure

l

1000

CGA

of the lntron

The proposed secondary structure of the SPOl gene 31 intron is shown. Bold ing of the splice junction (Figure 4). Lowercase letters denote exon sequences. with circles. These include the P4 pairing of parts of the P and Q sequences identify nucleotides that could form PlO, bringing the 5’ and 3’ splice sites

arrows indicate the B’and 3’splice sites, determined by RNA sequencPhylogenetically conserved secondary structure elements are labeled and the P7 pairing of parts of the R and S sequences. The solid bars into close proximity. Stop codons of ORFs are boxed.

tween the inferred amino acid sequence of the exons and a region in the carboxy-terminal half of E. coli DNA polymerase I (Figure 6). DNA polymerase of SPOl is encoded by gene 31 (De Antoni et al., 1965) and marker rescue recombination has shown that part of gene 31 is contained on restriction fragment EcoRI-23, which is adjacent to EcoRI-9 (Curran and Stewart, 1985). We have sequenced the DNA that surrounds the exon sequences presented here (Scarlato et al., unpublished data). One continuous ORF, including all of EcoRI-23, terminates just beyond the Hindlll site of EcoRI-9 and encodes a protein of 924 amino acids, in good agreement with the apparent molecular size of loo-105 kd calculated from the electrophoretic mobility of the product of gene 31 (De Antoni et al., 1965).

Discussion

9P31 POL

I

A Novel lntron in Bacteriophage SPOl lntrons in eubacterial DNA have been limited, until now, to the three related group IA introns of bacteriophage T4. It has been impossible to determine whether these introns are relics of an ancient gene structure or examples of a relatively recent horizontal gene transfer. The SPOl intron resembles the T4 introns in overall organization (e.g., it belongs to group IA and has an ORF that overlaps the conserved core) but has no discernible resemblance to the T4 sequences, either in the nucleotides of its catalytic core (besides the universally conserved sequences) or in the amino acids of its ORF. Re-

10 20 30 40 EFRNGNHLYNNFVSKLSLMIDP-DNIVHPSYNIHGTVTGRLSSNEP : . : . ..: . . . . . . . . . . . . . . . ::.::. :.::::::..: EYRGLAKLKSTYTDKLPUUNPKTGRVHTSYHQAVTATGRLSSTDP 630

640

650

660

Figure

EXON

1

EXON

2

670

50 ,31

NAQQFPRKVNTPT:iQYNFEIKIii:NSRFGDGG&QFDYSQL& :

POL

I

:..:

.

. .

NLQNIPVRNEEGR-----680

:..

:

. .

. .

:::

::::.:::

RIRQAFIAP--EDYVIVSADYSQIELR 690

700

710

6. Amino

Acid Sequence

Homology

The inferred amino acid sequence of ligated exons (gp31) was searched with FASTA for homology to other proteins. A 40% identity was found with a seouence near the carboxvl terminus of E. coli DNA polymerase (POiI). Two dots represent identity; one dot indicates a conservative amino acid change.

Cell 422

cent acquisition of an intron in the unrelated B. subtilis bacteriophage SPOl would require an independent transfer, presumably from a different eukaryote. Although not definitive, our data are at least consistent with a simpler interpretation: group I introns were already present approximately 1500 million years ago, in the common ancestors of both gram-positive and gram-negative eubacteria (Ochman and Wilson, 1987). One hypothesis concerning the origin and evolutionary function of introns imputes a role for introns as sites of exchange of protein folding domains by recombination-the “exon shuffling” hypothesis (Gilbert, 1978; Gilbert et al., 1986). The SPOl intron is inserted into the phage DNA polymerase gene in a region of high amino acid sequence similarity to E. coli DNA polymerase I. If the structures of these proteins are homologous, the site of intron insertion would separate two folding domains (8 sheet 8 and a helix J in the DNA binding domain of the E. coli protein; Ollis et al., 1985), consistent with the exon shuffling hypothesis. We searched for introns in SPOl because of the great similarity in structure and mode of infection between the HMU phages of B. subtilis (typified by SPOl) and the HMC phages of E. coli (typified by T4). The surprising finding that SPOl has an intron is consistent with the notion that these structures persist in bacteriophage genes due to some selective value.

Bacteriophage lntrons Occur in Genes Affecting DNA Synthesis Of the three T4 introns, two (nrdL3 and td) occur in genes of the same biosynthetic pathway, the conversion of ribonucleotides to deoxyribonucleotides. The function of the third T4 gene that contains an intron (sunv) is not known. In this context, it was astonishing that SPOl not only has a self-splicing group I intron, but that the intron is also in a gene, DNA polymerase, that is involved in DNA replication. It is striking that besides being in the same pathway, both the nrd6 and td genes specify the two steps in this pathway that consume reducing equivalents. A consequence of inefficient intron removal, therefore, would be a reduction in the levels of these two enzymes and the concomitant reduction of pool sizes of deoxyribonucleotides. It is noteworthy that in most organisms ribonucleoside diphosphate reductase activity is regulated by feedback inhibition by dATP (Eriksson and Sjiiberg, 1989). An exception to this rule is bacteriophage T4, whose ribonucleotide reductase is not sensitive to dATP (Berglund, 1972). The presence of introns in the nrdB and td genes, therefore, presents an opportunity to control the flux of metabolites through the pathway by regulating the rate of enzyme synthesis rather than enzyme activity. Reduced levels of SPOl DNA polymerase, caused by inefficient splicing, would result in expanded pools of dNTPs. Assuming that B. subtilis ribonucleoside diphosphate reductase is sensitive to feedback inhibition, this would also lead to a lower rate of conversion of ribonucleotides to deoxyribonucleotides, with concomitant sparing of reducing equivalents.

A Model for Regulation of Splicing of Group I lntrons Virulent DNA bacteriophages encode enzymes for DNA synthesis that may not be sensitive to the same regulatory signals as their cellular counterparts. Since the synthesis of DNA precursors consumes both reducing equivalents and nucleotides required for synthesis of RNA and protein, production of more DNA than can be packaged could reduce the size of the burst under poor nutritional conditions We suggest that splicing of bacteriophage introns may be regulated by interaction with a molecule that is a general indicator of the redox (or other nutritional) status of the infected cell. This scheme would be consistent with another unexplained property of a phage DNA polymerase. As far as we know, phage T7 has no group I introns. However, T7 DNA polymerase is only active in a 1:l complex with a host protein, thioredoxin (Modrich and Richardson, 1975). Remarkably, formation of the active complex requires the reduced (and not the oxidized) conformation of E. coli thioredoxin (Huber et al., 1986). Thus, T7 may have evolved an independent method of coupling DNA synthesis to the redox state of the infected cell. Regulatory mechanisms used by prokaryotic viruses might also be present in eukaryotic organelles, which are themselves of prokaryotic origin. It is striking that group I “self-splicing” introns have been found in all chloroplasts of land plants investigated, as well as in the cyanelle of Cyanophora paradoxa, but these have been incapable of in vitro self-splicing (reviewed in Cech, 1988; Evrard et al., 1988). It would not be surprising to us if regulation of splicing in these organelles, which are so intimately concerned with redox balance, were tightly regulated in vivo. In chloroplasts, there are numerous examples of protein enzymes that are activated through interaction with reduced thioredoxin (reviewed in Buchanan, 1986; Knaff, 1989). It will be interesting to see whether interaction with redox indicators can similarly regulate the activity of ribozymes. Experimental

Procedures

Strains and Plasmids B subtilis 168 (trp-) was used for propagation of wild-type SPOI bacteriophage. Bacteria were maintained as spores on potato dextrose agar (Difco), 42 g/liter plus 0.02 mglml trp. at room temperature. For infections, bacteria were grown at 37% in NY broth (8 g of nutrient broth [Difco] and 5 g of yeast extract [Difco] per liter [pH 7.21) supplemented with 5 mM MgSO,, and 0.02 mM MnCla after autoclaving. Plasmid pJK9 (provided by E. P. Geiduschek) contains the entire 6.8 kb EcoRI-9 restriction fragment of SPOI. A 2.8 kb EcoRI-Pvull fragment was subcloned into the polylinker of pUC8 (Goodrich et al.. 1989) and was subsequently transferred as an EcoRI-Pstl fragment to pBSM13+ (Stratagene) to create pHaEP1. RNA Isolation and lntron Labeling B. subtilis was grown at 37% in M9S medium (Bolle et al., 1968) modified with 2 x W5 M MnCle and 0.01 mglml L-tryptophan to a density of l-2 x 10s cells per ml. Cells were infected with SPOl at a multiplicity of approximately 5. Aliquots were removed before and at various times after infection and placed on ice. RNA was isolated and incubated with [c$*P]GTP (3000 Cilmmol) as previously described (Gott et al., 1986). In Vltm pHaEP1

Transcription DNA was linearized

with restriction

endonucleases

and 1 ug

Self-Splicing 423

lntron

in DNA Polymerase

Gene

of SPOl

was transcribed with T7 RNA polymerase (Stratagene) and 400 PM of each NTP in 25 ~1. Reaction conditions were as previously described (Gott et al., 1986). Isotopically labeled RNA was prepared by lowering [UTP] to 20 KM and adding 20 uCi of [cI-~~P]UTP (600 Cilmmol). DNA Sequencing Cloned SPOl DNA was sequenced in both directions by the dideoxy chain termination method (Sanger et al., 1977; Chen and Seeburg, 1985). The sequence was checked in its entirety by determining the sequence of single-stranded DNA obtained by asymmetrically amplifying SPOl DNA with the polymerase chain reaction in vitro (Gyllensten and Erlich, 1986). DNA polymerase purified from Thermus aquaticus was obtained from US Biochemical Corp. and was used according to their specifications, with 1:lOO primer ratios. RNA Purification and Sequencing Products from in vitro transcriptions lacking isotopic label were separated on a 5% acrylamide-8 M urea gel, and RNA species were visualized by staining with ethidium bromide. Species corresponding to premessage (unspliced) and ligated exons were cut from the gel, frozen at -60°C for 15 min, and crushed with a glass rod. The gel fragments were incubated at room temperature overnight in 0.9 ml of 10 mM Tris-HCI (pH 7.5), 1 mM EDTA, 30 mM NaCI, and 1% distilled phenol (Peebles et al., 1979). RNA was recovered by two ethanol precipitations and resuspended in HpO treated with diethyl pyrocarbonate. RNA isolated in this way was used as a template for sequencing using an end-labeled oligonucleotide (S-dCTAACTTGAGAGTAGTC-3’. complementary to residues 1133-1148) and AMV reverse transcriptase as described by Belfort et al. (1965). Acknowledgments We thank Larry Gold for stimulating discussions and especially for reminding us that T7 DNA polymerase contains thioredoxin, William A. Goodrich for Figure 5, and Carole Keith for Figure 3 and careful preparation of the manuscript. V. S. gratefully acknowledges E. Peter Geiduschek for support (National Science Foundation grant PCM8317847; National Institutes of Health grant GM15880) and encouragement during the course of the work, and A. Cascino for exquisite hospitality. H. G.-B. was supported, in part, by a Burke Graduate Fellowship. Work in the laboratory of D. A. S. was supported by grants from the National Science Foundation (DMB6609066) and the National Institutes of Health (GM37746). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC Section 1734 solely to indicate this fact. Received

July 3, 1990

References Belfort, M.. Pedersen-Lane, J., West, D.. Ehrenman, K., Maley, G., Chu, F., and Maley, F. (1985). Processing of the intron-containing thymidylate synthase (td) gene of phage T4 is at the RNA level. Cell 41, 375-382.

guanosine nucleotide 27, 487-496.

in the excision

of the intervening

sequence.

Cell

Chen, E. Y., and Seeburg, l? H. (1985). Supercoiling sequencing: a fast and simple method for sequencing plasmid DNA. DNA 4,165-170. Chu, F. K.. Maley, G. F.. Maley, F.. and Belfort, M. (1984). An intervening sequence in the thymidylate synthase gene of bacteriophage T4. Proc. Natl. Acad. Sci. USA 81, 3149-3153. Cummings, D. J., Michel, F., and McNally, K. L. (1989). DNA sequence analysis of the 24.5 kilobase pair cytochrome oxidase subunit I mitochondrial gene from Podospora anserina: a gene with sixteen introns. Curr. Genet. 76, 381-406. Curran. J. F., and Stewart, C. R. (1985). SPOl genome. Virology 742, 78-97.

Cloning

and mapping

of the

Daniels. C. J., Gupta, R., and Doolittle, W. F. (1985). Transcription and excision of a large intron in the tRNATRP gene of an archaebacterium. Halobacterium volcanii. J. Biol. Chem. 260, 3132-3134. Darnell, J. E., and Doolittle, W. F. (1986). Speculations on the early course of evolution. Proc. Natl. Acad. Sci. USA 83, 1271-1275. Davies, R. W., Waring, R. B., Ray, J. A., Brown, T. A.. and Scazzocchio, C. (1982). Making ends meet: a model for RNA splicing in fungal mitochondria. Nature 300, 719-724. De Antoni, G. L., Besso, N. E.. Zanassi, G. E.. Sarachu, A. N., and Grau, 0. (1985). Bacteriophage SPOl DNA polymerase and the activity of viral gene 31. Virology 143, 16-22. Ehrenman, K., Pedersen-Lane, Belfort, M. (1986). Processing gous to the eukaryotic group USA 83, 5875-5879.

J., West, D., Herman, R., Maley, F., and of phage T4 td-encoded RNA is analoI splicing pathway. Proc. Natl. Acad. Sci.

Eriksson, S., and Sjdberg. R.-M. (1989). Ribonucleotide reductase. In Allosteric Enzymes, G. HervB, ed. (Boca Raton, Florida: CRC Press), pp. 189-215. Evrard, J.-L., Kuntz, intron in a cyanelle genetic relationship 71, 115-122.

M., Straus, N. A., and Weil, J.-H. (1988). A class-l tRNA gene from Cyanophora paradoxa: phylobetween cyanelles and plant chloroplasts. Gene

Gage, L. P., and Geiduschek, riophage SPOl development: 57, 279-300. Gilbert,

W. (1978). Why genes

E. P. (1971). RNA synthesis during bactesix classes of SPOl RNA. J. Mol. Biol. in pieces?

Gilbert, W., Marchionni, M., and McKnight, of introns. Cell 46, 151-153.

Nature

277, 501.

G. (1986). On the antiquity

Goodrich, H. A., Gott, J. M., Xu, M.-Q., Scarlato, V., and Shub, D. A. (1989). A group I intron in Bacillus subtilis bacteriophage SPOl, In Molecular Biology of RNA., T. R. Cech, ed. (New York: Alan R. Liss), pp. 59-66. Gott, J. M., Shub. D. A., and Belfort, introns in bacteriophage T4: evidence of RNA in vitro. Cell 47, 81-87.

M. (1986). Multiple self-splicing from autocatalytic GTP labeling

Gott. J. M., Zeeh, A., Bell-Pedersen, D., Ehrenman. K., Belfort, M., and Shub, D. A. (1988). Genes within genes: independent expression of phage T4 intron open reading frames and the genes in which they reside. Genes Dev. 2, 1791-1799.

Berglund, 0. (1972). Ribonucleoside diphosphate reductase induced by bacteriophage T4. Il. Allosteric regulation of substrate specificity and catalytic activity. J. Biol. Chem. 247, 7276-7281.

Gyllensten, U., and Erlich, H. A. (1988). Generation of single stranded DNA by the polymerase cham reaction and its application to direct sequencing of the HLA-DQA locus. Proc. Natl. Acad. Sci. USA 85, 7652-7856.

Belle, A., Epstein, R. M., Sal.% W.. and Geiduschek, E. P. (1968). Transcription during bacteriophage T4 development: synthesis and relative stability of early and late RNA. J. Mol. Biol. 37, 325-348.

Hager, P W., and Rabinowltz, J. C. (1985). Translational specificity in Bacillus subtilis. In The Molecular Biology of the Bacilli, Vol. 2, D. A. Dubnau. ed. (Orlando, Florida: Academic Press), pp. 1-32.

Buchanan, B. B. (1986). The ferredoxin/thioredoxin system. In Thioredoxin and Glutaredoxin Systems: Structure and Function, A. Holmgren. C.-l. BrlndBn. H. Jbrnvall. and B.-M. SjBberg, eds. (New York: Raven Press), pp. 233-242.

Huber, H. E., Russel, M.. Model, P, and Richardson, C. C. (1986). Interaction of mutant thioredoxins of Escherichia co/i with the gene 5 protein of phage T7. J. Biol. Chem. 261, 15006-15012.

Cech, T. R. (1966). Conserved sequences and structures of group I Introns: building an active site for RNA catalysis-a review. Gene 73, 259-271. Cech, T. R.. Zaug, A. J., and Grabowski, P J. (1981). In vitro splicing of the ribosomal RNA precursor of Tetrahymena: involvement of a

Jones, D., and Sneath, P. H. A. (1970). Genetic taxonomy. Bacterial. Rev. 34, 40-81.

transfer

and bacterlal

Kaine, B. P. Gupta. R., and Woese, C. R. (1983). Putative introns tRNA genes of prokaryotes. Proc. Natl. Acad. Sci. USA 80,3309-3312. Kjems, J., and Garrett,

R.A. (1985). An intron in the 23s ribosomal

in

RNA

Cell 424

gene of the archaebacterium 675-677.

Desulfurococcus

Knaff, D. B. (1989). The regulatory Trends Biochem. Sci. 74, 433-434.

mobilis.

role of thioredoxin

Michel, F., and Dujon, B. (1986). Genetic exchanges phage T4 and filamentous fungi? Cell 46, 323.

Nature

318,

in chloroplasts. between

bacterio-

Michel, F., Jacquier, A., and Dujon, 8. (1982). Comparison of fungal mitochondrial introns reveals extensive homologies in RNA secondary structure. Biochimie 64, 667-881. Modrich, F’., and Richardson, C. C. (1975). Bacteriophage T7 deoxyribonucleic acid replication in vitro. J. Biol. Chem. 250, 5515-5522. Mosig, G., and Eiserling, F. (1988). Phage T4 structure and metabolism. In The Bacteriophages, Vol. 2, R. Calendar, ed. (New York: Plenum Publishing Corp.), pp. 521-606. Ochman, H., and Wilson, A. C. (1967). Evolution in bacteria: evidence for a universal substitution rate in cellular genomes. J. Mol. Evol. 26, 74-66. Ollis, D. L., Brick, P., Hamlin, R., Xuong, N. G., and Steitz, T. A. (1965). Structure of large fragment of Escherichia co/i DNA polymerase I complexed with dTMP. Nature 313, 762-766. Pearson, W. R., and Lipman, D. J. (1966). Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85, 2444-2446. Pedersen-Lane, J., and Belfort, M. (1967). Variable occurrence of the nrdB intron in the T-even phages suggests intron mobility. Science 237, 182-104. Peebles, C. L., Ogden, R. C., Knapp, G., and Abelson, J. (1979). Splicing of yeast tRNA precursors: a two-stage reaction. Cell 78, 27-35. Quirk, S. M., Bell-Pedersen, D., and Belfort, M. (1989a). lntron mobility in the T-even phages: high frequency inheritance of group I introns promoted by intron open reading frames. Cell 56, 455-465. Quirk, S. M., Bell-Pedersen, D., Tomaschewski, fort, M. (1969b). The inconsistent distribution phages indicates recent genetic exchanges. 301-315.

J., Riiger, W., and Belof introns in the T-even Nucl. Acids Res. 17;

Sanger, F., Nicklen, S., and Coulson, A. R. (1977). DNA sequencing with chain termination inhibitors. Proc. Natl. Acad. Sci. USA 74, 54635467. Shub, D. A., Xu, M.-Q., Gott, J. M.. Zeeh, A., and Wilson, L. D. (1967). A family of autocatalytic group I introns in bacteriophage T4. Cold Spring Harbor Symp. &ant. Biol. 52, 193-200. Shub, D. A., Gott, J. M., Xu, M.-Q., Lang, B. F.. Michel, F., Tomaschewski, J., Pedersen-Lane, J., and Belfort, M. (1968). Structural conservation between three homologous introns of phage T4 and the group I introns of eukaryotes. Proc. Natl. Acad. Sci. USA 85, 1151-1155. Sjoberg, B. M., Hahne, S., Mathews, C. Z., Mathews, C., Rand, K. N., and Gait, M. J. (1986). The bacteriophage T4 gene for the small subunit of ribonucleotide reductase contains an intron. EMBO J. 5,2031-2036. Stewart, C. (1988). Bacteriophage SPOl. In The Bacteriophages, Vol. 1. R. Calendar, ed. (New York: Plenum Publishing Corp.), pp. 477-515. Trinkl. H., and Wolf, K. (1986). The mosaic ~0x1 gene in the mrtochondrial genome of Schizosaccharomyces pombe: minimal structural requirements and evolution of group I introns. Gene 4.5, 269-297. Waring, R. B., and Davies, R. W. (1984). Assessment of a model formtron RNA secondary structure relevant to RNA self-splicing-a review. Gene 28, 277-291. Xu, M.-Q., and Shub, D. A. (1969). The catalytrc of bacteriophage T4. Gene 82, 77-62.

core of the sunY intron

Zaug, A. J., and Cech, T. R. (1982). The intervening sequence excused from the ribosomal RNA precursor in nuclei of Tetrahymena contains a S-terminal guanosine residue not encoded by the DNA. Nucl. Acrds Res. 10, 2823-2838. GenBank

Accession

The accession M37686.

number

Number for the sequence

reported

in this paper

IS

A self-splicing group I intron in the DNA polymerase gene of Bacillus subtilis bacteriophage SPO1.

We report a self-splicing intron in bacteriophage SPO1, whose host is the gram-positive Bacillus subtilis. The intron contains all the conserved featu...
2MB Sizes 0 Downloads 0 Views