Cell. Vol. 63, 417-424,
October
19. 1990, Copyright
0 1990 by Cell Press
A Self-Splicing Group I lntron in the DNA Polymerase Gene of Bacillus subtilis Bacteriophage SPOI Heidi Goodrich-Blair: Vincenzo Scarlato,t* Jonatha M. Gott,‘§ Ming-Qun Xu,’ and David A. Shub’ Department of Biological Sciences and Center for Molecular Genetics State University of New York, Albany Albany, New York 12222 t Department of Biology University of California, San Diego La Jolla, California 92093 and International Institute of Genetics and Biophysics Naples Italy l
Summary We report a self-splicing intron in bacteriophage SPOl, whose host is the gram-positive Bacillus subtilis. The intron contains all the conserved features of primary sequence and secondary structure previously described for the group IA introns of eukaryotic organelles and the gram-negative bacteriophage T4. The SPOl intron contains an open reading frame of 522 nucleotides. As in the T4 introns, this open reading frame begins in a region that is looped out of the secondary structure, but ends in a highly conserved region of the intron core. The exons encode SPOl DNA polymerase, which is highly similar to E. coli DNA polymerase I. The demonstration of self-splicing introns in viruses of both gram-positive and gramnegative eubacteria lends further evidence for their early origin in evolution. Introduction The discovery of a self-splicing group I intron in the to (thymidylate synthase) gene of bacteriophage T4 (Chu et al., 1984; Ehrenman et al., 1986) was entirely unexpected. Previously, mRNA splicing had been observed only in eukaryotes. In addition, introns capable of self-splicing were restricted to genes of mitochondria, chloroplasts, and rRNA genes of eukaryotic protists (reviewed in Waring and Davies, 1984; Cech, 1988). The origin of the td intron was, therefore, uncertain. Michel and Dujon (1986) pointed out the close resemblance of parts of the open reading frame (ORF) of the td intron to ORFs in introns of fungal mitochondria. This, they suggested, raised the possibility that T4 may have obtained its intron by horizontal gene transfer. Indeed, the discovery that the td intronic ORF is a site-specific DNA endonuclease and is capable of promoting the transfer of the intron to “homing sites” in genetic crosses (Quirk et al., 1989a) lends credence to this idea. $ Present address: Sclavo Research Center, Via Fiorentina 1, 53100 Siena, Italy. 5 Present address: Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado 60309.
If, on the other hand, the to intron is a molecular fossil-a remnant of an earlier time when many prokaryotic genes had introns (see, for example, Darnell and Doolittle, 1986)-then additional examples should exist among contemporary prokaryotes. In addition to td, there are at least two other introns in the T4 genome: in the nrdB (ribonucleoside diphosphate reductase, small subunit) and sunY genes (Gott et al., 1986; Sjoberg et al., 1986; Shub et al., 1988). Like td, these introns self-splice via the group I pathway (Gott et al., 1986; Xu and Shub, 1989). All three of these introns have the typical secondary structure and conserved sequences common to subgroup IA (Shub et al., 1988; reviewed in Cech, 1988). The amino acid sequences of the ORFs in each of the introns are not obviously related, but the high degree of similarity of the nucleotide sequences in their core structures is consistent with a common origin for the three T4 introns. Aside from T4 and its close relatives (Pedersen-Lane and Belfort, 1987; Quirk et al., 1989b; S. Eddy and L. Gold, personal communication), there is no evidence for introns in any of the eubacteria or their viruses. The same method used to detect the two additional introns in T4 failed to reveal the existence of self-splicing introns in Escherichia coli (Gott et al., 1986). In addition, sequence analysis of eubacterial and bacteriophage genes has given no other indication of the existence of introns. tRNA and rRNA genes of archaebacteria have introns (Kaine et al., 1983; Daniels et al., 1985; Kjems and Garrett, 1985) but these are not closely related to either of the self-splicing intron groups or to the mRNA introns of nuclear genes. Based on the assumption that introns may have persisted in T4 due to selection, we have chosen to seek additional introns in another bacteriophage, whose structure and mode of replication resemble that of T4, but whose host range does not overlap T4’s. It was hoped that comparison of the structures of independently derived introns, and identification of the genes in which they reside, might give additional clues to their origin and possible function. Bacteriophage SPOl infects the gram-positive Bacillus subtilis. No virus isolated on a gram-positive host has ever been propagated in gram-negative cells, and vice versa (see, for example, Jones and Sneath, 1970) so it is unlikely that T4 and SPOl have recently inhabited a common host. The strategies of infection of the two viruses are remarkably similar, however. Their genome sizes are among the largest of the bacteriophages, and their tails have a contractile sheath and complex baseplate. One of the bases in their DNAs has been completely replaced by a structural analog (hydroxymethylcytosine [HMC] for cytosine in T4, hydroxymethyluracil [HMU] for thymine in SPOl), for which the phage brings in biosynthetic genes. The ability to distinguish phage DNA from that of its host is probably responsible for the extreme virulence of these viruses: both of them stop host mRNA and DNA syntheses soon after infection (T4 even degrades the host DNA to mononucleotides), permitting them to divert the host’s metabolism exclusively to the production of progeny vir-
Cell 418
I
2
3
4
M
Kb)
Pre-mRNAINTRON-EIIINTRON-
-EI-EJI -1NTRON
-492
-369
Figure
1. In Vitro [c@P]GTP
Labeling
of SPOl
RNA
RNAs isolated from B. subtilis before and at various times after infection with SPOl were deproteinized and incubated with [c6’sP]GTP under self-splicing conditions. Samples were separated by electrophoresis on a 4% acrylamide-8 M urea gel, and labeled species were visualized by autoradiography. Lane 1, uninfected B. subtilis; lane 2, 3 min after infection by SPOl; lane 3, 10 min; lane 4, 20 min; lane M, 123 bp ladder DNA size markers.
ions. Gene expression in both bacteriophages is regulated positively, at least in part, by phage-encoded cr factors. Recent reviews by Mosig and Eiserling (1988) and Stewart (1988) provide complete overviews of T4 and SPOl, respectively. Preliminary characterization of RNA that could be labeled with GTP in vitro provided evidence that SPOl might have a self-splicing group I intron (Goodrich et al., 1989). We report here the cloning, in vitro transcription, and complete sequence of a self-splicing group I intron in the SPOl gene for DNA polymerase. Results A Self-Splicing Group I lntron in SPOl The group I splicing reaction is initiated by nucleophilic attack by the 3’OH of guanosine at the 5’splice site, resulting in addition of a noncoded G at the 5’end of the excised intron (Cech et al., 1981; Zaug and Cech, 1982). When this reaction is performed in vitro, with precursor RNA and [a-32P]GTP as the source of guanosine, only the intron is isotopically labeled. Indeed, the two additional introns in T4 pre-mRNA were detected by this method (Gott et al., 1988). RNA was extracted from cells infected with SPOl at 3, 10, and 20 min after infection-corresponding to the
EI-Eg-
Figure
2. In Vitro Transcription
of Cloned
SPOI
DNA
(A) Schematic diagram of cloned SPOI DNA in plasmid pHaEP1. The T7 promoter is indicated by an open arrow. Downward arrows show locations of the 5’ and 3’ splice sites. The box represents a 174 codon ORF identified by DNA sequencing (see Figure 3). Enzymes SnaBl and Hindlll were used for linearization of template DNA. (B) In vitro transcriptions were carried out using T7 RNA polymerase in the presence of [a-s*P]UTP Transcription products were separated on a 5% acrylamide-8 M urea gel and visualized by autoradiography. Species resulting from exon ligation (El-EII) and from cleavage at the 5’ splice site alone (INTRON-EII) are present in addition to an intron species of the approximate size seen in Figure 1.
early, middle, and late transcription classes (Gage and Geiduschek, 1971)-and exposed to [a-32P]GTP under conditions that promote the splicing of the T4 introns. Electrophoresis in a polyacrylamide gel under denaturing conditions followed by autoradiography revealed a single labeled species in the 10 min RNA that comigrated with a DNA marker of 881 nucleotides (Figure 1). Similar labeled species could not be detected in RNA extracted from uninfected cells, or at 3 or 20 min after infection. When end-labeled RNA was used as a probe for Southern blot hybridization of SPOl DNA, the putative intron was localized within the 8.8 kb EcoRI-9 restriction fragment. Dot hybridization with plasmid DNA, containing subfragments of EcoRI-9, further localized the labeled RNA to a 2.8 kb EcoRI-Pvull fragment (Goodrich et al., 1989). When this EcoRI-Pvull fragment was cloned down-
Self-Splicing 419
lntron
1
GAA Glu
TX Phe
CGT Arg
AAC Asn
GGA Gly
AAC Asn
CAC His
T-TA Leu
TAT Tyr
AAT Asn
AAC Asn
TTT Phe
GT-T Val
AGT Ser
AAA Lys
CTG Leu
TCT Ser
CTG Leu
ATG Met
ATA Ile
61
GAC Asp
CCT Pro
GAT Asp
AAC Asn
ATT Ile
GT-T Val
CAC His
CCT Pro
AGC Ser
TAC Tyr
AAC Asn
ATA Ile
CAT His
GGC GIY
ACT Thr
GTG Vel
ACA Thr
GGT Gly
CGT Arg
T-TG Leu
AGT
AGT
AAT
GAG
CCT
1 AAA
GAA
G-t-T
GGG
CAC
CGC
TAT
GGT
AAC
ATA
GCG
TGT
TAA
Ser
Ser
Asn
Glu
Pro
Lys
Glu
Val
Gly
His
Arg
Tyr
GIY
Asn
Ile
Ala
Cys
TER
121
161
in DNA Polymerase
Gene
of SPOl
TGAACCTATAAATATAGGGTGTATACTCCACGTGCAGGAGTGTCAGGAAATGGGCACTAGGGAGCGTGCTAACAGG
257
GGAACTGAGTAGGGATGAGCGCCTACACAGCAATCCTGTGCCAAGACCCT~GAGGGTAAGGTGCAACGACTATCG
333
AAACCACTCAGAAATGAGGAAGGGAGTAGAGTACCTTACAGGTGAAACTCCTGTAGGGGAAGCGCTGGGCAACCGT
409
TGAGATATGTATAAAACATAAGGAGGTGAAGAAT
476
ACCCAG
ATG MET
GAA Glu
TGG Trp
AAG Lys
GAC Asp
Al-T Ile
AAA Lys
GGA Gly
TAT Tyr
GAG Glu
GGG Gly
CAC His
TAT Tyr
CAA Gln
GTA Val
TCA Ser
AAC Asn
ACC Thr
GGG Gly
GAA Glu
GTA Val
TAC Tyr
AGC Ser
ATA Ile
AAG LYS
TCG Ser
GGA Gly
AAG Lys
ACT Thr
TTA Leu
AAA Lye
536
CAT His
CAG Gln
Al-f Ile
CCT Pro
AAA Lys
GAT Asp
GGG Gly
TAT Tyr
CAC His
AGG Arg
ATT Ile
GGA Gly
C-l-T Leu
-l-f-T Phe
AAA Lys
GGT Gly
GGA Gly
AAA Lys
GGG Gly
AAA Lys
596
ACG Thr
l-l-T Phe
CAA Gln
GTG Val
CAT His
CGT Arg
CTG Leu
GTG Val
GCG Ala
Al-T lie
CAC His
l-t-T Phe
TGT Cys
GAA Glu
GGA Gly
TAT Tyr
GAG Glu
GAA Glu
GGT Gly
CTA Leu
656
GTA Val
Gl-T Val
GAT Asp
CAT His
AAA Lys
GAT Asp
GGT Gly
AAC Asn
AAG Lys
GAC Asp
AAC Asn
AAT Asn
CTC Leu
AGT Ser
ACA Thr
AAT Asn
T-TA Leu
AGA Arg
TGG Trp
G-t-T Val
716
ACC Thr
CAA Gln
AAG Lys
ATA Ile
AAC Asn
GTA Val
GAG Glu
AAT Asn
CAA Gln
ATG Met
TCT Ser
AGA Arg
GGG Gly
ACT Thr
T-TA Leu
AAC Asn
GTA Vai
TCT Ser
AAG Lys
GCT Ala
776
CAA Gln
CAA Gln
ATC Ile
GCC Ala
AAG Lys
ATA Ile
AAA Lys
AAC Asn
CAG Gln
AAG Lys
CCA Pro
ATC Ile
Al-T Ile
GTG Val
ATC Ile
TCT Ser
CCA Pro
GAT Asp
GGA Gly
Al-T Ile
836
GAG Glu
AAA Lys
GAG Glu
TAT Tyr
CCA Pro
TCA Ser
ACT Thr
AAG Lys
TGT Cys
GCT Ala
TGT Cys
GAA Glu
GAA Glu
l-TG Leu
GGA Gly
T-TA Leu
ACA Thr
AGA Arg
GGT Gly
AAA Lys
896
GTA Val
ACT Thr
GAT Asp
GTC Val
CTG Leu
AAG Lys
GGA Gly
CAT His
AGG Arg
ATT Ile
CAC His
CAC His
AAG Lys
GGA
TAC Tyr
ACT Thr
l-f-T Phe
AGO TAC Arg Tyr
AAA Lys
CTC
AAC
GGT
TGA
AAC
GCT
CAG
Leu
Asn
Gly
TER
Asn
Ala
Gln
1027
CAA Gln
TTC Phe
CCA Pro
CGT Arg
AAG Lys
GTG Val
AAC Asn
ACG Thr
CCA Pro
ACA Thr
TTA Leu
TTC Phe
CAG Gln
TAT Tyr
AAC Asn
TTT Phe
GAG Glu
ATT Ile
AAG Lys
AAA Lyr
1067
ATG Met
TTT Phe
AAC Asn
TCT Ser
AGG Arg
TTT Phe
GGG Gly
GAT Asp
GGT Gly
GGT Gly
GTA Val
ATT Ile
GTA Val
CAG Gln
TTT Phe
GAC Asp
TAC Tyr
TCT Ser
CAG Gln
TTA Leu
1147
GAG Glu
TTA Leu
CGT Arg
A
956
Figure
3. Sequence
of DNA Cloned
GIY
AGATATAGTCTAGCAGGTAGTATAATCGGGGAATTTTATATACTACTTGTAG
1
into pHaEP1
The sequence of the 1156 bp EcoRI-SnaSI portion of pHaEP1 is shown. Numbers refer to nucleotides beginning at the EcoRl site. Arrows indicate the 5’and 3’splice sites of the 662 nucleotide intron. After splicing, the exons comprise a continuous ORF. An ORF of 174 amino acids is completely contained within the boundaries of the intron. A potential 6. subtilis ribosome initiation sequence is underlined. The nucleotide sequence data reported will appear in the EMIL, GenSank, and DDSJ nucleotide sequence data bases.
stream of a promoter for phage T7 RNA polymerase, transcription products from both Hindlll- and SnaBI-truncated DNA displayed species consistent with self-splicing (Figure 2). For example, in the reaction using Hindlll-truncated DNA, comparison with denatured DNA size standards gave sizes of 0.9 and 1.1 kb for the intron and ligated exon (El-EII) species, respectively, in good agreement with a primary transcript of 1.9 kb. Species corresponding to precursor RNA, the intron-3’ exon intermediate (poorly resolved in reactions with Hindlll-truncated DNA), ligated exon, and intron can be seen in both reactions. Thus, the entire self-splicing intron is contained on the EcoRI-SnaBI subfragment, whose sequence (determined by the dideoxy chain termination method) is shown in Figure 3.
The Nucleotide Sequence of the lntron RNAs corresponding to the primary transcript of Hindllltruncated template and the corresponding ligated exon species (as in Figure 2, but transcribed in the absence of isotopic label) were isolated from a gel, and the sequence near the putative splice site was determined by reverse transcription (Figure 4). The sequence of precursor RNA is identical to that of the DNA, but 882 nucleotides have been removed from the ligated exon species, joining residues 135 and 1018. As is the case for other group I introns, the 5’ splice site occurs after a U and the 3’ splice site after a G (Cech, 1988). Interestingly, like the T4 introns, the SPOl intron has an ORF contained entirely within its boundaries (Figure 3).
Cell 420
A
B
Figure
5’
U’
A
u’+J A/% c&J
.: 3’
4. Sequence
of Splice
Junction
Hindlll-truncated pHaEP1 DNA was transcribed with T7 RNA polymerase. Pre-mRNA (A) and ligated exon species (8) were isolated from a 5% acrylamide-8 M urea gel. An olrgonucleotide complementary to sequences in exon Ii was used for reverse transcriptase extension. Products of sequencing were separated on an 8% acrylamrde-8 M urea gel. In (A) and (B) the sequences are identical (3’ to 5’) until nucleotide 1018 (indicated by an arrow) In ligated exon RNA this nucleotide IS joined to nucleobde 135, deftning the limits of an 882 nucleotrde intron.
_
AGCUAGCU
An AUG codon preceded by a long (10 nucleotide) sequence complementary to the 3’ end of B. subtilis 16s RNA implies that this 174 codon ORF can be efficiently translated in vivo (Hager and Rabinowitz, 1985). Secondary Structure of the lntron Group I introns have a common core structure, comprising both local and long-range helical pairing regions, with highly conserved primary sequence elements present at defined locations (Michel et al., 1982; Davies et al., 1982). The SPOl intron can be folded into a structure that contains all of the elements common to group I introns (Figure 5). The common base-paired regions (Pl-P9) as well as conserved sequence elements (R Q, R, S) are all present. The R and S sequences (which tend to be the most highly conserved) match the consensus at 12/13 and 12/12 positions, respectively (Cech, 1988). The intron contains both the extra nucleotides (between the R sequence and 3’portion of P3) and the variations in the conserved primary sequence elements that are typical of subgroup IA (Michel et al., 1982; Cech, 1988). The 5’ splice site is located in Pl, after a U residue that pairs with a G, as is the case in most other group I introns. In addition, the intron 3’-terminal G is followed by three residues capable of Watson-crick base pairing (PlO) with residues near the intron 5’terminus (Figure 5). Thus, the 3’ portion of Pl fulfills the conditions for a putative “internal guide sequence:’ aligning the 3’and 5’splice sites for exon ligation (Davies et al., 1982). The position of the ORF within the secondary structure is also analogous to the T4 introns-beginning in a loop but overlapping at its 3’end with a highly conserved structural element (Shub et al., 1988). The SPOl intron ORF begins within the loop of P8, but its last three codons comprise the 3’portion of P8. Upon reaching the UGA termination codon, a translating ribosome would disrupt both the P7 and P8 structural elements, with obvious deleterious effects on the conformation required for splicing (Shub et al., 1987; Gott et al., 1988). Translation of the T4 intron ORFs is prevented by formation of a helix between up-
3’
stream sequences and the ORF ribosome initiation sites (Gott et al., 1988). No corresponding translational modulation can be inferred from the sequence surrounding the SPOl intron ORF, and it will be of interest to determine whether the intact intron also serves as mRNA for this protein. Two features of the proposed secondary structure are unusual. First, although group IA introns typically contain additional nucleotides between the 5’ portion of P7 and the 3’ portion of P3, the extra nucleotides can usually be folded into one or two (as for the T4 introns) stable stem and loop structures. This region of the SPOl intron contains 65 nucleotides, only 25 of which can be organized into a continuous based-paired stem with a single loop. Although it is likely that the remainder of this region is also highly structured, we cannot predict an obvious secondary structure from the primary sequence. The second unusual feature of the structure lies between the 5’ portion of P3 and P4. In most introns this region comprises a few (typically three or four) nucleotides. The SPOl intron, however, contains 69 nucleotides in this interval, most of which can be folded into three thermodynamically stable stems and loops (Figure 5). Only three other group I introns, none of them from group IA, contain extra nucleotides in this region. In all of these cases, the extra residues can be folded into two highly stable stem-loop arrangements (Trinkl and Wolf, 1986; Cummings et al., 1989). Since there is no evidence, either from mutants or phylogenetic comparison, concerning these two unusual regions of the SPOl intron, the structures drawn for them in Figure 5 must be considered tentative. Indeed, other potential base pairing interactions are consistent with alternative structures for the regions, including one in which the two regions interact with each other. The lntron Is in the SPOl Gene Encoding DNA Polymerase A search of protein sequence data bases with the sequence analysis program FASTA (Pearson and Lipman, 1988) revealed a significant similarity (40% identity) be-
f;ySplicing
lntron
in DNA Polymerase
Gene
of SPOl
U
G A
G
l G GZC G
270
160 u
A’
G”G 210
Gi
0
Pl
A
l
’ A
C=G (zc
A 240,G
G G
l A
A
A
A-
A
190.A
c=G c=G gag=C -
0
P3 l 180
A-U
-
mAW!AA;::
;r:
A
c
A
A
A
u C-G-
u
. UA-UA v"-cu IJ-A "A-U G !%
c CA=-:
230
E:;
GGAAkfGG’v
U
A-
U
A-U
0
U-A
P7J
l
370
G=C U-4 C=G UC=%
G
A
G Figure
5. Secondary
Structure
l
1000
CGA
of the lntron
The proposed secondary structure of the SPOl gene 31 intron is shown. Bold ing of the splice junction (Figure 4). Lowercase letters denote exon sequences. with circles. These include the P4 pairing of parts of the P and Q sequences identify nucleotides that could form PlO, bringing the 5’ and 3’ splice sites
arrows indicate the B’and 3’splice sites, determined by RNA sequencPhylogenetically conserved secondary structure elements are labeled and the P7 pairing of parts of the R and S sequences. The solid bars into close proximity. Stop codons of ORFs are boxed.
tween the inferred amino acid sequence of the exons and a region in the carboxy-terminal half of E. coli DNA polymerase I (Figure 6). DNA polymerase of SPOl is encoded by gene 31 (De Antoni et al., 1965) and marker rescue recombination has shown that part of gene 31 is contained on restriction fragment EcoRI-23, which is adjacent to EcoRI-9 (Curran and Stewart, 1985). We have sequenced the DNA that surrounds the exon sequences presented here (Scarlato et al., unpublished data). One continuous ORF, including all of EcoRI-23, terminates just beyond the Hindlll site of EcoRI-9 and encodes a protein of 924 amino acids, in good agreement with the apparent molecular size of loo-105 kd calculated from the electrophoretic mobility of the product of gene 31 (De Antoni et al., 1965).
Discussion
9P31 POL
I
A Novel lntron in Bacteriophage SPOl lntrons in eubacterial DNA have been limited, until now, to the three related group IA introns of bacteriophage T4. It has been impossible to determine whether these introns are relics of an ancient gene structure or examples of a relatively recent horizontal gene transfer. The SPOl intron resembles the T4 introns in overall organization (e.g., it belongs to group IA and has an ORF that overlaps the conserved core) but has no discernible resemblance to the T4 sequences, either in the nucleotides of its catalytic core (besides the universally conserved sequences) or in the amino acids of its ORF. Re-
10 20 30 40 EFRNGNHLYNNFVSKLSLMIDP-DNIVHPSYNIHGTVTGRLSSNEP : . : . ..: . . . . . . . . . . . . . . . ::.::. :.::::::..: EYRGLAKLKSTYTDKLPUUNPKTGRVHTSYHQAVTATGRLSSTDP 630
640
650
660
Figure
EXON
1
EXON
2
670
50 ,31
NAQQFPRKVNTPT:iQYNFEIKIii:NSRFGDGG&QFDYSQL& :
POL
I
:..:
.
. .
NLQNIPVRNEEGR-----680
:..
:
. .
. .
:::
::::.:::
RIRQAFIAP--EDYVIVSADYSQIELR 690
700
710
6. Amino
Acid Sequence
Homology
The inferred amino acid sequence of ligated exons (gp31) was searched with FASTA for homology to other proteins. A 40% identity was found with a seouence near the carboxvl terminus of E. coli DNA polymerase (POiI). Two dots represent identity; one dot indicates a conservative amino acid change.
Cell 422
cent acquisition of an intron in the unrelated B. subtilis bacteriophage SPOl would require an independent transfer, presumably from a different eukaryote. Although not definitive, our data are at least consistent with a simpler interpretation: group I introns were already present approximately 1500 million years ago, in the common ancestors of both gram-positive and gram-negative eubacteria (Ochman and Wilson, 1987). One hypothesis concerning the origin and evolutionary function of introns imputes a role for introns as sites of exchange of protein folding domains by recombination-the “exon shuffling” hypothesis (Gilbert, 1978; Gilbert et al., 1986). The SPOl intron is inserted into the phage DNA polymerase gene in a region of high amino acid sequence similarity to E. coli DNA polymerase I. If the structures of these proteins are homologous, the site of intron insertion would separate two folding domains (8 sheet 8 and a helix J in the DNA binding domain of the E. coli protein; Ollis et al., 1985), consistent with the exon shuffling hypothesis. We searched for introns in SPOl because of the great similarity in structure and mode of infection between the HMU phages of B. subtilis (typified by SPOl) and the HMC phages of E. coli (typified by T4). The surprising finding that SPOl has an intron is consistent with the notion that these structures persist in bacteriophage genes due to some selective value.
Bacteriophage lntrons Occur in Genes Affecting DNA Synthesis Of the three T4 introns, two (nrdL3 and td) occur in genes of the same biosynthetic pathway, the conversion of ribonucleotides to deoxyribonucleotides. The function of the third T4 gene that contains an intron (sunv) is not known. In this context, it was astonishing that SPOl not only has a self-splicing group I intron, but that the intron is also in a gene, DNA polymerase, that is involved in DNA replication. It is striking that besides being in the same pathway, both the nrd6 and td genes specify the two steps in this pathway that consume reducing equivalents. A consequence of inefficient intron removal, therefore, would be a reduction in the levels of these two enzymes and the concomitant reduction of pool sizes of deoxyribonucleotides. It is noteworthy that in most organisms ribonucleoside diphosphate reductase activity is regulated by feedback inhibition by dATP (Eriksson and Sjiiberg, 1989). An exception to this rule is bacteriophage T4, whose ribonucleotide reductase is not sensitive to dATP (Berglund, 1972). The presence of introns in the nrdB and td genes, therefore, presents an opportunity to control the flux of metabolites through the pathway by regulating the rate of enzyme synthesis rather than enzyme activity. Reduced levels of SPOl DNA polymerase, caused by inefficient splicing, would result in expanded pools of dNTPs. Assuming that B. subtilis ribonucleoside diphosphate reductase is sensitive to feedback inhibition, this would also lead to a lower rate of conversion of ribonucleotides to deoxyribonucleotides, with concomitant sparing of reducing equivalents.
A Model for Regulation of Splicing of Group I lntrons Virulent DNA bacteriophages encode enzymes for DNA synthesis that may not be sensitive to the same regulatory signals as their cellular counterparts. Since the synthesis of DNA precursors consumes both reducing equivalents and nucleotides required for synthesis of RNA and protein, production of more DNA than can be packaged could reduce the size of the burst under poor nutritional conditions We suggest that splicing of bacteriophage introns may be regulated by interaction with a molecule that is a general indicator of the redox (or other nutritional) status of the infected cell. This scheme would be consistent with another unexplained property of a phage DNA polymerase. As far as we know, phage T7 has no group I introns. However, T7 DNA polymerase is only active in a 1:l complex with a host protein, thioredoxin (Modrich and Richardson, 1975). Remarkably, formation of the active complex requires the reduced (and not the oxidized) conformation of E. coli thioredoxin (Huber et al., 1986). Thus, T7 may have evolved an independent method of coupling DNA synthesis to the redox state of the infected cell. Regulatory mechanisms used by prokaryotic viruses might also be present in eukaryotic organelles, which are themselves of prokaryotic origin. It is striking that group I “self-splicing” introns have been found in all chloroplasts of land plants investigated, as well as in the cyanelle of Cyanophora paradoxa, but these have been incapable of in vitro self-splicing (reviewed in Cech, 1988; Evrard et al., 1988). It would not be surprising to us if regulation of splicing in these organelles, which are so intimately concerned with redox balance, were tightly regulated in vivo. In chloroplasts, there are numerous examples of protein enzymes that are activated through interaction with reduced thioredoxin (reviewed in Buchanan, 1986; Knaff, 1989). It will be interesting to see whether interaction with redox indicators can similarly regulate the activity of ribozymes. Experimental
Procedures
Strains and Plasmids B subtilis 168 (trp-) was used for propagation of wild-type SPOI bacteriophage. Bacteria were maintained as spores on potato dextrose agar (Difco), 42 g/liter plus 0.02 mglml trp. at room temperature. For infections, bacteria were grown at 37% in NY broth (8 g of nutrient broth [Difco] and 5 g of yeast extract [Difco] per liter [pH 7.21) supplemented with 5 mM MgSO,, and 0.02 mM MnCla after autoclaving. Plasmid pJK9 (provided by E. P. Geiduschek) contains the entire 6.8 kb EcoRI-9 restriction fragment of SPOI. A 2.8 kb EcoRI-Pvull fragment was subcloned into the polylinker of pUC8 (Goodrich et al.. 1989) and was subsequently transferred as an EcoRI-Pstl fragment to pBSM13+ (Stratagene) to create pHaEP1. RNA Isolation and lntron Labeling B. subtilis was grown at 37% in M9S medium (Bolle et al., 1968) modified with 2 x W5 M MnCle and 0.01 mglml L-tryptophan to a density of l-2 x 10s cells per ml. Cells were infected with SPOl at a multiplicity of approximately 5. Aliquots were removed before and at various times after infection and placed on ice. RNA was isolated and incubated with [c$*P]GTP (3000 Cilmmol) as previously described (Gott et al., 1986). In Vltm pHaEP1
Transcription DNA was linearized
with restriction
endonucleases
and 1 ug
Self-Splicing 423
lntron
in DNA Polymerase
Gene
of SPOl
was transcribed with T7 RNA polymerase (Stratagene) and 400 PM of each NTP in 25 ~1. Reaction conditions were as previously described (Gott et al., 1986). Isotopically labeled RNA was prepared by lowering [UTP] to 20 KM and adding 20 uCi of [cI-~~P]UTP (600 Cilmmol). DNA Sequencing Cloned SPOl DNA was sequenced in both directions by the dideoxy chain termination method (Sanger et al., 1977; Chen and Seeburg, 1985). The sequence was checked in its entirety by determining the sequence of single-stranded DNA obtained by asymmetrically amplifying SPOl DNA with the polymerase chain reaction in vitro (Gyllensten and Erlich, 1986). DNA polymerase purified from Thermus aquaticus was obtained from US Biochemical Corp. and was used according to their specifications, with 1:lOO primer ratios. RNA Purification and Sequencing Products from in vitro transcriptions lacking isotopic label were separated on a 5% acrylamide-8 M urea gel, and RNA species were visualized by staining with ethidium bromide. Species corresponding to premessage (unspliced) and ligated exons were cut from the gel, frozen at -60°C for 15 min, and crushed with a glass rod. The gel fragments were incubated at room temperature overnight in 0.9 ml of 10 mM Tris-HCI (pH 7.5), 1 mM EDTA, 30 mM NaCI, and 1% distilled phenol (Peebles et al., 1979). RNA was recovered by two ethanol precipitations and resuspended in HpO treated with diethyl pyrocarbonate. RNA isolated in this way was used as a template for sequencing using an end-labeled oligonucleotide (S-dCTAACTTGAGAGTAGTC-3’. complementary to residues 1133-1148) and AMV reverse transcriptase as described by Belfort et al. (1965). Acknowledgments We thank Larry Gold for stimulating discussions and especially for reminding us that T7 DNA polymerase contains thioredoxin, William A. Goodrich for Figure 5, and Carole Keith for Figure 3 and careful preparation of the manuscript. V. S. gratefully acknowledges E. Peter Geiduschek for support (National Science Foundation grant PCM8317847; National Institutes of Health grant GM15880) and encouragement during the course of the work, and A. Cascino for exquisite hospitality. H. G.-B. was supported, in part, by a Burke Graduate Fellowship. Work in the laboratory of D. A. S. was supported by grants from the National Science Foundation (DMB6609066) and the National Institutes of Health (GM37746). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC Section 1734 solely to indicate this fact. Received
July 3, 1990
References Belfort, M.. Pedersen-Lane, J., West, D.. Ehrenman, K., Maley, G., Chu, F., and Maley, F. (1985). Processing of the intron-containing thymidylate synthase (td) gene of phage T4 is at the RNA level. Cell 41, 375-382.
guanosine nucleotide 27, 487-496.
in the excision
of the intervening
sequence.
Cell
Chen, E. Y., and Seeburg, l? H. (1985). Supercoiling sequencing: a fast and simple method for sequencing plasmid DNA. DNA 4,165-170. Chu, F. K.. Maley, G. F.. Maley, F.. and Belfort, M. (1984). An intervening sequence in the thymidylate synthase gene of bacteriophage T4. Proc. Natl. Acad. Sci. USA 81, 3149-3153. Cummings, D. J., Michel, F., and McNally, K. L. (1989). DNA sequence analysis of the 24.5 kilobase pair cytochrome oxidase subunit I mitochondrial gene from Podospora anserina: a gene with sixteen introns. Curr. Genet. 76, 381-406. Curran. J. F., and Stewart, C. R. (1985). SPOl genome. Virology 742, 78-97.
Cloning
and mapping
of the
Daniels. C. J., Gupta, R., and Doolittle, W. F. (1985). Transcription and excision of a large intron in the tRNATRP gene of an archaebacterium. Halobacterium volcanii. J. Biol. Chem. 260, 3132-3134. Darnell, J. E., and Doolittle, W. F. (1986). Speculations on the early course of evolution. Proc. Natl. Acad. Sci. USA 83, 1271-1275. Davies, R. W., Waring, R. B., Ray, J. A., Brown, T. A.. and Scazzocchio, C. (1982). Making ends meet: a model for RNA splicing in fungal mitochondria. Nature 300, 719-724. De Antoni, G. L., Besso, N. E.. Zanassi, G. E.. Sarachu, A. N., and Grau, 0. (1985). Bacteriophage SPOl DNA polymerase and the activity of viral gene 31. Virology 143, 16-22. Ehrenman, K., Pedersen-Lane, Belfort, M. (1986). Processing gous to the eukaryotic group USA 83, 5875-5879.
J., West, D., Herman, R., Maley, F., and of phage T4 td-encoded RNA is analoI splicing pathway. Proc. Natl. Acad. Sci.
Eriksson, S., and Sjdberg. R.-M. (1989). Ribonucleotide reductase. In Allosteric Enzymes, G. HervB, ed. (Boca Raton, Florida: CRC Press), pp. 189-215. Evrard, J.-L., Kuntz, intron in a cyanelle genetic relationship 71, 115-122.
M., Straus, N. A., and Weil, J.-H. (1988). A class-l tRNA gene from Cyanophora paradoxa: phylobetween cyanelles and plant chloroplasts. Gene
Gage, L. P., and Geiduschek, riophage SPOl development: 57, 279-300. Gilbert,
W. (1978). Why genes
E. P. (1971). RNA synthesis during bactesix classes of SPOl RNA. J. Mol. Biol. in pieces?
Gilbert, W., Marchionni, M., and McKnight, of introns. Cell 46, 151-153.
Nature
277, 501.
G. (1986). On the antiquity
Goodrich, H. A., Gott, J. M., Xu, M.-Q., Scarlato, V., and Shub, D. A. (1989). A group I intron in Bacillus subtilis bacteriophage SPOl, In Molecular Biology of RNA., T. R. Cech, ed. (New York: Alan R. Liss), pp. 59-66. Gott, J. M., Shub. D. A., and Belfort, introns in bacteriophage T4: evidence of RNA in vitro. Cell 47, 81-87.
M. (1986). Multiple self-splicing from autocatalytic GTP labeling
Gott. J. M., Zeeh, A., Bell-Pedersen, D., Ehrenman. K., Belfort, M., and Shub, D. A. (1988). Genes within genes: independent expression of phage T4 intron open reading frames and the genes in which they reside. Genes Dev. 2, 1791-1799.
Berglund, 0. (1972). Ribonucleoside diphosphate reductase induced by bacteriophage T4. Il. Allosteric regulation of substrate specificity and catalytic activity. J. Biol. Chem. 247, 7276-7281.
Gyllensten, U., and Erlich, H. A. (1988). Generation of single stranded DNA by the polymerase cham reaction and its application to direct sequencing of the HLA-DQA locus. Proc. Natl. Acad. Sci. USA 85, 7652-7856.
Belle, A., Epstein, R. M., Sal.% W.. and Geiduschek, E. P. (1968). Transcription during bacteriophage T4 development: synthesis and relative stability of early and late RNA. J. Mol. Biol. 37, 325-348.
Hager, P W., and Rabinowltz, J. C. (1985). Translational specificity in Bacillus subtilis. In The Molecular Biology of the Bacilli, Vol. 2, D. A. Dubnau. ed. (Orlando, Florida: Academic Press), pp. 1-32.
Buchanan, B. B. (1986). The ferredoxin/thioredoxin system. In Thioredoxin and Glutaredoxin Systems: Structure and Function, A. Holmgren. C.-l. BrlndBn. H. Jbrnvall. and B.-M. SjBberg, eds. (New York: Raven Press), pp. 233-242.
Huber, H. E., Russel, M.. Model, P, and Richardson, C. C. (1986). Interaction of mutant thioredoxins of Escherichia co/i with the gene 5 protein of phage T7. J. Biol. Chem. 261, 15006-15012.
Cech, T. R. (1966). Conserved sequences and structures of group I Introns: building an active site for RNA catalysis-a review. Gene 73, 259-271. Cech, T. R.. Zaug, A. J., and Grabowski, P J. (1981). In vitro splicing of the ribosomal RNA precursor of Tetrahymena: involvement of a
Jones, D., and Sneath, P. H. A. (1970). Genetic taxonomy. Bacterial. Rev. 34, 40-81.
transfer
and bacterlal
Kaine, B. P. Gupta. R., and Woese, C. R. (1983). Putative introns tRNA genes of prokaryotes. Proc. Natl. Acad. Sci. USA 80,3309-3312. Kjems, J., and Garrett,
R.A. (1985). An intron in the 23s ribosomal
in
RNA
Cell 424
gene of the archaebacterium 675-677.
Desulfurococcus
Knaff, D. B. (1989). The regulatory Trends Biochem. Sci. 74, 433-434.
mobilis.
role of thioredoxin
Michel, F., and Dujon, B. (1986). Genetic exchanges phage T4 and filamentous fungi? Cell 46, 323.
Nature
318,
in chloroplasts. between
bacterio-
Michel, F., Jacquier, A., and Dujon, 8. (1982). Comparison of fungal mitochondrial introns reveals extensive homologies in RNA secondary structure. Biochimie 64, 667-881. Modrich, F’., and Richardson, C. C. (1975). Bacteriophage T7 deoxyribonucleic acid replication in vitro. J. Biol. Chem. 250, 5515-5522. Mosig, G., and Eiserling, F. (1988). Phage T4 structure and metabolism. In The Bacteriophages, Vol. 2, R. Calendar, ed. (New York: Plenum Publishing Corp.), pp. 521-606. Ochman, H., and Wilson, A. C. (1967). Evolution in bacteria: evidence for a universal substitution rate in cellular genomes. J. Mol. Evol. 26, 74-66. Ollis, D. L., Brick, P., Hamlin, R., Xuong, N. G., and Steitz, T. A. (1965). Structure of large fragment of Escherichia co/i DNA polymerase I complexed with dTMP. Nature 313, 762-766. Pearson, W. R., and Lipman, D. J. (1966). Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85, 2444-2446. Pedersen-Lane, J., and Belfort, M. (1967). Variable occurrence of the nrdB intron in the T-even phages suggests intron mobility. Science 237, 182-104. Peebles, C. L., Ogden, R. C., Knapp, G., and Abelson, J. (1979). Splicing of yeast tRNA precursors: a two-stage reaction. Cell 78, 27-35. Quirk, S. M., Bell-Pedersen, D., and Belfort, M. (1989a). lntron mobility in the T-even phages: high frequency inheritance of group I introns promoted by intron open reading frames. Cell 56, 455-465. Quirk, S. M., Bell-Pedersen, D., Tomaschewski, fort, M. (1969b). The inconsistent distribution phages indicates recent genetic exchanges. 301-315.
J., Riiger, W., and Belof introns in the T-even Nucl. Acids Res. 17;
Sanger, F., Nicklen, S., and Coulson, A. R. (1977). DNA sequencing with chain termination inhibitors. Proc. Natl. Acad. Sci. USA 74, 54635467. Shub, D. A., Xu, M.-Q., Gott, J. M.. Zeeh, A., and Wilson, L. D. (1967). A family of autocatalytic group I introns in bacteriophage T4. Cold Spring Harbor Symp. &ant. Biol. 52, 193-200. Shub, D. A., Gott, J. M., Xu, M.-Q., Lang, B. F.. Michel, F., Tomaschewski, J., Pedersen-Lane, J., and Belfort, M. (1968). Structural conservation between three homologous introns of phage T4 and the group I introns of eukaryotes. Proc. Natl. Acad. Sci. USA 85, 1151-1155. Sjoberg, B. M., Hahne, S., Mathews, C. Z., Mathews, C., Rand, K. N., and Gait, M. J. (1986). The bacteriophage T4 gene for the small subunit of ribonucleotide reductase contains an intron. EMBO J. 5,2031-2036. Stewart, C. (1988). Bacteriophage SPOl. In The Bacteriophages, Vol. 1. R. Calendar, ed. (New York: Plenum Publishing Corp.), pp. 477-515. Trinkl. H., and Wolf, K. (1986). The mosaic ~0x1 gene in the mrtochondrial genome of Schizosaccharomyces pombe: minimal structural requirements and evolution of group I introns. Gene 4.5, 269-297. Waring, R. B., and Davies, R. W. (1984). Assessment of a model formtron RNA secondary structure relevant to RNA self-splicing-a review. Gene 28, 277-291. Xu, M.-Q., and Shub, D. A. (1969). The catalytrc of bacteriophage T4. Gene 82, 77-62.
core of the sunY intron
Zaug, A. J., and Cech, T. R. (1982). The intervening sequence excused from the ribosomal RNA precursor in nuclei of Tetrahymena contains a S-terminal guanosine residue not encoded by the DNA. Nucl. Acrds Res. 10, 2823-2838. GenBank
Accession
The accession M37686.
number
Number for the sequence
reported
in this paper
IS