Gem,. II8 0

(1999) IIWII3

1997 Elscvicr Scicncc

The DNA SPOl (Recombinant

Vincenzo

Puhliahers

B.V. :+.I1rights raewed.

polymerase-encoding

DNA;

replication;

109

0378-l 119~92~S~l5.00

gene of Bacillus

amino acid sequence;

sequence

homology;

subtilis

self-splicing

bacteriophage

group-1 intron)

Scarlato ‘L and Silvana Gargano a

SUMMARY

The bacteriophage SPOl DNA polymerase-encoding gene, which contains a self-splicing intron, has been sequenced its amino acid (aa) sequence has been deduced. The aa sequence of SPOl DNA polymerase shows a high degree of ilarity with that of DNA polymerase I from Escherichicl cnli (PolI). Alignment with the sequences of PolI, and the $29 SPOl DNA polymerases indicate that the aa residues that have been implicated in 3 ’ 4 5’ exonuclease activities are served.

INTRODUCTION

During Bucillus subtilis bacteriophage SPO 1 development, succcssivc phage-coded modifications of the host transcriptional apparatus control three phases of gene expression: early, middle, and late. This temporal pattern of SPOl transcription is controlled by three regulatory genes (genes 28, 33 and 34) whose products code for RNA polymerase-binding proteins, and by a fourth gene (gene 27) that codes for a protein involved in replication (Heintz and Shub. 1982: Greene et al., 1982: Geiduschek and Ito, 1982). Upon infection, early expression starts within 1 min, middle expression starts within 3-4 min (at 37 a C), whereas late cxprcssion starts at the onset of DNA replication which

CJ~,-rcporltlrncp r/j;Dr. V. Scarlato

at his permanent

address:

biological Rcacarch Institute Sicna (I.R.I.S.). Via Fiorentina Sicna, 1~11~.Tel. (39.577)293239; Fax (39-5773293561. Abbreviations:

aa, amino acid(s);

aide triphosphatc:

bp. base pair(s); dNTP,

gp, gcnc product(s);

HMU,

Immuno1, 53100

dcoxynucleo-

hydrowymethyluracil:

kb.

kilobasc(s) or 1000 bp: nt, nucleotide(s); ORF, open reading frame: PolI, E. w/i DN.4 polymcrasc I: Pollk. Klcnow (la-g) fragment of E. co/i DNA polymcraac

I: RBS, ribosomc-binding

fate: ::. novel junction.

site(s); SDS, sodium dodql

sul-

and simand con-

occurs at 10 min. Phage SPOl DNA synthesis depends on the activity of nine viral gene products (Okubo et al., 1972). Mutants in genes 21-23 and 27-32 fail to synthcsizc phage DNA under nonpermissive conditions. The gene 28 protein (gp28), is a member of the o-family of transcriptional initiation factors (Gribskov and Burgess, 1986). Middle transcription is driven by gp28 attached to the RNA polymerase core. SPOl DNA contains HMU in place of thymine, and the products of genes 23 and 2Y probably represent enzymes involved in HMU biosynthesis (Glassberg et al., 1977a.b). It has been proposed that the products of genes 21 and 32 could be involved in the initiation step of SPOl DNA replication, whereas gp22. gp30 and gp31 would be required for elongation (Glassbcrg et al., 1977a,b). Sequence analysis of gene 30 revealed similarity with the exonuclease of bacteriophage T4, gp46, which is involved in DNA replication, and with the replication protein P of phage A (Scarlato and Sayrc. 1992). Biochemical studies have provided evidence that the bacteriophage SPOl DNA polymerase is coded for by gene 31 (De Antoni ct al., 1985). This DNA polymerase is unusual in that it is the first DNA polymerase containing a self-splicing group I intron (Goodrich-Blair et al., 1990; Goodrich et al., 1989).

110 13

19 26 I II

I

2015

25 II

16

I

II

12

1

I

17

I

13

2

I

,'

,'

I

6

8

.'

I

,-

,'

,'

,'

.'

23

,'

II

Zl?

‘sx,,

3 ,,--ri ,' ,'

24

III

II

I

14 13

'\\

I

7

15

I

II

18

26 10

\ '\ '\ '\ \ '\

lOkb

lkb

I

t Fig.

I. Phgwal

location

of SPOl

DNr\ pol~mcrasc-encoding

gene .?I. (Upper part) &oRI

part) Expansion

of SPOl region coding for gcnc 31. Open boxes represent tions of the mapped middle promoters P,,,, 8-10 arc indicated bq upward phagc (provided EwRI-PwII

by E.P. Gclduschck.

fragment.

adjacent

Universit]

to the EcoRI-23

of Cahfornia fragment,

map of the approx.

IA&kb SPOI gcnome.

at San Diego. La Jolla. CA), and cloned into the plaamid

was cloned in pUCX as described

of subclones was gcncratcd by using suitable internal restriction isolation of plasmids DN.4 were E. co[i HB 101 and DHS

(Goodrich

sites, or by using BAL 31 nuclease

The aim of the present study was to determine the nt and the aa sequences of SPOl DNA polymerase, and to compare the aa sequence with those ofother DNA polymerases.

EXPERIMENTAL

restriction

AND DISCUSSION

(a) The nt and predicted aa sequences of SPOl DNA polymerase Marker rescue of an amber mutation in gene 31 was observed upon recombination with a plasmid containing the restriction fragment EcoRI-23 (Curran and Stewart, 1985). This DNA fragment (Fig. 1) is adjacent to the larger SPOl genomic fragment EcoRI-9 in which a self-splicing group-I intron was identified (Goodrich-Blair ct al., 1990; Goodrich et al., 1989). Analysis of the nt sequence of EcoRI-23, and of the flanking restriction fragments. shows an ORF that starts just downstream from the middle promoter P M,,8(Greene et al., 1984; Scarlato et al., 1991) and terminates within the unspliced intron (Goodrich-Blair et al., 1990). A separate 522-bp ORF lies within the intron. Ligation of the spliced message brings together two exons that constitute the DNA polymerase gene (Fig. 1). The nt sequence of the SPOl DNA polymerasc gene and the flanking sequences are shown in Fig. 2. The calculated M, of the 924-aa sequence predicted for the SPO 1 DNA polymerase is 106808, which is in agreement with the reported size (105 kDa) estimated by SDS-polyacrylamide gel elec-

(Lower

gcnc 31 cxons and shaded arca the intron. TranscrIption ib rightward. Locnarro\?s. The 1632.hp E‘crlRI-23 fragment was purified from a i recombinant vector. pUC18.

et al.. 1989: Goodrich-Blair

digestions.

Bacterial

The 2X00 bp

et al.. 1990). 4 xt

straitis used for propagation

and

trophoresis (De Antoni et al., 1985). This protein contains 23”,, of acidic, 16”, of basic, and 35”, of hydrophobic aa residues. The start codon ATG for the SPOl DNA polymerasc gene is preceded by a region complementary to the 3’ terminus of B.suhtilis 16s rRNA (Moran et al., 1982). A similar RBS has been found to overlap the stop codon of the gene 31 ORF. Eight bp downstream from this RBS there is a possible start codon for an unidentified ORF. (b) Comparison of aa sequences between phage SPOl DNA-polymerase and PolI of Escherichia coli A search of the Genetic Computer Group data base (Devereux et al., 1984) showed the SPOl DNA polymcrase to be strongly homologous to PolI. The aa sequence alignment of SPOI DNA polymerase and PolI (Fig. 3) showed that 53.04”” of aa residues are conserved (28.73”,, identity). The majority of the identical residues (total of 36.8”,,) were found in the C-terminal portion of the two proteins. In this region, residues Lys”‘. Tyr’“” and His”“’ of PolI were identified as being implicated in dNTP-binding (Joyce and Steitz, 1987; Pandey et al., 1987). These three rcsiducs are conserved in SPOl DNA polymerase (Fig. 3) as well as in bacteriophages T5 and T7 DNA polymerascs (Lcavitt and Ito, 1989). Comparison between a number of DNA polymerase aa sequences has suggested that there may be several distinct evolutionary groups of DNA polymcrases. Based on primary structural similarity, the SPOl DNA polymcrase belongs to a group that includes PolI, bactc-

111 -35 . .-IO . _______ . ACATGGGTGTCAAGTTGTGGTGAATCAGTTTCTTTTTAGTATCAAGC~GGAGTGTTGTTAATG~MGTGCTTTAGACACGCTAAA~MTTCAATCCTAAGCCTATGMG~AAG -______ --____ MGSALDTLKEFNPKPMKGO

120

~A~AMAAAG~TA~TTATMTCGTC~AAGAAA~CTTTTGACTATGAATACCGTAAGAAGAAGTATATGACAGGMAAGCAGG~CTTTTAAAGTTTGGGCTA~AGAAGT~

240

19

59

TGKAGKLLKFGLAEV

GSKKARIIIVOENPFDYEYRKKKYM

360

GAATAGACCCAGATGAGG~TGTGTACTACACTTCAATTGTTAAGTAC~AACACCAGAGAATAGATTA~AACACCAGATGAGATCAA~AGTCTATGGATTATATGTG~AGAGATAG

99

GIDPDEDVYYTSIVKYPTPENRLPTPDEIKESMDYMWAEI

480

AGGTAATCGATCCTGATATCATCATCCCTACAGGTAATTTATCCTTGMGTTTTTAA~AAAATGACA~ATTACTAA~TTAGG~~GTTATATGAGATAGA~GTAGAAAATTCT

139

TKVRGKLYEIEGRKF

EVIDPDIIIPTGNLSLKFLTKMTAI

TCCCTATGATTCATCCTMTACAGTGCTCAAACAGC~AAATATCAGGACTTCTTTATTAAGGACCTTGMATATTGGCATCCTTATT~MGGAAAAACACCTAAGAATGTTCTAGCGT

600

FPMIHPNTVLKCIPKYODFFIKDLEILASLLEGKTPKNVLA

179

TTACAMOGAAAGACGATACTGTGAT~TTTGA~AT~TATTGATGAGATCAAGAGATACTTAGA~TTCCAGCAGGTTCTAGAGT~TTATCGACTTAGAGACTGTTMGACCAACC

720

FTKERRYCDTFEDAIDEIKRYLELPAGSRVVIDLETVKTN

219

CTTTTATTGAAAAAGTMCTATGAAG~ACGACTTTAGAAGCTTATCCAATGAGCCMCAGCCTAAGATTGTTGGTATCGGGTTATCTGACAGGTCTGGTTATGGATGTGCGTACCCT

840

PFIEKVTMKKTTLEAYPMSOQPKIVGIGLSDRSGYGCAIP

259

TATATCACAGGGAAAATCTTATGAAGGOTAACCAGATAGGAACCATTGTAAAATTTCTMGAAAGTTACTAGAAAGGGMGATTTGGAGTTTATCGCACATAATGGTAAGTTTGATATAA

960

LYHRENLMKGNQIGTIVKFLRKLLEREOLEFIAHNGKFDI

299 1090 339

AGACTGATATG~TGGTTATGATGATOCCCTTGATGGTGAAAAACCT~GGGGAGGATGAAGGTAATTACGACTTGATACCCTGGGACATACTAAAAGTGTATCTTGCAGATGACTGTG

1200

ETDMGGYDDALDGEKPKGEDEGNYDLIPWDILKVYLADDC

379

ATGTGhCTTTCAGATTGTCAGAGAAATATATACCTTTGGTTCCTAGACATAGAAATGG

1320

OVTFRLSEKYIPLVEENEEKKWLWENIMVPGYYTLLDIEM

419

ATGGCATACATGTTGATAGGGAATGGCTTGAAGTTTTMGAGTTTCCTATGAAAAGG~ATTTCTAGACTTGAGGACAAGATGAGAGAATTTCCTGAGGGTGTCGCTAT~AGCGTGAAA

1440

DGIHVDREWLEVLRVSYEKEISRLEDKMREFPEGVAMERE

459

TGAGGGACAAGTGGA~AAAGAGT~ATGA~AGGTA~AT~AAGTC~~TAATAGAACACC~GAGCAA~~GA~AAGTTCAAGAAATATM~AAATATGA~~AT~TAAA~~~TGGGGATA

1560

MRDKWKERVMIGNIKSANRTPEOQDKFKKYKKYDPSKGGD

499

AGATTMCTTTGGTAGTACTAAACAACTAGGAGAGCTATTGTTTGAGAGAATGGGATTAGAGACTGTTATTTTTACTGACAAAWjGGCACCAAGTACTAATGATGACTCCCThAAGTTTA

1680

KINFGSTKOLGELLFERMGLETVIFTDKGAPSTNDDSLKF

539

:: GCTACMCATACATGGCACTGTGACAGGTCGTTTGAGTAGTAATGA~TAACGCTC~AATTCCCA~TAAGGTGAACACG~CAACATTATT~~AGTATAACTTTGAGATTAAGAAAA

1920

SYNIHGTVTGRLSSNEPNAQQFPRKVNTPTLFQYNFEIKK

619

TGTTT~CTCTAGGTTT~GGAT~T~TGTAATTGTA~AGTTTGA~~ACTCTCAG~TM~AGTTACGTA~AC~TGT~TGTTA~TA~T~MGA~C~~A~ACTATTGA~C~GTACAGA~CAG

2040

MFNSRFGDGGVIVQFDYSQLELRILVCYYSRPYTIDLYRS

659

GAGCTG~CTTGCATAM~TGTA~~TCTGA~G~A~TTGGTGTAG~~ATTGAAGAGGTMGTAAAGA~AG~GGA~AG~MG~AAGAAGATA~AGTTTGGTAT~GTTTA~AAGAG~~TG

2160

GADLHKAVASDAFGVAIEEVSKDORTASKKIOFGIVYQES

699

CA~GA~~TTTATCTGMG~CCTGCGG~AGA~TATCACTATGAGTGMGATGAATGTGAAATCT~CATCAAGAAG~ACTTTAAG~GATTCCCTAAGGTTAG~AAGTGGAT~AGAGATA

2280

ARGLSEDLRAEGITMSEDECEIFIKKYFKRFPKVSKWIRD

739

CCAAAMGCATGTTAMGACATAAGTACGGTTAAGACTCTCACC~A~TACTAGAAACCTACCTGATATTGACTCTATAGATCAGTCTMGGCAAATGAGGCAGAACGTCAGGCGGTTA

2400

TKKHVKDISTVKTLTGATRNLPDIOSIDQSKANEAEROAV

779

ATACTCCTATTCAAOGMCAGGCTCTGACTGTACACTMTGTCTCTAATCCTCATCAATCAATGGTTMGAGAGTCTG~TTAAGAAG~TATCTGTATTACAGTTCATGACTCCATTG

2520

NTPIOGTGSDCTLMSLILINOWLRESGLRSRICITVHDSI

819

TACTAGACTGTCCTAACWTGAAGTATTAGAGGTTGCTAAGAAAGTThMCAThTCATffiAGAACTTAGGAGAATATAATGAGTTCTAThMTTCCTTG~ACGTACCMTCCTCAGTG

2640

VLDCPKDEVLEVAKKVKHIMENLGEYNEFYKFLGDVP

I

L

s

AAAT(jCAGATTCioAAG~CTATGGG~T~TTTTGMW3TACTATTGMGATATAGMGAACATGGAGTAGATGGTTTCATTGAAATGMGAAAAAGAGAAGCTTGA~GGATATGA

959 2760

EflEIGRNyGDhFEhTIEDIEEHGVDGFIEnKEKEKLEKDn

899 __-----

AAGAGTTTACTAAGATTATTGAAGAT~TGGATCAATACCACCTTGATTAAGGGGTTG

2880

KEFTKIIEDGGSIPOYARIYWENIS*

MKODTLIKGL 924

Fig. 2. The nt and aa sequences Goodrich-Blair

of the DNA polymerase

gene. Sequences

from bp l-90

and from 17 15- 1989 were taken from Greene et al. (1984) and

These sequences and the sequence of the EcoRI-23 fragment were assembled to yield the sequcncc of gene 31. The -10 and -35 DNA regions of the P,,, 8 middle promoter (see Fig. 1) arc underlined. RBS are overlined. The junction of exon 1 and exon 2 is indicated by double colons (::). Both strands were sequenced by the method of Sanger et al. (1977) either using PolIk (Boehringcr Mannheim Biochemicals),

or Sequenase

et al. (1990), respectively.

(US Biochemical

the DNA sequencing formed by standard M84415.

Corp.. Cleveland.

was carried methods

OH). In the cases in which the generated

out with synthetic

(Maniatis

oligodeoxynucleotides

et al., 1982). This sequence

as primers

subclones

(Applied

were too long or the sequence

Biosystem

data will appear in the GenBank

Synthesizer).

Nuclcotide

Sequence

resolution

DNA mampulations Database

was poor, were per-

under accession

No.

112

2 GSALDTLKEFFPKPMKGQGSKKARIII”OENFFDYEY--RKKKYMTGKAGKL-------~ I. .:.I. :.. I. *.*. I :: ..:..I I. : I......::., 37

GNlYGVLNNLRSLIMOYKPTHMVVFDhKGKTFRDELFEHYK~PPnPDDLRAOIEPLH

52

-----LKFGL~VGIDPDEDVWTSIVKYPTPENRLPTPDEIKESnDYnWAEIEVI----

97

hMVKhMGLPLLhVSGVEADDVIGTLAREAEKA~PVLISTGD~AOLVTPNI~INT~T

:

:.I

.I:

..:11.

.

I

..::..:

. . . .

103

----DPDIII----PTGNLSLWLTKMTAITKVRGKLYEIEGRKFFPMIHP----NTVLK

157

NTILGPEEVVN(YGVPPELII~ALMGDSSa‘l[PGVPGVGEKTADALLQGL~DTLYA

151

~-KYDDFFIKIXEILASLLEGKrPKNVLAF~RRYCD---------------TFEDAI

217

EPEKI~GLSFRG~KTMAAKLE~EVAYLSYMTIKTDVELELTCEOLEVOCPAAEELL

195

~~KRY_----_LELPAGSRV---___________--_-______________---V~DL

:I:

:I

::

I..::

:

. . ..I

:::

..:I.

:,:I

:.Il.

I..

il....

.

. .

I.:

.

:

.

:.:

I:

::::.

:::::.

.

ETVKTNPFIEKVTMKKTTLEhYPMSOO------PKIVGIGLSCRSGYGCAIPLYHRENLM ::I.1 :.:11:::. II:1 :.I..: I:. . :

337

ETLK--AWIAK-IEKAPVFAFDTETDSLDNISANLVGLSFAIEPGVAAYIPVAH-DYLD ** *

268

KGNOIG--TIVKRRKLLEREDLEFIAHNGKFDIRWLRhSL----1H :::I ,:I .:I I: :.:I:.111 I.

392

hWDISRERAlELLKPLLEDEKNKVGONLKM).RGILANYGIELRGIAFDTMlSYILN * *

322

IIDYRGERYSlU%RLANLETDmj(3YDDALliGEKPKGEDEGIPWDI;KVYLADDCD~ . . . . ..I .Il.:: .:: :. I :. l:..l: . .I

451

SVAGRHDNDSLAERWLKHKT------ITFEEIAGKGKNOLTFMIALEEAGRYMEDADV + *

,

:

.I

I..:1

:.

I

505

nOLHLKNWPCLCKHKGPLNVFENIEMPLVPMSRIERNGVKI~KVLHNHSEELTLRLA

442

FhEDKMREFPEGVAMEREMRDKW

The DNA polymerase-encoding gene of Bacillus subtilis bacteriophage SPO1.

The bacteriophage SPO1 DNA polymerase-encoding gene, which contains a self-splicing intron, has been sequenced and its amino acid (aa) sequence has be...
566KB Sizes 0 Downloads 0 Views