Cell, Vol. 62, 629-637,

August

24, 1990, Copyright

0 1990 by Cell Press

The IN Protein of Moloney Murine Leukemia Virus Processes the Viral DNA Ends and Accomplishes Their Integration In Vitro Robert Craigie, Tamio Fujiwara, and Frederic Bushman Laboratory of Molecular Biology National Institute of Diabetes and Digestive and Kidney Diseases Bethesda, Maryland 20892

Retroviral DNA integration involves a coordinated set of DNA cutting and joining reactions. We find that the IN protein of Moloney murine leukemia virus (MoMLV) is the only viral protein required to accomplish these reactions in vitro, IN protein has a site-specific nuclease activity that cleaves 2 nucleotides from the sequence present at the Blends of MoYLV DNA made by reverse transcription. This reaction generates the recessed 3’ ends that are the normal precursors for integration. IN protein also possesses the integration activity that joins these recessed 3’ ends of the viral DNA to a staggered cut, made by IN protein, in the target DNA. Short duplex oligonucleotides, corresponding to the ends of MoMLV DNA, serve as the viral DNA substrate for both the cleavage and integration reactions; there are no special requirements for the DNA that acts as the target for integration. The reaction products are detected by a direct physical assay. Introduction After infection of a sensitive cell, the retroviral RNA genome is reverse transcribed to make a DNA copy. Integration of this viral DNA into a chromosome of the host cell is necessary for normal viral replication. Transcription of the integrated DNA produces viral RNA species that function as the template for translation of viral proteins or as the genome of progeny virions. Assembled virions bud from the cell membrane, and their entry into another cell completes the viral replication cycle. For a recent review of retroviral replication, see Varmus and Brown (1989). Genetic studies have identified two classes of mutations that directly affect integration. Mutations mapping near the 3’ end of the pal gene of Moloney murine leukemia virus (MoMLV) abolish integration, even though reverse transcription proceeds normally to make the viral DNA (Donehower and Varmus, 1984; Schwartzberg et al., 1984); this region of the MoMLV pal gene encodes the 46 kd IN protein, which is generated by proteolytic processing of a polyprotein precursor (Tanese et al., 1986). The other class of mutations that impair integration are changes in the DNA sequence present at each end of unintegrated MoMLV DNA (Colicelli and Goff, 1985, 1988). These findings are consistent with the view that an interaction be* Present Japan.

address:

Shionogi

Institute

for Medical

Science,

Osaka

566,

tween IN protein and the ends of the viral DNA is important for integration. Similar results have been obtained with other retroviruses (for recent reviews of retroviral DNA integration see Skalka, 1988; Varmus and Brown, ‘1989; Grandgenett and Mumm, 1990). The endogenous viral DNA made by reverse transcription exists as part of a large nucleoprotein complex, derived from the viral core (Brown et al., 1987; Bowerman et al., 1989). The ends of the viral DNA isolated from these complexes are of two forms: some are blunt and correspond to the expected full-length product of reverse transcription, and others are recessed by 2 bases at the 3’ end (Fujiwara and Mizuuchi, 1988; Brown et al., 1989). The MoMLV IN function is implicated in generating these recessed 3’ ends because several mutant viruses defective in IN make only the blunt-ended viral DNA (Roth et al., 1989; Brown et al., 1989). The endogenous viral DNA can be integrated into a target DNA in vitro, and the necessary protein factors copurify with the complex (Brown et al., 1987; Bowerman et al., 1989). Analysis of the structure of the MoMLV integration intermediate, made in this cell-free system, has demonstrated that the viral DNA species with recessed 3’ ends is the precursor for integration (Fujiwara and Mizuuchi, 1988; Brown et al., 1989). To form the intermediate, these recessed 3’ends of the viral DNA are joined to the 5’ends of a double-strand cut made in the target DNA; 4 bp staggered cleavage of the target DNA, with a 5’ protrusion, is inferred from the characteristic 4 bp duplication of target DNA sequence at the site of MoMLV integration. The 5 ends of the viral DNA remain unjoined in the intermediate, and a DNA repair step is necessary to complete the integration process. As a first step toward defining the protein factors needed for retroviral DNA integration, and analyzing their biochemical activities, we previously developed an in vitro reaction system for integration of an exogenously added mini-MoMLV DNA with recessed 3’ends that mimic those of the authentic integration precursor (Fujiwara and Craigie, 1989). The mini-MoMLV DNA also carried antibiotic resistance markers, enabling the integration products with phage lambda target DNA to be detected by selection in E. coli. Detergent-disrupted MoMLV virions provided the necessary viral proteins. Here we demonstrate by a direct physical assay that the IN protein of MoMLV is the only viral protein necessary both to generate the recessed 3’ends of MoMLV DNA and to accomplish their integration in vitro. Results Expression and Partial Purification of MoMLV IN Protein We have expressed the MoMLV IN protein in Spodoptera frugiperda (Sf9) cells infected with a recombinant bacufovirus that contains the entire coding sequence of MoMLV lN protein under the transcriptional regulation of the Au-

Cell 830

a

b

c

of integration, confirming the fidelity of the reaction (data not shown). Preliminary experiments also revealed that most of the IN protein was present in the insoluble fraction of cell lysates, although some integration activity was found in the soluble as well as in the insoluble fraction. A considerable increase in integration activity was obtained by solubilizing the initially insoluble protein fraction in the presence of urea; the protein remained soluble and active after subsequent removal of the urea by dialysis. The partially purified preparation of IN protein used for the experiments described here was prepared from the insoluble fraction (Figure 1, lane c), which is already considerably enriched for IN protein, by solubilization in the presence of urea and chromatography on a Superose 12 column equilibrated with the solubilization buffer. Fractions containing IN protein were pooled and extensively dialyzed to remove the urea. We estimate that IN comprises at least 50% of the total protein in the Superose 12 peak (Figure 1, lane d).

d

200k 97k 68k 43k

29k

Figure

1. Expression

and Partial

Purification

of MoMLV

IN Protein

SDS-polyacrylamide gel electrophoresis of the insoluble protein fraction from uninfected Sf9 cells (lane a), Sf9 cells infected with wild-type AcMNPV (lane b), Sf9 cells infected with recombinant AcMNPV expressing MoMLV IN protein (lane c). The insoluble protein fraction shown in lane c was solubilized in the presence of urea and applied to a Superose 12 column. Peak fractions containing IN protein were pooled (lane d). The migration positions of molecular weight standards are indicated. The gel was stained with Coomassie blue.

tographa californica nuclear polyhedrosis virus (AcMNPV) viral polyhedrin promoter. Cells infected with the recombinant virus express high levels of IN protein (Figure 1). The identity of the expressed protein was confirmed by its mobility in polyacrylamide gel electrophoresis, Western blotting analysis, and N-terminal sequence analysis (data not shown). The N-terminal methionine is retained and precedes the lleu residue that marks the expected N-terminus of the viral protein made by proteolytic cleavage (Copeland et al., 1985). As a functional assay for putative integration activity of the expressed protein, we first employed the previously described in vitro assay for integration of mini-MoMLV DNA (Fujiwara and Craigie, 1989). The DNA sequences at the ends of the linear mini-MoMLV DNA mimic the ends of unintegrated MoMLV DNA isolated from the cytoplasm of MoMLV-infected cells, and the 3’ ends are recessed by 2 nucleotides, as in the authentic precursor for integration. Integration of mini-MoMLV DNA into phage lambda target DNA is detected by in vitro packaging of the reaction products, infecting E. coli, and selecting for the antibiotic resistance markers present on the mini-MoMLV DNA. Integration activity was indeed detected in extracts made from Sf9 cells infected with the recombinant baculovirus; sequencing of the junctions between mini-MoMLV and target DNA of several integration products revealed the expected 4 bp duplication of target DNA sequence at the site

IN Protein Correctly Cleaves the 3’ Ends of MoMLV DNA The MoMLV IN function is known to be essential for formation of the recessed ends of the MoMLV DNA precursor for integration in vivo (Roth et al., 1989). We therefore tested whether IN protein is able to process blunt-ended MoMLV DNA termini, corresponding to the product of reverse transcription, to generate the correct recessed 3’ ends. As the substrate for cleavage, we synthesized a doublestranded DNA, LTRl, corresponding to the first 18 bp of one end of unintegrated MoMLV DNA (Figure 2A); the first 13 bp of the other end of authentic unintegrated MoMLV DNA are identical, and mutations beyond position 13 from the LTR ends do not affect integration in vivo (Roth et al., 1989, cited as unpublished data). The DNA strand of LTRl that would be recessed by 2 nucleotides at its 3’end in the integration precursor was labeled with 32P at its 5’ end. This blunt-ended LTR substrate was incubated with IN protein under appropriate reaction conditions, and the products were electrophoresed in a denaturing polyacrylamide gel together with standards generated by chemical cleavage reactions (Figure 28). In the presence of IN protein and Mn*+ (Figure 28, lane d), cleavage resulted in a major product cut at position 2 from the 3’ end of the labeled strand; the slightly slower mobility of the IN cleavage products, relative to the standards made by chemical cleavage, is indicative of 3’-OH termini. A less prominent cleavage also occurred at position 1 to give a product shorter by 1 nucleotide. The origin of the reaction products longer than 18 nucleotides (Figure 28, lane d) is discussed below. Cleavage was not observed in the absence of IN protein (lane c) or when EDTA was substituted for Mn*+ (lane e). Substitution of Mg*+ for Mn*+ abolished cleavage at position 2, but weak cleavage was still observed at position 1 (lane f). Incubation of the unannealed labeled strand with IN protein in the presence of Mn*+ resulted in weak cleavage at position 1, but not at position 2 (lane g), suggesting that the weak cleavage at position 1 of the normal double-strand substrate in the presence of Mn*+ may

Biochemical 831

Activities

of MoMLV

IN Protein

A LTRl

5'-AATGAAAGACCCCACCTG-3' 3'-TTACTTTCTGGGGTGGAC-5'.

LTR2

5'-TATGAAAGACCCCACCTG-3' 3'-ATACTTTCTGGGGTGGAC-5'.

LTRtJ

5'-AAATGAAAGACCCCACCTG-3' 3'-TTTACTTTCTGGGGTGGAC-5'.

B abcdefg

C

abed

D abed

Figure

LTRl Figure

2. Cleavage

LTR2 of MoMLV

DNA Termini

LTR3 by IN Protein

(A) DNA substrates for cleavage reactions with IN protein. LTRl corresponds to the 5’ edge of the MoMLV U3 sequence. Bases that differ from this wild-type sequence in LTRP and LTR3 are indicated by bold letters. Solid circles indicate the position of the 32P label. (B) Cleavage of LTRI by IN protein. The reaction products were electrophoresed in a 20% denaturing polyacrylamide gel and visualized by autoradiography. Lanes a and b. chemical cleavage standards: A+G and T+C, respectively. Reaction omitting IN protein (lane c), standard reaction (lane d), reaction containing 0.1 mM EDNA instead of 3 mM MS+ (lane e), reaction containing 3 mM Mg*+ instead of 3 mM MS+ (lane f), reaction containing only the labeled DNA strand of LTRI (lane g). The massof DNA included in the reaction with single-stranded DNA substrate (lane g) was adjusted to equal that of the standard reaction with duplex DNA. The major cleavage product in the presence of Mn*+, indicated by an arrow, results from removal of 2 nucleotides from the 3’ end of the labeled DNA strand in LTRl. (C and D) Reactions with LTR2 (C) and reactions with LTR3 (D). The labeling of the lanes in (C) and (D) is the same as in (A).

be the result of incomplete annealing or partial denaturation during the incubation. No cleavage was detected when control reactions were carried out with identically prepared protein fractions from uninfected Sf9 cells or Sf9 cells infected with wild-type AcMNPV virus (data not shown). We also tested two double-stranded oligonucleotides with altered bases as substrates for cleavage by MoMLV IN protein (Figures 2C and 2D). One (LTRP) terminated CATA...3’, and the other (LTR3) terminated CATTT.3’. LTR2 was processed identically to the wild-type end, with predominant cleavage at position 2 (Figure 2C, lane d).

3. Expected

Products

of Normal

Integration

with LTRI

(A) A pair of LTRI molecules are brought together by IN protein (stippled). 5’ ends of DNA strands are indicated by circles and the labeled 5’ end in LTRI is shown as a solid circle. (B) IN protein cleaves 2 nucleotides from the 3’ end of the labeled strand of each LTRl molecule to make the recessed precursor ends for integration. (C) IN protein makes a 4 bp staggered cut in another LTRl molecule, which acts as the target DNA, and joins the recessed 3’ ends (shown in [B]) to the protruding 5’ ends of this cut. (D) Melting of the 4 bp of target DNA between the sites of cleavage generates two products; the hydrogen-bonded single strands in these products may also separate, depending on their T, and environment. Analysis of these reaction products by denaturing gel electrophoresis is expected to reveal labeled DNA strands of heterogeneous size that are longer than LTRl, together with smaller fragments of heterogeneous length. The heterogeneity in fragment size arises because integration events can occur at many different locations in the target DNA sequence. Although, for simplicity, cleavage at the 3’ ends of LTRl is shown as a concerted reaction involving a pair of LTRI ends, our experiments do not test this idea. However, concerted integration of a pair of LTRl ends, after cleavage by IN protein, has been demonstrated in our experiments.

LTR3 exhibited cleavage at positions 1 and 3 from the 3 end of the labeled strand, with weaker cleavage at position 2 (Figure 2D, lane d). Integration of MoMLV DNA Ends by IN Protein Products longer than the original substrate DNA were obsewed in reactions with LTRl and IN protein (Figure 2B, lane d). Are these products generated by the normal MoMLV integration reaction with LTRl acting as both the viral DNA ends and the target DNA (depicted in Figure 3)? The experiments described below address this question. We shall refer to the joining of recessed 3’viral DNA ends to the 5’ ends of a cut made in the target DNA as integration, although strictly this DNA strand transfer reaction generates an integration intermediate that requires a subsequent DNA repair step to complete the integration process. The series of longer products generated in a reaction with LTRl (see Figure 28, lane d) is more evident with a

Cell 632

A LTRl

5'-AA'I'GAAAGACCCCA;(:"I'C;3 ’ -TTACTTTCTSr-(;C-T~SAC-'

'I ‘0

n /1‘C‘TC;L y .rk _ ACTTTCTGGGGTGGAC-5'.

5 ! -AATGAAAGACC”,-’

LTR4

3'-

LTR5

5'-AACGAAAGACCCCACCTG-i' 3'-TTGCTTTCTGGGGTGGAC-"'.

LTR6

l ::~AATGAAAGACCCCACCTG-:'

. ’

TTACTTTCTGGGGTGGAC-5'

B abcdef

D

C 9

abed

abed

E abed

I) * .

LTRl Figure

4. Integration

LTR4 of MoMLV

LTR5 DNA Ends

LTR6 by IN Protein

(A) DNA substrates for reactions with IN protein. Labeling is the same as in Figure 2A. (6) Integration of LTRl by IN protein. The reactions are the same as in Figure 28, but a longer exposure of the autoradiogram is shown. Lanes a and b, chemical cleavage standards: A+G, T+C, respectively. Lane c, reaction omitting IN protein; lane d, standard reaction. Other labeling is as in Figure 28. (C-E) Reactions with LTR4 (C), reactions with LTR5 (D), and reactions with LTR6 (E). The labeling of the lanes in (C), (D), and (E) is the same as in (B). The single band that migrates slightly more slowly than the original substrate in lane d of (D) and (E) may be artifactual since a band of similar mobility is occasionally seen in the products of chemical cleavage and control reactions.

longer exposure of the autoradiogram (Figure 48, lane d). Heterogeneous fragments of shorter length are also observed. These shorter fragments are an expected product of the integration reaction, because the 3’ ends of the cut made in target DNA remain unjoined in the integration product (see Figure 3). Neither the higher molecular weight series of products, nor the shorter products, were observed when the Superose 12 IN protein fraction was substituted by the corresponding protein fraction prepared from uninfected Sf9 cells or from Sf9 cells infected with wild-type AcMNPV virus (data not shown). Further evidence for the involvement of IN protein came from the observation that under a variety of conditions that reduced the efficiency of correct cleavage, the quantity of both the

higher molecular weight series of products and the shorter products decreased in parallel (data not shown). Analysis of the reaction products made with several different LTR substrates strengthened the view that integration occurs in our reactions. The wild-type substrate (LTRl) gives a series of higher molecular weight oligonucleotides, which abruptly terminates at a mobility corresponding to about 30 nucleotides (Figure 4B, lane d). The irregular spacing of the bands indicates that they do not correspond to a unique sequence terminating at different nucleotide positions. A similar pattern of bands is also observed with a substrate (LTR4) with recessed 3’ends, corresponding to the cleavage product of the full-length substrate (LTRl) by IN protein (Figure 4C, lane d). However, this series of bands is absent when the substrate carries an A/T to G/C mutation at position 3 (LTRS, Figure 4D), or when the other strand of the wild-type substrate is labeled at its 5’ end (LTRG, Figure 4E). Although the longer products are not made in the reaction with LTRG, heterogeneous shorter products are observed. All of the above results are fully consistent with the scheme depicted in Figure 3. The products generated by LTR4 differ slightly from the corresponding series with the wild-type substrate, because one of the potential target DNA strands is 2 nucleotides shorter at its 3’ end. The base pair altered in LTRS, which does not make the higher molecular weight products, is known to be critical for integration (Colicelli and Goff, 1985, 1988). Finally, because integration results from joining the recessed 3’ ends to the 5’ ends of a cut made in the target DNA, the labeled products of integration with substrate LTR6 should all be shorter than the initial length (see Figure 3) as is observed. To confirm that the higher molecular weight products are generated by the MoMLV integration reaction, we determined the nucleotide sequence of a population of these products made by incubation of the wild-type substrate LTRl with MoMLV IN protein. The reaction products were first electrophoresed in a denaturing polyacrylamide gel, and the portion of the gel containing the upper two-thirds of the higher molecular weight ladder was excised. The DNA was purified from the gel, subjected to base-specific chemical cleavage reactions, and then electrophoresed in asecond sequencing gel (Figure 5). The result shows that the sequence of these products is homogeneous from the 5’ end of the labeled strand up to the A nucleotide at position 3 from the 3’ end that would become joined to target DNA sequence in the product of a normal integration reaction, the terminal pair of T nucleotides having been lost. As predicted, the sequence beyond this A nucleotide becomes heterogeneous because integration occurs at multiple locations in the target DNA sequence. Concerted Integration of a Pair of MoMLV Ends by IN Protein Generates the Normal 4 bp Target Sequence Duplication The data presented above demonstrate that doublestranded oligonucleotides corresponding to the ends of MoMLV DNA are correctly cleaved by IN protein and that the resulting recessed 3’ ends are joined by IN protein to

Biochemical 033

Activities

of MoMLV

IN Protein

abed

Figure 6. Expected Product a Circular Target DNA

Figure 5. DNA Sequence of the Labeled tegration Products Made with LTRI

Strand

in a Population

of In-

If IN protein integrates LTRl DNA molecules into one another by the normal integration pathway, as depicted in Figure 3, the longer DNA strands in the integration products should all have a unique sequence from their labeled 5’end up to the CA-3’ position that becomes joined to target DNA sequence in the normal reaction, the terminal pair of T nucleotides having been lost. The sequence that follows the CA should be heterogeneous because integration can occur at multiple locations in the target DNA. To test whether our integration products meet these criteria, a population of integration products made with LTFtl was purified from a denaturing polyacrylamide gel and subjected to basespecific chemical cleavage reactions. The products were then electrophoresed in a second sequencing gel. Lanes a-d, chemical cleavage reactions: G, A+G, T+C, and C, respectively. The integration product that remains uncleaved after the chemical cleavage reactions is labeled IP As predicted, the sequence is unique from the 5’ end and matches that of LTRl, but becomes heterogeneous after the CA position that is expected to be joined to the target DNA, indicated by the arrow.

heterogeneous locations in the target DNA sequence. In the normal MoMLV DNA integration reaction, concerted attack by a pair of MoMLV DNA ends joins each 3’end of the MoMLV DNA to the protruding 5’ ends of a 4 bp staggered cut made in the target DNA. Repair of the resulting intermediate generates the characteristic 4 bp target sequence duplication at the site of integration. To test whether correct paimise integration of MoMLV DNA ends occurs in our reactions, we analyzed the products generated when a 400 bp circular DNA was included as a target DNA in a reaction with the wild-type LTRl substrate. Integration of a pair of MoMLV DNA ends is expected to generate a linearized target DNA, flanked by a pair of MoMLV DNA ends, with the characteristic single-

of Normal

Integration

of LTRl

Ends

into

The reaction is as shown in Figure 3, except that a circular DNA molecule serves as the target DNA for integration. (A) Synapsis of a pair of LTRI molecules. (6) Two nucleotides are cleaved from the 3’ end of each LTRl molecule. (C) The recessed 3’ ends of LTRl are joined to the 5’ ends of a staggered cut made in the target DNA. (D) Melting of the 4 bp between the sites of target DNA cleavage generates a linear product with 4 bp single-strand gaps and 2 base overhangs at the junctions between LTRl and the target DNA. Labeling is the same as in Figure 3.

strand junctions of the integration intermediate (Figure 6). The products of a reaction with this circular target DNA were electrophoresed in a polyacrylamide gel, and a band consistent with the expected mobility was observed (Figure 7, lane c). This product was not observed when substrate LTR5 was substituted for LTRl in the reaction (lane d), in the absence of the circular target DNA (lane e), or when IN protein was omitted (lane f). The results of partial digestion of the reaction products with mung bean nuclease before electrophoresis are also consistent with concerted attack of a pair of MoMLV DNA ends onthe circular target DNA; this nuclease treatment generated a new discrete labeled band intermediate in mobility between the full-length product and the position of the linearized target DNA (data not shown), as expected for the product of nucleolytic attack at one of the single-strand junctions of the structure depicted in Figure 6D. In addition to the linear product, a labeled species near the origin of the gel (Figure 7, lane c) comigrates with nicked circle target DNA and probably results from integration of a single MoMLV DNA end into a target DNA. Although this interpretation has not been rigorously tested, a labeled product comigrating with nicked circle DNA has been observed with different sizes of circular target DNA and under different conditions of gel electrophoresis (data not shown). To check for the presence of the correct target site duplication, the linear product was cloned, and the junctions of several isolates were sequenced. To facilitate cloning,

Cell

834

abcdef

622- 527- *)r 404-

*

CATTTTCGGC TTCGCGACGC CATCCAGCCT GGCGAGAAGC ACTTATGACT

CGGCGAGGAC ACGCGAGGCT

GGACCGCTTT CGCGGGCATC CAGGCCATGC

CTTTCGCTGG

GCCTCGCGTC AAGCAGGCCA GACTGTCTTC CATCCCGATG ATGCTGTCCA (:“f:GGCCACC

CTGGTCCCGC

+

e

Figure 8. Presence tion Products

309- c

Figure 7. Products Made in an Integration and a Circular Target DNA

Reaction

Containing

LTRl

A reaction was carried out with LTRl and IN protein in the presence of an unlabeled 400 bp circular target DNA. The reaction products were deproteinized and analyzed in a native 5% polyacrylamide gel. Lane a, pBR322 DNA digested with Mspl and labeled with 32P; lane b, linearized target DNA labeled with 32P; lane c. products of the complete integration reaction; lane d, reaction containing LTR5 instead of LTRI; lane e, the circular target DNA was omitted from the reaction; lane f, IN protein was omitted from the reaction. The arrow indicates the position of the linear integration product. The sizes of the pBR322 Mspl fragments are shown.

the reaction was carried out with LTR7, which differs from LTRl in that a BarnHI restriction site is located after position 18 of the MoMLV LTR sequence. The linear product was purified from a polyacrylamide gel and amplified by the polymerase chain reaction after treatment with DNA polymerase I to repair the putative single-strand gaps in the structure (see Figure 6). The amplified product was cut with BamHI and cloned into pUC19. As expected, the cloned fragments consisted of linearized target DNA flanked by MoMLV DNA ends. Nine pairs of junctions between the viral DNA ends and the target DNA were sequenced (Figure 8). Eight of the nine pairs of junctions precisely matched those expected for normal integration of MoMLV DNA: the terminal TT-3’ nucleotides were lost and the MoMLV DNA ends were joined to the target DNA after the CA-3 nucleotides, 4 bp of target DNA sequence were directly repeated at the site of integration with no other modification of the target DNA sequence, and each of the integration events occurred at a different location in the target DNA. The other one of the nine pairs of junctions differed only in that 2 additional bases were present, one adjacent to each copy of a putative 4 bp target sequence duplication; we suspect that this aberrant structure was generated from a normal integration intermediate during the repair step with DNA polymerase I. These results show that MoMLV DNA ends, located on separate molecules, can be integrated by IN protein to give the normal integration product.

of a 4 bp Target

Sequence

Duplication

in Integra-

The linear product of an integration reaction with LTR7, IN protein, and the 400 bp circular target DNA was amplified by polymerase chain reaction and cloned into pUC19 (LTR7 is identical to LTRl, except for addition of a BamHl site after position 18 from the LTR end to facilitate cloning). Nine cloned products were analyzed. The structure of each corresponded to that shown in Figure 6D after repair of the singlestrand gaps and overhangs. Each line shows the target DNA sequence flanking a pair of integrated LTR7 ends. In all cases the terminal 2 T nucleotides are missing from the integrated LTR7 ends. Eight of the products exhibited the 4 bp target sequence duplication (shown in bold), characteristic of MoMLV DNA integration. One product (bottom line) contained additional bases adjacent to a putative 4 bp duplication of target DNA sequence; the sequence of this region prior to the insertion event was S-CTGGTCCCGCCACC-3’. This aberrant structure may result from errors during the repair step with DNA polymerase I.

We have demonstrated that the IN protein of MoMLV is the only viral protein required to generate the recessed 3’ ends of MoMLV DNA and accomplish their subsequent integration in vitro. Although we have no evidence of a requirement for proteins other than IN, the possibility that contaminating insect cell proteins play a stimulatory role in the cleavage or integration reactions has not been strictly excluded. Site-Specific Nuclease Activity of MoMLV IN Protein We conclude that the MoMLV IN protein has a site-specific nuclease activity that cleaves 2 nucleotides from the ends of MoMLV DNA to expose the 3’-hydroxyl group of the A nucleotide that is to become joined to the target DNA in the integration product. The IN protein of avian myeloblastosis virus also has such an activity (Katzman et al., 1989). While the avian IN protein requires Mg2+ for optimal specificity, the cleavage reaction with MoMLV IN protein requires Mn2+ as a divalent cation. A mutant end that is longer by 1 nucleotide is cleaved by MoMLV IN protein to remove 3 nucleotides from the 3’ end, implying that the site of cleavage is directed by sequences internal to the very end of the LTR. This specificity is fully consistent with the in vivo observation that the correct 3’termini are generated by a number of MoMLV mutants with additional or altered bases to the 3’ side of the A nucleotide that becomes joined to the target DNA in the integration product (Roth et al., 1989). However, cutting at a normal LTR termi-

Biochemical 835

Activities

of MoMLV

IN Protein

nal sequence located 10 bp internal to the very end of the MoMLV DNA is not observed in vivo (Roth et al., 1989). It will be of interest to determine whether this reflects an intrinsic requirement for IN nuclease to act near the ends of a DNA molecule; alternative possibilities include repair of such internal cleavage by DNA ligase or constraints on the proximity of viral DNA ends and IN protein molecules within the nucleoprotein complex. integration Activity of MoMLV IN Protein The results presented here demonstrate that, in addition to its site-specific nuclease activity, MoMLV IN protein accomplishes concerted integration of a pair of cleavage products with recessed 3’ ends to generate the normal integration intermediate. IN protein is therefore responsible for the 4 bp staggered cleavage of the target DNA and joining of the recessed 3’ ends of the viral DNA to the 5’ ends of the cut target DNA. We have used initially blunt-ended MoMLV DNA (LTRl), which is cleaved by IN protein itself prior to integration, and a substrate (LTR4) with the recessed end initially present. The similar overall integration efficiency with these two substrates argues against a requirement for tight coupling of cleavage at the LTR termini and integration in our reaction system. The possibility that the integration products with the recessed end substrate (LTR4) result from cleavage internal to the CA-3’ nucleotides, followed by coupled integration of the cleavage product, is excluded; the 3’ ends of LTR4 are joined to the target DNA at the normal position, following the CA-3’ nucleotides (unpublished data). The specificity of the divalent cation requirements for both the MoMLV cleavage and integration reactions requires further investigation. Although we do not observe significant cleavage of oligonucleotide substrates with Mg2+ as compared with Mn2+, the recessed ends of the endogenous viral DNA within core particles integrate efficiently in vitro with Mg2+ as the divalent cation (Brown et al., 1987; Fujiwara and Mizuuchi, 1988). Integration of mini-MoMLV DNA into phage lambda target DNA, monitored by a lambda packaging assay, is apparently stimulated by inclusion of an extract of uninfected NIH 3T3 cells in the reaction mixture. This result is obtained with either detergent-disrupted MoMLV virions as the source of the viral protein needed for integration (Fujiwara and Craigie, 1989), or with the partially purified preparation of IN protein used for the experiments described here (unpublished data). We have found that a similar stimulatory effect can be obtained by inclusion of one of several small basic proteins, including ribonuclease A, in the reaction mixture instead of cell extract (unpublished data). We note that none of the above factors are added to the present reaction system, nor does their addition result in any obvious stimulation (unpublished data). Their mechanism of stimulation in the lambda packaging assay system, which differs significantly from the reaction conditions reported here, remains to be determined. This stimulatory mechanism need not be relevant to the integration process in vivo.

Role of Large Nucleoprotein Complexes in Retroviral DNA Integration Our results demonstrate that the MoMLV IN protein is the only viral protein needed to bring together a pair of MoMLV DNA ends and correctly integrate them into a target DNA. However, MoMLV DNA isolated from cells after infection and reverse transcription forms part of a large nucleoprotein complex, and the protein factors required for integration of this endogenous viral DNA in vitro copurify with the complex (Brown et al., 1987; Bowerman et al., 1989); retroviruses have clearly evolved a tight nucleoprotein organization that localizes the machinery needed for these steps of the replication cycle. This complex must contain the IN protein, since it is competent for integration, and the presence of the viral capsid protein has also been demonstrated by immunoprecipitation (Bowerman et al., 1989). While it is possible that the immediate nucleoprotein precursor for integration contains only the viral DNA and IN protein, it is quite likely that other viral proteins also remain associated with the viral DNA up until the integration step. Assembly of the viral DNA in a large nucleoprotein complex may serve several functions. Viral proteins other than IN may be required to ensure that IN remains associated with the viral DNA until it is able to integrate into the host genome. We note that some of our reaction products probably result from integration of a single viral end into a target DNA; the normal nucleoprotein organization may help to reduce the frequency of such events. It may also play a role in transportation of the viral DNA from the cytoplasm, where reverse transcription occurs, to the nucleus. Between the steps of reverse transcription and integration into the host genome, the viral DNA can potentially integrate into itself. Although such autointegration is known to occur in vivo (Shoemaker et al., 1980, 1981), it is tempting to speculate that constraints imposed by the structure of the nucleoprotein complex may help to reduce its frequency. MoMLV IN Protein Functionally Resembles Transposase Proteins The DNA cutting and joining steps of retroviral DNA integration (Fujiwara and Mizuuchi, 1988; Brown et al., 1989) are remarkably similar to those involved in DNA transposition of the prokaryotic transposons Mu (Mizuuchi, 1984; Craigie and Mizuuchi, 1985, 1987), and TnlO (Benjamin and Kleckner, 1989). The transposase proteins encoded by these transposons each have a site-specific nuclease activity that cleaves at the S’ends of the transposon DNA and an activity that joins the exposed 3’ends of the mobile DNA element to the 5’ ends of the staggered cut made by the same protein in the target DNA. The finding that a retroviral IN protein possesses the same two activities reinforces the view that these reactions belong to a class of DNA rearrangement reactions that share an essentially similar biochemical mechanism. Experimental Procedures Miscellaneous Commercially

Enzymes available

and Methods enzymes were used as recommended

by

the

Cell 836

supplier. Polynucleotide krnase was purchased from Pharmacia, restriction enzymes and DNA ligase from New England Biolabs, and Taq polymerase from Promega. Molecular weight protein standards were purchased from Bethesda Research Laboratories. Standard methods involving DNA manipulation are found in Sambrook et al. (1989). Expression of MoMLV IN Protein The 1.2 kb Xmnl-Seal restriction fragment of the MoMLV proviral clone pNCA (Colicelli and Goff, 1988) which contains most of the IN coding sequence (including the termination codon), was cloned into the BamHl site of the baculovirus expression vector pAc373 (Summers and Smith, 1987); synthetic oligonucleotide linkers reconstructed the N-terminus of the IN coding sequence and added translation initiation signals. The linker bridging the BamHl site of pAc373 and the Xmnl site within the IN coding region has the sequence 5’-GATCCTATAAATATG- - --GAACA-3’, where the dashes represent the IN coding sequence between the translation initiation codon and the Xmnl site. The Seal restriction site to the 3’ side of the pal termination codon was bridged to pAc373 with a BamHl linker. The resulting plasmid (pMK556) was recombined with wild-type AcMNPV by cotransformation of Sf9 cells, and recombinants were plaque purified by standard procedures (Summers and Smith, 1987). One purified recombinant baculovirus, 556-3, was used for expression of MoMLV IN protein. Partial Purification of MoMLV IN Protein Growth of Sf9 cells as monolayers and infection with recombinant virus were carried out essentially as described by Summers and Smith (1987). To provide the source of IN protein, a 150 cm* flask of Sf9 cells was infected with clone 556-3. After incubation for 3 days at 28OC, the cells were harvested. All procedures were carried out between 0°C and 4OC. The cells were first chilled, the medium removed, and the monolayer of cells was gently washed with 20 ml of 20 mM HEPES (pH 7.6) 150 mM K glutamate. This buffer was removed, and the cells were lysed by addition of 5 ml of 100 mM K glutamate, 20 mM HEPES (pH 7.6). 5 mM MgAcz, 1 mM dithiothreitol, 0.5% Nonidet P40, with gentle pipetting. After incubation on ice for 5 min, the lysate was centrifuged at 13,000 x g for 10 min. The pellet was retained and resuspended in 5 ml of 0.4 M K glutamate, 20 mM HEPES (pH 7.6) 1 mM EDTA, 1 mM dithiothreitol, 0.5% (w/v) Nonidet P40 by disruption with a glass-glass homogenizer. The suspension was incubated on ice for 10 min and then centrifuged at 13,000 x g for 10 min. The pellet was resuspended in 1 ml of the same buffer, and the centrifugation was repeated. The resulting pellet was resuspended in 1 ml of 0.4 M K glutamate, 0.1% (w/v) Nonidet P40 in HEDG buffer (20 mM HEPES ]pH 7.61, 1 mM EDTA. 1 mM dithiothreitol, 20% [w/v] glycerol). At this stage the sample was frozen in liquid nitrogen and stored at -7OOC. SDS-polyacrylamide gel electrophoresis of this fraction is shown in lane c of Figure 1. The sample was thawed, centrifuged at 13,000 x g for 10 min, and the pellet was resuspended in 400 ul of HEDG buffer containing 4 M urea, 0.4 M K glutamate, and 0.1% (w/v) Nonidet P40. The protein was incubated on ice for 15 min and then centrifuged at 13,000 x g for 10 min. Two hundred and fifty microliters of the supernatant was then loaded onto a Superose 12 column (HR 10/30. Pharmacia) equilibrated with the same solubilization buffer. The column was run at a flow rate of 0.1 ml/min. Fractions containing IN protein were pooled and extensively dialyzed against HEDG buffer containing 0.4 M K glutamate and 0.1% (w/v) Nonidet P40, frozen in liquid nitrogen, and stored at -7OOC. During trial experiments integration activity was monitored by the biological assay for integration of mini-MoMLV DNA (Fujiwara and Craigie, 1989). Subsequently, fractions containing IN protein were identified by Coomassie blue staining of SDS-polyacrylamide gels. IN protein concentration was measured by Coomassie blue staining, relative to bovine serum albumin standards, after polyacrylamide gel electrophoresis. Ollgonucleotide LTR Substrates Substrate LTRl, corresponding to the U3 end of MoMLV DNA, was made by annealing the synthetic oligonucleotide sequence 5’-AATGAAAGACCCCACCTG-3’with its complement. LTRP through LTR6 differ from LTfil as shown in Figures 2A and 4A. The location of the s*P label is also noted in these figures. LTR7 was formed by annealing the oligonucleotide sequence 5’-AATGAAAGACCCCACCTGGATCCATA-

AC-3 with Its complement. Oligonucleotides were purified by electrophoresis in a DNA sequencing gel before labeling and annealing. One hundred nanograms of the oligonucleotide strand to be labeled with s2P at its Send was phosphorylated by T4 polynucleotide kinase in the presence of 100 uCi of [p3*P]ATP (specific activity, 3000 Cilmmol) in a reaction volume of 15 ~1. EDTA was added to a final concentration of 25 mM, and the polynucleotide kinase was inactivated by heating at 85OC for 15 min. NaCl was added to a final concentration of 0.1 M, together with an excess (400 ng) of the unlabeled complementary strand, in a total volume of 40 ul. The mixture was heated to 8O”C, and the DNA was annealed by slow cooling. Unincorporated nucleotide was then removed by passage through a G25 Sephadex Quick Spin Column (Boehringer). Preparation of Circular Target DNA Circular target DNA was prepared by ligation of the 0.4 kb EcoRl fragment of pLMFl24 (Fisher et al., 1986). This DNA is the Alul restriction fragment of pBR322 (positions 686-1089) with EcoRl cohesive ends. The EcoRl fragment was circularized by ligation in the presence of HU protein, which greatly increases the yield of monomer circle product (Hodges-Garcia et al., 1989). The ligated DNA was then extracted twice with phenol, once with chloroform, and precipitated with ethanol. LTR Cleavage and Integration Reactions The reaction conditions for cleavage of the 3’ends of MoMLV DNA and integration are identical. Reactions (15 ul) contained 85 mM KCI, 20 mM MOPS (pH 7.2), 3 mM MnCls, 10 mM dithiothreitol, 20% (w/v) glycerol, 100 uglml bovine serum albumin, 0.5 pmol of LTR substrate (excluding the excess unlabeled strand), and 1 pmol of MoMLV IN protein (these conditions exclude 13 mM K glutamate and 0.003% Nonidet P40 contributed by the IN protein storage buffer). Reactions were incubated at 30°C for 1 hr. For analysis of the reaction products in DNA sequencing gels, the reactions were stopped by addition of 15 ul of 95% formamide, 20 mM EDTA, 0.05% bromophenol blue, and 0.05% xylene cyanol; 1.5 ul of the sample was loaded after first heating at 95‘C for 2 min. Preparative reactions for isolation of the integration products from sequencing gels were scaled up to 75 ul; the products of four such reactions were pooled, precipitated with ethanol, and the whole sample was loaded. Reactions including the 0.4 kb circular target DNA were performed exactly as described above, except the reactions (15 ul) included 1 ng of the circular target DNA. Reactions were stopped by addition of 5 ul of 5 mglml pronase in 100 mM EDTA. 5 ul of 1 M NaCI, 1 frI of 10% SDS, and 24 ul of HzO. After incubation at 30°C for 1 hr, 5 frl of 3 M NaAc was added, and the mixture was extracted once with phenol and precipitated with ethanol. The entire sample was loaded onto a polyacrylamide gel. Electmphoresis of Reaction Products The products of reactions with oligonucleotide viral ends were analyzed by electrophoresis in 20% polyacrylamide (19:1, acrylamide:bis) sequencing gels in TBE buffer; chemical cleavage standards were generated by the method of Maxam and Gilbert (1977). The products of reactions with the circular target DNA were electrophoresed in a 5% polyacrylamide (37.5:1, acrylamide:bis) gel, in TBE buffer, for 90 min at 10 V/cm. Protein samples were electrophoresed in 10% SDS-polyacrylamide gels (Laemmli, 1970) and stained with Coomassie blue. Amplification and Sequencing of the Integration Products with a Circular Target DNA The products of an integration reaction with substrate LTR7 and the 0.4 kb circular target DNA were electrophoresed in a 5% polyacrylamide gel, as described above. LTR7 was labeled with 32P, as indicated for LTRI in Figure 2A, in order to facilitate identification of the reaction product. The linear product was excised from the gel, the DNA was isolated by the crush/soak method, and then treated with DNA polymerase I to repair the putative single-strand gaps in the integration product (see Figure 6). The DNA was then subjected to 30 cycles of polymerase chain reaction amplification using a Perkin Elmer Cetus Thermal Cycler with Taq polymerase (Promega). The primer sequence was 5’-GTTATGGATCCAGGTGGGGTCT-3’. The amplified product was digested with BamHl to generate cohesive ends, purified by gel elec-

Biochemical 037

Activities

of MoMLV

IN Protein

trophoresis, and cloned into the BamHl site of pUC19 using standard procedures. The DNA sequences of the junctions between LTR7 and target DNA in the cloned products were determined by sequencing double-stranded plasmid DNA by the dideoxy method with the Sequenase DNA polymerase (U.S. Biochemicals), as recommended by the supplier. The pUC sequencing primers were %GTAAAACGACCGCCAGT-3’ (forward) and 5”CAGGAAACAGCTATGAC-3’ (reverse).

We thank S. Goff (Columbia University) for providing the MoMLV proviral clone pNCA. We also thank K. Adzuma, M. Gellert, K. Mizuuchi, and H. Nash for their comments on this work and reading the manuscript. This work was supported by the National Institutes of Health Intramural Aids Targeted Antiviral Program. F. D. B. is a fellow of the Leukemia Society of America. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. May 11, 1990; revised

June

14, 1990.

N. (1989). Intramolecular

transposition

Bowerman, B., Brown, P O., Bishop, J. M., and Varmus, H. E. (1989). A nucleoprotein complex mediates the integration of retroviral DNA. Genes Dev. 3, 469478. Brown, Correct

P O., Bowerman, B., Varmus, H. E., and Bishop, J. MM. (1987). integration of retroviral DNA in vitro. Cell 49, 347-356.

Brown, P O., Bowerman, B., Varmus, H. E., and Bishop, J. M. (1969). Retroviral integration: structure of the initial covalent product and its precursor, and a role for the viral IN protein. Proc. Natl. Acad. Sci. USA 86, 2525-2529. Colicelli, J., and Goff, S. t? (1985). Mutants and pseudorevertants Maloney murine leukemia virus with alterations at the integration Cell 42, 573-580.

of site.

Colicelli, J., and Goff, S. l? (1988). Sequence and spacing requirements of a retrovirus integration site. J. Mol. Biol. 199, 47-59. Copeland, T. D., Gerard, G. F, Hixon, C. W., and Oroszlan, S. (1985). Amino- and carboxyl-terminal sequence of Moloney murine leukemia virus reverse transcriptase. Virology 743, 676-679. Craigie, R., and Mizuuchi, K. (1985). Mechanism of transposition of bacteriophage Mu: structure of a transposition intermediate. Cell 41, 0674%. Craigie, R:, and Mizuuchi, K. (1987). Transposition of Mu DNA: joining of Mu to target DNA can be uncoupled from cleavage at the ends of Mu. Cell 57, 493-501. Donehower, L. A., and Varmus, H. E. (1984). A mutant murine leukemia virus with a single missense codon in po/ is defective in a function affecting integration. Proc. Natl. Acad. Sci. USA 81, 6461-6465. Fisher, L. M., Barot, H. A., and Cullen, M. E. (1966). DNA gyrase complex with DNA: determinants for site-specific DNA breakage. EMBO J. 5, 1411-1418. Fujiwara, T., and Craigie, R. (1989). Integration of mini-retroviral DNA: a cell-free reaction for biochemical analysis of retroviral integration. Proc. Natl. Acad. Sci. USA 86, 3065-3069. Fujiwara, T, and Mizuuchi, K. (1988). Retroviral DNA integration: ture of an integration intermediate. Cell 54, 497-504. Grandgenett, D. P, and Mumm, tegration. Cell 60, 3-4.

for sequencing

Mizuuchi, K. (1984). Mechanism of transposition of bacteriophage Mu: polarity of the strand transfer reaction at the initiation of transposition. Cell 39, 395-404.

Sambrook, J., Fritsch, E. F, and Maniatis, T (1969). Molecular Cloning: A Laboratory Manual (Cold Spring Harbor, New York: Cold Spring Harbor Laboratory). Schwartzberg, P., Colicelli, J., and Goff, S. P (1964). Construction and analysis of deletion mutations in thepolgene of Moloney murine leukemia virus: a new viral function required for productive infection, Cell 37, 1043-1052. Shoemaker, C., Goff, S., Gilboa, E., Paskind, M., Mitra, S. W., and Baltimore, D. (1960). Structure of a cloned circular Moloney murine leukemia virus DNA molecule containing an inverted segment: implications for retrovirus integration. Proc. Natl. Acad. Sci. USA 7i: 3932-3936. Shoemaker, C., Hoffmann, J., Goff, S. l?, and Baltimore, D. (1981). Intramolecular integration within Moloney murine leukemia virus DNA. J. Virol. 40, 164-172.

References Benjamin, H. W., and Kleckner, by TnlO. Cell 59, 373-383.

Maxam, A. M., and Gilbert, W. (1977). A new method DNA. Proc. Natl. Acad. Sci. USA 74, 560-564.

Roth, M. J., Schwartzberg, P L., and Goff, S. P (1989). Structureofthe termini of DNA intermediates in the integration of retroviral DNA: dependence on IN function and terminal DNA sequence. Cell 58,47&I.

Acknowledgments

Received

Laemmli, U. K. (1970). Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature 227, 660-685.

S. R. (1990).

Unraveling

Hodges-Garcia, Y., Hagerman, P J., and Pettijohn, ring closure mediated by protein HU. J. Biol. Chem.

struc-

retrovirus

in-

D. E. (1969). DNA 264, 14621-14623.

Katzman, M., Katz, R. A., Skalka, A. M., and Leis, J. (1989). The avian retroviral integration protein cleaves the terminal sequences of linear viral DNA at the in vivo sites of integration. J. Virol. 63, 5319-5327.

Skalka, A. M. (1966). Integrative recombination in retroviruses. In Genetic Recombination, R. Kucherlapati and G. R. Smith, eds. (Washington, D. C.: American Society for Microbiology Publications), pp. 701-724. Summers, M. D., and Smith, G. E. (1987). A Manual Baculovirus Vectors and Insect Cell Culture Procedures, tural Experiment Station Bulletin No. 1555.

of Methods for Texas Agricul-

Tanese, N., Roth, M. J., and Goff, S. P (1986). Analysis of retroviralpol gene products with antisera raised against fusion proteins produced in fscherichie co/i. J. Virol. 59, 328-340. Varmus, H., and Brown, P (1969). Retroviruses. In Mobile Berg and M. M. Howe, eds. (Washington, D. C.: American Microbiology Publications), pp. 53-106.

DNA, D. E. Society for

The IN protein of Moloney murine leukemia virus processes the viral DNA ends and accomplishes their integration in vitro.

Retroviral DNA integration involves a coordinated set of DNA cutting and joining reactions. We find that the IN protein of Moloney murine leukemia vir...
2MB Sizes 0 Downloads 0 Views