Gene 534 (2014) 222–228

Contents lists available at ScienceDirect

Gene journal homepage: www.elsevier.com/locate/gene

Restriction enzyme cutting site distribution regularity for DNA looping technology Ying Shang a,1, Nan Zhang a,1, Pengyu Zhu a, Yunbo Luo b, Kunlun Huang a,b,⁎, Wenying Tian b, Wentao Xu a,b,⁎ a b

Laboratory of Food Safety, College of Food Science and Nutritional Engineering, China Agricultural University, Beijing 100083, China The Supervision, Inspection and Testing Center of Genetically Modified Organisms, Ministry of Agriculture, Beijing 100083, China

a r t i c l e

i n f o

Article history: Accepted 24 October 2013 Available online 6 November 2013 Keywords: Restriction enzyme Restriction enzyme cutting sites Distribution regularity DNA looping technology Cohesive end

a b s t r a c t The restriction enzyme cutting site distribution regularity and looping conditions were studied systematically. We obtained the restriction enzyme cutting site distributions of 13 commonly used restriction enzymes in 5 model organism genomes through two novel self-compiled software programs. All of the average distances between two adjacent restriction sites fell sharply with increasing statistic intervals, and most fragments were 0–499 bp. A shorter DNA fragment resulted in a lower looping rate, which was also directly proportional to the DNA concentration. When the length was more than 500 bp, the concentration did not affect the looping rate. Therefore, the best known fragment length was longer than 500 bp, and did not contain the restriction enzyme cutting sites which would be used for digestion. In order to make the looping efficiencies reach nearly 100%, 4–5 single cohesive end systems were recommended to digest the genome separately. © 2013 Elsevier B.V. All rights reserved.

1. Introduction In molecular biology research, the digestion of genomes with restriction enzymes is a very commonly used technique, particularly in genome walking; such techniques include Inverse-PCR (IPCR) (Ochman et al., 1988), T-linker PCR (Trinh et al., 2012a), Loop-linker PCR (Trinh et al., 2012b) and others (Bae and Sohn, 2010; Tsaftaris et al., 2010). These procedures always use a restriction enzyme first to digest the genome, and the digested fragments are then self-looped or ligated to adaptors with T4 ligase; this assists with the insertion of genes into plasmid vectors during gene cloning and protein expression experiments (Amanda et al., 2012; Cramer et al., 2012). The use of proper restriction enzyme cutting sites is always a key point in the techniques mentioned above. Many restriction enzymes require the binding of two copies of a recognition sequence for DNA cleavage, thereby introducing a loop in the DNA (Halford et al., 2004; Van den Broek et al., 2006). Meanwhile, digesting genomic DNA with a restriction enzyme is always the basic pre-treatment step before DNA looping, especially in IPCR. IPCR was the earliest technique for cloning flanking sequences, and it is based on conventional PCR and is the most classical application of DNA Abbreviations: IPCR, Inverse-PCR; Ct, threshold cycle. ⁎ Corresponding authors at: Laboratory of Food Safety, College of Food Science and Nutritional Engineering, China Agricultural University, Beijing 100083, China. Tel.: +86 10 6273 8793; fax: +86 10 6273 7786. E-mail addresses: [email protected] (Y. Shang), [email protected] (N. Zhang), [email protected] (P. Zhu), [email protected] (Y. Luo), [email protected] (K. Huang), [email protected] (W. Tian), [email protected] (W. Xu). 1 These authors contributed equally. 0378-1119/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.gene.2013.10.054

looping technology after genome digestion. During the past two decades, many scholars have conducted research and made improvements to increase the success rate of IPCR, such as combining it with long range PCR (Benkel and Fong, 1996) or nested PCR (David and Ignacio, 2001) or adding 5% formamide to eliminate non-specific amplification (Li et al., 1998). Despite the many methods that have tried to increase the efficiency of IPCR, its issues were not fundamentally solved. The low success rate of IPCR essentially is attributable to two aspects: the choice of the restriction enzyme and the looping efficiency of the digested segments. For organisms that have been completely sequenced, it is easy to choose a restriction enzyme in the genome digestion. However, the vast majority of genomic DNA sequences of organisms or fragments of interest (Jin et al., 2008) remain unknown, and the choice of restriction enzyme is thus relatively difficult. With regard to looping efficiency, both the concentration and length of digested fragments affect the self-looping. To obtain a flanking sequence by IPCR, the proper distance from a restriction site to both boundaries is a key factor: if the length is too long, polymerase amplification is restricted, and if the length is too short, a different amplification product is generated and has a negative influence on the sequence analysis (Drewell et al., 2002; Kun et al., 2010), further affecting the IPCR efficiency. As to the genome, the base composition and GC-content, genomic density polymorphism distribution and genome mutations regularity (Conrad et al., 2011; Jiao et al., 2004; Meunier and Duret, 2004) in different organisms have been studied. The general distribution regularity of restriction enzyme cutting sites is an important aspect of the genome which is a basic and key technique in molecular biology, however, there are no related reports about it. To overcome the drawbacks mentioned above, which were also difficulties related to DNA looping, we chose 13 types of commonly

Y. Shang et al. / Gene 534 (2014) 222–228

used restriction enzymes and 5 model organisms as targets and adopted a statistical method to theoretically analyze the distribution regularities of their restriction enzyme cutting sites. Two novel software programs were developed to support the research described above. Because DNA looping technology was commonly used in IPCR, this method has been studied extensively and was easy to operate. The main application of IPCR was to obtain the flanking sequence of a known gene; therefore, we chose IPCR to obtain the flanking sequence of a genetically modified organism as a model to conduct DNA looping technology research. The looping rates of fragments with different lengths and concentrations were studied using real-time PCR. T4 ligases from different companies may have some differences in enzyme activity, so looping rates were also studied, and by analyzing the differences in the Ct (threshold cycle) value, we were able to intuitively obtain optimal fragment conditions. 2. Materials and methods 2.1. Materials and software 2.1.1. DNAMAN Version 4.0 RE software I, self-compiled software program for the statistical analysis of differences between two adjacent data sets, were used in this paper to statistically analyze the distances between two adjacent restriction sites. RE software II, self-compiled software program, with a customized statistical range and the ability to count numbers in each range, were used in this paper to statistically count the distribution of distances between two adjacent restriction sites. Genetically modified maize LY038 was kindly supplied by the Monsanto Far East Ltd. Beijing Representative Office. 2.2. Distribution of common restriction enzymes following the digestion of genomic DNA Five target genomes, Oryza sativa (rice) (Annon, n.a.a), Escherichia coli O157:H7 EDL933 [GenBank no. AE005174], Saccharomyces cerevisiae (baker's yeast) (Annon, n.a.b), Homo sapiens Chromosome 1 (Annon, n.a.c), and Arabidopsis thaliana (Annon, n.a.d), were downloaded from NCBI and saved in the .FASTA format. O. sativa (rice) is cultivated rice and is used as one of the most important staple foods by a majority of the world's population. From the end of the 1980s to the beginning of the 1990s, Toriyama et al. (Toriyama et al., 1988) used the PEG induction method to successfully convert rice protoplast and obtained cultivatable transgenic plants; research on genetically modified rice then progressed quickly. E. coli O157:H7 EDL933 is considered a genetically engineered microorganism and is widely used in genetic engineering. S. cerevisiae is a major model organism of great industrial importance. H. sapiens are the most advanced mammals, and Chromosome 1 is the longest of the 24 chromosomes (including chromosomes X and Y). A. thaliana is a small flowering plant of the mustard family and is the first plant to be completely sequenced. These model organisms covered prokaryotes, eukaryotes, plants, animals and common genetically modified materials. Therefore, these representative genomes were chosen as targets to conduct restriction enzyme cutting site analysis. We chose 13 restriction enzymes that are commonly used in molecular experiments: BamH I, Bgl I, Bgl II, EcoR I, EcoR V, Hind III, Kpn I, Not I, Pst I, Sac I, Sal I, Xba I and Xho I (Guoqiang et al., 2012; Saxonov, 2012; Wang et al., 2012). We copied the target genome sequences into DNAMAN Version 4.0 and selected restriction enzymes to digest these target genomes; the digestion results were saved in the .txt format. We then ran the digestion results through the RE software I, and the results of distance between the two adjacent restriction sites were saved as .txt format. Custom statistics interval, and run the results of distance between the two adjacent

223

restriction sites with the RE software II, the results of the distance distributions were saved in .txt format. Finally, the distributions for genomic DNA digestion with common restriction enzymes were obtained.

2.3. Study of looping rates in ligase-mediated reactions A series of sequences on the LY038 gene cassette were chosen with lengths of approximately 500 bp, 1000 bp, 1500 bp, 2000 bp, and 2500 bp, none of these sequences included the Sac I restriction site. We designed a pair of primers containing a Sac I restriction site on one end and then ligated the amplified product with the pGEM-T easy vector and obtained recombinant plasmids after transformation and bluewhite selection. A Sac I restriction site was located at 109 bp of the pGEM-T easy vector, and we used Sac I to perform enzyme digestion, from which we obtained different lengths of recombinant segments. Because the length of the pGEM-T easy vector can be as long as 3000 bp, we inserted a segment of 2000 bp between the two Hsp92 I restriction sites (17 bp and 1947 bp) and then obtained a new recombinant fragment of 3000 bp. After gel extraction and looping by T4 ligase, the looping rates of different recombined fragments were confirmed through real-time PCR. Different lengths of recombinant fragments were used as templates in the simulated ligation reaction of IPCR. The Inner primers were designed in the region of inner part on the fragment, and the Ligation primers were designed in part of the ligation boundary. The schematic was shown in Supplementary material 1 (Fig. S1). The amplification direction of Ligation primer was towards the outside of the known sequence, according to the direction of the IPCR primer. The amplicons of Inner primers were obtained whether the fragments were selfligated or not. However, the products of Ligation primers were obtained only when the recombined fragments were successfully looped. We applied comparative quantitative PCR to compare the Ct values of Inner and Ligation primers according to the Comparative Delta-delta Ct method to calculate the ΔΔCt (Thomas and Kenneth, 2008) of pending test samples, and the looping rates of recombined fragments with different lengths and conditions were obtained by comparison. The Inner and Ligation primers for the 500 bp recombined segment were N-1-F/R and L-1-F/L-500-R; for 1000 bp, N-2-F/R and L-2-F/L1000-R; for 1500 bp, N-2-F/R and L-2-F/L-1500-R; for 2000 bp, N-2-F/ R and L-2-F/L-2000-R; for 2500 bp, N-2-F/R and L-2-F/L-2500-R; and for 3000 bp, N-2-F/R and L-3000-F/R. All of the primers were designed by ABI Prism Primer Express Version 2.0 and synthesized by Sangon Biotech (Shanghai, China). The sequences are shown in Supplementary materials 2 (Table S1). The real-time PCR system was comprised as follows: 2.5× Real Master Mix buffer, 11.25 μL; primer 10 μmol/L, 0.25 μL; template DNA, 5.0 μL; and ddH2O, 8.25 μL. All of the amplifications were carried out on an ABI7500 with the program as follows: one step of 10 min at 95 °C, 40 cycles of 15 s at 95 °C, 30 s at 60 °C and 30 s at 72 °C, and then a melting curve was drawn (continuation mode): 15 s/95 °C, 60 s/60 °C, 30 s/95 °C, 15 s/60 °C. Each reaction was repeated three times, and each time the reaction was performed in triplicate. Relative quantification PCR was adopted to test and verify the looping efficiency of fragments with different lengths; ABI software version 2.0.1 was applied to analyze the experimental data. First, a pre-experiment was undertaken to ensure that the amplification efficiencies of the Inner and Ligation primer were the same. We successfully diluted the looping segments with different lengths on a gradient, and the dilutions were chosen as the templates. Second, realtime PCR was performed using Inner and Ligation primers separately, and then we calculated the Ct values and obtained standard curves. If the amplification efficiencies (E %) were relatively the same, and after obtaining an acceptable range of amplification efficiency, the looping rates of fragments with different lengths and appropriate template concentrations were confirmed. Finally, we tested the effects of ligases

224

Y. Shang et al. / Gene 534 (2014) 222–228

from different manufactories on the looping rate with the same PCR conditions and thermal cycle. 3. Results and discussion 3.1. The advantages of the two novel software programs In this study, we first compiled two novel software programs to conduct restriction enzyme cutting site analysis and statistics; both programs were designed based on the principle of statistics (Supplementary material 3). The interfaces to study the distributions of commonly used restriction enzymes in genome digestion and partial results are shown in Fig. 1. After analyzing restriction enzyme cutting sites with DNAMAN Version 4.0 (LynnonBiosoft, USA), RE software I was used to obtain results regarding the distances between adjacent restriction enzyme cutting sites in one type of genome; then, based on these results, RE software II was utilized to summarize the distribution of the distances according to defined intervals. In this study, the statistical interval used with RE software II was 500 bp. By adjusting statistical intervals with the two software programs, we could easily obtain the analysis results corresponding to a selected enzyme. The content of eukaryote genomes is large, and thus the analysis of restriction enzyme cutting sites involves much work, and the statistics are heavy and prone to errors. Therefore, the two software programs can facilitate this analysis, as they intuitively generate statistical results and overcome drawbacks, such as large errors, that follow labor- and time-intensive work and are caused by artificial statistics. By adjusting the statistical interval, we can easily obtain the analysis results corresponding to a selected enzyme. This facilitates the step of choosing the appropriate restriction enzyme to target a fragment, presents a visual result and raises the restriction enzyme cutting efficiency. 3.2. Regulation of the distribution of different restriction enzyme cutting sites in genomes To determine the distributions of restriction enzymes commonly used in digesting genomic DNA, we selected several familiar and

commonly used restriction enzymes. Regular patterns were found by analyzing 13 enzymes to determine their genomic distribution with DNAMAN software. The 13 enzymes were BamH I, Bgl I, Bgl II, EcoR I, EcoR V, Hind III, Kpn I, Not I, Pst I, Sac I, Sal I, Xba I and Xho I. The statistical range was 500 bp, the abscissa was the statistical range, and the vertical axis was the percentage. 3.2.1. Distribution results for different restriction enzymes following the digestion of one selected genome The distribution of 13 restriction enzyme cutting sites in one selected genome was discussed. The statistical result for the genome of O. sativa was shown in Table 1, and details of the other four genomes are shown in Supplementary materials 4–9. With the expansion of the statistical interval, the percentage of the sections cut at adjacent restriction enzyme cutting sites decreased unevenly and slowly. From the data, we can see that when more enzyme cutting sites exist in the genome, more fragments lie within the 0–499 bp interval. The top five restriction enzymes generating the highest proportion (%) of fragments between 0–2500 bp following the digestion of O. sativa were Bgl I, Pst I, Sac I, Hind III and Bgl II; those for E. coli O157:H7 EDL933 were Bgl I, EcoR V, Pst I, Kpn I and EcoR I; those for S. cerevisiae were Hind III, EcoR V, EcoR I, Bgl II and Xba I; those for A. thaliana were Hind III, Bgl II, EcoR, Xba I and EcoR V; and those for H. sapiens Chromosome 1 were Pst I, Hind III, Xba I, Bgl I and Sac I. 3.2.2. Distribution of one selected restriction enzyme following the digestion of different genomes The restriction enzyme cutting sites of one selected enzyme in five genomes were analyzed to determine the suitable restriction enzyme to digest one species. The statistical range was 500 bp, the abscissa was the statistical range, and the vertical axis was the percentage. The result for BamH I was shown in Fig. 2, and the other 12 restriction enzyme cutting site distribution details were shown in Supplementary material 10. We could observe that a longer genome indicated more restriction enzyme cutting sites. The results showed the coverage and distribution of different restriction enzyme cutting sites, and we could conclude that Hind III, Bgl II, EcoR I, Pst I and EcoR V had comparably more restriction enzyme cutting

Fig. 1. Schematic diagram of the procedure for analyzing the restriction enzyme cutting site distribution and partial results.

Y. Shang et al. / Gene 534 (2014) 222–228

225

Table 1 The distribution of restriction enzyme cutting sites in the genome of Oryza sativa. Restriction enzyme

The total number of enzyme cutting sections

The number of sections between 0–499 bp

The proportion of sections between 0–499 bp (%)

The proportion of sections between 0–2500 bp (%)a

The proportion of sections between 500–3000 bp (%)

Bgl I Pst I Sac I Hind III Bgl II Xba I EcoR I Sal I EcoR V Xho I BamH I Kpn I Not I The average value

27322 23264 21819 22128 20195 18292 20325 13034 15615 13245 12727 8712 3033

10475 4605 5750 3791 3545 3126 3321 2620 2223 2474 1773 1193 380

38.34 19.79 26.35 17.13 17.55 17.09 16.34 20.10 14.24 18.68 13.93 13.69 12.53 18.90

66.64 56.84 56.33 54.11 51.39 48.82 48.62 44.17 43.07 43.00 38.50 31.20 21.04 46.44

32.00 42.85 34.78 43.09 40.21 37.14 37.61 28.16 33.94 29.03 29.37 21.44 9.83 32.27

a

Restriction enzymes are sorted in descending order according to the proportion of the sections between 0–2500 bp (%).

sites in the five genomes and that the distances between adjacent restriction enzyme cutting sites were small. BamH I, Kpn I, Sal I and Xho I had fewer restriction sites, and the distances between adjacent restriction enzyme cutting sites were bigger. Not I had the fewest restriction sites and the greatest distances between adjacent restriction enzyme cutting sites. We first obtained the cutting site distributions of 13 commonly used restriction enzymes in 5 model organisms, and they were different from each other. According to the analysis results, adjacent restriction enzyme cutting sites were mostly located within a range of 0–499 bp. The vast majority of distances, at least above 50%, were under 46 (the

theoretical distance of a six-base-pair restriction site). For the genomes with the same length, adjacent restriction enzyme cutting site distances became smaller as the number of restriction enzyme cut sites grew. The percentage of adjacent restriction enzyme cutting site distances decreased as the statistical intervals increased. However, significant differences still existed in the data analysis when different genomes were cut by the same restriction enzyme or the same genome was cut by different restriction enzymes. The differences mentioned above might be related to the base distribution of the genome. Due to the GC-content of the genomes were various in different organisms, and the bases were unevenly distributed even

Fig. 2. The restriction enzyme cutting site distributions of BamH I on the genomes of the five model organisms.

226

Y. Shang et al. / Gene 534 (2014) 222–228

along chromosomes in the same organism, therefore, it might cause the distribution differences of the restriction enzyme cut sites. The percentage of adjacent restriction enzyme cutting site distances we calculated was an average value and the restriction enzyme cutting site distribution regularities were studied from the view of the whole genome and enriched the information of it. 3.3. Distribution of isocaudarner following the digestion of different genomes The distribution regulations discussed above were single enzyme digestion reaction, and among the 13 restriction enzymes there were two pairs of isocaudarners, one pair was BamH I and Bgl II, and the other was Sal I and Xho I. Isocaudarner, a special kind of DNA restriction endonuclease, produces the same cohesive end in DNA fragment when digests different sequences. DNA fragments that result from digestion of two isocaudarners can be ligated together (Liu and Chen, 2004) and the looping is formed. We analyzed the restriction enzyme cutting sites distribution regulation of the two pairs of isocaudamers mentioned above, the statistical result was shown in Table 2; and the details were in Supplementary material 11–15. From the result, the percentage of fragments digested by isocaudarners was obviously higher than that in the single restriction enzyme digestion, especially the fragments in 0–2500 bp. Compared to single digestion, the average percentage of the fragments located in the range 0–2500 bp in isocaudarner digestion was at least 10% higher, at most 20.97%. Hence, isocaudarner digestion was recommended as better than single restriction enzyme digestion. We named the digestion that produced the same cohesive end in DNA fragment as single cohesive end system, such as isocaudarner digestion. 3.4. Looping rate under ligase-mediated reaction conditions We used Comparative Delta-delta Ct methodology with relative quantification PCR to obtain and analyze DNA looping rates. First, we undertook a pre-experiment to ensure that the amplification efficiencies of Inner primer and Ligation primer were the same. If the amplification efficiencies (E %) were relatively the same, we amplified the samples with Inner primer and Ligation primer and then compared each sample with control group (recombinant segment of 500 bp, 1 ng/μL concentration, T4 ligase from Promega) to analyze the results. 3.4.1. DNA fragments with different lengths and concentrations The relative quantification method was used to analyze the looping rate and compare various lengths as well as different concentrations of recombined fragments; each reaction set was replicated three times. The Ct values of Inner and Ligation primers during real-time PCR were recorded, and we calculated ΔΔCt values using the Comparative Deltadelta Ct method. The recombined fragment of 500 bp at a concentration of 1 ng/μL was used as the control group, and we analyzed multiple relationships between the samples and control group. The results were shown in Fig. 3 and Table 3. We concluded from the results that the looping rate of the 500 bp recombined templat was significantly lower than that of the 1000 bp

Fig. 3. The looping results for DNA fragments with different lengths and different concentrations.

and longer segments. The rates for the 1000–3000 bp templates were comparatively high, and length did not have an obvious effect on the looping rate. For the 500 bp recombined template at a concentration of 1 ng/μL, the looping rate was the lowest, and the rate increased as the concentration rose. The concentration of the template was not obviously affected by the length of the template when the length was longer than 500 bp.

3.4.2. Ligase from different companies Four ligases from different companies (two foreign companies: Promega (P), NEB (N), Takara (T); one domestic company: Kangwei (K)) were chosen. We analyzed looping rates by relative quantification PCR compared to recombined templates of 500 bp and 1000 bp at concentrations of 1 ng/μL and 10 ng/μL, respectively; each group had three replicates. The Ct values of Inner and Ligation primers were measured during real-time PCR, and ΔΔCt was calculated with the Comparative Delta-delta Ct method. 1 ng/μL of 500 bp recombined fragment with the Promega ligase was used as the control group, and we compared multiple relationships between the samples and the control group. The results were shown in Fig. 4 and Table 4. Based on the different lengths and concentrations mentioned above, the looping rate of T4 ligase from Kangwei with a 500 bp recombination fragment at 1 ng/μL was the highest, while NEB was lowest. The looping rate of T4 ligase from Kangwei with a 500 bp recombination fragment at 10 ng/μL was slightly higher than that of the others, while NEB was still the lowest. Although the ligases from different companies had slight effects on the looping rate when the length of the linear DNA was short, these effects would not cause significant changes. For lengths of

Table 2 The comparison of restriction enzyme cutting sites distributions between single restriction enzyme and isocaudarner. The species

Oryza sativa Escherichia coli O157:H7 EDL933 Saccharomyces cerevisiae Homo sapiens Chromosome 1 Arabidopsis thaliana The average value a

The proportion of sections (%) The length of the fragments (bp)

BamH I

Bgl II

BamH I and Bgl II

D-valuea

Sal I

Xho I

Sal I and Xho I

D-value

0–2500 0–2500 0–2500 0–2500 0–2500

38.50 29.97 26.33 34.09 35.50 32.88

51.39 32.46 45.45 48.39 61.75 47.89

67.44 48.80 66.42 64.02 72.92 63.92

16.05 16.34 20.97 15.63 11.17 16.03

44.17 24.58 18.30 3.74 23.84 22.93

43.00 8.73 21.75 15.73 35.91 25.02

62.65 32.20 40.38 18.78 47.21 40.24

18.48 7.62 18.63 3.05 11.30 11.82

D-value was the percentage difference between the two isocaudarners together and the higher one in single digestion in the same length of the fragment.

Y. Shang et al. / Gene 534 (2014) 222–228 Table 3 Looping results for DNA fragments with different lengths and concentrations. Samplea Inner curve Ligation curve Average Standard ΔΔCt ΔCt deviation Average Average of ΔCt Ct value Ct value

Compare with 5-1

5-1 5-5 5-10 10-1 10-5 10-10 15-1 15-5 15-10 20-1 20-5 20-10 25-1 25-5 25-10 30-1 30-5 30-10

1.000 2.668 5.271 226491.000 256914.900 252138.100 193030.200 241233.600 270665.300 211082.900 243596.000 248611.800 206283.400 237944.200 244743.600 216268.500 220452.600 256180.900

8.638 5.957 5.133 9.953 6.638 5.612 9.358 6.585 5.668 9.964 6.706 5.579 9.924 6.627 5.676 9.794 6.667 5.857

25.773 21.676 19.870 9.299 5.802 4.803 8.935 5.840 4.757 9.412 5.947 4.791 9.405 5.902 4.910 9.206 6.052 5.025

17.135 15.719 14.737 −0.654 −0.836 −0.809 −0.423 −0.745 −0.911 −0.552 −0.759 −0.788 −0.519 −0.725 −0.766 −0.587 −0.615 −0.832

0.184 0.021 0.080 0.045 0.060 0.074 0.176 0.059 0.065 0.022 0.036 0.019 0.046 0.088 0.044 0.154 0.089 0.028

0.000 −1.416 −2.398 −17.789 −17.971 −17.944 −17.559 −17.880 −18.046 −17.688 −17.894 −17.924 −17.654 −17.860 −17.901 −17.723 −17.750 −17.967

227

Table 4 Ligation results with ligases from different manufacturers. Samplea Inner curve

Ligation curve

Average ΔCt standard ΔΔCt ΔCt deviation

Compare with 5-1-p

18.276 18.406 17.113 17.901 15.052 15.825 14.863 15.138 0.119 0.144 0.513 0.391 0.028 0.141 0.156 −0.160

1.000 0.914 2.240 1.297 9.345 5.470 10.652 8.805 292351.400 287226.200 222471.600 242042.600 311387.100 287857.300 284930.900 354709.300

Ct average Ct average

a Sample naming principle: the first number represents the length of the fragment/ 100 bp; the second number represents the looping concentration. 5–1 indicates that the length of the fragment is 500 bp and that the looping concentration is 1 ng/μL.

1000 bp and longer fragments, the looping rates of the T4 ligases from different manufacturers were not obviously different. When the results above were integrated, the percentage of the fragments digested by restriction enzymes that ranged from 0–499 bp was the highest; these segments have low looping rates, which affected the followed looping efficiency, therefore, the optimal length of a looping fragment was longer than 500 bp. Because the common amplification length for DNA looping was 500–3000 bp, the longest looping segment we studied was 3000 bp. The average percentage of the fragments after single restriction enzyme digestion located within a range of 500–3000 bp which were fit for DNA looping was nearly 30%. If a 500 bp of known fragment were reserved that did not have the restriction enzyme cutting sites which would be used next, the length of the

5-1-p 5-1-n 5-1-k 5-1-t 5-10-p 5-10-n 5-10-k 5-10-t 10-1-p 10-1-n 10-1-k 10-1-t 10-10-p 10-10-n 10-10-k 10-10-t

7.797 9.007 8.880 9.866 4.822 4.891 5.229 5.149 8.780 9.957 9.972 11.246 5.544 5.651 5.790 6.198

26.073 27.414 25.993 27.767 19.874 20.716 20.092 20.287 8.899 10.101 10.485 11.638 5.572 5.792 5.946 6.038

0.187 0.095 0.030 0.250 0.104 0.290 0.322 0.132 0.082 0.064 0.177 0.277 0.061 0.024 0.044 0.076

0.000 0.130 −1.163 −0.375 −3.224 −2.452 −3.413 −3.138 −18.157 −18.132 −17.763 −17.885 −18.248 −18.135 −18.120 −18.436

a Sample naming principle: the first number represents the length of the fragment/ 100 bp; the second number represents the looping concentration; the third character indicates the short name of the company; 5-1-P means 500 bp looping with a 1 ng/μL concentration of the ligase from Promega.

digested targets could be guaranteed longer than 500 bp. Therefore, we chose a restriction enzyme that can digest the genome into fragments that lie in the range of 0–2500 bp as the most optimal choice, the percentage of the fragments which fit for looping was obviously increased. Based on the statistics, Hind III, EcoR I, Pst I, EcoR V and the isocaudarners Bgl II and BamH I were recommended as the preferred single cohesive end systems when digesting common species; the fragments digested by these enzymes were lied in the range of 0–2500 bp more often than others, and Bgl I and Xba I could be possible alternative choices. Consequently, if we chose 4–5 single cohesive end systems, especially isocaudarners, to digest the genome separately, the percentage of fragments in the 0–2500 bp range could reach above 98.90%, including the reserved 500 bp known sequence; the digested fragments were fit for DNA looping. The DNA looping rate was obviously increased at least 60% compared to conventional single digestion method without reserved fragments.

4. Conclusions and perspectives

Fig. 4. The looping results for ligases from different manufacturers.

The choice of restriction enzyme and the length of the digested fragment were the two important aspects that affected DNA looping. By theoretically analyzing the distribution of restriction enzyme cutting sites with two software programs, the DNA looping rate reached above 98.90% when we chose 4–5 proper single cohesive end systems and reserved known fragment of no less than 500 bp. The two novel compiled software programs can be applied to do statistics and analyze the restriction enzyme cutting sites of other interested restriction enzymes, and offer useful biological information. Meanwhile, the obtained data of this research provide the distributions of the restriction enzyme cutting sites which are commonly used, and give a reference for DNA digestion and looping of choosing the restriction enzyme. The results have a guiding significance for the experiments which related to the restriction enzyme and DNA looping. With the results of this study, which offer changes to methods related to restriction enzymes and DNA looping and provides a more bioinformatics-oriented basis for the digestion of other genomes in modular biological research, particularly IPCR, the advantages of IPCR can be improved in the area of the amplification of flanking sequences.

228

Y. Shang et al. / Gene 534 (2014) 222–228

Supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.gene.2013.10.054. Conflict of interest statement The authors declare no conflicts of interest. Acknowledgments The study was funded by the National GMO Cultivation Major Project of New Varieties (No. 2013ZX08012-001 and No. 2014ZX08012-001). References Amanda, M.L., John, J.B., Nathan, C.C., Hal, S.A., 2012. Linking yeast Gcn5p catalytic function and gene regulation using a quantitative, graded dominant mutant approach. PLoS One 7, e36193. Bae, J.H., Sohn, J.H., 2010. Template-blocking PCR: an advanced PCR technique for genome walking. Anal. Biochem. 398, 112–116. Benkel, B.F., Fong, Y., 1996. Long range-inverse PCR (LR-IPCR): extending the useful range of inverse PCR. Genet. Anal-Biomol. E 13, 123–127. Conrad, D.F., et al., 2011. Variation in genome-wide mutation rates within and between human families. Nat. Genet. 43, 712–714. Cramer, F., Christensen, C.L., Poulsen, T.T., Badding, M.A., Dean, D.A., Poulsen, H.S., 2012. Insertion of a nuclear factor kappa B DNA nuclear-targeting sequence potentiates suicide gene therapy efficacy in lung cancer cell lines. Cancer Gene Ther. 19, 675–683. David, Q., Ignacio, H.C., 2001. Transgenic DNA introgressed into traditional maize landraces in Oaxaca, Mexico. Nature 414, 541–543. Drewell, R.A., Bae, E., Burr, J., Lewis, E.B., 2002. Transcription defines the embryonic domains of cis-regulatory activity at the Drosophila bithorax complex. Proc. Natl. Acad. Sci. U. S. A. 99, 16853–16858. Guoqiang, Z., et al., 2012. A mimicking-of-DNA-methylation-patterns pipeline for overcoming the restriction barrier of bacteria. PLoS Genet. 8, e1002987. Halford, S.E., Welsh, A.J., Szczelkun, M.D., 2004. Enzyme-mediated DNA looping. Annu. Rev. Biophys. Biomol. Struct. 33, 1–24. Jiao, X.Y., Lü, M.D., Huang, J.F., Liang, L.J., Shi, J.S., 2004. Genomic determination of CR1 CD35 density polymorphism on erythrocytes of patients with gallbladder carcinoma. World J. Gastroenterol. 10, 3480–3484.

Jin, L., Lilin, W., Harvey, M., Matthew, H.K., Ross, B., Makrigiorgos, G.M., 2008. Replacing PCR with COLD-PCR enriches variant DNA sequences and redefines the sensitivity of genetic testing. Nat. Med. 14, 579–584. Kun, Y., Xuelong, W., Chunxiu, L., Jinqing, C., 2010. Study on the isolation of the flanking sequences adjacent to transgenic T-DNA in Brassica napus genome by all improved inverse PCR method. J. Anhui Agric. Sci. 38, 5002–5005. Li, Z.H., Liu, D.P., Wang, J., Guo, Z.C., Yin, W.X., Liang, C.C., 1998. Inversion and transposition of Tc1 transposon of C. elegans in mammalian cells. Somat. Cell Mol. Genet. 24, 363–369. Liu, Z., Chen, Y.H., 2004. Design and construction of a recombinant epitope-peptide gene as a universal epitope vaccine strategy. J. Immunol. Methods 285, 93–97. Meunier, J., Duret, L., 2004. Recombination drives the evolution of GC-content in the human genome. Mol. Biol. Evol. 21, 984–990. Oryza sativa Genome Project Report http://www.ncbi.nlm.nih.gov/sites/entrez? db=genomeprj&cmd=Retrieve&dopt=Overview&list_uids=9512. Saccharomyces cerevisiae (baker's yeast) Genome Project Report http://www.ncbi.nlm. nih.gov/sites/entrez? Db=genomeprj&cmd=ShowDetailView&TermToSearch=9518. Human Genome Resources http://www.ncbi.nlm.nih.gov/genome/guide/human/. Arabidopsis thaliana (thale cress) Genome Project Report http://www.ncbi.nlm.nih.gov/ sites/entrez?Db=genomeprj&cmd=ShowDetailView&TermToSearch=9506. Ochman, H., Gerber, A.S., Hartl, D.L., 1988. Genetic applications of an inverse polymerase chain reaction. Genetics 120, 621–623. Saxonov, S., 2012. Methods and compositions for nucleic acid analysis. US Patent 20120316074. Thomas, D.S., Kenneth, J.L., 2008. Analyzing real-time PCR data by the comparative CT method. Nat. Protoc. 3, 1101–1108. Toriyama, K., Arimotoa, Y., Uchimiyaa, H., Hinata, K., 1988. Transgenic rice plants after direct gene transfer into protoplasts. Nat. Biotechnol. 6, 1072–1074. Trinh, Q., Xu, W.T., Shi, H., Luo, Y.B., Huang, K.L., 2012a. An A-T linker adapter polymerase chain reaction method for chromosome walking without restriction site cloning bias. Anal. Biochem. 425, 62–67. Trinh, Q., Shi, H., Xu, W.T., Hao, J.R., Luo, Y.B., Huang, K.L., 2012b. Loop-linker PCR: an advanced PCR technique for genome walking. IUBMB Life 64, 841–845. Tsaftaris, A., Pasentzis, K., Argiriou, A., 2010. Rolling circle amplification of genomic templates for inverse PCR (RCA-GIP): a method for 5′- and 3′-genome walking without anchoring. Biotechnol. Lett. 32, 157–161. Van den Broek, B., Vanzi, F., Normanno, D., Pavone, F.S., Wuite, G.J., 2006. Real-time observation of DNA looping dynamics of Type IIE restriction enzymes NaeI and NarI. Nucleic Acids Res. 34, 167–174. Wang, T., et al., 2012. Characterization of replication and conjugation of plasmid pWTY27 from a widely distributed Streptomyces species. BMC Microbiol. 7, 253.

Restriction enzyme cutting site distribution regularity for DNA looping technology.

The restriction enzyme cutting site distribution regularity and looping conditions were studied systematically. We obtained the restriction enzyme cut...
1007KB Sizes 0 Downloads 0 Views