Mining Tissue-specific Contigs from Peanut (Arachis hypogaea L.) for Promoter Cloning by Deep Transcriptome Sequencing 1

State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China 2 Department of Biology, Miami University, Oxford, OH 45056, USA 3 Department of Computer Science and Software Engineering, Miami University, Oxford, OH 45056, USA

*Corresponding author: Email, [email protected] (Received February 6, 2014; Accepted August 13, 2014)

Keywords: Arachis hypogaea  Digital gene expression profile  Root-specific transcript assembly contigs  Seed-specific transcript assembly contigs  Tissue-specific promoters. Abbreviations: TPM, transcripts per million clean tags; TACs, Transcript assembly contigs. The raw transcriptome data described in this study have been deposited in the Sequence Read Archive (accession No. SRR1105785). The assemble contigs have been

deposited in the Transcriptome Shotgun Assembly (accession No. GATE00000000.1).

Introduction Peanut is an important oil and cash crop worldwide. World peanut production reached 39.46 million metric tons in 2013 (USDA 2013), with China, India and Nigeria as the leading producing countries. Yields are strongly affected by insects, bacterial and fungal diseases (Vargas Gil et al. 2008). In particular, white grubs, the subterranean larvae of scarab beetles, are an important pest (Rogers et al. 2005, Li et al. 2012). They feed on peanut roots and pods, contributing to seed yield loss significantly. Genetic engineering is useful for incorporating agronomically valuable genes into popular commercialized cultivars. We have cloned a series of cry8type genes from Bacillus thuringiensis that showed toxicity to white grubs (Yu et al. 2006, Shu et al. 2007, 2009). In particular, we successfully transferred the cry8Ea1 gene into peanut roots through Agrobacterium rhizogenes (Geng et al. 2013). To control the subterranean insects, root- and seedspecific promoters were needed to direct the tissue-specific expression of insecticidal genes in peanut and avoid the undesirable side effects of constitutive promoters (Kasuga et al. 1999, Kabouw et al. 2012). However, key factors including the large and complex tetraploid genome, insufficient transcriptome data and lack of sequenced genome, inhibit the exploration of tissue-specific promoters in peanut. Historically, several methods, such as subtractive hybridization (Zimmermann et al. 1980), suppression subtractive hybridization (Diatchenko et al. 1996), differential display reverse transcription PCR (Bauer et al. 1993) and cDNA representational difference analysis (Hubank and Schatz 1994) had been widely used to analyze differences in gene expression. Subsequently, cDNA microarrays were used to monitor gene expression profile changes (Kononen et al. 1998, Nelson et al. 2004, Aya et al. 2014), but were limited by the fact that only known genes can be recognized by the microarray chips.

Plant Cell Physiol. 55(10): 1793–1801 (2014) doi:10.1093/pcp/pcu111, Advance Access publication on 16 September 2014, available online at www.pcp.oxfordjournals.org ! The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: [email protected]

Downloaded from http://pcp.oxfordjournals.org/ at University of Texas at San Antonio on October 12, 2014

Peanut (Arachis hypogaea L.), one of the most important oil legumes in the world, is heavily damaged by white grubs. Tissue-specific promoters are needed to incorporate insect resistance genes into peanut by genetic transformation to control the subterranean pests. Transcriptome sequencing is the most effective way to analyze differential gene expression in this non-model species and contribute to promoter cloning. The transcriptomes of the roots, seeds and leaves of peanut were sequenced using Illumina technology. A simple digital expression profile was established based on number of transcripts per million clean tags (TPM) from different tissues. Subsequently, 584 root-specific candidate transcript assembly contigs (TACs) and 316 seed-specific candidate TACs were identified. Among these candidate TACs, 55.3% were root-specific and 64.6% were seed-specific by semi-quantitative RTPCR analysis. Moreover, the consistency of semiquantitative RT-PCR with the simple digital expression profile was correlated with the length and TPM value of TACs. The results of gene ontology showed that some root-specific TACs are involved in stress resistance and respond to auxin stimulus, whereas, seed-specific candidate TACs are involved in embryo development, lipid storage and long-chain fatty acid biosynthesis. One rootspecific promoter was cloned and characterized. We developed a high-yield screening system in peanut by establishing a simple digital expression profile based on Illumina sequencing. The feasible and rapid method presented by this study can be used for other non-model crops to explore tissue-specific or spatially specific promoters.

Regular Paper

Lili Geng1, Xiaohong Duan1, Chun Liang2,3, Changlong Shu1, Fuping Song1 and Jie Zhang1,*

L. Geng et al. | Mining tissue-specific promoter from peanut

With advances in sequencing technology, RNA-Seq using Illumina, 454 and other next-generation sequencing platforms were becoming the popular choice for high-throughput studies of gene expression (AC’t Hoen et al. 2008, Marioni et al. 2008) and advanced the robustness, richness and comparability of expression profiling data (AC’t Hoen et al. 2008, Bai et al. 2013, Begara-Morales et al. 2014). Comparative transcriptome analysis revealed the underlying mechanisms of many important biological processes, such as pollination (Osaka et al. 2013), ovule development (Kubo et al. 2013), seed germination (Endo et al. 2012) and callus formation in young and mature hypocotyl of Arabidopsis (Chen et al. 2012). It was also used to analyze the responses of alfalfa to salt stress (Postnikova et al. 2013) and the biosynthetic pathways of secondary metabolites in medical plants (Ramilowski et al. 2013). Several of tissuespecific promoters and motifs were defined in mouse and rice via comparison of transcriptomes from different tissues (Barrera et al. 2008, Jiao et al. 2009). A public genomic database, PeanutDB, integrated peanut transcriptome data obtained by our lab and other different sources and served as an easy-to-use web portal (Duan et al. 2012), and facilitated genomics research and molecular breeding in peanut. Genes related to the early embryo abortion of peanut had been identified by deep sequencing (Chen et al. 2013). Although many root-specific or seed-specific promoters have been cloned from Arabidopsis thaliana (Nitz et al. 2001, Koyama et al. 2005), Oryza sativa (Wu et al. 1998, Jeong et al. 2010, Li et al. 2013), Nicotiana tabacum (Yamamoto et al. 1991) and Brassica napus (Keddie et al. 1994, Sta˚lberg et al. 1996), the promoters traditionally used in peanut transformation primarily consist of 35S or soybean promoters. Therefore, alternative promoters, particularly tissue-specific promoters, were needed to improve the agronomic traits of peanut through transgenic approaches. In this study, the transcriptome data of peanut generated by Illumina genome sequencing (Duan et al. 2012) were screened. Many root-specific and seed-specific TACs were identified. One root-specific promoter was obtained, and its function was analyzed and verified. 1794

Results Digital gene expression profile based on Illumina sequencing To compare transcript abundances, the transcriptomes of roots, seeds and leaves were sequenced by Illumina sequencing. Approximately 5 Gb of sequence data (33 million reads from leaf, 47 million reads from root, 16 million reads from seed, each 75 bp long) was obtained. Transcript reads containing adaptor sequences were cleaned, and low-quality reads were filtered out with high stringency. About 57 million reads have been used for transcript assembly. The digital gene expression profile was established based on the TPM value of the TACs. Five hundred and eighty-four root-specific candidate TACs (Supplementary Table S1) and 316 seed-specific candidate TACs (Supplementary Table S2) were identified, with average lengths of 505 and 514 bp, respectively (varying from 300 to 2895 bp) (Fig. 1).

Semi-quantitative RT-PCR analysis of tissue-specific genes from digital profile All of the tissue-specific candidate TACs greater than 500 bp in length and only 5 of TACs shorter than 500 bp were selected for semi-quantitative RT-PCR analysis (Fig. 1). Ninety-nine of 179 root-specific candidate TACs were expressed exclusively in roots (Supplementary Fig. S1). Among all of the selected TACs, 55.3% were root-specific, and 36.3% were expressed in the root but also exhibited low expression levels in other tissues, such as leaves, stems or seeds (Fig. 2B). The correlation between the digital expression profile and semi-quantitative RT-PCR analysis increased with increasing TAC length, and 75.6% of TACs greater than 1000 bp were root-specific (Fig. 2A). For the 40 TACs for which the TPM values of leaf and seed were 0, 67.5% were expressed exclusively in roots (Fig. 2C), a level 12.2% higher than the average. They also exhibited a high degree of length preference, such that 85.7% of the TACs ranging in length from 602 to 953 bp agreed well with the digital expression profile (Fig. 2C). Moreover, the

Downloaded from http://pcp.oxfordjournals.org/ at University of Texas at San Antonio on October 12, 2014

Fig. 1 Length distribution of tissue-specific candidate TACs.

Plant Cell Physiol. 55(10): 1793–1801 (2014) doi:10.1093/pcp/pcu111

Downloaded from http://pcp.oxfordjournals.org/ at University of Texas at San Antonio on October 12, 2014

Fig. 2 Semi-quantitative RT-PCR of root-specific candidate TACs. The percentage of root-specific TACs rose when the TACs length increased (A). Among all of the selected TACs, 55.3% were root-specific, 36.3% were non-root-specific and 8.4 % were not amplified by RT-PCR (B). For candidate TACs with TPM values of 0 in both leaf and seed, the percentage of root-specific TACs also increased with TAC length (C). The percentage of root-specific TACs varied by root TPM, and was highest when the root TPM value was greater than 50 (D); n, the sample number.

Fig. 3 Semi-quantitative RT-PCR of seed-specific candidate TACs. The percentage of seed-specific TACs varied with TAC length (A). Among all of the selected TACs, 64.6% were seed-specific, and 35.4% were non-seed-specific (B). The percentage of seed-specific TACs varied among different seed TPM intervals, and it reached peak point when TACs’ seed TPM value greater than 300 (C).

consistency of semi-quantitative RT-PCR with the simple digital expression profile was correlated with TPM value of TACs. For TACs with a root TPM value less than 50, 47.6%–52.9% were root-specific, whereas 69.8% of TACs with a root TPM value greater than 50 were root-specific (Fig. 2D).

For the seed-specific gene digital expression profile, 82 of 127 TACs were expressed exclusively in seeds, based on the semi-quantitative RT-PCR analysis (Supplementary Fig. S2). Seed-specific TACs accounted for 64.6% of the selected TACs (Fig. 3B). The percentage of seed-specific TACs varied among 1795

L. Geng et al. | Mining tissue-specific promoter from peanut

Table 1 Gene ontology classification of peanut root-specific candidate TACs Class

Root-specific TACs

Root-specific candidate TACs

Percentage of root-specific TACs (%)

Biological process

Biological adhesion Biological regulation Cellular component organization or biogenesis Cellular process Developmental process Establishment of localization Localization Metabolic process Multicellular organismal process Negative regulation of biological process Regulation of biological process Reproduction Response to stimulus Signaling Single-organism process

1 4 1 29 3 9 9 46 3 1 4 1 14 1 12

1 9 1 42 4 15 16 67 4 2 8 1 21 3 20

100.0 44.4 100.0 69.0 75.0 60.0 56.3 68.7 75.0 50.0 50.0 100.0 66.7 33.3 60.0

Cellular component

Cell Cell Part Extracellular region Membrane Membrane part Organelle Organelle part

18 17 2 8 7 9 1

30 29 3 17 14 17 3

60.0 58.6 66.7 47.1 50.0 52.9 33.3

Molecular function

Antioxidant activity Binding Catalytic activity Enzyme regulator activity Nucleic acid binding transcription factor activity Transporter activity

3 43 53 1 1 3

7 69 80 1 3 6

42.9 62.3 66.3 100.0 33.3 50.0

Note: Categories with more than 20 root-specific candidate TACs and a higher percentage of root-specific TACs than the average level (55.3%, Fig. 2B) are marked in light green.

different TACs length intervals and 76.3% of TACs ranged between 500 and 599 bp were expressed primarily in seeds (Fig. 3A). Moreover, 87.5% of TACs with seed TPM values greater than 300 were seed-specific (Fig. 3C), a level 12.9% higher than the average. To identify the functional category of these selected TACs, gene ontology (GO) was employed to classify the transcripts annotated by known proteins. The result showed that 157 of 179 root-specific and 71 of 127 seed-specific candidate TACs were assigned to known GO categories. For root-specific candidate TACs, 58 were assigned to the Biological Process, 27 to the Cellular Component and 96 to Molecular Function (Table 1). Nine categories (Table 1) had a higher percentage of root-specific TACs than the average level (55.3%, Fig. 2B), and TACs assigned to Cellular Process, Metabolic Process and Response to Stimulus were more likely to be verified as genuinely root-specific. For example, among our data set one heat stress transcription factor (Hsf) and four peroxidases were expressed exclusively in roots (Supplementary Table S3). It had been reported that Hsf can be induced by different abiotic stressors, particularly in roots, and some Hsf mutant lines developed shorter roots and less lateral roots (Scharf et al. 2012). Some peroxidases were root-specific and involved in root differentiation (Nakamura et al. 1988, Tarkka et al. 2001). For seed-specific candidate TACs, 22 were assigned to the Biological Process, 24 to the Cellular Component and 27 to Molecular Function (Table 2). The percentage of seed-specific 1796

TACs assigned to Organelle, Cell and Cell Part (Table 2) was slightly higher than the average level (64.6%, Fig. 3B). Seedspecific TACs assigned to these three cellular component categories were also assigned to biological process terms relating to seed storage proteins (arachin, conarachin and globulin) and oil-body proteins (oleosin and steroleosin) (Supplementary Table S3). Peanut seeds have a large endosperm tissue accumulating proteins and lipids. Seed storage proteins and oil-body proteins are important for embryo development and germination of the young seedling. Root and seed play different roles in plant growth. Rootspecific and seed-specific TACs were assigned to different biological processes. Some root-specific TACs are involved in stress resistance and response to auxin stimulus (Supplementary Table S3), and none of these were previously described in peanut except for Contig309016. In contrast, some seed-specific candidate TACs involved in embryo development, lipid storage and long-chain fatty acid biosynthesis (Supplementary Table S3). These results emphasize the correlation of our digital gene expression profile with function of root and seed.

Characteristics of a tissue-specific promoter Among the tissue-specific TACs, a possible promoter region of contig229999, named Asy, was isolated (GenBank No. JQ780692). To further study the expression pattern of the potential promoter Asy, a fragment containing the 550-bp

Downloaded from http://pcp.oxfordjournals.org/ at University of Texas at San Antonio on October 12, 2014

Ontology

Plant Cell Physiol. 55(10): 1793–1801 (2014) doi:10.1093/pcp/pcu111

Table 2 Gene ontology classification of peanut seed-specific candidate TACs Class

Seed-specific TACs

Seed-specific candidate TACs

Percentage of seed-specific TACs (%)

Biological process

Biological regulation Cellular component organization or biogenesis Cellular process Developmental process Establishment of localization Immune system process Localization Metabolic process Multi-organism process Multicellular organismal process Positive regulation of biological process Regulation of biological process Reproduction Reproductive process Response to stimulus Signaling Single-organism process

6 4 16 4 3 1 4 25 2 3 2 5 1 1 8 2 10

10 4 32 6 9 1 10 49 3 5 2 9 3 3 11 2 18

60.0 100.0 50.0 66.7 33.3 100.0 40.0 51.0 66.7 60.0 100.0 55.6 33.3 33.3 72.7 100.0 55.6

Cellular Component

Cell Cell part Macromolecular complex Membrane Membrane part Organelle Organelle part

21 21 3 6 5 16 4

32 32 4 10 8 24 5

65.6 65.6 75.0 60.0 62.5 66.7 80.0

Molecular Function

Binding Catalytic activity Nucleic acid binding transcription factor activity Transporter activity

20 22 1 1

35 43 3 2

57.1 51.2 33.3 50.0

Downloaded from http://pcp.oxfordjournals.org/ at University of Texas at San Antonio on October 12, 2014

Ontology

Note: Categories with more than 20 seed-specific candidate TACs and a higher percentage of seed-specific TACs than the average level (64.6%, Fig. 3B) are marked in light green.

Fig. 4 GFP observations of kanamycin-resistant shoots of N. tabacum. A and B, two true leaves; C and D, four true leaves; E and F, vigorous growth.

promoter region was placed upstream of the GUSPlus reporter gene in a binary transformation vector (Supplementary Fig. S3) .and transformed into A. thaliana. Meanwhile, the GFP reporter gene driven by the Asy promoter (Supplementary Fig. S3) was transformed into Nicotiana tabacum. Upon analysis of the transgenic A. thaliana plants by histochemical assay, GUS activity was detected in the roots of two-leaf, six-leaf and rosette-leaf stage plants (Fig. 4). However, no GUS staining was observed in the leaves and stems of these plants. For the transgenic N. tabacum plants, GFP fluorescence was observed under a fluorescence microscope only in the roots (Fig. 5). No GFP expression occurred in the leaves or stems. These results

indicate that the promoter Asy was root-specific, in agreement with the results of the semi-quantitative RT-PCR analysis of contig229999.

Discussion Constitutive overexpression of transgenes that disturb normal plant processes illustrates the need for spatial and temporal control of gene expression. One potential approach is to clone tissue-specific promoters to control transgene expression. In this study, we devised a large-scale screening system for 1797

L. Geng et al. | Mining tissue-specific promoter from peanut

the isolation of tissue-specific genes in peanut using a simple digital expression profile generated by RNA-Seq data. Consequently, 900 candidate TACs and 181 tissue-specific genes were identified, and a root-specific promoter was cloned and characterized. Historically, gene expression was first analyzed by subtractive hybridization (Zimmermann et al. 1980, Diatchenko et al. 1996), which was time consuming and labor intensive. Subsequently, gene expression microarrays were widely used (Kononen et al. 1998, Nelson et al. 2004). However, this method also presented problems, such as background noise, cross-hybridization and detection of only predefined sequences. These problems disappeared with the introduction of deep sequencing technology, which enables monitoring of the expression of unknown and low-abundance genes. Low-cost and high-throughput next-generation sequencing technologies are becoming useful for discovering novel genes and investigating gene expression patterns, particularly via RNA-Seq approach (AC’t Hoen et al. 2008, Marioni et al. 2008). The wide range of gene expression profiles identified 215 growth stage-specific expressed genes universally in leaf blade, leaf sheath, and root in japonica rice (Sato et al. 2011), and 177 genes involved in the seed filling process of Glycine max (Severin et al. 2010). In addition, many important genes of non-model plants have been identified using this method, including genes related to sweet potato root formation and development (Wang et al. 2010), novel candidate genes of flavonoid, theanine and caffeine biosynthesis pathways in Camellia sinensis (Shi et al. 2011), 36 cellulose synthase genes of ramie that are likely responsible for the biosynthesis of bast fiber (Liu et al. 2013), and 92 differentially expressed genes of blueberry involved in regulating fruit metabolism and anthocyanin content (Li et al. 2010). Here, the transcriptomes of the roots, seeds and leaves of peanut were sequenced by Illumina sequencing to compare the expression levels of genes in different tissues. A simple 1798

digital gene expression profile was established using Microsoft Excel, based on the TPM values of TACs. As a result, 584 rootspecific candidate TACs and 316 seed-specific candidate TACs were identified. After RT-PCR analysis, 55.3% of the candidate TACs were root-specific and 64.6% of the candidate TACs were seed-specific. The percentage of tissue-specific genes identified in the present study is significantly higher than the 31.2% reported for a study of tomato (Lim et al. 2012), which identified candidate genes using in silico orthologue searches and comparisons using the microarray database of Arabidopsis. When the digital gene expression profile was established, contigs with TPM ratio greater than 50 were defined as putatively tissuespecific. The percentage of putatively root-specific TACs that were validated by RT-PCR reached peak value as the TPM ratio increased to 60 to 80 (Supplementary Fig. S4A). When the TPM ratio cutoff increased to 80-323, number of root-specific TACs reduced, probably because a higher cutoff might increase the number of false negatives. The same trend was shown in seed-specific TACs (Supplementary Fig. S4B), whereas the percentage of seed-specific TACs reached the lowest point when the TPM ratio increased to 150 to 200 (Supplementary Fig. S4C). Other elements, such as length of contigs, might also affect this. Moreover, the consistency of the simple digital expression profile with the semi-quantitative RT-PCR analysis was correlated with the lengths and TPM values of the TACs. Within the constraints of length and TPM value, the consistency reached 85.7% for root-specific TACs and 87.5% for seedspecific TACs (Fig. 2C and Fig. 3C). So the simple digital gene expression profile provided an efficient way to mine tissue-specific contigs for promoter cloning. Because peanut is tetraploid, transcripts from homologous genes, including homoeologs, might have been clustered into the same contig. It is not an easy task to separate them because no genome sequence of peanut was available when we assembled the reads. No special criterion was set during the assembly. But we still found contigs shared high similarity

Downloaded from http://pcp.oxfordjournals.org/ at University of Texas at San Antonio on October 12, 2014

Fig. 5 GUS staining analysis of kanamycin-resistant shoots of A. thaliana. A, two true leaves; B, six true leaves; C, rosette leaves.

Plant Cell Physiol. 55(10): 1793–1801 (2014) doi:10.1093/pcp/pcu111

(>98%), and some of them had different expression profile. For example, non-seed specific Contig38549 has 99% similarity with seed specific Contig325789 (Supplementary Table S3). We sequenced and characterized the transcriptomes of peanut roots, seeds and leaves and established a simple digital expression profile to explore root- and seed-specific candidate genes. Our results illustrate the utility of Illumina sequencing for identifying tissue-specific genes in non-model plant species. Mining tissue-specific promoters will accelerate research in the insect and disease resistance of peanut. Additionally, this study provides a valuable method for facilitating future exploration of tissue-specific or spatially specific promoters in other lessstudied crops.

RNA isolation, Illumina sequencing and in silico analysis Cultivated peanut Baisha1016 was kept in a growth chamber with a 14 h/35 mmol m2 s1 photoperiod at 26–28 C. Leaves, roots and stems were collected from 3-week-old seedlings, and young pods were collected between 25 and 55 days after peg penetration into the soil. Total RNA was isolated from stems, roots, leaves and seeds using the TRIzol (Invitrogen, Germany) reagent and treated with RNase-free DNase (Promega, Germany) according to the manufacturer’s instructions. The RNA samples were sent to the SinoGenoMax Research Center (Beijing, China). The mRNA was purified from 4 mg of total RNA using Sera-mag Magnetic Oligo (dT) beads from Illumina. After the purification, mRNA was fragmented into small pieces (100–400 bp) using divalent cations at 94 C for 5 min. Double-stranded cDNA was synthesized using the SuperScript Double-Stranded cDNA Synthesis kit (Invitrogen, Camarillo) with random hexamer primers from Illumina. The synthesized cDNA was subjected to end-repair, phosphorylation, and paired-end adapter ligation. After these steps, cDNA fragments ranging from 250 to 350 were purified and enriched with PCR before sequencing on the Illumina Genome Analyzer IIx platform by the SinoGenoMax Research Center (Beijing, China). The sequencing data have been deposited in NCBI Sequence Read Archive (SRA, http:// www.ncbi.nlm.nih.gov/Traces/sra) and the accession number is SRR1105785. Data cleaning, including adapter removal and low-quality trimming, was performed using CLC Genomics Workbench (CLC bio, version 4.8) (Duan et al. 2012). CLC Genomics Workbench was also used to conduct de novo transcriptome assembly for clean sequences. Using more stringent parameters (e.g. length fraction = 0.8, similarity = 0.9) in CLC Genomics Workbench 4.8 (Duan et al. 2012). TACs were used for annotation against the NCBI nr database using a cutoff e-value of e-05 and minimum coverage length  33aa. Functional annotation and gene ontology was carried out using BLAST2GO (http://www.geneontology.org website) (Conesa et al. 2005, Conesa and Go¨tz 2008). To compare expression levels among tissues, tags were normalized to TPM (Morrissy et al. 2009, Tao et al. 2012). A simple digital gene expression profile was established using Excel software as follows: the root TPM value was divided by leaf TPM and by seed TPM, and contigs yielding values larger than 50 for both calculations, or for which both the leaf and seed TPM values were 0, were considered as root-specific; seed-specific contigs were identified in a similar manner.

Semi-quantitative RT-PCR analysis Approximately 300 ng of total RNA was reverse-transcribed using M-MLV transcriptase (Takara, Japan) with random primers (Promega, USA) at 42 C for 90 min. The gene encoding 18S rRNA was used to normalize the cDNA input. Primer 18S-F, 50 - GAAACGGCTACCACATCCAAG -30 , and primer 18SR, 50 - CCAACCCAAGGTCCAACTACG -30 , were used to amplify the 18S rRNA. For semi-quantitative RT-PCR analysis of the selected genes, a 1/10 dilution of

Isolation of the root-specific promoter Genomic DNA was first digested separately with Dra I, EcoR V, Pvu II, and Stu I, and ligated to blunt-ended adaptors, a duplex made of oligo Short-Adapt (50 PACCAGCCCGGGC-NH2) and Long-Adapt (50 - GTAATACGACTCACTATAGG GCACGCGTGGTCGACGGCCCGGGCTGGT -30 ). The 30 amino modification blocks extension, permitting only the de novo-synthesized strands to serve as templates for the primer AP1 to anneal and extend (Siebert et al. 1995). The possible promoter region of Contig229999 was amplified. The adaptor-specific primer AP1 (50 - GTAATACGACTCACTATAGGGC -30 ) and the Contig229999specific primer GSP1 (50 - GTTGGCAAATTGTAACATCTCTTTCCCT -30 ) were used for the first PCR with the following cycles: 95 C for 3 min, 7 cycles of 95 C for 25 s + 68 C for 5 min, 32 cycles of 95 C for 25 s + 65 C for 5 min, and 72 C for 10 min (Duan et al. 2013). The PCR products were diluted 10-fold and a nested PCR was performed with primers GSP2 (50 - CGTCTCTAATTTGGTGTA ACACGGTTTC -30 ) and AP2 (50 - ACTATAGGGCACGCGTGGT -30 ). PCR products were recovered, ligated into a pMDT-19 vector and sequenced by the Shanghai Sangon Biotechnology Co. (Shanghai, China).

Downloaded from http://pcp.oxfordjournals.org/ at University of Texas at San Antonio on October 12, 2014

Materials and methods

the generated cDNA was used as a template for a 20-cycle PCR reaction. The reactions were incubated in a 96-well plate at 94 C for 5 min, followed by 20 cycles of 94 C for 30 s, 58 C for 30 s and 72 C for 3 min. All reactions were performed in triplicate. All primers used in these experiments are available for download as Supplementary Table S4 and Supplementary Table S5.

Generation of transgenic N. tabacum and A. thaliana plants Tobacco leaves were cut into approximately 1 cm2 sections and then incubated with the A. tumefaciens strain LBA4404, which harbored the GFP gene driven by the Asy promotor (constructed based on pCAMBIA2300). The infected leaf discs were transferred onto a medium lacking both kanamycin and plant hormones for 2 days. The leaf discs were then transferred to MS medium supplemented with 2 mgl-1 6-BA, 0.2 mg l-1 NAA and 50 mgl-1 kanamycin. Transformations of Arabidopsis thaliana were performed by the floral dip method (Clough and Bent 1998) using the A. tumefaciens strain LBA4404, which harbors the GUSplus gene driven by the Asy promotor (constructed based on pCAMBIA2300). Seeds were selected on MS medium supplemented with 50 mgl-1 kanamycin.

Histochemical assay for GUS activity Whole A. thaliana plants were histochemically stained to analyze GUS gene expression according to Jefferson et al. (1987). The transformed plants were vacuum-infiltrated with X-Gluc solution and incubated at 37 C in the dark for 12 h, followed by washing with 70% ethanol. The GUSPlus gene contained a catalase intron, preventing expression in bacteria (Vickers et al. 2007).

GFP observation GFP expression in the transformed N. tabacum plants was observed using a stereomicroscope (Olympus SZX16, Ac Adapter) with a 480-nm excitation and a 500- to 550-nm emission filter block.

Supplementary data Supplementary data are available at PCP online.

Funding This study was supported by the National Transgenic Major Program (2014ZX08009-013B), Postdoctoral Special Finance Assistance Scheme and 863 Projects of China (2011AA10A203). 1799

L. Geng et al. | Mining tissue-specific promoter from peanut

References

1800

Downloaded from http://pcp.oxfordjournals.org/ at University of Texas at San Antonio on October 12, 2014

AC’t Hoen, P., Ariyurek, Y., Thygesen, H.H., Vreugdenhil, E., Vossen, R.H.A.M., de Menezes, R.X. et al. (2008) Deep sequencingbased expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms. Nucleic Acids Res. 36: e141. Aya, K., Hobo, T., Sato-Izawa, K., Ueguchi-Tanaka, M., Kitano, H. and Matsuoka, M. (2014) A novel AP2-type transcription factor, SMALL ORGAN SIZE1, controls organ size downstream of an auxin signaling pathway. Plant Cell Physiol. 55: 897–912. Bai, S., Saito, T., Sakamoto, D., Ito, A., Fujii, H. and Moriguchi, T. (2013) Transcriptome analysis of Japanese pear (Pyrus pyrifolia Nakai) flower buds transitioning through endodormancy. Plant Cell Physiol. 54: 1132–1151. Barrera, L.O., Li, Z., Smith, A.D., Arden, K.C., Cavenee, W.K., Zhang, M.Q. et al. (2008) Genome-wide mapping and analysis of active promoters in mouse embryonic stem cells and adult organs. Genome Res. 18: 46–59. Bauer, D., Muu¨ller, H., Reich, J., Riedel, H., Ahrenkiel, V., Warthoe, P. et al. (1993) Identification of differentially expressed mRNA species by an improved display technique (DDRT-PCR). Nucleic Acids Res. 21: 4272–4280. Begara-Morales, J.C., Sa´nchez-Calvo, B., Luque, F., Leyva-Pe´rez, M.O., Leterrier, M., Corpas, F.J. et al. (2014) Differential transcriptomic analysis by RNA-seq of GSNO-responsive genes between Arabidopsis roots and leaves. Plant Cell Physiol. 55: 1080–1095. Chen, C.C., Fu, S.F., Lee, Y.I., Lin, C.Y., Lin, W.C. and Huang, H.J. (2012) Transcriptome analysis of age-related gain of callus-forming capacity in Arabidopsis hypocotyls. Plant Cell Physiol. 53: 1457–1469. Clough, S.J. and Bent, A.F. (1998) Floral dip: a simplified method for Agrobacterium-mediated transformation of Arabidopsis thaliana. Plant J. 16: 735–743. Conesa, A. and Go¨tz, S. (2008) Blast2GO: A comprehensive suite for functional analysis in plant genomics. Int. J. Plant Genom. doi:10.1155/2008/ 619832 [Volume no. & page numbers?]. Conesa, A., Go¨tz, S., Garcı´a-Go´mez, J.M., Terol, J., Talo´n, M. and Robles, M. (2005) Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21: 3674–3676. Diatchenko, L., Lau, Y.F., Campbell, A.P., Chenchik, A., Moqadam, F., Huang, B. et al. (1996) Suppression subtractive hybridization: a method for generating differentially regulated or tissue-specific cDNA probes and libraries. Proc. Nat’l Acad. Sci. USA 93: 6025–6030. Duan, X., Geng, L., Shu, C., Zhu, Y. and Zhang, J. (2013) Cloning and bioinformatic analysis of symrk gene cloned from Arachis hypogaea L. J. Agric. Sci. Technol. 14: 33–41. Duan, X., Schmidt, E., Li, P., Lenox, D., Liu, L., Shu, C. et al. (2012) PeanutDB: an integrated bioinformatics web portal for Arachis hypogaea transcriptomics. BMC Plant Biol. 12: 94. Endo, A., Tatematsu, K., Hanada, K., Duermeyer, L., Okamoto, M., Yonekura-Sakakibara, K. et al. (2012) Tissue-specific transcriptome analysis reveals cell wall metabolism, flavonol biosynthesis and defense responses are activated in the endosperm of germinating Arabidopsis thaliana seeds. Plant Cell Physiol. 53: 16–27. Geng, L., Chi, J., Shu, C., Gresshoff, P.M., Song, F., Huang, D. et al. (2013) A chimeric cry8Ea1 gene flanked by MARs efficiently controls Holotrichia parallela. Plant Cell Rep. 32: 1211–1218. Hubank, M. and Schatz, D.G. (1994) Identifying differences in mRNA expression by representational difference analysis of cDNA. Nucleic Acids Res. 22: 5640–5648. Jefferson, R.A., Kavanagh, T.A. and Bevan, M.W. (1987) GUS fusions: betaglucuronidase as a sensitive and versatile gene fusion marker in higher plants. EMBO J. 6: 3901.

Jeong, J.S., Kim, Y.S., Baek, K.H., Jung, H., Ha, S.H., Do Choi, Y. et al. (2010) Root-specific expression of OsNAC10 improves drought tolerance and grain yield in rice under field drought conditions. Plant Physiol. 153: 185–197. Jiao, Y., Tausta, S.L., Gandotra, N., Sun, N., Liu, T., Clay, N.K. et al. (2009) A transcriptome atlas of rice cell types uncovers cellular, functional and developmental hierarchies. Nat genet. 41: 258–263. Kabouw, P., van Dam, N.M., van der Putten, W.H. and Biere, A. (2012) How genetic modification of roots affects rhizosphere processes and plant performance. J. Exper. Botany 63: 3475–3483. Kasuga, M., Liu, Q., Miura, S., Yamaguchi-Shinozaki, K. and Shinozaki, K. (1999) Improving plant drought, salt, and freezing tolerance by gene transfer of a single stress-inducible transcription factor. Nature Biotechnol. 17: 287–291. Keddie, J., Tsiantis, M., Piffanelli, P., Cella, R., Hatzopoulos, P. and Murphy, D. (1994) A seed-specific Brassica napus oleosin promoter interacts with a G-box-specific protein and may be bi-directional. Plant Mol. Biol. 24: 327–340. Kononen, J., Bubendorf, L., Kallionimeni, A., Ba¨rlund, M., Schraml, P., Leighton, S. et al. (1998) Tissue microarrays for high-throughput molecular profiling of tumor specimens. Nat. Med. 4: 844–847. Koyama, T., Ono, T., Shimizu, M., Jinbo, T., Mizuno, R., Tomita, K. et al. (2005) Promoter of Arabidopsis thaliana phosphate transporter gene drives root-specific expression of transgene in rice. J. Biosci. Bioeng. 99: 38–42. Kubo, T., Fujita, M., Takahashi, H., Nakazono, M., Tsutsumi, N. and Kurata, N. (2013) Transcriptome analysis of developing ovules in rice isolated by laser microdissection. Plant Cell Physiol. 54: 750–765. Li, R., Zhu, H., Ruan, J., Qian, W., Fang, X., Shi, Z. et al. (2010) De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 20: 265–272. Li, Y., Han, J., Yu, C., Yu, W. and Mu, W. (2012) Toxicity and control effect of seven insecticides to Holotrichia parallela. Acta Phytophylacica Sin. 39: 147–152. Li, Y., Liu, S., Yu, Z., Liu, Y. and Wu, P. (2013) Isolation and characterization of two novel root-specific promoters in rice (Oryza sativa L.). Plant Sci. 207: 37–44. Lim, C.J., Lee, H.Y., Kim, W.B., Lee, B.S., Kim, J., Ahmad, R. et al. (2012) Screening of tissue-specific genes and promoters in tomato by comparing genome wide expression profiles of Arabidopsis orthologues. Mol. Cells 34: 53–59. Liu, T., Zhu, S., Tang, Q., Chen, P., Yu, Y. and Tang, S. (2013) De novo assembly and characterization of transcriptome using Illumina pairedend sequencing and identification of CesA gene in ramie (Boehmeria nivea L. Gaud). BMC Genomics 14: 125. Marioni, J., Mason, C., Mane, S., Stephens, M. and Gilad, Y. (2008) RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 18: 1509–1517. Morrissy, A.S., Morin, R.D., Delaney, A., Zeng, T., McDonald, H., Jones, S. et al. (2009) Next-generation tag sequencing for cancer gene expression profiling. Genome Res 19: 1825–1835. Nakamura, C., Van Telgen, H.J., Mennes, A.M., Ono, H. and Libbenga, K.R. (1988) Correlation between auxin resistance and the lack of a membrane-bound auxin binding protein and a root-specific peroxidase in Nicotiana tabacum. Plant Physiol. 88: 845–849. Nelson, P.T., Baldwin, D.A., Scearce, L.M., Oberholtzer, J.C., Tobias, J.W. and Mourelatos, Z. (2004) Microarray-based, high-throughput gene expression profiling of microRNAs. Nature Meth. 1: 155–161. Nitz, I., Berkefeld, H., Puzio, P.S. and Grundler, F.M. (2001) Pyk10, a seedling and root specific gene and promoter from Arabidopsis thaliana. Plant Sci. 161: 337–346. Osaka, M., Matsuda, T., Sakazono, S., Masuko-Suzuki, H., Maeda, S., Sewaki, M. et al. (2013) Cell type-specific transcriptome of Brassicaceae stigmatic papilla cells from a combination of laser microdissection and RNA sequencing. Plant Cell Physiol. 54: 1894–1906.

Plant Cell Physiol. 55(10): 1793–1801 (2014) doi:10.1093/pcp/pcu111

transcription of the napA storage-protein promoter in transgenic Brassica napus seeds. Planta 199: 515–519. Tao, X., Gu, Y.H., Wang, H.Y., Zheng, W., Li, X., Zhao, C.W. et al. (2012) Digital gene expression analysis based on integrated de novo transcriptome assembly of sweet potato [Ipomoea batatas (L.) Lam.]. PloS One 7: e36234. Tarkka, M.T., Nyman, T.A., Kalkkinen, N. and Raudaskoski, M. (2001) Scots pine expresses short-root-specific peroxidases during development. Eur. J. Biochem. 268: 86–93. USDA. (2013) World agricultural production. U.S. Dep. Agric. Foreign Agric. Serv. Circ. WAP 13-09. Vargas Gil, S., Haro, R., Oddino, C., Kearney, M., Zuza, M., Marinelli, A. et al. (2008) Crop management practices in the control of peanut diseases caused by soilborne fungi. Crop Protect. 27: 1–9. Vickers, C.E., Schenk, P.M., Li, D., Mullineaux, P.M. and Gresshoff, P.M. (2007) pGFPGUSPlus, a new binary vector for gene expression studies and optimising transformation systems in plants. Biotechnol. Lett. 29: 1793–1796. Wang, X., Luan, J., Li, J., Bao, Y., Zhang, C. and Liu, S. (2010) De novo characterization of a whitefly transcriptome and analysis of its gene expression during development. BMC Genomics 11: 400. Wu, C.Y., Adach, T., Hatano, T., Washida, H., Suzuki, A. and Takaiwa, F. (1998) Promoters of rice seed storage protein genes direct endospermspecific gene expression in transgenic rice. Plant Cell Physiol. 39: 885–889. Yamamoto, Y.T., Taylor, C.G., Acedo, G.N., Cheng, C.L. and Conkling, M.A. (1991) Characterization of cis-acting sequences regulating root-specific gene expression in tobacco. Plant Cell 3: 371–382. Yu, H., Zhang, J., Huang, D., Gao, J. and Song, F. (2006) Characterization of Bacillus thuringiensis strain Bt185 toxic to the Asian cockchafer: Holotrichia parallela. Current Microbiol. 53: 13–17. Zimmermann, C.R., Orr, W.C., Leclerc, R.F., Barnard, E.C. and Timberlake, W.E. (1980) Molecular cloning and selection of genes regulated in aspergillus development. Cell 21: 709–715.

Downloaded from http://pcp.oxfordjournals.org/ at University of Texas at San Antonio on October 12, 2014

Postnikova, O.A., Shao, J. and Nemchinov, L.G. (2013) Analysis of the alfalfa root transcriptome in response to salinity stress. Plant Cell Physiol. 54: 1041–1055. Ramilowski, J.A., Sawai, S., Seki, H., Mochida, K., Yoshida, T., Sakurai, T. et al. (2013) Glycyrrhiza uralensis transcriptome landscape and study of phytochemicals. Plant Cell Physiol. 54: 697–710. Rogers, D.J., Ward, A.L. and Wightman, J.A. (2005) Damage potential of two scarab species on groundnut. Int. J. Pest Manage. 51: 305–312. Sato, Y., Antonio, B., Namiki, N., Motoyama, R., Sugimoto, K., Takehisa, H. et al. (2011) Field transcriptome revealed critical developmental and physiological transitions involved in the expression of growth potential in japonica rice. BMC Plant Biology 11: 10. Scharf, K.D., Berberich, T., Ebersberger, I. and Nover, L. (2012) The plant heat stress transcription factor (Hsf) family: structure, function and evolution. Biochim Biophys Acta 1819: 104–119. Severin, A., Woody, J., Bolon, Y.T., Joseph, B., Diers, B., Farmer, A. et al. (2010) RNA-Seq Atlas of Glycine max: A guide to the soybean transcriptome. BMC Plant Biology 10: 160. Shi, C.Y., Yang, H., Wei, C.L., Yu, O., Zhang, Z.Z., Jiang, C.J. et al. (2011) Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds. BMC Genomics 12: 131. Shu, C., Liu, R., Wang, R., Zhang, J., Feng, S., Huang, D. et al. (2007) Improving toxicity of Bacillus thuringiensis strain contains the cry8Ca gene specific to Anomala corpulenta larvae. Current. Microbiol. 55: 492–496. Shu, C., Yu, H., Wang, R., Fen, S., Su, X., Huang, D. et al. (2009) Characterization of two novel cry8 genes from Bacillus thuringiensis strain BT185. Current. Microbiol. 58: 389–392. Siebert, P.D., Chenchik, A., Kellogg, D.E., Lukyanov, K.A. and Lukyanov, S.A. (1995) An improved PCR method for walking in uncloned genomic DNA. Nucleic Acids Res. 23: 1087. Sta˚lberg, K., Ellersto¨m, M., Ezcurra, I., Ablov, S. and Rask, L. (1996) Disruption of an overlapping E-box/ABRE motif abolished high

1801

Mining tissue-specific contigs from peanut (Arachis hypogaea L.) for promoter cloning by deep transcriptome sequencing.

Peanut (Arachis hypogaea L.), one of the most important oil legumes in the world, is heavily damaged by white grubs. Tissue-specific promoters are nee...
768KB Sizes 1 Downloads 5 Views