Mol Biol Rep DOI 10.1007/s11033-014-3395-z
Molecular cloning and characterization of NPR1 gene from Arachis hypogaea Qi Wu • Xiu Zhen Wang • Yue Yi Tang • Hong Tao Yu • Yu Fei Ding • Chuan De Yang • Feng Gao Cui • Jian Cheng Zhang • Chuan Tang Wang
Received: 21 January 2013 / Accepted: 2 May 2014 Ó Springer Science+Business Media Dordrecht 2014
Abstract The NPR1 gene was an important regulator for a plant disease resistance. The cDNA of NPR1 gene was cloned from peanut cultivar Ri Hua 1 by rapid amplification of cDNA ends-polymerase chain reaction (RACEPCR). The full length cDNA of Arachis hypogaea NPR1 consisted of 2,078 base pairs with a 1,446 bp open-reading frame encoding 481 amino acids. The predicted NPR1 contained the highly conserved functional domains (BTB/ POZ domain from M1 to D116), protein–protein interaction domains (three ankyrin repeats from K158 to L186; N187 to L217 and R221 to D250) and one NPR1-like domain (C262 to S469). The DNA sequence of the NPR1 gene was 2,332 or 2,223 bp. Both two sequences contained three introns and four exons. The NPR1 transcripts were expressed mainly in roots and leaves, while fewer signals were detected in the stems. Amount of the NPR1 transcript was significantly increased 1 h after salicylic acid challenge and was eventually 5.3 times greater than that in the control group. Both the DNA sequence and the coding
Electronic supplementary material The online version of this article (doi:10.1007/s11033-014-3395-z) contains supplementary material, which is available to authorized users. Q. Wu X. Z. Wang Y. Y. Tang H. T. Yu Y. F. Ding C. De Yang F. G. Cui J. C. Zhang C. T. Wang (&) Shandong Peanut Research Institute (SPRI), 126 Fushan Rd., Qingdao 266100, People’s Republic of China e-mail:
[email protected] Q. Wu e-mail:
[email protected];
[email protected] X. Z. Wang e-mail:
[email protected] Y. Y. Tang e-mail:
[email protected] sequence were obtained from eight cultivars and nine wild species of Arachis. Maximum likelihood analyses of dN/dS ratios for 25 sequences from different species showed that different selection pressures may have acted on different branches. Keywords NPR1 gene Arachis hypogaea Peanut Selection pressure Positively selected sites Abbreviations bp Base pair BW Bacterial wilt df Degrees of freedom EST Expressed sequence tag LRT Likelihood ratio test NPR1 Nonexpressor of pathogenesis-related genes1 PR genes Pathogenesis-related genes qRT-PCR Quantitative real-time reverse transcription polymerase chain reaction RT-PCR Reverse transcription polymerase chain reaction SA Salicylic acid SAR Systemic acquired resistance H. T. Yu e-mail:
[email protected] Y. F. Ding e-mail:
[email protected] C. De Yang e-mail:
[email protected] F. G. Cui e-mail:
[email protected] J. C. Zhang e-mail:
[email protected] 123
Mol Biol Rep
UTR TGA TUA
Untranslated region Leucine zipper transcription factor TGA Alpha-tubulin
Introduction Plants respond to a diverse range of pathogenic microorganisms in various ways. Systemic acquired resistance (SAR) is a defense reaction which can be aroused when plant was infected by pathogens. It is effective against bacterial, fungal and virus pathogens [1–3] through the concerted activation of pathogenesis-related (PR) genes [4, 5]. Certain chemicals, such as salicylic acid (SA), were necessary for SAR. Plants which fail to accumulate SA have an impaired SAR, whereas, an elevated level of SA, either endogenously or exogenously, can enhance a broad-spectrum resistance [4]. Nonexpressor of pathogenesis-related genes 1 (NPR1) has been identified as an important component for the SAregulated resistance. The NPR1 gene is vital for both systemic resistance and the infections control [6, 7]. Previous research has shown that npr1 mutants result in failing expression of PR1, PR2 and PR5 [8]. In npr1 mutant plants, the virulent pathogen spreads further and forms larger lesions than in the wild types [9–11]. The transgenic plants with NPR1 over expression displayed enhanced resistance and induced PR genes a higher expression level [12]. NPR1 always contains several ankyrin repeats, which are very important to proteins that have diverse biological functions and protein–protein interactions in plant. The ankyrin repeat was demonstrated functional importance by studying conserved histidine residues in the ankyrin repeat because these residues are crucial for hydrogen bonds that are necessary for 3D structure [6, 13–15]. Peanut (Arachis hypogaea) is an important oil crop and nutritious food. The whole global production of peanut was 48 million tons [16]. Worldwide, disease is the major constraint to peanut production [17]. Resistance has been studied in some disease of A. hypogaea [18] and a much more understanding of the resistant mechanism will help develop new resistant peanut cultivars. To date, some genes associated with disease resistance have been identified and studied [17, 19]. Many bio-informatic tools are available driving the study of plant resistance and are improving cultivation of A. hypogaea. With the completion of many model plant whole genome sequencing projects, a large number of valuable data will become available which will help peanut disease resistance studies. Wild species A. duranensis (diploid, the A genome) and A. ipaensis (diploid, the B genome) were most likely progenitors for cultivated peanut (tetraploid). And the wild species of Arachis
123
have an affluent source of resistant related genes by reason of high genetic diversity and many wild relatives have been selected across a range of environments and biotic stresses [20]. However, the NPR1 cDNA sequence of peanut has not been known, nor is little data available on its expression profile in peanut. Thereby, basic characterization of the NPR1 gene is provided. In this paper, we report a fulllength cDNA, genomic organization and expression profile in peanut.
Materials and methods Experimental materials and the immune challenge The cultivated A. hypogaea L. variety, Ri Hua 1, a Virginia type cultivar with a high resistance to bacterial wilt both in the field and the laboratory, was provided by Mr. Dian Wen Zhang (Donggang District Institute of Peanut, Rizhao, China). The wild species of Arachis and other peanut cultivars used in this research were kept by our laboratory. All 28 °C in darkness pre-germinated seeds were sown in a controlled climate room that was maintained at 28 °C, with a 12 h photophase (16,000 l9). For temporal and spatial expression patterns analysis, three samples were taken once a week starting from the second month after sowing. The roots, stems and leaves of each specimen were dissected out and put into liquid nitrogen immediately. For the SA challenge experiment, 48 six-leaf-age plants were used and sorted into two groups. Each plant from the treated group was sprayed with a 6 ml solution of SA (200 lg ml-1) (Sigma, MO, USA) with 0.01 % Tween20 as an emulsifier. The control group was treated with only Tween20 (0.01 %) buffer. After processing, three plants were randomly sampled from each group at time points of 0, 0.5, 1, 3, 6, 12, 24 and 48 h after spraying. Full-length cDNA cloning The leaves RNA were extracted by Trizol (Invitrogen, CA, USA). The values of spectrophotometrically at 260 and 280 nm and the brightness of GelRed (Biotium, CA, USA) stained bands on agarose gel were used to determine the RNA concentration and integrity. Prior to cDNA synthesis, DNA contamination were removed by RNase-free DNase (Takara, Japan). Then M-MLV reverse transcriptase (Promega, WI, USA) was used for cDNA synthesis (Table 1). The reaction mix was incubated for 1 h at 42 °C, terminated for 5 min at 85 °C, then stored at -80 °C. NPR1-F1 and NPR1-R1 primers were designed referring to the homologous sequence of Glycine max (NM_001251729.1) to get the NPR1 EST sequence from A. hypogaea (Table 1). In a Thermal Cycler, 25 ll PCR volume
Mol Biol Rep Table 1 Sequences of primers used in this study Primers
Sequence (50 –30 )
Application
NPR1-F1
AGATGTGATTCCTGTTCTTATGGCA
est sequence
NPR1-R1
CTCGCTCAAGAACATCAACACA
est sequence
M13-47
CGCCAGGGTTTTCCCAGTCACGAC
Vector universal primers
R-VM
GAGCGGATAACAATTTCACACAGG
Vector universal primers
NPR1-F2
ATCCATCCATTTTGGTAGCACTG
30 RACE first round PCR
NPR1-F3
GCAAGGAATCTAACAAAGACAGGCT
30 RACE second round PCR
NPR1-R2
GACATCATCCGAGTCCATTGCCTTAT
50 RACE first round PCR
NPR1-R3
AGCGACTTGATTTCGTCCACAACTTC
50 RACE second round PCR
UPM
Long: ctaatacgactcactatagggcAAGCAGTGGTATCAACGCAGAGT
30 , 50 RACE universal primer
Short: ctaatacgactcactatagggc AAGCAGTGGTATCAACGCAGAGT ATGTCTGAATTGGTGCCTTATGGGA
30 , 50 RACE nested universal primer Complete cdna amplify
NPR1-R
TCACTTTTTCCTCACTCTATGGC
Complete cdna amplify
qNPR1-f
CCATCCATTTTGGTAGCACT
npr1 qRT-PCR
qNPR1-r
TTCTTCTCATTTCTCGCTCA
npr1 qRT-PCR
actin-f
TTGGAATGGGTCAGAAGGATGC
avtin qRT-PCR
actin-r
AGTGGTGCCTCAGTAAGAAGC
actin qRT-PCR
TUA-f
CTGATGTCGCTGTGCTCTTGG
TUA qRT-PCR
TUA-r
CTGTTGAGGTTGGTGTAGGTAGG
TUA qRT-PCR
NUP NPR1-F
including 12.5 ll of 29 PCR buffer (Tiangen, China), 1.0 ll of each primer (10 lmol l-1) and 1 ll of template mix (about 50 ng ll-1). Finally PCR-grade water was added to make the volume up to 25 ll. The PCR programme was at 94 °C for 5 min first, then 30 cycles (94 °C for 40 s, 57 °C for 40 s, 72 °C for 1 min) and 72 °C for 10 min for the final extension. The target PCR products were ligated into the vector pMD18-T (Takara, Japan). The vector was transformed into the competent cells (Escherichia coli Top10), positive clones were selected through anti-ampicillin and PCR by a pair of vector primers M1347 and RV-M (Table 1). ABI3730 (Applied Biosystems, USA) was used to sequence the target clone. According to the EST as described above, SMART-RACE approaches (Clontech, CA, USA) were used to determine 30 end and 50 end of cDNA. NPR1-F2, NPR1-F3 and UPM primer were used to clone the 30 end while NPR1-R2, NPR1-R 3 and UPM primer were used to get the 50 end (Table 1). Genomic sequence of NPR1 The leaves DNA was extracted by a genomic DNA purification kit (Promega, WI, USA) based on the full length cDNA of NPR1 genes from A. hypogaea, NPR1-F and NPR1-R were used to get the whole sequence of NPR1 gene in A. hypogaea (Table 1). Then the target band of NPR1 genomic sequence was purified and sequenced as above.
Sequence analysis and phylogenetic analysis The BLAST algorithm at NCBI was used to conduct the homology sequence searches. Sequences were analyzed by BioEdit 7.0.9.0 software [21]. Clustal X 1.83 was used for multiple sequence alignment [22]. MEGA 4.0 was used to construct trees with 1,000 bootstrap value. The prediction of molecular mass and the theoretical isoelectric point were conducted by the protein MolWt and AA composition calculator. InterPro Scan was used for motif analysis. The DNAsp 4.10 program [23] was used to analyze the polymorphic sites. The proportions of nonsynonymous (dN) and synonymous (dS) substitutions per site were analyzed using PAML software [24]. NPR1 mRNA distribution and temporal expression profile The expression level of NPR1 mRNA transcripts in roots stems and leaves at different development stages and the expression profile in SA treated leaves were tested by quantitative real-time RT-PCR. RNA extracted and cDNA synthesis were the same as described above. The qRT-PCR assay was executed according to MIQE standards totally in a Roche light cycle 2.0. The PCR mix was 20 ll volume, including 10 ll 29 SYBR green master mix (Takara, Japan), 1 ll cDNA template, 0.5 ll of each primers (10 lmol l-1), and 8 ll DEPC-treated water. A
123
Mol Biol Rep
182 bp PCR product of qNPR1-f and qNPR1-r (Table 1) were used to prove the specificity. Both the 195 bp b-actin PCR products and the 94 bp alpha-tubulin (TUA) fragment were used to prove the successful cDNA synthesis and calibrate the template. Serial dilutions of purified PCR product as template in qRT-PCR were used to build the standard curve for NPR1 transcript. The qRT-PCR was performed at 95 °C for 30 s, then 45 cycles of 95 °C for 5 s, 60 °C for 20 s and 72 °C for 15 s. For expression level test, both samples and the internal control were run in three repeats. A melting curve analysis was used to determine the uniqueness of PCR product. The baseline of qRT-PCR was set by the software. The relative expression level of NPR1 was analyzed by the comparative Ct method [25]. The Ct values of NPR1 and the internal control, b-actin, were used to reckon the expression level at different development stages. The Ct of NPR1 and TUA were determined only for the SA challenged samples. The DCt which is the difference between NPR1 and the internal control was used to normalize the template amount. DDCt was the difference of DCt between the treated sample and the untreated sample. So the 2-DDCt was used to represent the difference relative to the untreated sample. The t test and p values B 0.05 were used for statistical analysis.
Results Molecular characterization of NPR1 cDNA The cDNA sequence of NPR1 from A. hypogaea (GenBank accession no. JX188654) consisted of 2,078 bp, including a 1,446 bp ORF encoded 481 amino acids (Fig. 1). The cDNA contained a 366 nucleotides 50 untranslated region (UTR), and a 266 nucleotides 30 UTR, which included a polyA tail and a polyadenylylation signal sequence (AATGAA) which located at 81 bp upstream of the polyadenylylation tail. The putative molecular mass and the theoretical isoelectric point of NPR1 were 5.45 kDa and 6.07, respectively. The analysis using InterPro Scan revealed that a typical BTB/POZ domain (M1 to D116) at the N-terminal, three significant homologies to ankyrin repeats (from K158 to L186, N187 to L217 and R221 to D250) in the middle and one NPR1-like domain (C262 to S469) situated at the C-terminal. The BTB/POZ domain was an important motif for protein–protein interaction which always be found in C2H2-type transcription factors. The ankyrin repeat was always in proteins referring to the most common protein– protein interaction [26, 27]. The NPR1-like domain has been shown to be important in mediating the binding of leucine zipper transcription factor TGA (TGA) to the as-1 motif and in controlling the start of (SAR).
123
To know more about the NPR1 gene, cDNA sequences from other species were identified using the above method. The GenBank accession no. is shown in Table 2. The alignment of sequences indicates the conservation of these three domains among A. hypogaea. Genomic sequences of NPR1 The full genomic DNA sequence of the NPR1 gene was 2,332 bp (GenBank accession no. JX188655). It contained three introns and four exons. All the junctions between exon–intron followed the common splice rule (–AG/GT–). The DNA sequences for NPR1 from cultivars of A. hypogaea and some wild species of Arachis were also obtained. The genomic organization of NPR1 sequences shared great similarity among all the Arachis species tested. All genes had the same structure. The genomic sequences of the NPR1 genes were either 2,332 or 2,223 bp. The GenBank accession no. and the exact information for the exons and introns are shown in Tables 2 and 3 . Six sequences of the NPR1 gene have been obtained from three samples of the A. hypogaea cultivar, Ri Hua 1 (see Appendix S1 in Supporting Information). The polymorphic sites contained 36 parsimony informative sites (positions 132, 135, 165, 281, 310, 322, 404, 430, 436, 437, 816, 818, 1,045, 1,207, 1,357, 1,400, 1,414, 1,430, 1,460, 1,477, 1,480, 1,538, 1,549, 1,773, 1,782, 1,849, 1,867, 1,879, 2,006, 2,009, 2,102, 2,111, 2,228, 2,267, 2,279 and 2,280) and 13 single variable sites (positions 33, 192, 326, 461, 518, 691, 1,114, 1,299, 1,444, 1,612, 1,893, 2,031 and 2,108) with a nucleotide diversity of Pi = 0.01063 and a theta (per site) of Eta = 0.00970, gaps and missing data were excluded. This showed that the genomic sequences of the NPR1 gene varied within species as the six sequences shared 99.8–92.5 % identity in the nucleotide while the deduced amino acid sequences had an identity from 98.7 to 100 %. Hence, the six complete genomic NPR1 sequences could be classified into two types according to size. Type I (GenBank accession no. JX188656) was 2,223 bp and Type II (GenBank accession no. JX188655) was 2,332 bp. The differences between these two types were mainly due to the length of introns as the exons were much more similar to each other. Phylogenetic analysis The ClustalW was used for sequences alignment. The NPR1 protein sequence of A. hypogaea shared 83 % identity with Glycine max (NM_001251729.1), 80 % identity with Lotus japonicus (AK339295) and 75 % identity with Populus trichocarpa (XM_002300827). The Maximum likelihood (ML) phylogenic tree was constructed (Fig. 2). It showed a remarkable divergence
Mol Biol Rep Fig. 1 Nucleotide sequences for NPR1 gene from A. hypogaea. Letters in black boxes indicate the BTB/POZ domain and gray boxes show the three ankyrin repeats. The NPR1_like domain is underlined. The asterisk means the stop codon
among monocots and dicots. This divergence may indicate the development of the plant disease-resistance response. The phylogenetic tree was made up of three main clades.
Two of the main clades were composed of dicotyledonous while the other clade contained all the referenced monocotyledonous.
123
Mol Biol Rep Table 2 A listing of the species in this study Species
Lineage
Accession number
Species
Lineage
Accession number
A. hypogaea Ri Hua1
Dicotyledon
JX188654
Arachis rigonii (A18)
Dicotyledon
JX262633
Arachis hypogaea Lu Feng 2
Dicotyledon
JX262621
Arachis triseminata (A21)
Dicotyledon
JX262634
A. hypogaea Si Li Hong
Dicotyledon
JX262622
Arachis glabrata (A30)
Dicotyledon
JX262635
A. hypogaea Qun Yu 101
Dicotyledon
JX262623
Arachis appressipila (A33)
Dicotyledon
JX262636
A. hypogaea CTWe
Dicotyledon
JX262624
Glycine maxa
Dicotyledon
NM_001251729.1
A. hypogaea Quan Hua 646
Dicotyledon
JX262625
Medicago truncatulaa
Dicotyledon
XM_003617314.1
A. hypogaea FB4
Dicotyledon
JX262626
Lotus japonicusa
Dicotyledon
AK339295.1
A. hypogaea Hua Yu 22
Dicotyledon
JX262627
Populus trichocarpaa
Dicotyledon
XM_002300827.1
Arachis duranensis (A5)
Dicotyledon
JX262628
Arabidopsis thalianaa
Dicotyledon
NM_123879.2
a
Arachis correntina (A9)
Dicotyledon
JX262629
Nicotiana tabacum
Dicotyledon
AY640382.1
Arachis pusilla (A10) Arachis appresipilla (A13)
Dicotyledon Dicotyledon
JX262630 JX262631
Vitis viniferaa Oryza sativa Japonica Groupa
Dicotyledon Monocotyledon
XM_002274009.1 HM991166.1
Arachis paraguaniensis (A15)
Dicotyledon
JX262632
Zea maysa
Monocotyledon
NM_001154115.1
a
Means the sequence was downloaded from GenBank
Table 3 Nucleotide composition of NPR1 gene Scientific name
Genbank accession no.
Ex1
In1
Ex2
In2
Ex3
In3
Ex4
A. duranensis (A5)
JX262637
1–222
223–535
536–1289
1290–1383
1384–1578
1579–1956
1957–2231
A. correntina (A9)
JX262638
1–222
223–517
518–1271
1272–1364
1365–1559
1560–1862
1863–2137
A. pusilla (A10)
JX262639
1–222
223–582
583–1336
1337–1430
1431–1625
1626–2057
2058–2332
A. appresipilla (A13)
JX262640
1–222
223–551
552–1305
1306–1398
1399–1593
1594–2010
2011–2285
A. paraguaniensis (A15)
JX262641
1–222
223–551
552–1305
1306–1398
1399–1593
1594–2010
2011–2285
A. rigonii (A18)
JX262642
1–222
223–551
552–1305
1306–1398
1399–1593
1594–2028
2029–2303
A. trierectoides (A21)
JX188659
1–222
223–551
552–1305
1306–1398
1399–1593
1594–2010
2011–2285
A. glabrata (A30)
JX188657
1–222
223–517
518–1271
1272–1364
1365–1559
1560–1862
1863–2137
A. appressipila (A33)
JX188658
1–222
223–587
588–1341
1342–1435
1436–1630
1631–2057
2058–2332
A. hypogaea Lu Feng 2
JX262643
1–222
223–527
528–1281
1282–1375
1376–1570
1571–1948
1949–2223
A. hypogaea Si Li Hong A. hypogaea Qun Yu 101
JX262644 JX262645
1–222 1–222
223–527 223–527
528–1281 528–1281
1282–1375 1282–1375
1376–1570 1376–1570
1571–1948 1571–1948
1949–2223 1949–2223
A. hypogaea CTWe
JX262646
1–222
223–527
528–1281
1282–1375
1376–1570
1571–1948
1949–2223
A. hypogaea Quan Hua 646
JX262647
1–222
223–527
528–1281
1282–1375
1376–1570
1571–1948
1949–2223
A. hypogaea FB4
JX262648
1–222
223–527
528–1281
1282–1375
1376–1570
1571–1948
1949–2223
A. hypogaea Hua Yu 22
JX262649
1–222
223–527
528–1281
1282–1375
1376–1570
1571–1948
1949–2223
A. hypogaea Ri Hua1 typeI
JX188655
1–222
223–587
588–1341
1342–1435
1436–1630
1631–2057
2058–2332
A. hypogaea Ri Hua1 typeII
JX188656
1–222
223–527
528–1281
1282–1375
1376–1570
1571–1948
1949–2223
The transcripts distribution of NPR1 and the temporal expression profile in leaves after SA challenge qRT- PCR was used to find out the transcripts distribution of NPR1 in roots, stems and leaves with an internal control of b-actin. The SA-challenged expression profile of NPR1 transcripts in leaves was investigated using TUA as an internal control. Only one wave crest in the melting curve for both the target gene and the internal control indicated that the amplification was specific. Standard curves were
123
constructed by relative concentration and Ct. qRT-PCR amplification efficiencies of NPR1 gene was 99 %, (E = 101/-slope - 1). NPR1 mRNA was expressed mainly in the roots and leaves, while fewer signals were detected in the stems (Fig. 3). The NPR1 mRNA transcripts was increased sharply 1 h after SA challenge and was 5.3 times of that in the control, which was statistically significant with P value 0.01724 \ 0.05. The exposure of plants to a high concentration of SA resulted in high expression levels for the NPR1 transcripts up to 1 h after treatment. Then the
Mol Biol Rep Fig. 2 Consensus Maximum likelihood (ML) tree for NPR1 gene. The values represent the percentage of 1000 bootstrap replications ([50 %). The protein sequences used are as follows: JX262621–JX262636 (Arachis), JX188654 (Arachis), NM_001251729.1 (Glycine max), XM_003617314.1 (Medicago truncatula), AK339295.1 (Lotus japonicus), XM_002300827.1 (Populus trichocarpa), NM_123879.2 (Arabidopsis thaliana), AY640382.1 (Nicotiana tabacum), XM_002274009.1 (Vitis vinifera), HM991166.1 (Oryza sativa Japonica Group) and NM_001154115.1 (Zea mays)
expression fell back to normal levels, 3–48 h after SA challenge (Fig. 4). Divergence of NPR1 The ratio of nonsynonymous/synonymous substitution rate (x = dN/dS) is sensitive to detect selective pressure [24]. When dN/dS [ 1 value implies adaptive evolution, whereas dN/dS \ 1 indicates a purifying selection [28]. In the present study, NPR1 sequences were analyzed, including Oryza sativa, Zea mays, Nicotiana tabacum, Vitis vinifera, Arabidopsis thaliana, Populus trichocarpa, Medicago truncatul, Lotus japonicus, Glycine max and ten species from genus Arachis. A. hypogaea L. variety Ri Hua 1 was used as a representative of the A. hypogaea cultivars due to the high similarity among them. The ML tree of NPR1 was constructed based on the coding sequence alignment (see Appendix S2 in Supporting Information) and it was acquired by PAML software. The sitemodels [24, 28] and the branch-site models [29] were all used to identify the codon sites that may under positive selection (Table 4). The likelihood ratio test (LRT) was used to statistically analyze. No positively selected sites were identified in the site models with 2Dl was 0 (M1a to M2a, degrees of freedom, df = 2) and 3.45076 (M7 to M8, df = 2). The branch-site model A was used to identify positively selected
Fig. 3 The expression level of NPR1 mRNA transcripts in roots, stems and leaves at different development stages was determined by qRT-PCR with the internal control of b-actin. Vertical bars represent the mean ± SD
codon sites by the Bayes Empirical Bayes (BEB) method in the branch of the genus Arachis and two dicotyledon clades (Table 4, Appendix S2). The 2Dl of three comparisons were 19.131884, 15.143752 and 9.80261, respectively and 3.8415 and 6.6349 were the critical values for the LRT at 5 and 1 %, respectively. Thereby, the value significantly exceeded the critical values (df = 1, P \ 0.01), which indicated that positively selected site was present for NPR1 in the Arachis lineage (18S*, * means probability [95 %) and two dicotyledon lineages (407D*). The data indicated that the positive Darwinian selections for these sites may promote the divergence and evolution of those lineages.
123
Mol Biol Rep
Fig. 4 The expression profile of NPR1 in SA treated leaves was measured by qRT-PCR with the internal control of TUA gene. Asterisks indicate the significant difference between control and the SA challenge sample. The P values of 0.5 h and 1 h were 0.00492 and 0.01724, respectively
The ratios of dN/dS among species were useful to detect the selective pressures that may act upon a gene [30]. Oneratio model supposes a single x value for the entire tree. The x value for the coding region of NPR1 was 0.1427 in one-ratio model. The variable x ratios for NPR1 among the lineages were tested using the free-ratio model. It supposes an unattached x ratio for each branch (Appendix S2). LRTs for these two model revealed that the free-ratio model was more convincing than the one-ratio model (2Dl = 105.00823, df = 35, P \ 0.01). In the whole NPR1 sequence, dS exceeded dN in all branches of the tree
(dN/dS \ 1.0), which indicated that a functional constraint might have acted on NPR1 throughout the evolution of monocotyledons and dicotyledons. All x values were \1 for every lineage from the genus Arachis. This showed a conservative function for NPR1, which may have undergone purified selection in these lineages. Furthermore, the dN/dS values in the Lotus japonicus lineage and the two wild species of Arachis (A. glabrata and A. appresipilla) were 0.3946 and 0.6264, respectively, which indicated a relaxed selective constraint in these two lineages (Appendix S2). The values for dN/dS also approached one value across one clade of the dicotyledons (0.9428) (Appendix S2). Thus, the NPR1 gene may have been subject to relaxed selection when the plants in this clade diverged from monocotyledons.
Discussion Analysis of the genomic sequences and cDNA transcripts of NPR1 not only determined the similarity within the Arachis genus or between different species, but also highlighted some of the regions or sites of the protein that were critical to protein function. The deduced protein of Type I contains eleven cysteine residues while Type II has ten cysteine residues, one less than Type I. Cysteine
Table 4 Parameter estimates of the site models and branch-site models Model
Paraa
Estimates of parameters
2Dlb
M0 (one-ratio)
1
x0 = 0.14053
M1a (nearly neutral)
2
p0 = 0.73388 (p1 = 0.26612); x0 = 0.08907 (x1 = 1)
M2a (positive selection)
4
p0 = 0.73388; p1 = 0.18589 (p2 = 0.08023); x0 = 0.08907 (x1 = 1), x2 = 1
M7 (beta)
2
p = 0.46899; q = 2.04100
M8 (beta and x)
4
p0 = 0.99451; p = 0.48236; q = 2.18070; (p1 = 0.00549) x = 2.20506
Model A (clade1: genus Arachis lineage)
4
p0 = 0.79827; p1 = 0.17651 (p2 = 0.02522); x0 = 0.10797 (x1 = 1), x2 = 101.11479
(Null model vs. Model A)
Model A (clade2: dicotyledon clades including Arabidposis thaliana lineage)
4
p0 = 0.79938; p1 = 0.16944 (p2 = 0.03118); x0 = 0.10857 (x1 = 1), x2 = Inf
15.14375 (Null model vs. Model A)
Positively selected sitesc None
Site models (M1a vs. M2a) 0
None
(M7 vs. M8) 3.45076
None
Branch-site models
19.13188 Model A (clade3: dicotyledon clades including Populus trichocarpa lineage)
4
p0 = 0.79537; p1 = 0.16260 (p2 = 0.04203); x0 = 0.10904 (x1 = 1), x2 = 25.55293
(Null model vs. Model A) 9.80261
a
The number of parameters estimated in the x distribution. To detect positive selection * P \ 0.05, Prob [ 0.95. c Amino acid sites under positive selection based on a Bayes Empirical Bayes (BEB) probability 0.95 %. No amino acids sites under the model M2a and M8 were shown to be under positive selection
123
b
Mol Biol Rep
residues may play an important role in disulfide bonds that are crucial for the function of NPR1. These two types of NPR1 gene may come from two sets of the original genome of Arachis. Ankyrin repeats have been identified in eukaryote, prokaryote and virus. In most instances, proteins containing ankyrin motifs are related to protein–protein interactions. The crystal structure shows that b hairpins and a helices of ankyrin repeats can form the bottom and the top of an L-shaped structure. The conserved histidine of the first a helix has vital function on folding of structure by making hydrogen bonds [14]. The changes in conserved histidine probably cause destabilization of the protein [31]. The NPR1 gene was constitutively expressed in different tissues with a high expression level in root and leaves, which confirmed that roots and leaves hold most of the NPR1 protein. Treatment with SA significantly increased the expression level of NPR1 after 1 h, which indicated that NPR1 expression was strongly positively induced by SA treatment. Previous studies have shown that the role of SA can positively regulate plant defense signaling pathways. It has been demonstrated that NPR1, the best characterized ANK protein in peanut, is involved in SA regulated resistance responses. The NPR1 was downstream of the signal molecule SA and the NPR1 transcripts increased when challenged by SA. The results from this study show that the SA can regulate the expression level of NPR1 protein in A. hypogaea. There is more and more attention in gene loci which are influenced under natural selection [32, 33]. Both accelerated evolution which implies positive selection and conserved evolution that indicates negative purification selection illuminate functional importance of these regions [34]. The function of NPR1 was demonstrated to be related with the SA signal mechanism [10, 11]. The main finding of the divergent analysis was that purification selection may act on NPR1 through plant evolution. Under the one ratio model, the x values for the whole NPR1 was \1 (dN/ dS = 0.1427). In the free ratio model, there were still no lineages with dN/dS [ 1 and the LRT result showed that the ratio varied among branches (P \ 0.01), so the hypothesis of there being dN/dS homogeneity among branches was rejected. Variant dN/dS values were found across the phylogenetic tree, which suggested that diverse selective pressures may have acted on different branches. Some dN/ dS values that approached one value for the free ratio model indicated a relaxed pressure has acted on these lineages. The branch-site model A is employed to identify positively selected amino acid sites in lineage [29, 35]. This investigation was especially interested in three lineages (the Arachis clade and two dicotyledon clades) and tested the positively selected sites in these clades. The LRTs
showed evidence of positively selected sites. The 2Dl between the null model and the alternative model was 15.14375 (df = 1, P \ 0.01) for the Arachis clade and 19.131884 (df = 1, P \ 0.01) and 9.80261 (df = 1, P \ 0.05) for the two dicotyledon clades. Only one codon site was detected as being potentially under positive selection (18S*) in the Arachis lineage (using BEB analysis). Furthermore, one site (407D*) was detected in the two dicotyledon clades. The more coding sequences tested, the more powerful the LRT was [36]. This study sampled 19 species for phylogenetic analyses. Diverse species may support more accurate result for the evolution selection test of the NPR1 gene. In order to find out the relationship between the positive selection sites and the three important domains, this study plotted them. Only 293Q was in the ANK repeat, the others were at the BTB/POZ-like domain, the NPR1-like domain or in the space between domains. Two significant sites 18S and 407D from two lineages were situated at the N-terminal of the NPR1 gene and the NPR1like domain, respectively. ANK repeat is one of ancient protein domains in all three superkingdoms (bacteria, archaea, and eukarya), as well as in a number of viral genomes spanning a wide range of functions. Generally, it is shown that proteins possess tandemly repetitive proteins evolve faster than proteins with no repeat [37]. Repeat mutations might be tolerated and hence protein survives when there is a selection pressure applied. Then concerted evolution could spread such mutations [38]. This suggests that the ANK repeat of NPR1 was not just a simple tandem repeat, but also vital for its structure and function in mediating specific protein–protein interactions. The results suggest that sequence conservation was extremely important for ANK consecutive repeats and that they stack together to form a characteristic structure in order to keep a relationship with their target protein. In conclusion, a set of novel NPR1 genes was cloned from the Arachis genus and it was constitutively expressed in the roots, leaves and stems. The SA challenge assay showed that the amount of the NPR1 transcripts could positively be regulated by the SA levels, which suggested that NPR1 could be related to the development of the plant defense against disease. In addition, some sites for NPR1 that have experienced significant selective pressure have been identified. Natural selection analysis of NPR1 among dicotyledonous and monocotyledonous plants revealed lineage specific patterns of variation. Different selected pressure may have acted on different regions of the NPR1 gene. Acknowledgments The authors are grateful to all the laboratory members for continuous technical advice and helpful discussion. This research was supported from China Agricultural Research System (CARS-14), Qingdao Science and Technology plan Basic research Project (12-1-4-11-(1)-jch).
123
Mol Biol Rep
References 1. Van Loon LC (2007) Plant responses to plant growth-promoting rhizobacteria. Eur J Plant Pathol 119:243–254 2. Vlot AC, Klessig DF, Park SW (2008) Systemic acquired resistance: the elusive signal(s). Curr Opin Plant Biol 11:436–442 3. Mukhtar FB, Mohammed M, Ajeigbe AH (2009) Effect of benzyl amino purine (BAP), coconut milk (CM) and manure applications on leaf senescence and yield in photoperiod sensitive cowpea variety (Kanannado). Afr J Plant Sci 3:142–146 4. Loake G, Grant M (2007) Salicylic acid in plant defence-the players and protagonists. Curr Opin Plant Biol 10:466–472 5. Sels J, Mathys J, De Coninck BM et al (2008) Plant pathogenesisrelated (PR) proteins: a focus on PR peptides. Plant Physiol Biochem 46:941–950 6. Le Henanff G, Heitz T, Mestre P et al (2009) Characterization of Vitis vinifera NPR1 homologs involved in the regulation of pathogenesis-related gene expression. BMC Plant Biol 9:54 7. Chen XK, Zhang JY, Zhang Z et al (2012) Overexpressing MhNPR1 in transgenic Fuji apples enhances resistance to apple powdery mildew. Mol Biol Rep 39:8083–8089 8. Stein E, Molitor A, Kogel KH et al (2008) Systemic resistance in Arabidopsis conferred by the mycorrhizal fungus Piriformospora indica requires jasmonic acid signaling and the cytoplasmic function of NPR1. Plant Cell Physiol 49:1747–1751 9. Wang D, Weaver ND, Kesarwani M et al (2005) Induction of protein secretory pathway is required for systemic acquired resistance. Science 308:1036–1040 10. Divi UK, Rahman T, Krishna P (2010) Brassinosteroid-mediated stress tolerance in Arabidopsis shows interactions with abscisic acid, ethylene and salicylic acid pathways. BMC Plant Biol 10:151 11. Robert-Seilaniantz A, Grant M, Jones JD (2011) Hormone crosstalk in plant disease and defense: more than just jasmonate– salicylate antagonism. Annu Rev Phytopathol 49:317–343 12. Le Henanff G, Farine S, Kieffer-Mazet F et al (2011) Vitis vinifera VvNPR1.1 is the functional ortholog of AtNPR1 and its overexpression in grapevine triggers constitutive activation of PR genes and enhanced resistance to powdery mildew. Planta 234:405–417 13. Shi Z, Maximova SN, Liu Y et al (2010) Functional analysis of the Theobroma cacao NPR1 gene in Arabidopsis. BMC Plant Biol 10:248 14. Stogios PJ, Downs GS, Jauhal JJ et al (2005) Sequence and structural analysis of BTB domain proteins. Genome Biol 6:R82 15. Trujillo M, Shirasu K (2010) Ubiquitination in plant immunity. Curr Opin Plant Biol 13:402–408 16. FAOSTAT (2006) http://www.faostatfaoorg/site/567/defaultaspx 17. Luo M, Liang XQ, Dang P et al (2005) Microarray-based screening of differentially expressed genes in peanut in response to Aspergillus parasiticus infection and drought stress. Plant Sci 169:695–703 18. Holbrook CC, Stalker HT (2002) Peanut breeding and genetic resources. Plant Breed Rev 22:297–356 19. Guo AY, He K, Liu D et al (2005) DATF: a database of Arabidopsis transcription factors. Bioinformatics 21:2568–2569
123
20. Leal-Bertioli SCM, Jose ACVF, Alves-Freitas DMT et al (2009) Identification of candidate genome regions controlling disease resistance in Arachis. BMC Plant Biol 9:112 21. Hall TA (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucl Acids Symp Ser 41:95–98 22. Thompson JD, Gibson TJ, Plewniak F et al (1997) The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25:4876–4882 23. Rozas J, Sanchez-DelBarrio JC, Messeguer X et al (2003) DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19:2496–2497 24. Yang ZH, Nielsen R (2002) Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. Mol Biol Evol 19:908–917 25. Livak KJ, Schmittgen TD (2001) Analysis of relative gene expression data using real-time quantitative PCR and the 2(T) (-Delta Delta C) method. Methods 25:402–408 26. Phelps CB, Wang RR, Choo SS et al (2010) Differential regulation of TRPV1, TRPV3, and TRPV4 sensitivity through a conserved binding site on the ankyrin repeat domain. J Biol Chem 285:731–740 27. Sklenovsy P, Otyepka M (2010) In silico structural and functional analysis of fragments of the ankyrin repeat protein p18(INK4c). J Biomol Struct Dyn 27:521–540 28. Nielsen R, Yang ZH (1998) Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148:929–936 29. Yang Z, Wong WS, Nielsen R (2005) Bayes empirical bayes inference of amino acid sites under positive selection. Mol Biol Evol 22:1107–1118 30. Hurst LD (2002) The Ka/Ks ratio: diagnosing the form of sequence evolution. Trends Genet 18:486–487 31. Cao H, Glazebrook J, Clarke JD et al (1997) The Arabidopsis NPR1 gene that controls systemic acquired resistance encodes a novel protein containing ankyrin repeats. Cell 88:57–63 32. Ding K, Kullo IJ (2006) Molecular evolution of 50 flanking regions of 87 candidate genes for atherosclerotic cardiovascular disease. Genet Epidemiol 30:557–569 33. Kullo IJ, Ding KY (2007) Patterns of population differentiation of candidate genes for cardiovascular disease. BMC Genet 8:48 34. Thomas JW, Touchman JW, Blakesley RW et al (2003) Comparative analyses of multi-species sequences from targeted genomic regions. Nature 424:788–793 35. Zhang JZ, Nielsen R, Yang ZH (2005) Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol 22:2472–2479 36. Anisimova M, Bielawski JP, Yang ZH (2001) Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution. Mol Biol Evol 18:1585–1592 37. Armour JA (2006) Tandemly repeated DNA: why should anyone care? Mutat Res 598:6–14 38. Swanson WJ, Vacquier VD (2002) The rapid evolution of reproductive proteins. Nat Rev Genet 3:137–144