Mol Genet Genomics (2014) 289:1289–1306 DOI 10.1007/s00438-014-0890-9

ORIGINAL PAPER

Transcriptome‑wide analysis of WRKY transcription factors in wheat and their leaf rust responsive expression profiling Lopamudra Satapathy · Dharmendra Singh · Prashant Ranjan · Dhananjay Kumar · Manish Kumar · Kumble Vinod Prabhu · Kunal Mukhopadhyay 

Received: 24 April 2014 / Accepted: 18 July 2014 / Published online: 7 August 2014 © Springer-Verlag Berlin Heidelberg 2014

Abstract WRKY, a plant-specific transcription factor family, has important roles in pathogen defense, abiotic cues and phytohormone signaling, yet little is known about their roles and molecular mechanism of function in response to rust diseases in wheat. We identified 100 TaWRKY sequences using wheat Expressed Sequence Tag database of which 22 WRKY sequences were novel. Identified proteins were characterized based on their zinc finger motifs and phylogenetic analysis clustered them into six clades consisting of class IIc and class III WRKY proteins. Functional annotation revealed major functions in metabolic and cellular processes in control plants; whereas response to stimuli, signaling and defense in pathogen inoculated plants, their major molecular function being binding to DNA. Tag-based expression analysis of the identified genes revealed differential expression between mock and Puccinia triticina inoculated wheat near isogenic lines. Gene expression was also performed with six rust-related microarray experiments at Gene Expression Omnibus database. TaWRKY10, 15, 17 and 56 were common in both tagbased and microarray-based differential expression analysis Communicated by S. Hohmann. Electronic supplementary material  The online version of this article (doi:10.1007/s00438-014-0890-9) contains supplementary material, which is available to authorized users. L. Satapathy · D. Singh · P. Ranjan · D. Kumar · M. Kumar · K. Mukhopadhyay (*)  Department of Bio‑Engineering, Birla Institute of Technology, Mesra, Ranchi 835215, India e-mail: [email protected] K. V. Prabhu  Department of Genetics, Indian Agriculture Research Institute, New Delhi 110012, India

and could be representing rust specific WRKY genes. The obtained results will bestow insight into the functional characterization of WRKY transcription factors responsive to leaf rust pathogenesis that can be used as candidate genes in molecular breeding programs to improve biotic stress tolerance in wheat. Keywords  WRKY transcription factors · Wheat · Leaf rust · Gene ontology · Differential expression Abbreviations ESTs Expressed sequence tags GEO Gene expression omnibus GRAVY Grand average of hydropathicity IWGSC International wheat genome sequencing consortium MEGA Molecular evolutionary genetic analysis NCBI National Centre for Biotechnology Information NLS Nuclear localization signals SAGE Serial analysis of gene expression SOLiD Sequencing by oligonucleotide ligation and detection SRA Sequence read archive R-genes Resistant genes TF Transcription factor

Introduction Regulation of gene expression in plants in retort to diverse environmental and biotic agents occurs mostly through transcriptional control requiring the involvement of several transcription factors (TFs) (Eulgem et al. 2000). TFs are proteins that either activate or repress the process of transcription of target genes by binding to specific DNA

13

1290

sequences. Besides the leaf rust resistant (R) genes, TFs like WRKY, MYB and NAC might also help plants to encounter pathogen attack and overcome abiotic stresses (Ulker and Somssich 2004; Rushton et al. 2010). The WRKY TFs modulate spatio-temporal expression of the downstream target genes during pathogen infection in response to pathogen encoded protective signal molecules, the elicitors, as well as defense signaling molecules like Abscisic acid, Jasmonic acid and Salicylic acid (Eulgem and Somssich 2007). The different WRKY genes control various physiological processes at the convergence of biotic and abiotic stress response pathways and, therefore, exploring their underlying mechanism of action might shed light on the cross talks between different stresses (Atkinson and Urwin 2012). The WRKY TFs are one of the 10 largest TF superfamilies of higher plants that belong to the zinc-finger motif containing WRKY-GCM1 superfamily (Babu et al. 2006). They contain a ~60 amino acids DNA-binding region, designated as WRKY domain, comprising of the conserved WRKYGQK heptapeptide and a zinc-finger motif. Based on the number of WRKY domains and type of zinc-finger motif, WRKY TFs have been classified into three classes (Rushton et al. 2010). Class I WRKY proteins have two WRKY domains and a C2H2-type zinc-finger motif (C–X4–5–C–X22–23–H–X1–H), where only the C-terminal WRKY domain is active in DNA binding. Class II proteins contain single WRKY domain and similar zinc-finger motif like class I. The class II WRKY TFs are further divided into five subgroups ‘a–e’ based on variation in additional amino acid motifs present outside the WRKY domain. The class III WRKY TFs also carry single WRKY domain but differs from classes I and II in its altered C2HC type of zinc-finger motif (C–X7–C–X23–H–X–C) (Eulgem et al. 2000; Ulker and Somssich 2004). Minor variants of the WRKYGQK signature motif like WRKYGKK and WRKYGEK, are also rarely found (Xie et al. 2005). WRKY proteins exhibit high binding affinity towards a DNA sequence, the W-box, (C/T)TGAC(T/C), although alternative binding sites are also reported (Rushton et al. 2010). The conserved Cys and His residues of the zinc-finger motif are essentially involved in zinc dependent DNA-binding activity (Pandey and Somssich 2009). The WRKY family has been extensively studied in Arabidopsis where more than 74 members were identified; many of them are associated to defense systems (Dong et al. 2003; Kalde et al. 2003; Eulgem and Somssich 2007). About 109 WRKY TFs have been reported in rice (Ramamoorthy et al. 2008), mostly concerned with responses to various phytohormones, abiotic and biotic stresses including those caused by blight and blast pathogens (Ramamoorthy et al. 2008; Ryu et al. 2006). WRKY TFs have also been studied extensively in barley (Mangelsen et al.

13

Mol Genet Genomics (2014) 289:1289–1306

2008), maize (Wei et al. 2012), Brachypodium (Tripathi et al. 2012), creosote bush (Zou et al. 2004), soybean (Zhou et al. 2008), banana (Shekhawat et al. 2011), pepper (Dang et al. 2013) and few lower plants like ferns, mosses and green algae (Rushton et al. 2010). Bread wheat (Triticum aestivum L.) is an important cereal crop, that provides one-fifth of food calories and proteins to the world population (http://www.faostat.fao.org) and its demand is expected to rise by 60 % in the developing countries by 2050 (Rosegrant and Agcaoili 2010). Simultaneously, climate change induced rise in temperature, drought and biotic threats are estimated to abate wheat production by 29 % (Braun et al. 2010). The large allohexaploid (2n = 6x = 42) genome of ~16.94 Gb consisting of three homoeologous A, B and D genomes that originated from related progenitor species, provide significant challenges for molecular and functional genomics-based improvement of wheat (Choulet et al. 2010). Moreover, recent polyploidization, high proportion of transposable elements and repetitive DNA along with absence of completely sequenced genome, also limit correct assembling, mapping and functional annotation of closely related sequences (Dubcovsky and Dvorak 2007). Therefore, genomics-based improvement of wheat is trailing to other major cereal crops like maize and rice (Bevan and Uauy 2013). The rust diseases often jeopardize wheat production worldwide (McIntosh and Pretorius 2011), of them leaf (brown) rust, caused by the obligate biotrophic fungus Puccinia triticina Eriks. is of widespread occurrence accounting ~10 % yield loss annually (Eversmeyer and Kramer 2000; Dean et al. 2012). Various elite lines of wheat were developed by introgression and pyramiding of several R-genes, identified in wheat and its related wild species, for durable rust resistance. These R-genes are generally involved in pathogen recognition and programmed death of infected cells through hypersensitive response (HR) that leads to complete or near-complete resistance. However, any mutation in R-genes might affect ability of the plant to recognize the pathogen, thus change orientation of the plant from resistant to susceptible (Poland et al. 2009). Also, mutagenic variations in the pathogen genome, prolific sporulation and efficient dissemination of the spores result in frequent breakdown of cultivar resistance (Park and Wellings 2011). The biology of WRKY TFs and its mechanism of action has been a subject of intense research for years; however, functional analysis of these TFs in an important crop like wheat is rare (Wu et al. 2008). Recent reports mentioned the identification and expression patterns of various WRKY TFs in wheat against Fusarium induced biotic stress and different phytohormones (Wu et al. 2008; Proietti et al. 2010; Bahrini et al. 2011; Niu et al. 2012 and Zhu et al. 2013). In our lab, expression analysis of a TaWRKY1b was

Mol Genet Genomics (2014) 289:1289–1306

carried out which showed 146 fold induction of the gene in resistant plants but only 12-fold induction of the gene in susceptible plants during leaf rust infection as compared to mock inoculated controls (Kumar et al. 2014). Inspite of different ongoing studies, the role of WRKY TFs in leaf rust infection is not yet established and reported. However, it is predicted that wheat might contain much more WRKY TFs than rice, maize or Arabidopsis due to its large genome size, but most are yet to be disclosed (Zhu et al. 2013). Since only a low (5x) coverage of wheat genome sequence is available at present (Brenchley et al. 2012; http://www.wheatgenome.org) a quicker and complementary approach to identify wheat genes is through expressed sequence tags (EST) analysis (Manickavelu et al. 2012; Olga et al. 2014) Therefore, the present study was initiated with the objective for transcriptome-wide identification of WRKY TFs in wheat and their comprehensive functional exploration with respect to the rust diseases in general and leaf rust in particular.

Materials and methods In silico data mining of WRKY TFs and phylogenetic analysis All available WRKY protein sequences of Oryza sativa (both japonica and indica), Sorghum bicolor, Hordeum vulgare, Zea mays, Brachypodium distachyon and Saccharum officinalis were downloaded from Gramineae (Mochida et al. 2011) and GRASSIUS (Yilmaz et al. 2009) transcription factor databases. The retrieved sequences were searched for similarity with wheat ESTs using TBLASTN at NCBI (http://www.ncbi.nlm.nih.gov) with an e-value cutoff of 10 (Zhu et al. 2013; Niu et al. 2012; Xiaoming et al. 2014). The non redundant ESTs were selected, translated in silico using GENSCAN (http:// genes.mit.edu/GENSCAN.html) and only the sequences containing conserved WRKY domains were opted for further characterization. BLASTN was performed with their respective nucleic acid sequences to check for novelty. The WRKY protein sequences were aligned using CLC Genomics Workbench 6.5 (CLC bio, Aarhus N, Denmark). A pipeline describing the strategies used in this study for identification and functional characterization of the novel WRKY TFs is provided in Supplementary Fig. S1. Phylogenetic analysis of newly identified WRKY proteins in wheat was performed to study their evolutionary relationship with known Arabidopsis, Oryza sativa japonica, Brachypodium distachyon, Physcomitrella patens and Mucor circinelloides based on multiple sequence alignments (MSA) and neighbor joining method using Molecular Evolutionary Genetic Analysis (MEGA5) software (Tamura et al. 2011).

1291

Bootstrap analysis of 1,000 replicates was performed to provide confidence estimates for the tree topologies. In silico characterization of identified WRKY TFs in wheat The theoretical pI and molecular weight (Bjellqvist et al. 1994) was determined using Expasy Server (http:// us.expasy.org/tools/pi_tool.html), whereas, the grand average of hydropathicity (GRAVY) and aliphatic index was analyzed using Prot Param (Gasteiger et al. 2005). Kinase specific phosphorylation sites (Blom et al. 1999) and N-glycosylation potential sites (Gupta et al. 2004) were predicted by NetPhos 2.0 Server (http://www.cbs.dtu.dk) and NetNGlyc 1.0 Server (http://www.cbs.dtu.dk), respectively. Secondary structure for the novel TaWRKY proteins were predicted using CLC Genomics Workbench 6.5 and PSIPRED (Buchan et al. 2013). Motif scan (MarchlerBauer et al. 2011) was used to determine different catalytic domains. Additional conserved motifs outside the WRKY domain were identified using Multiple EM for Motif Elicitation (MEME) version 4.9 (Bailey et al. 2009). The limits of minimum width, maximum width and maximum number of motifs were specified as 6, 50 and 20, respectively, with any number of repetitions (Puranik et al. 2013). Transmembrane helices and nuclear localization signals (NLS) present in the newly identified complete WRKY sequences were predicted using HMMTOP software (Garg et al. 2013) and NLStradamus (Nguyen et al. 2009), respectively. Subcellular localization of these sequences was predicted using Wolf PSORT software (Horton et al. 2007; Jia et al. 2013). Cutoff value of 0.6 and viterbi algorithm was used to determine the position and their corresponding amino acid sequences of the NLS. RNA isolation, SAGE library preparation and next generation sequencing A pair of wheat near isogenic lines (NILs): HD2329 (a leaf rust susceptible phenotype) and HD2329 + Lr28 [resistant (Nest Immune 0-0;) were used. The Lr28 gene was derived from Aegilops speltoides (Tauschii) and is effective against all pathotypes of the pathogen in India. Puccinia triticina pathotype 77-5, the most predominant and devastating leaf rust pathogen in all parts of the Indian subcontinent, was selected as the experimental pathogen (Bipinraj et al. 2011). The pathogen inoculum was prepared by mixing urediniospores of P. triticina pathotype 77-5 and talcum powder (ratio 1:1) and applied gently on leaves of the NIL pairs. Both plant types were also mock inoculated with only talcum powder and used as control. After inoculation plants were placed under high humidity of >90 % for 24 hpi in the dark to facilitate infection. Then the pots were transferred to the normal growth chamber [22 °C, day time;

13

1292

14 °C, night time, relative humidity (80 %)] at National Phytotron Facility, Indian Agriculture Research Institute, New Delhi. Leaf tissues from 15 seedlings, each of mock and pathogen inoculated NILs were taken at 24 h post inoculation (hpi) and stored in liquid nitrogen. Total RNA was isolated from leaf samples using TRI REAGENT (Molecular Research Center, Inc., USA) as per manufacturer’s instruction. The RNA isolation time-point was based on earlier studies on development of infection structures (Hu and Rijkenberg 1998) and activation of resistant signaling genes (Coram et al. 2008; Singh et al. 2012). The integrity of the isolated RNAs was confirmed using Agilent Bioanalyser 2100. Four serial analysis of gene expression (SAGE) libraries were prepared from the isolated RNAs [coded as: (i) S-M: HD2329 mock inoculated, (ii) S-PI: HD2329 pathogen inoculated, (iii) R-M: HD2329 +  Lr28 mock inoculated and (iv) R-PI: HD2329 + Lr28 pathogen inoculated] using SOLiD–SAGE kit (Applied Biosystems, CA, USA) following the recommended protocol and sequenced using sequencing by oligonucleotide ligation and detection (SOLiD) technique at Bay Zoltán Foundation of Applied Research, Institute of Plant Genomics, Human Biotechnology and Bioenergy, Zagreb, Hungary. The sequences have been submitted to NCBI SRA061917 (BioSample accession as SAMM01820702, SAMM01820703, SAMM01820704 and SAMM01820705). Functional annotation of identified WRKY sequences The identified WRKY TFs in wheat in the present study were mapped to the SOLiD–SAGE reads of wheat transcript assembly prepared earlier in our laboratory using mock and Puccinia triticina inoculated NILs (Singh et al. 2012). Mapping was carried out using the reference assembly function of CLC Genomics Workbench 6.5 considering the default parameters for short reads (Mismatch cost: 2; Insertion cost: 3; Deletion cost: 3 and Global alignment). Consensus sequences were extracted from each of the source SOLiD–SAGE libraries and BLASTN was performed with wheat ESTs. EST hits with minimum e-value and maximum identity in all four libraries were downloaded and redundant ESTs were eliminated. Gene ontology (GO) of the identified ESTs was performed to categorize them in terms of Biological processes, Molecular functions and Cellular components using Blast2GO (B2G) software (Conesa and Gotz 2008). Expression analysis of identified WRKY TFs in wheat under biotic stress Expression values for all the identified WRKY TFs in wheat were extracted from the four SOLiD–SAGE dataset using CLC Genomics Workbench 6.5. The read counts

13

Mol Genet Genomics (2014) 289:1289–1306

were determined and imported into Cytoscape software (Smoot et al. 2011) where differentially expressed transcripts were represented in the form of different nodes based on up- or down-regulation. The differential expression of the identified WRKY transcripts among the wheat NILs in response to leaf rust pathogenesis was determined with log2 transformed values and represented through heat map, scatter plot and cluster analysis. The expression patterns of the wheat WRKY genes were further investigated using Genevestigator software (Hruz et al. 2008). Six independent rust-related microarray-based experimental data were downloaded from NCBI-GEO (National Centre for Biotechnology Information—Gene Expression Omnibus; http://www.ncbi.nlm.nih.gov/geo/) and uploaded onto the sample set. The experiment IDs included GSE6227, GSE9915, GSE31753, GSE31756, GSE31761 and GSE32151. All the identified WRKY genes obtained in the present study were uploaded in the gene selection panel. Expression data were hierarchically clustered based on Euclidean distance in sample with log2 transformed values. Chromosomal localization The newly identified 22 wheat WRKY sequences were mapped onto chromosomes of wheat from International Wheat Genome Sequencing Consortium (IWGSC), Sorghum, Brachypodium and rice whose genomes have been completely sequenced using Plant Ensembl database by selecting e-value 10 as the cutoff criteria (Kersey et al. 2014). Synteny of wheat WRKY genes was also performed using wheat zapper tool and visualized using Circos online tool (Alnemer et al. 2013).

Results In silico identification of WRKY TFs in wheat and phylogenetic analysis To obtain a robust dataset of wheat WRKY TFs, the redundant ESTs were removed and the presence of conserved WRKY domain was examined following the pipeline mentioned in Supplementary Fig. S1. We were able to identify a total of 470 protein sequences (80 of rice, 94 of Brachypodium distachyon, four of Hordeum vulgare, 103 of Sorghum bicolor, 142 of Zea mays and 47 of Saccharum officinalis) containing WRKY domains that were orthologous in wheat. BLASTN with these 470 sequences to wheat ESTs provided 100 ESTs with WRKY domains. Each of these wheat WRKY genes, identified in the present study, was assigned a unique identifier from TaWRKY 1 to 100 (Supplementary Table S1). Forty five of these 100

Mol Genet Genomics (2014) 289:1289–1306

WRKY sequences matched with >95 % identity to previously known wheat WRKY genes (TaWRKY1–45, Supplementary Table S1). The remaining 55 WRKY sequences had either no match with any known WRKY sequences in wheat or had a matching identity of

Transcriptome-wide analysis of WRKY transcription factors in wheat and their leaf rust responsive expression profiling.

WRKY, a plant-specific transcription factor family, has important roles in pathogen defense, abiotic cues and phytohormone signaling, yet little is kn...
2MB Sizes 1 Downloads 4 Views