TRPLSC 1399 No. of Pages 10

Review

Lessons from Domestication: Targeting Cis-Regulatory Elements for Crop Improvement Gwen Swinnen,1,2 Alain Goossens,1,2,* and Laurens Pauwels1,2,* Domestication of wild plant species has provided us with crops that serve our human nutritional needs. Advanced DNA sequencing has propelled the unveiling of underlying genetic changes associated with domestication. Interestingly, many changes reside in cis-regulatory elements (CREs) that control the expression of an unmodified coding sequence. Sequence variation in CREs can impact gene expression levels, but also developmental timing and tissue specificity of expression. When genes are involved in multiple pathways or active in several organs and developmental stages CRE modifications are favored in contrast to mutations in coding regions, due to the lack of detrimental pleiotropic effects. Therefore, learning from domestication, we propose that CREs are interesting targets for genome editing to create new alleles for plant breeding.

Trends Crop domestication traits are often caused by mutations in cis-regulatory elements. Subtle alterations in expression are favored to avoid pleiotropic effects. Regulatory networks affecting domestication traits are beginning to be unraveled. Genome editing holds promise for engineering of cis-regulatory elements.

Cis-Regulatory Variants Were Selected during Domestication For over ten millennia, the domestication of our crops relied only on human selection (see Glossary), which can be considered to be the most simple of breeding techniques. This process nevertheless was successful in generating plants with traits adapted to our use. Although in general referred to by domestication, selection of initial domestication traits that are typically fixed within crop species was followed by the acquisition of improvement traits that are variable among crop cultivars. Doebley and colleagues [1] introduced the idea that regulatory mutations contributed markedly to plant domestication. Since then, multiple studies have indeed shown that domestication was accompanied by the rewiring of crop transcriptomes through regulatory changes. Comparative transcriptional profiling of maize (Zea mays), tomato (Solanum lycopersicum), cotton (Gossypium ssp.), carrot (Daucus carota), and their respective wild ancestors revealed significant changes in gene expression [2–6]. Genes with differential expression levels between seedlings of maize and its ancestor teosinte are enriched for candidate domestication genes [3,7]. These candidate genes have a lower expression variation than non-candidate genes, which suggests that human selection occurred on cis-regulatory regions linked to the candidate genes [2]. The advent of next-generation sequencing has allowed pinpointing genomic alterations that are responsible for domestication traits [8,9]. Meyer and Purugganan [10] reported in 2013 that almost half of the described causative mutations resided in cis-regulatory regions. Since then, additional changes in cis-regulatory elements (CREs) that underwent selection and

Trends in Plant Science, Month Year, Vol. xx, No. yy

1 Department of Plant Systems Biology, VIB, Technologiepark 927, B-9052 Gent, Belgium 2 Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, B-9052 Gent, Belgium

*Correspondence: [email protected] (A. Goossens) and [email protected] (L. Pauwels).

http://dx.doi.org/10.1016/j.tplants.2016.01.014 © 2016 Elsevier Ltd. All rights reserved.

1

TRPLSC 1399 No. of Pages 10

that underlie initial domestication or improvement traits have been identified (Table 1). Most of the genes affected by such a regulatory change have been found to encode transcription factors (TFs) that regulate developmental processes. Similar observations have been made for the evolution of plant morphology [11]. The CRE mutations usually reside upstream of the domestication gene and often affect spatial and/or temporal expression (Table 1). Until now, for only a few of such CREs that were selected during domestication, the corresponding interacting TFs or microRNAs (miRNAs) have been determined, leaving much to be explored in future research. The [6_TD$IF]rare CRE examples in tomato, rice (Oryza sativa), and barley (Hordeum vulgare) for which a gene regulatory network is emerging, are discussed in detail. Furthermore, we discuss the reason for this prevalence of CRE mutations in domestication alleles and propose how we can exploit knowledge of gene regulatory networks by using genome editing.

Examples Illustrating the Importance of CREs Larger Tomato Fruits The main selection criteria during domestication for nearly all fruit crops were fruit size and shape [12]. Molecular genetic studies in tomato identified two cis-regulatory mutations, locule number (lc) and fasciated (fas), which influence both of these traits by synergistically increasing floral meristem size (Table 1) [13–18]. Both mutations are responsible for larger and flat fruit phenotypes with higher locule (seed compartment) numbers, as seen in beefsteak tomatoes [16,19]. The positive epistatic interaction between lc and fas suggests that they are part of a single regulatory network that controls the size of floral meristems (Figure 1A). The tomato lc allele is associated with two single nucleotide polymorphisms (SNPs) downstream of the homeodomain TF, SlWUSCHEL (SlWUS) (Figure 1B,C) [16]. Both SNPs were proposed to disrupt the repression of SlWUS by AG1 (TAG1), the tomato homolog of the Arabidopsis flower development TF AGAMOUS (AG) (Figure 1A–C) [17]. SlWUS functions in a negative feedback loop with the signaling peptide SlCLAVATA3 (SlCLV3) [20]. The fas mutation was introduced by an inversion with one of its breakpoints upstream of SlCLV3 [21], resulting in a reduced expression of SlCLV3 [18]. Selection of lc and fas thus fine-tuned the expression of regulators in a network controlling floral meristem size, which resulted in supernumerary locules (Figure 1D). Slender Rice During rice domestication, selection frequently occurred for grain size and shape. These characteristics affect both grain quality and yield, two traits that are usually negatively correlated with each other [22]. However, Wang and colleagues [23] recently identified a change in a CRE of the GRAIN WIDTH 7 (GW7) gene promoter that improved grain appearance and quality without inducing a yield penalty. This semidominant GW7TFA[1_TD$IF] allele, commonly present in tropical japonica varieties, partially alleviated the repression of GW7 expression by a specific SQUAMOSA-PROMOTER BINDING PROTEIN-LIKE TF, OsSPL16/ GW8, during panicle development (Figure 2A,B). As a consequence, the spatial control of cell division in the spikelet hulls by GW7 was altered, which resulted in the production of more slender grains. Denser Spikes in Barley As for all cereal crops, inflorescence and reproductive organ development has been a major target of barley domestication because of its influence on yield (Table 1). Several alleles of a barley APETALA2 ortholog, named HvAP2, were associated with increased spike density [24]. These alleles contain SNPs in an miRNA-binding site of HvAP2 and were shown to disrupt miRNA172-directed cleavage of HvAP2 transcripts during the early stages of reproductive development. As a result of this heterochronic change in HvAP2 expression, spikelet maturation was delayed yielding denser spikes.

2

Trends in Plant Science, Month Year, Vol. xx, No. yy

Glossary ChIP-seq (chromatin immunoprecipitation followed by sequencing): an in vivo technique in which the genomic fragments that co-precipitate with a DNA-binding protein of interest are sequenced. Cis-regulatory hypothesis: changes in CREs that control the expression of pleiotropic regulators and their downstream targets are the predominant mechanism for the evolution of morphological diversity. Cis-regulatory mutation: a mutation within a non-coding DNA sequence that alters the expression of a linked gene. Cis-regulatory region: a region in the genome [e.g., promoter, 50 untranslated region (UTR), 30 UTR, intron] that can encompass multiple CREs. CRE (cis-regulatory element or cis-element): a TF/miRNA binding site or another usually non-coding DNA sequence that regulates the expression of a linked gene. CRISPR/Cas9 (clustered regularly interspaced short palindromic repeat/CRISPR-associated protein 9): a genome editing method in which a Cas9 nuclease is guided by a synthetic guide RNA to the target site (protospacer) in the genome. Directed evolution: an iterative protein engineering strategy that consists of random mutagenesis and/ or recombination followed by selection. ecoTILLING (eco-targeting induced local lesions in genomes): a method to identify SNPs for a specific gene using natural variation within a population as opposed to chemical mutagenesis. Genome editing: a genetic engineering strategy to introduce heritable mutations at targeted genomic loci using engineered sitespecific nucleases. Gene regulatory network: a structure comprising regulatory factors that interact with each other to define the spatiotemporal expression of a specific set of genes. Human selection: a process that includes both conscious and unconscious selection, in which humans allow only individuals with desirable traits to reproduce. Improvement traits: traits that result from the adaptation of crop

TRPLSC 1399 No. of Pages 10

(A)

Figure 1. Two Cis-Regulatory Elements (CREs) Control Tomato Fruit Size and Shape. (A) Model for spatio-

CLV3

WUS

WUS + LFY

WUS

AG

AG

Stage 2

Stage 3

Stage 6

(C)

(B)

lc

LC

TAG1 AG

TAG1 AG

Low

High

SlWUS

CArG

SlWUS

(D)

lc T1696

fas/lc T954

X

temporal control of floral meristem size in angiosperms by a conserved gene regulatory network [91,92]. During the initial stages of Arabidopsis floral development (stage 1–2), a transcription factor (TF) that maintains stem cell identity called WUS is expressed in the organizing center. By a negative feedback loop that promotes the CLV signaling pathway, WUS limits its own activity, which guarantees that the floral stem cell pool remains constant throughout these stages. In stage 3, WUS induces, together with LFY, the TF AG [93,94]. AG then targets WUS directly for epigenetic silencing by binding two CArG boxes (CC[A/T]6GG) downstream of WUS [95]. In stage 6, WUS expression is abolished indirectly by AG, causing the timely termination of floral stem cell activity. Modified from [92]. (B) LC situation with inhibition of SlWUS expression through binding of a putative downstream CArG box by TAG1. (C) In the lc mutant, two SNPs in the CArG box were proposed to disrupt TAG1 binding, increasing SlWUS expression during the third stage of flower development [17]. (D) Resulting fruit phenotypes with increased locule numbers. In the fas/lc mutant, SlCLV3 expression is additionally reduced by a chromosomal rearrangement. The fas locus probably arose after the lc locus [16]. Scale bar, 1 cm. Cartoons based on images from solgenomics.wur.nl [96]. Abbreviations: AG, AGAMOUS; CLV, CLAVATA; fas, fascinated; GRN[4_TD$IF], gene regulatory network; LC, locule number; LFY, LEAFY; SNPs, single nucleotide polymorphisms; WUS, WUSCHEL[5_TD$IF]; CRE, cis-regulatory element; TF, transcription factor; NHEJ, non-homologous end joining; PAM, protospacer adjacent motif; CRISPR/Cas9, clustered regularly interspaced short palindromic repeat/ CRISPR-associated protein 9.

Selection of Minimal Pleiotropic Effects

populations to different growth conditions and human preferences, for example, adaptation to specific climates, fruit morphology, and grain quality. Initial domestication traits: traits that were selected during the establishment of new crop species, for example, increased yield, loss of seed shattering, and changes in inflorescence architecture. PAM (protospacer adjacent motif): a nucleotide motif adjacent to the protospacer that is necessary for recognition by Cas9. Pleiotropy: the phenomenon of a single gene that has multiple independent functions and because of that affects two or more distinct traits. Promoter: a DNA sequence controlling spatiotemporal gene activity consisting both of the core promoter immediately upstream of the gene and gene distal enhancer sequences. Protein-binding microarray: a high-throughput method to determine the preferred DNA-binding sequences of a DNA-binding protein by probing purified proteins to a dsDNA microarray. Protospacer: the target genomic site of a single RNA-guided Cas9 nuclease. TALEN (TRANSCRIPTION ACTIVATOR-LIKE EFFECTOR NUCLEASE): an engineered protein composed of a TALE DNA-binding domain and a nuclease domain that is used for genome editing. Transient expression assay: an assay to determine the transactivating or transrepressing capacities of a transcription factor on a given promoter using a transient expression system such as protoplasts. Yeast one-hybrid: a technique in a heterologous yeast system to determine if a protein specifically binds a particular DNA sequence.

Domestication has generated remarkable variation in plant architecture and appearance. The morphologies of crop plants and other multicellular organisms are determined by regulatory networks, which have hierarchical structures that possess several layers [25]. The position of a gene in such a hierarchical structure influences its level of pleiotropy. Many researchers have argued that selection favors coding mutations with a low pleiotropic effect [26,27], involving genes that are usually positioned at or near terminal points of regulatory networks [28]. By contrast, genes that are located at the center of a network and connected to many other regulators are predicted to carry out independent functions in different cell types and developmental stages. Therefore, changes in their coding sequence are expected to have deleterious effects [29].

Trends in Plant Science, Month Year, Vol. xx, No. yy

3

TRPLSC 1399 No. of Pages 10

Table 1. Examples of CRE Modifications underlying Crop Initial Domestication or Improvement Phenotypesa Species

Gene

Protein

Mutation

Support

CRE

CRE Location

TF/ miRNA

Effect on Expression

Phenotype

Refs

Brassica napus

FLC.A10

TF

TE

Association



Upstream



Tissue-specific increase

Flowering time

[67]

Citrus sinensis

Ruby

TF

TE

Functional characterization



Upstream



Tissue-specific and temperatureinduced increase

Anthocyanin production

[68]

Glycine max

TFL1b

TF

SNP

Association

SORLIP1

Upstream



Light-induced reduction

Determinate growth

[69]

Hordeum vulgare

AP2

TF

SNP

Association and functional characterization

miRNA172 binding site

Exon

miRNA172

Heterochronic increase

Inflorescence architecture

[24]

Malus  domestica

MYB10

TF

Copy number variation

Association and functional characterization



Upstream

MYB10

Ectopic increase

Anthocyanin production

[70]

Oryza sativa

LG1

TF

SNP

Association



Upstream



Tissue-specific reduction

Inflorescence architecture

[71]

Oryza sativa

GS5

Peptidase

SNP

Functional characterization

ABAresponsive element

Upstream



Tissue-specific and heterochronic increase

Grain size

[72,73]

Oryza sativa

qSH1/RPL

TF

SNP

Association

repeat

Upstream

ABI3 type

Tissue-specific reduction

Seed shattering

[74]

Oryza sativa

Ghd7

Regulator

SNP

Association



Upstream



Reduction

Plant height, flowering time, and grain number

[75]

Oryza sativa

GIF1

Cell wall invertase



Association and functional characterization



Upstream



Tissue-specific reduction

Grain quality and size

[76]

Oryza sativa

GW8/SPL16

TF

Indel

Functional characterization



Upstream



Reduction

Grain shape, quality, and size

[33]

Oryza sativa

GW7

TRM protein

Indel

Functional characterization

GTAC motif

Upstream

GW8/ SLP16

Increase

Grain shape and quality

[23]

Oryza sativa

Kala4

TF

Duplication and insertion

Association and functional characterization



Upstream



Ectopic increase

Anthocyanin production

[77]

Physalis philadelphica

POS1

TF

Copy number variation

Association and functional characterization



Intron



Increase

Reproductive organ size

[78]

Solanum lycopersicum

FAS/CLV3

Signal peptide

Inversion

Functional characterization



Upstream



Reduction

Locule number

[18,21]

Solanum lycopersicum

LC/WUS

TF

SNP

Association

CArG box

Downstream

TAG1

Tissue-specific and heterochronic increase

Locule number

[15,17]

Solanum lycopersicum

FW2.2/CNR

Regulator

SNP

Functional characterization



Upstream



Heterochronic increase

Fruit weight

[79,80]

Solanum lycopersicum

FW3.2/ KLUH

CYP450

SNP

Association

OSE

Upstream



Increase

Fruit weight

[81]

Triticum aestivum L.

MFT

PEBP protein

SNP

Functional characterization

A-box

Upstream



Increase

Dormancy

[82]

Triticum aestivum L.

VRN1

TF

SNP and indel

Association

CArG box

Upstream



Reduction

Vernalization response

[83]

Vitis vinifera

MYBA1

TF

TE

Association



Upstream





White fruit

[84]

Zea mays

TB1

TF

TE

Functional characterization



Upstream



Increase

Apical dominance

[85]

Zea mays

CCT

TF

TE

Association and functional characterization



Upstream



Reduction

Flowering time

[86]

Zea mays

TU/ZMM19

TF

Inversion

Functional characterization



Upstream



Ectopic increase

Pod corn

[87,88]

Zea mays

Vgt1/Rap2.7

TF

TE

Association



Upstream



Reduction

Flowering time

[89]

Zea mays

prol1.1/GT1

TF



Association



Upstream



Tissue-specific increase

Inflorescence architecture

[90]

Abbreviations: miRNA, microRNA; SNP, single nucleotide polymorphism; TF, transcription factor; TE, transposable element. a This list is not intended to be exhaustive.

4

Trends in Plant Science, Month Year, Vol. xx, No. yy

TRPLSC 1399 No. of Pages 10

However, selection does work efficiently on mutations in CREs of central regulators and their target genes that alter their, often spatiotemporal, expression [26]. Correspondingly, many domestication genes with CRE mutations are TFs (Table 1). This phenomenon can be explained by the modular organization of cis-regulatory regions, having multiple CREs that are assumed to often act independently of each other [26]. Mutating one CRE will only modify a specific part of the expression profile of the gene it controls. Modification of an existing CRE can be established by point mutations, structural variation, or transposable elements that either disrupt, generate, or alter the strength of a regulatory link (Table 1). Transposable elements are additionally able to create new CREs [30]. In many cases, this will impact developmental timing or tissue specificity of expression. For instance, the alleviation of SlWUS repression by TAG1 in the lc mutant occurs during a particular stage in floral development, increasing floral meristem size (Figure 1C,D) [17], which is also achieved in the fas mutant by a higher transcription of SlCLV3 earlier during flower development (Figure 1C,D) [18]. Likewise, increased GW7TFA expression, which improved rice grain appearance and quality, is most pronounced in developing panicles (Figure 2B) [23]. Hence, CRE mutations allow the fine-tuning of gene transcription, whereas severe gain- or lossof-function mutations often result in undesirable effects. For example, a null mutation in SlCLV3 generates enlarged tomato fruits, but is additionally accompanied by profound inflorescence branching and highly fasciated flowers [18]. Tomato plants that are silenced for the TAG1 gene do not only display fruits with defects in determinacy but also homeotic conversion of stamens [31]. Whereas constitutive AtWUS overexpression induces expansion of shoot and flower meristems, as well as adventitious shoot formation and somatic embryogenesis in seedlings [32]. Finally, most Basmati varieties hold a mutation in the promoter of OsSPL16/GW8, which translates in a strongly reduced GW8 expression. Although this also leads to an increase in GW7 expression levels (Figure 2C) and to the production of slender and high-quality grains, this mutation was also associated with a yield penalty [33]. We assume that selection of these severe mutations during domestication would have been improbable because of their detrimental effects. A CRE change, however, which only modifies the expression pattern of a central TF or of one of its targets, is more likely to be favored by (A)

(B)

HJX74

GW8

Low

Short and wide grain low quality high yield

NIL-gw8Basma

GW8

GW8

GTACGTAC

(C)

NIL-gw7TFA

GW7

High

X GTACGTAC Slender high quality high yield

GW7

High

GTACGTAC

GW7

Slender high quality yield penalty

Figure 2. Cis-Regulatory Element (CRE) Mutations Control Rice Slenderness. (A) In most Indica varieties, GW7 expression is restricted by the OsSPL16/GW8 repressor. SPL transcription factors (TFs) contain a highly conserved SBP domain that binds a consensus sequence [97], consisting of a GTAC core motif that is flanked by gene-specific regions [98]. (B) A CRE variation present in most tropical japonica varieties directly upstream of the GTAC motifs reduces binding strength of GW8 to the GW7 promoter. The resulting increase in GW7 expression specifically influences grain appearance and quality [23]. (C) A mutation in the GW8 promoter present in Basmati rice leads to lower GW8 expression levels and subsequently also to higher GW7 expression. However, due to a pleiotropic function of GW8, this mutation also brings about a yield penalty [33]. Abbreviations: GW7, GRAIN WIDTH 7; GW8, GRAIN WIDTH 8; HJX74, Hua-Jing-Xian 74 variety; NIL, near isogenic line; SBP, SQUAMOSA promoter-binding protein; SPL, SQUAMOSA-PROMOTER BINDING PROTEIN-LIKE; TFA, TaifengA.

Trends in Plant Science, Month Year, Vol. xx, No. yy

5

TRPLSC 1399 No. of Pages 10

selection. Therefore, the discussed examples of domestication support the cis-regulatory hypothesis. Although there is an ongoing debate about the relative importance of regulatory changes (including trans-regulatory mutations, e.g., a TF null mutation) and cis-regulatory changes in evolution [34,35], there is no doubt that CRE modifications, because of their low pleiotropic effect, have had an essential part in crop domestication.

Unraveling Gene Regulatory Networks Surprisingly, little information is available about gene regulatory networks in crops, probably due to the difficult nature of studying TF–CRE pairs [30]. If transcriptional regulators of the process of interest are known, a TF-centered approach, such as chromatin immunoprecipitation followed by sequencing (ChIP-seq), can be used to identify target genes and bound DNA regions [36]. The ChIP-seq data can subsequently be used to predict TF-binding motifs [37]. Complementary to this approach are protein-binding microarrays, which have successfully been used to determine DNA-binding specificities of plant TFs [38–40]. In the absence of TFs known to regulate the gene or process of interest, gene expression data can be used to identify candidate regulators by their correlated or anticorrelated gene expression. This ‘guilt by association’ principle becomes more powerful with the increasing availability of RNA-seq transcript profiling data [41]. Alternatively, promoter-centered techniques, such as yeast one-hybrid, can be performed to identify regulating TFs, as recently shown for the aliphatic glucosinolate pathway and secondary cell wall biosynthesis in Arabidopsis [42,43]. Another successful method isolates DNA-binding proteins from plant extracts using an affinity tagged DNA probe after which bound TFs are identified by mass spectrometry [44]. In the future, we expect similar approaches that can isolate endogenous DNA from the cell, such as engineered DNA-binding molecule-mediated ChIP (enChIP), will also be applicable to plants [45]. CREs can also be studied on a genome-wide scale. DNase-seq makes use of the hypersensitivity of TF-bound DNA to DNaseI to identify TF-occupied regions [46], and to reveal bound DNA motifs when the sequencing depth is sufficient, although this seems limited to more stably bound factors [47]. In plants, DNaseI footprinting has been applied in Arabidopsis (Arabidopsis thaliana) and rice [48,49]. Moreover, it can be done using different tissues, developmental stages, or after a specific treatment, linking cis-elements to conditions important for breeding, such as drought stress [49–51]. Moreover, comparable techniques are emerging that promise cheaper, faster, and more sensitive detection of TF-binding sites [52]. The wealth of sequenced plant genomes available allows the identification of CREs based on sequence conservation [53,54]. Also, polymorphisms in regulatory regions causative for allele-specific expression might be mapped, as suggested in [7]. Ultimately, several of the above-described analyses can be combined to increase the sensitivity of identifying TF target genes and bound CREs [39]. It is difficult to predict in silico if a change in the CRE sequence will lead to a change in affinity or even complete loss of TF binding. This should be evaluated experimentally. For example, in the above-discussed case of rice GW7, electrophoretic mobility shift assays, yeast one-hybrid, and transient expression assays were used to confirm reduced OsSPL16/GW8 binding [23].

Concluding Remarks: Genome Editing of Cis-Elements Natural sequence variation in CREs could be used in breeding programs for crop improvement. Promising CRE alleles can now be found in the breeding population by ongoing whole-genome resequencing programs or by more targeted approaches such as ecoTILLING [55]. However, [7_TD$IF]a SNP or indel in a specific small-sized CRE that results in loss of TF binding may be rare or absent in the population.

6

Trends in Plant Science, Month Year, Vol. xx, No. yy

TRPLSC 1399 No. of Pages 10

(A)

(B)

(C)

(D)

X

CRE CRE

Off

Cas9

On

PAM

TF

NHEJ

TF

Figure 3. CRISPR/Cas9-Mediated Genome Editing of CREs. (A) The promoter of the gene of interest contains a CRE (yellow) bound by a repressive TF shutting down expression in the tissue or condition in which the TF is expressed. (B) Genome editing using Streptomyces pyogenes Cas9 and a guide RNA recognizing, respectively, the NGG (pink) PAM sequence and a protospacer (green) containing the CRE. The RNA-guided endonuclease cuts at a precise location in the CRE (red triangles), generating a double-stranded break (DSB). (C) In absence of a template, the plant cell repairs the DSB by NHEJ, resulting most often in random indels at this position. As an example, an insertion of one base pair is shown. (D) The disrupted promoter sequence is not recognized anymore by the TF and the gene expression is induced. Abbreviations: CRE, cis-regulatory element; CRISPR/Cas9, clustered regularly interspaced short palindromic repeat/CRISPR-associated protein 9; NHEJ, non-homologous end joining; PAM, protospacer adjacent motif; TF, transcription factor.

Recently, genome editing has become more accessible and may form an attractive approach to introduce sequence variation in CREs. Genome editing is based on targeting a nuclease to specific loci in the plant genome to make a double-stranded break (DSB). Although site-specific targeting has been facilitated by technologies such as TRANSCRIPTION ACTIVATOR–LIKE EFFECTOR NUCLEASE (TALEN), only the advent of RNA-guided nucleases, such as the clustered regularly interspaced short palindromic repeat/CRISPR-associated protein 9 (CRISPR/Cas9) system, has brought genome editing within reach of most researchers and is anticipated to become the dominant genome editing technology. When a DSB is made in the genome by the RNA-guided endonuclease, the imperfect repair by non-homologous end joining (NHEJ) may make small insertions or deletions at the site of the DSB, disrupting a targeted ciselement (Figure 3A–D). Alternatively, using two guide RNAs, the entire CRE or cis-regulatory region might be removed by two flanking DSBs [56]. Interestingly, ChIP-seq analysis of catalytically disabled Cas9 has shown that it preferentially associates with accessible chromatin, represented by DNase I hypersensitive regions [57,58]. This suggests that CREs are readily accessible targets for CRISPR/Cas9. Important for targeting small cis-elements is the choice of nuclease landing for CRISPR/Cas9 that is restricted by a protospacer adjacent motif (PAM)-sequence, which is ‘NGG’ for the commonly used Streptomyces pyogenes Cas9. However, other bacteria have Cas9 orthologs with different PAM prerequisites [59,60], and SpCas9 proteins with different PAM specificities have been engineered using directed evolution [61]. It is therefore foreseeable that in the near future, CRISPR/Cas9 will no longer have sequence restrictions. Alternatively, when a DNA template can be provided, the DSB can be repaired by homologous recombination (HR), and the DSB site can be introduced further upstream or downstream of the cis-element. This allows modifying the cis-element by choice. First results of CRISPR/Cas9-mediated HR in Arabidopsis have been reported [62]. Recently, several examples of the power of genome editing to disrupt coding sequences have been published. One of the most striking examples is the knockout of all three homoeoalleles of the MILDEW RESISTANCE LOCUS in hexaploid bread wheat (Triticum aestivum) using the TALEN technology, resulting in tolerance to the pathogen Blumeria graminis [63]. To our

Trends in Plant Science, Month Year, Vol. xx, No. yy

7

TRPLSC 1399 No. of Pages 10

knowledge, only one example of genome editing of a CRE in plants has been reported to date. The effector-binding element (EBE) in the OsSWEET14/Os-11N3 promoter is bound by the Xanthomonas oryzae TALE effector AvrXa7, which functions as a transcriptional activator. TALEN-mediated disruption of the EBE resulted in tolerance to Xanthomonas strains carrying the AvrXa7 effector [64]. Although completely knocking out the OsSWEET14 gene also results in resistance, knockout mutants have severely delayed growth and small seeds, making them uninteresting for resistance breeding [65]. Importantly, no polymorphisms in rice germplasm in the EBE have been identified [64].

Outstanding Questions

Eventually, genome editing may also contribute to finding and studying CREs. Recently, two publications used site-directed mutagenesis to scan the human BCL11A erythroid enhancer in a tiling manner and pinpoint an essential motif [56,66]. In addition, Vierstra and colleagues [66] used the randomness of indels caused by NHEJ and the resulting allelic series to derive a consensus motif for a TF recognition site.

Can we modify the CRISPR/Cas9 system to make every single base in the genome a target? Or alternatively, will HR be generally applicable in crops?

In conclusion, the combination of in-depth understanding of gene regulatory networks and genome editing to find and alter CREs at the single nucleotide level in plant genomes may provide a promising engineering strategy for future crop improvement (see Outstanding Questions). Acknowledgments We thank Annick Bleys for help in preparing the manuscript. This research was supported by funding from the Research Foundation Flanders (FWO) through the project G005312N and a postdoctoral fellowship to L.P.

References 1. Doebley, J.F. et al. (2006) The molecular genetics of crop domestication. Cell 127, 1309–1321 2. Hufford, M.B. et al. (2012) Comparative population genomics of maize domestication and improvement. Nat. Genet. 44, 808–811

15. Barrero, L.S. et al. (2006) Developmental characterization of the fasciated locus and mapping of Arabidopsis candidate genes involved in the control of floral meristem size and carpel number in tomato. Genome 49, 991–1006

3. Swanson-Wagner, R. et al. (2012) Reshaping of the maize transcriptome by domestication. Proc. Natl. Acad. Sci. U.S.A. 109, 11878–11883

16. Muños, S. et al. (2011) Increase in tomato locule number is controlled by two single-nucleotide polymorphisms located near WUSCHEL. Plant Physiol. 156, 2244–2254

4. Koenig, D. et al. (2013) Comparative transcriptomics reveals patterns of selection in domesticated and wild tomato. Proc. Natl. Acad. Sci. U.S.A. 110, E2655–E2662

17. van der Knaap, E. et al. (2014) What lies beyond the eye: the molecular mechanisms regulating tomato fruit weight and shape. Front. Plant Sci. 5, 227

5. Yoo, M-J. and Wendel, J.F. (2014) Comparative evolutionary and developmental dynamics of the cotton (Gossypium hirsutum) fiber transcriptome. PLoS Genet. 10, e1004073

18. Xu, C. et al. (2015) A cascade of arabinosyltransferases controls shoot meristem size in tomato. Nat. Genet. 47, 784–792

6. Rong, J. et al. (2014) New insights into domestication of carrot from root transcriptome analyses. BMC Genomics 15, 895 7. Lemmon, Z.H. et al. (2014) The role of cis regulatory evolution in maize domestication. PLoS Genet. 10, e1004745 8. Morrell, P.L. et al. (2011) Crop genomics: advances and applications. Nat. Rev. Genet. 13, 85–96 9. Gepts, P. (2014) The contribution of genetic and genomic approaches to plant domestication studies. Curr. Opin. Plant Biol. 18, 51–59 10. Meyer, R.S. and Purugganan, M.D. (2013) Evolution of crop species: genetics of domestication and diversification. Nat. Rev. Genet. 14, 840–852 11. Doebley, J. and Lukens, L. (1998) Transcriptional regulators and the evolution of plant form. Plant Cell 10, 1075–1082 12. Pickersgill, B. (2007) Domestication of plants in the Americas: insights from Mendelian and molecular genetics. Ann. Bot. 100, 925–940 13. Lippman, Z. and Tanksley, S.D. (2001) Dissecting the genetic pathway to extreme fruit size in tomato using a cross between the small-fruited wild species Lycopersicon pimpinellifolium and L. esculentum var. Giant Heirloom. Genetics 158, 413–422 14. Barrero, L.S. and Tanksley, S.D. (2004) Evaluating the genetic basis of multiple-locule fruit in a broad cross section of tomato cultivars. Theor. Appl. Genet. 109, 669–679

8

Trends in Plant Science, Month Year, Vol. xx, No. yy

19. Rodríguez, G.R. et al. (2011) Distribution of SUN, OVATE, LC, and FAS in the tomato germplasm and the relationship to fruit shape diversity. Plant Physiol. 156, 275–285 20. Schoof, H. et al. (2000) The stem cell population of Arabidopsis shoot meristems in maintained by a regulatory loop between the CLAVATA and WUSCHEL genes. Cell 100, 635–644 21. Huang, Z. and van der Knaap, E. (2011) Tomato fruit weight 11.3 maps close to fasciated on the bottom of chromosome 11. Theor. Appl. Genet. 123, 465–474 22. Sakamoto, T. and Matsuoka, M. (2008) Identifying and exploiting grain yield genes in rice. Curr. Opin. Plant Biol. 11, 209–214 23. Wang, S. et al. (2015) The OsSPL16-GW7 regulatory module determines grain shape and simultaneously improves rice yield and grain quality. Nat. Genet. 47, 949–954 24. Houston, K. et al. (2013) Variation in the interaction between alleles of HvAPETALA2 and microRNA172 determines the density of grains on the barley inflorescence. Proc. Natl. Acad. Sci. U.S.A. 110, 16675–16680 25. Mejia-Guerra, M.K. et al. (2012) From plant gene regulatory grids to network dynamics. Biochim. Biophys. Acta Gene Regul. Mech. 1819, 454–465 26. Carroll, S.B. (2008) Evo-devo and an expanding evolutionary synthesis: a genetic theory of morphological evolution. Cell 134, 25–36

Which transcription factors and gene regulatory networks impinge on the CREs associated with domestication traits? What is the minimal amount of knowledge about a gene regulatory network needed to engineer a CRE in a rational way?

How does CRE sequence variation affect the binding strength with the transcription factor? Can we use this knowledge to engineer the spatiotemporal expression of a target gene in a quantitative way?

TRPLSC 1399 No. of Pages 10

27. Streisfeld, M.A. and Rausher, M.D. (2011) Population genetics, pleiotropy, and the preferential fixation of mutations during adaptive evolution. Evolution 65, 629–642

53. Van de Velde, J. et al. (2014) Inference of transcriptional networks in Arabidopsis through conserved noncoding sequence analysis. Plant Cell 26, 2729–2745

28. Lenser, T. and Theißen, G. (2013) Molecular mechanisms involved in convergent crop domestication. Trends Plant Sci. 18, 704–714

54. De Witte, D. et al. (2015) BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements. Bioinformatics 31, 3758–3766

29. Wessinger, C.A. and Rausher, M.D. (2012) Lessons from flower colour evolution on targets of selection. J. Exp. Bot. 63, 5741– 5749

55. Varshney, R.K. et al. (2014) Harvesting the promising fruits of genomics: applying genome sequencing technologies to crop breeding. PLoS Biol. 12, e1001883

30. Wittkopp, P.J. and Kalay, G. (2011) Cis-regulatory elements: molecular mechanisms and evolutionary processes underlying divergence. Nat. Rev. Genet. 13, 59–69

56. Canver, M.C. et al. (2015) BCL11A enhancer dissection by Cas9mediated in situ saturating mutagenesis. Nature 527, 192–197

31. Pan, I.L. et al. (2010) Functional diversification of AGAMOUS lineage genes in regulating tomato flower and fruit development. J. Exp. Bot. 61, 1795–1806

57. Kuscu, C. et al. (2014) Genome-wide analysis reveals characteristics of off-target sites bound by the Cas9 endonuclease. Nat. Biotechnol. 32, 677–683

32. Zuo, J. et al. (2002) The WUSCHEL gene promotes vegetative-toembryonic transition in Arabidopsis. Plant J. 30, 349–359

58. Wu, X. et al. (2014) Genome-wide binding of the CRISPR endonuclease Cas9 in mammalian cells. Nat. Biotechnol. 32, 670–676

33. Wang, S. et al. (2012) Control of grain size, shape and quality by OsSPL16 in rice. Nat. Genet. 44, 950–954

59. Ran, F.A. et al. (2015) In vivo genome editing using Staphylococcus aureus Cas9. Nature 520, 186–191

34. Hoekstra, H.E. and Coyne, J.A. (2007) The locus of evolution: evo devo and the genetics of adaptation. Evolution 61, 995–1016

60. Steinert, J. et al. (2015) Highly efficient heritable plant genome engineering using Cas9 orthologues from Streptococcus thermophilus and Staphylococcus aureus. Plant J. 84, 1295–1305

35. Wray, G.A. (2007) The evolutionary significance of cis-regulatory mutations. Nat. Rev. Genet. 8, 206–216 36. Kaufmann, K. et al. (2010) Chromatin immunoprecipitation (ChIP) of plant transcription factors followed by sequencing (ChIP-SEQ) or hybridization to whole genome arrays (ChIP-CHIP). Nat. Protoc. 5, 457–472 37. Thomas-Chollier, M. et al. (2012) RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets. Nucleic Acids Res. 40, e31 38. Franco-Zorrilla, J.M. et al. (2014) DNA-binding specificities of plant transcription factors and their potential to define target genes. Proc. Natl. Acad. Sci. U.S.A. 111, 2367–2372 39. Lindemose, S. et al. (2014) A DNA-binding-site landscape and regulatory network analysis for NAC transcription factors in Arabidopsis thaliana. Nucleic Acids Res. 42, 7681–7693 40. Weirauch, M.T. et al. (2014) Determination and inference of eukaryotic transcription factor sequence specificity. Cell 158, 1431–1443 41. Goossens, A. (2015) It is easy to get huge candidate gene lists for plant metabolism now, but how to get beyond? Mol. Plant 8, 2–5 42. Li, B. et al. (2014) Promoter-based integration in plant defense regulation. Plant Physiol. 166, 1803–1820 43. Taylor-Teeples, M. et al. (2014) An Arabidopsis gene regulatory network for secondary cell wall synthesis. Nature 517, 571–575 44. Shaikhali, J. et al. (2012) The CRYPTOCHROME1-dependent response to excess light is mediated through the transcriptional activators ZINC FINGER PROTEIN EXPRESSED IN INFLORESCENCE MERISTEM LIKE1 and ZML2 in Arabidopsis. Plant Cell 24, 3009–3025 45. Fujita, T. et al. (2013) Identification of telomere-associated molecules by engineered DNA-binding molecule-mediated chromatin immunoprecipitation (enChIP). Sci. Rep. 3, 3171 46. John, S. et al. (2013) Genome-scale mapping of DNase I hypersensitivity. Curr. Protoc. Mol. Biol. 27, 21.27 47. Sung, M-H. et al. (2014) DNase footprint signatures are dictated by factor dynamics and DNA sequence. Mol. Cell 56, 275–285 48. Zhang, W. et al. (2012) Genome-wide identification of regulatory DNA elements and protein-binding footprints using signatures of open chromatin in Arabidopsis. Plant Cell 24, 2719–2731 49. Sullivan, A.M. et al. (2014) Mapping and dynamics of regulatory DNA and transcription factor networks in A. thaliana. Cell Rep. 8, 2015–2030 50. Sullivan, A.M. et al. (2015) DNase I hypersensitivity mapping, genomic footprinting, and transcription factor networks in plants. Curr. Plant Biol. 3-4, 40–47 51. Pajoro, A. et al. (2014) Dynamics of chromatin accessibility and gene regulation by MADS-domain transcription factors in flower development. Genome Biol. 15, R41 52. Buenrostro, J.D. et al. (2015) ATAC-seq: a method for assaying chromatin accessibility genome-wide. Curr. Protoc. Mol. Biol. 109, 21.29

61. Kleinstiver, B.P. et al. (2015) Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature 523, 481–485 62. Schiml, S. et al. (2014) The CRISPR/Cas system can be used as nuclease for in planta gene targeting and as paired nickases for directed mutagenesis in Arabidopsis resulting in heritable progeny. Plant J. 80, 1139–1150 63. Wang, Y. et al. (2014) Simultaneous editing of three homoeoalleles in hexaploid bread wheat confers heritable resistance to powdery mildew. Nat. Biotechnol. 32, 947–951 64. Li, T. et al. (2012) High-efficiency TALEN-based gene editing produces disease-resistant rice. Nat. Biotechnol. 30, 390–392 65. Antony, G. et al. (2010) Rice xa13 recessive resistance to bacterial blight is defeated by induction of the disease susceptibility gene Os-11N3. Plant Cell 22, 3864–3876 66. Vierstra, J. et al. (2015) Functional footprinting of regulatory DNA. Nat. Methods 12, 927–930 67. Hou, J. et al. (2012) A Tourist-like MITE insertion in the upstream region of the BnFLC.A10 gene is associated with vernalization requirement in rapeseed (Brassica napus L.). BMC Plant Biol. 12, 238 68. Butelli, E. et al. (2012) Retrotransposons control fruit-specific, cold-dependent accumulation of anthocyanins in blood oranges. Plant Cell 24, 1242–1255 69. Liu, B. et al. (2010) The soybean stem growth habit gene Dt1 is an ortholog of Arabidopsis TERMINAL FLOWER1. Plant Physiol. 153, 198–210 70. Espley, R.V. et al. (2009) Multiple repeats of a promoter segment causes transcription factor autoregulation in red apples. Plant Cell 21, 168–183 71. Zhu, Z. et al. (2013) Genetic control of inflorescence architecture during rice domestication. Nat. Commun. 4, 2200 72. Li, Y. et al. (2011) Natural variation in GS5 plays an important role in regulating grain size and yield in rice. Nat. Genet. 43, 1266–1269 73. Xu, C. et al. (2015) Differential expression of GS5 regulates grain size in rice. J. Exp. Bot. 66, 2611–2623 74. Konishi, S. et al. (2006) An SNP caused loss of seed shattering during rice domestication. Science 312, 1392–1396 75. Lu, L. et al. (2012) Evolution and association analysis of Ghd7 in rice. PLoS ONE 7, e34021 76. Wang, E. et al. (2008) Control of rice grain-filling and yield by a gene with a potential signature of domestication. Nat. Genet. 40, 1370–1374 77. Oikawa, T. et al. (2015) The birth of a black rice gene and its local spread by introgression. Plant Cell 27, 2401–2414 78. Wang, L. et al. (2014) Regulatory change at Physalis Organ Size 1 correlates to natural variation in tomatillo reproductive organ size. Nat. Commun. 5, 4271 79. Frary, A. et al. (2000) fw2. 2: a quantitative trait locus key to the evolution of tomato fruit size. Science 289, 85–88

Trends in Plant Science, Month Year, Vol. xx, No. yy

9

TRPLSC 1399 No. of Pages 10

80. Cong, B. et al. (2002) Natural alleles at a tomato fruit size quantitative trait locus differ by heterochronic regulatory mutations. Proc. Natl. Acad. Sci. U.S.A. 99, 13606–13611 81. Chakrabarti, M. et al. (2013) A cytochrome P450 regulates a domestication trait in cultivated tomato. Proc. Natl. Acad. Sci. U.S.A. 110, 17125–17130 82. Nakamura, S. et al. (2011) A wheat homolog of MOTHER OF FT AND TFL1 acts in the regulation of germination. Plant Cell 23, 3215–3229 83. Zhang, J. et al. (2012) A single nucleotide polymorphism at the Vrn-D1 promoter region in common wheat is associated with vernalization response. Theor. Appl. Genet. 125, 1697–1704 84. This, P. et al. (2007) Wine grape (Vitis vinifera L.) color associates with allelic variation in the domestication gene VvmybA1. Theor. Appl. Genet. 114, 723–730 85. Studer, A. et al. (2011) Identification of a functional transposon insertion in the maize domestication gene tb1. Nat. Genet. 43, 1160–1163 86. Yang, Q. et al. (2013) CACTA-like transposable element in ZmCCT attenuated photoperiod sensitivity and accelerated the postdomestication spread of maize. Proc. Natl. Acad. Sci. U.S.A. 110, 16969–16974 87. Han, J-J. et al. (2012) Pod corn is caused by rearrangement at the Tunicate1 locus. Plant Cell 24, 2733–2744 88. Wingen, L.U. et al. (2012) Molecular genetic basis of pod corn (Tunicate maize). Proc. Natl. Acad. Sci. U.S.A. 109, 7115–7120 89. Salvi, S. et al. (2007) Conserved noncoding genomic sequences associated with a flowering-time quantitative trait locus in maize. Proc. Natl. Acad. Sci. U.S.A. 104, 11376–11381

10

Trends in Plant Science, Month Year, Vol. xx, No. yy

90. Wills, D.M. et al. (2013) From many, one: genetic control of prolificacy during maize domestication. PLoS Genet. 9, e1003604 91. Miwa, H. et al. (2009) Plant meristems: CLAVATA3/ESR-related signaling in the shoot apical meristem and the root apical meristem. J. Plant Res. 122, 31–39 92. Sun, B. and Ito, T. (2015) Regulation of floral stem cell termination in Arabidopsis. Front. Plant Sci. 6, 17 93. Lenhard, M. et al. (2001) Termination of stem cell maintenance in Arabidopsis floral meristems by interactions between WUSCHEL and AGAMOUS. Cell 105, 805–814 94. Lohmann, J.U. et al. (2001) A molecular link between stem cell regulation and floral patterning in Arabidopsis. Cell 105, 793–803 95. Liu, X. et al. (2011) AGAMOUS terminates floral stem cell maintenance in Arabidopsis by directly repressing WUSCHEL through recruitment of Polycomb Group proteins. Plant Cell 23, 3654–3670 96. Fernandez-Pozo, N. et al. (2015) The Sol Genomics Network (SGN) – from genotype to phenotype to breeding. Nucleic Acids Res. 43, D1036–D1041 97. Klein, J. et al. (1996) A new family of DNA binding proteins includes putative transcriptional regulators of the Antirrhinum majus floral meristem identity gene SQUAMOSA. Mol. Gen. Genet. 250, 7–16 98. Liang, X. et al. (2008) Identification of a consensus DNA-binding site for the Arabidopsis thaliana SBP domain transcription factor, AtSPL14, and binding kinetics by surface plasmon resonance. Biochemistry 47, 3645–3653

Lessons from Domestication: Targeting Cis-Regulatory Elements for Crop Improvement.

Domestication of wild plant species has provided us with crops that serve our human nutritional needs. Advanced DNA sequencing has propelled the unvei...
1MB Sizes 0 Downloads 8 Views