Accepted Manuscript A recurring syndrome of accelerated plastid genome evolution in the angiosperm tribe Sileneae (Caryophyllaceae) Daniel B. Sloan, Deborah A. Triant, Nicole J. Forrester, Laura M. Bergner, Martin Wu, Douglas R. Taylor PII: DOI: Reference:
S1055-7903(13)00439-9 http://dx.doi.org/10.1016/j.ympev.2013.12.004 YMPEV 4771
To appear in:
Molecular Phylogenetics and Evolution
Received Date: Revised Date: Accepted Date:
7 September 2013 5 December 2013 17 December 2013
Please cite this article as: Sloan, D.B., Triant, D.A., Forrester, N.J., Bergner, L.M., Wu, M., Taylor, D.R., A recurring syndrome of accelerated plastid genome evolution in the angiosperm tribe Sileneae (Caryophyllaceae), Molecular Phylogenetics and Evolution (2013), doi: http://dx.doi.org/10.1016/j.ympev.2013.12.004
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
A recurring syndrome of accelerated plastid genome evolution in the angiosperm tribe Sileneae (Caryophyllaceae)
Daniel B. Sloana*, Deborah A. Triantb, Nicole J. Forresterb, Laura M. Bergnerb, Martin Wub, Douglas R. Taylorb
a
Department of Biology, Colorado State University, Fort Collins, CO, 80523, United States
b
Department of Biology, University of Virginia, Charlottesville, VA, 22904, United States
*Corresponding Author:
[email protected] 970.491.2256
Keywords: chloroplast genome, intron loss, inversions, mutation rate, positive selection
1
ABSTRACT
2
In flowering plants, plastid genomes are generally conserved, exhibiting slower rates of sequence
3
evolution than the nucleus and little or no change in structural organization. However,
4
accelerated plastid genome evolution has occurred in scattered angiosperm lineages. For
5
example, some species within the genus Silene have experienced a suite of recent changes to
6
their plastid genomes, including inversions, shifts in inverted repeat boundaries, large indels,
7
intron losses, and rapid rates of amino acid sequence evolution in a subset of protein genes, with
8
the most extreme divergence occurring in the protease gene clpP. To investigate the relationship
9
between the rates of sequence and structural evolution, we sequenced complete plastid genomes
10
from three species (Silene conoidea, S. paradoxa, and Lychnis chalcedonica), representing
11
independent lineages within the tribe Sileneae that were previously shown to have accelerated
12
rates of clpP evolution. We found a high degree of parallel evolution. Elevated rates of amino
13
acid substitution have occurred repeatedly in the same subset of plastid genes and have been
14
accompanied by a recurring pattern of structural change, including cases of identical inversions
15
and intron loss. This “syndrome” of changes was not observed in the closely related outgroup
16
Agrostemma githago or in the more slowly evolving Silene species that were sequenced
17
previously. Although no single mechanism has yet been identified to explain the correlated suite
18
of changes in plastid genome sequence and structure that has occurred repeatedly in angiosperm
19
evolution, we discuss a possible mixture of adaptive and non-adaptive forces that may be
20
responsible.
21
1. INTRODUCTION
22
One of the fundamental challenges in molecular evolution is to understand the relationship
23
between the rate of nucleotide substitution and the rate of evolution in genome structure.
24
Although these rates are often correlated with each other, the mechanisms underlying the
25
correlation remain incompletely understood (Hardison et al., 2003; Shao et al., 2003; Xu et al.,
26
2006; Tian et al., 2008; Zhu et al., 2009). The history of plastid genome evolution across the
27
flowering plants offers one example of an association between substitution rates and structural
28
rearrangements (Jansen et al., 2007). In most angiosperm lineages, plastid gene order is identical,
29
suggesting that the structural organization of the genome has been stable for more than 150 Myr
30
(Raubeson and Jansen, 2005; Bell et al., 2010). However, scattered angiosperm lineages harbor
31
extensively rearranged plastid genomes (Jansen et al., 2007). Although rates of nucleotide
32
substitution are generally slow in angiosperm plastids (approximately three-fold lower than in
33
the nucleus; Wolfe et al., 1987; Drouin et al., 2008), species with rearranged plastid genomes
34
also experience frequent nucleotide substitutions (Jansen et al., 2007; Guisinger et al., 2008).
35
Sequencing of complete mitochondrial and plastid genomes from four closely related
36
species in the genus Silene (Caryophyllaceae) has provided examples of recent and dramatic
37
changes in rates of organelle genome evolution (Sloan et al., 2012a; Sloan et al., 2012b). Two of
38
these species (S. conica and S. noctiflora) have experienced increased rates of sequence and
39
structural evolution in both organelles. The plastid genomes in these species reveal a history of
40
inversions, intron losses, shifts in inverted repeat (IR) boundaries, and large insertions and
41
deletions (indels). They have also experienced accelerated rates of nucleotide substitution in a
42
subset of protein-coding genes, particularly at nonsynonymous sites, resulting in rapid evolution
43
of amino acid sequences. The most dramatic acceleration involves the clpP gene, which is almost
3
44
unrecognizable at the nucleotide sequence level in S. conica and S. noctiflora (Erixon and
45
Oxelman, 2008b; Sloan et al., 2012b). This gene codes for the ATP-dependent proteolytic
46
subunit of the caseinolytic peptidase and is involved in protein metabolism within the plastid. A
47
number of other plastid genes with non-photosynthetic functions are also very divergent in these
48
two species, but genes encoding the core photosynthetic complexes remain highly conserved
49
(Sloan et al., 2012b). In contrast to S. conica and S. noctiflora, two other species (S. latifolia and
50
S. vulgaris) have maintained slowly evolving plastid genomes that are largely indistinguishable
51
from the ancestral structural organization found in most angiosperms. All four of these Silene
52
species are estimated to have diverged from each other in the last 5-10 Myr and are members of
53
subgenus Behenantha (Mower et al., 2007; Frajman et al., 2009; Sloan et al., 2009). The
54
phylogenetic relationships among the major lineages within this subgenus have proven difficult
55
to resolve, so it is unclear whether the similar patterns of organelle genome evolution in S.
56
conica and S. noctiflora are the result of independent evolutionary events or shared ancestry
57
(Erixon and Oxelman, 2008a; Sloan et al., 2009; Rautenberg et al., 2012; Sloan et al., 2012b).
58
Broader sampling of clpP evolution in the tribe Sileneae has shown that this gene has
59
been subject to independent accelerations within Silene and in the genus Lychnis (Erixon and
60
Oxelman, 2008b), but very little is known about plastid evolution at the whole-genome level in
61
these cases. Here, we report the sequencing of complete plastid genomes from three species,
62
representing three independent clades with highly divergent clpP sequences: Lychnis (L.
63
chalcedonica), Silene subgenus Behenantha (S. conoidea), and Silene subgenus Silene (S.
64
paradoxa; note that, although clpP sequence was not previously available from S. paradoxa, it
65
was expected to have an accelerated evolutionary rate for this gene based on the extreme
66
divergence that was documented in the closely related species S. fruticosa; Erixon and Oxelman,
4
67
2008b; Sloan et al., 2009). We also sequenced the complete plastid genome from the outgroup
68
Agrostemma githago, which is the sister lineage to the rest of the Sileneae and appears to have
69
maintained the slow, ancestral rate of clpP sequence evolution (Erixon and Oxelman, 2008a).
70
These genomes reveal remarkable parallelism in a suite of changes in sequence and structural
71
evolution that has occurred repeatedly in independent Sileneae lineages.
72 73
2. MATERIALS AND METHODS
74
2.1 Plastid DNA Extraction and Sequencing
75
Seeds from a single maternal plant for each of four target species (Table 1) were germinated and
76
grown in the greenhouse with supplemental lighting on a 16-hr/8-hr light-dark cycle and regular
77
watering and fertilization. For each species, 200 g of leaf tissue was harvested from multiple
78
maternal siblings and disrupted and filtered using standard protocols. Chloroplasts were then
79
isolated by differential centrifugation followed by separation on discontinuous sucrose gradients
80
(Palmer, 1986; Jansen et al., 2005). Isolated chloroplasts were lysed and plastid DNA was
81
purified by phenol:chloroform extraction. DNA samples were used for construction of paired-
82
end sequencing libraries with the NEBNext DNA Sample Prep Reagent set 1 (New England
83
Biolabs) and Illumina TruSeq adapters. To generate sufficient material for library construction,
84
the A. githago DNA sample was amplified with GenomiPhi V2 (GE Healthcare). Libraries were
85
sequenced as part of a multiplexed 2×101 bp lane on an Illumina HiSeq2000 at the University of
86
Minnesota’s Biomedical Genomics Center.
87 88
2.2 Plastid Genome Assembly and Annotation
5
89
Illumina sequencing generated a minimum of 112,000 plastid read pairs per library, providing
90
deep coverage of the plastid genomes. A subset of 50,000 read pairs from each library was
91
assembled with Velvet v1.1.06 using a k-mer of 51 and an expected insert length of 350 (Zerbino
92
and Birney, 2008). Assembly gaps were closed by manually inspecting contig ends in Tablet
93
v1.12.12.05 (Milne et al., 2010) and performing additional assemblies of a larger number of
94
reads with either Velvet or MIRA v3.4.0 (Chevreux et al., 1999). Misassemblies and sequencing
95
errors were identified and corrected by mapping all reads from each library to the closed genome
96
sequence with SOAP v2.21 (Li et al., 2009). Finished genomes were annotated using DOGMA
97
(Wyman et al. 2004) and deposited in GenBank (Table 1). Plastid genome maps were generated
98
with OGDraw (Lohse et al., 2013). Repetitive content was identified by comparing each genome
99
to itself with NCBI-BLASTN+ v2.2.24 (MEGABLAST), using a word size of 7 and an E-value
100
threshold of 1e-6 (Zhang et al. 2000). Tandem repeats were also identified with Phobos v3.3.12
101
(C. Mayer, http://www.ruhr-uni-bochum.de/ecoevo/cm/cm_phobos.htm). This analysis was
102
restricted to sequences of at least 20 bp in length, containing two or more copies of a perfect
103
repeating unit from 2 to 40 bp in length. One copy of the large IR was removed from each
104
genome prior to repeat analyses.
105 106
2.3 Phylogenetic and Substitution Rate Analysis
107
Protein-coding gene sequences were extracted from each of the newly sequenced plastid
108
genomes and from previously published genomes from four additional Silene species (Sloan et
109
al., 2012b). Plastid genomes from Spinacia oleracea (Schmitz-Linneweber et al., 2001) and
110
Arabidopsis thaliana (Sato et al., 1999) were also included as outgroups. Extracted sequences
111
were aligned in frame using MUSCLE v3.7 (Edgar, 2004). Unalignable sequences at the 5’ and
6
112
3’ ends of genes were manually removed, and matK alignments were modified to accommodate
113
previously identified frameshifts in S. conica and S. paradoxa (Sloan et al., 2009). Structural
114
divergence in accD was too extensive to provide a useful alignment of all species, so this gene
115
was not analyzed further. The rapid structural evolution of accD within Silene has been
116
described previously (Sloan et al., 2012b).
117
To infer the phylogenetic relationships among the sampled species, protein-coding genes
118
were concatenated into a single alignment. Sequences from ribosomal protein genes, accD,
119
clpP, ycf1, and ycf2 were excluded from this concatenation, because they were known to exhibit
120
dramatic variation in rates of nucleotide substitution among species within the Sileneae (Erixon
121
and Oxelman, 2008b; Sloan et al., 2012b). Analyses were performed both with and without third
122
codon positions in the alignment because of the potential influence of homoplasy at these sites
123
(Jeffroy et al., 2006). The resulting alignments contained 15,024 codons and were analyzed with
124
RAxML v7.4.4, using the GTRGAMMA model (Stamatakis, 2006). Bipartition support for each
125
node in the inferred phylogeny was assessed with 100 bootstrap replicate alignments (analyzed
126
with the “-f T” option in RAxML). The same concatenated alignments were also analyzed with
127
MrBayes v3.2.1, using a GTR+Gamma+I model (Ronquist et al., 2012). Independent chains
128
were run for 500,000 generations with trees sampled every 1000 generations. The first 25% of
129
the sampled tree-set was discarded as burn-in.
130
To analyze rates of nucleotide substitution, individual genes were concatenated into a
131
single alignment for each multi-subunit complex, including ATP synthase, cytochrome b6 f,
132
NADH-plastoquinone oxidoreductase, photosystem I (including ycf3 and ycf4), photosystem II,
133
RNA polymerase, and the small and large ribosomal subunits. All other genes were analyzed
134
individually. In L. chalcedonica, which contains two distinct clpP genes, the Lc1 copy was used
7
135
for rate analyses. The level of nonsynonymous (dN) and synonymous (dS) sequence divergence
136
for each gene or concatenated gene set was estimated for a constrained topology with PAML
137
v4.7 (Yang, 2007). Nodes that were not clearly resolved in unconstrained phylogenetic analyses
138
(Fig. 1) were left as polytomies in the constrained topology. Codon frequencies were determined
139
with an F3×4 model and dN/dS ratios were estimated separately for each branch. Relative ratio
140
tests were used to identify differences among genes or concatenated gene sets in the
141
proportionality of branch lengths (Muse and Gaut, 1997). These tests utilize likelihood ratios to
142
accept or reject the null hypothesis that any two trees differ by only a proportional scaling of
143
their respective branch lengths. Tests were implemented in HyPhy v2.1020120320beta(MP) for
144
all pairwise gene combinations and performed separately for synonymous and nonsynonymous
145
divergence (Pond et al., 2005), using an MG94×HKY85 model of evolution with codon
146
frequencies estimated with an F3×4 model.
147 148
3. RESULTS
149
3.1 Complete Plastid Genomes
150
Paired-end Illumina sequencing of purified plastid DNA produced deep coverage (>100×),
151
enabling de novo assembly of complete plastid genomes for four species within the angiosperm
152
tribe Sileneae (Tables 1 and 2). Each of the four genomes was assembled into a conventional
153
circular map consisting of two single-copy regions separated by a pair of large inverted repeats
154
(Fig. S1). Total genome sizes ranged from 147.9 to 151.7 kb, which is consistent with results
155
from other Sileneae species (Sloan et al., 2012b).
156 157
3.2 Phylogenetic relationships
8
158
Phylogenetic analysis of plastid genome sequences provided strong support for a number of
159
relationships among the sampled species (Fig. 1). There are two parts of the tree, however, that
160
remain incompletely resolved. The first is the relationship among L. chalcedonica and the two
161
Silene subgenera. We found some evidence that S. paradoxa (subgenus Silene) is more closely
162
related to L. chalcedonica than to the other Silene species (subgenus Behenantha), which is
163
consistent with previous conclusions that the genus Lychnis may be nested within Silene (Erixon
164
and Oxelman, 2008a; Sloan et al., 2009; Greenberg and Donoghue, 2011). Our analysis also
165
failed to confidently resolve the relationships among the major lineages within Silene subgenus
166
Behenantha. Analysis of first and second codon positions produced weak evidence for a sister
167
relationship between S. noctiflora and the two representatives of section Conoimorpha (S. conica
168
and S. conoidea), but this received very little statistical support in the maximum likelihood and
169
Bayesian analyses (Fig. 1b). In contrast, analysis of all codon positions placed S. noctiflora sister
170
to the rest of the subgenus Behenantha (Fig. 1a).
171 172
3.3 Gene and Intron Content
173
The newly sequenced plastid genomes contain a set of genes (Table 2) that is essentially
174
identical to the previously described plastid gene content in Silene, encoding 77 proteins, 30
175
tRNAs, and 4 rRNAs (Sloan et al., 2012b). The only clear differences among species result from
176
the gain or loss of duplicated genes. Most changes in gene copy number reflect shifts in the
177
boundaries of the large IR (see Inverted Repeat Boundary Shifts). However, L. chalcedonica
178
contains two divergent clpP copies (82.7% nucleotide identity) that are located outside the IR
179
and correspond to the Lc1 and Lc2 copies that were reported previously in this species (Erixon
180
and Oxelman, 2008b). The Lc2 copy was initially described as a pseudogene because of a 1-bp
9
181
frameshift deletion, but the gene is intact in our L. chalcedonica sample. Interestingly, although,
182
the Lc1 and Lc2 copies of clpP are divergent across most of their lengths, they share a 207-bp
183
internal stretch with 100% identity, suggesting occasional gene conversion between the copies.
184
Even outside of the region affected by recent gene conversion, these two copies form a sister
185
group in phylogenetic analysis (data not shown), suggesting that the gene duplication occurred
186
after the divergence of Lychnis (Erixon and Oxelman 2008b). We found no evidence of the
187
previously reported Lc3 and Lc4 clpP fragments in the L. chalcedonica plastid genome (Erixon
188
and Oxelman, 2008b), indicating that they were either amplified from nuclear or mitochondrial
189
DNA in the earlier study or that they are present in only a subset of individuals within the
190
species.
191
The sequence of the A. githago plastid genome confirms that the common ancestor of the
192
Sileneae contained a total of 20 plastid introns (Sloan et al., 2012b), but L. chalcedonica, S.
193
conoidea, and S. paradoxa have all experienced subsequent intron losses (Fig. 2). Similar to S.
194
noctiflora and S. conica, these three species have each lost the two introns in the fast-evolving
195
clpP gene (Erixon and Oxelman, 2008b; Sloan et al., 2012b). In addition, S. conoidea lacks the
196
rpoC1 intron. This intron is also absent from S. conica, suggesting that it was lost early on in the
197
evolution of Silene section Conoimorpha. Notably, however, S. conoidea has retained the atpF
198
intron, indicating that it must have been lost very recently from S. conica (Fig. 2).
199 200
3.4 Chromosomal Inversions
201
The A. githago plastid genome has maintained the same gene order found in S. latifolia, S.
202
vulgaris, and the inferred ancestor of all angiosperms (Raubeson and Jansen, 2005; Sloan et al.,
203
2012b). In contrast, L. chalcedonica, S. conoidea, and S. paradoxa have each undergone genome
10
204
rearrangements (Fig. 2). All three of these species have experienced an inversion with the same
205
breakpoints that were described previously in S. conica (Sloan et al., 2012b). These inversions
206
appear to be the result of recombination between a small pair of repeats (ca. 170 bp) that are
207
present in all sequenced Sileneae plastid genomes, including the outgroup A. githago, but have
208
only resulted in rearrangements in the L. chalcedonica, S. conica, S. conoidea, and S. paradoxa
209
lineages. This pair of repeats share approximately 80% sequence identity in each species except
210
S. paradoxa, in which they are identical and were presumably homogenized by gene conversion.
211
The L. chalcedonica plastid genome also harbors an additional inversion with a unique pair of
212
intergenic breakpoints (ycf3-trnS and rbcL-accD) that have not been observed in other species
213
within the Sileneae (Fig. 2).
214 215
3.5 Repeat Content
216
The sequenced plastid genomes differ substantially in repetitive content. After excluding the
217
large IR, each of the rearranged genomes (L. chalcedonica, S. paradoxa, S. conoidea, S. conica,
218
and S. noctiflora) had more total repetitive sequence and more tandem repeats than each of the
219
genomes with conserved gene synteny (Table 3). Notably, tandem repeats were found in another
220
of genes that exhibit elevated substitution rates such as clpP, ycf1, and ycf2 (see section 3.7), in
221
addition to introns and intergenic regions (data not shown).
222 223
3.6 Inverted Repeat Boundary Shifts
224
All Sileneae plastid genomes sequenced to date retain the typical IR structures, but IR location
225
has changed substantially in some species with movement of the boundaries between repeat and
226
single-copy regions. The outgroup A. githago exhibits boundary positions that are nearly
11
227
identical to those previously reported in the slowly evolving plastid genomes of S. latifolia and S.
228
vulgaris, suggesting that they represent the ancestral state for the tribe (Fig. 3). In contrast, other
229
Sileneae species have experienced substantial shifts in boundary positions, resulting in the gain
230
or loss of entire genes from the IR (Fig. 3).
231 232
3.7 Rates of Sequence Evolution
233
The plastid genome sequences of L. chalcedonica, S. conoidea, and S. paradoxa confirm that the
234
clpP gene has experienced multiple independent increases in substitution rate within the Sileneae
235
(Erixon and Oxelman, 2008b; Sloan et al., 2012b) and reveal that correlated accelerations have
236
also occurred in some other plastid genes, including ycf1, ycf2, and ribosomal protein genes (Fig.
237
4). Although accelerations have affected both synonymous and nonsynonymous sites (Table S1),
238
they are most pronounced at nonsynonymous sites, resulting in large increases in dN/dS ratios that
239
point to altered selection pressures (Fig. 2). In many cases dN/dS are substantially greater than 1
240
(Table S1). In contrast, plastid genes encoding core components of the photosynthetic machinery
241
show little evidence of acceleration in the Sileneae (Fig. 4). Relative ratio tests confirmed that
242
there is a strong interaction between genes and species in the rate of sequence evolution. In
243
particular, trees derived from clpP, ycf1, and ycf2 exhibited highly significant conflicts in
244
branch-length proportionality with all other plastid genes (Fig. 5).
245 246
4. DISCUSSION
247
Our results demonstrate that the repeated rate accelerations in a single plastid gene (clpP)
248
observed across the tribe Sileneae extend to a genome-wide syndrome of parallel evolution in
249
plastid DNA sequence and structure (Erixon and Oxelman, 2008b). Although we cannot
12
250
determine whether the similar patterns of organelle genome divergence in S. conica and S.
251
noctiflora represent independent evolutionary events or changes in a common ancestor, we have
252
found that parallel accelerations have occurred within Lychnis and two Silene subgenera
253
(Behenantha and Silene). Although L. chalcedonica and S. paradoxa appear to be sister taxa in
254
our phylogenetic analysis, the results from broader sampling of cpDNA within the tribe Sileneae
255
confirm that the shared inversions, intron losses, and substitution rate accelerations in these
256
species occurred independently. Erixon and Oxelman (2008a, 2008b) reported that S. schafta and
257
S. pseudoatocion (two species within the same subgenus as S. paradoxa) have slowly evolving
258
clpP genes that retain their introns, indicating that the observed changes in this gene are not
259
ancestral. Likewise, sequences from the psaI-ycf4 region confirm that S. schafta and S.
260
pseudoatocion lack the inversion that is shared by S. paradoxa and L. chalcedonica.
261
Across deeper phylogenetic scales, similar combinations of structural instability and
262
elevated rates of amino substitution in the same plastid genes have occurred in scattered
263
angiosperm lineages (Jansen et al., 2007). The best-studied examples are found in the
264
Geraniaceae, a family exhibiting many patterns of plastid genome evolution that are similar to
265
our observations in the Sileneae (Chumley et al., 2006; Guisinger et al., 2008; Blazier et al.,
266
2011; Guisinger et al., 2011; Weng et al., 2012).
267
These repeated accelerations in sequence and structural evolution in angiosperm plastid
268
genomes suggest a common mechanism. There are many evolutionary forces – both adaptive and
269
non-adaptive – that simultaneously affect genome sequence and structure. These include changes
270
in rates of DNA damage, recombination, and repair or changes in the efficacy of selection
271
resulting from reduced effective population size (Ne). But many of the simplest explanations
13
272
involving a single mechanism seem inadequate in the face of the complex mixture of changes
273
observed in many of these genomes.
274
One possibility is a disruption in the nuclear-encoded DNA replication, recombination,
275
and repair machinery that regulates the plastid genome (Day and Madesis, 2007). Loss of genes
276
involved in recombination and double-stranded break repair can have significant effects on plant
277
organelle genome evolution. For example, genes such as MSH1 and RECA3 suppress
278
recombination between short repeats, and loss of these genes causes extensive genome
279
rearrangement (Shedge et al., 2007; Maréchal and Brisson, 2010; Xu et al., 2011). Interestingly,
280
we found that identical inversion events occurred independently at least three times in the
281
Sileneae (Fig. 2). This recurring inversion appears to be mediated by recombination between a
282
pair of small, imperfect repeats. Although this pair of repeats is present throughout the Sileneae,
283
it has not led to inversions in the more highly conserved plastid genomes of A. githago, S.
284
latifolia, and S. vulgaris. This suggests that, in addition to differences in the amount of repetitive
285
content (Table 3), variation in rates of plastid genome rearrangement may be driven by changes
286
in recombinational activity of existing repeats.
287
Recombination is intimately related to mismatch repair, so changes or disruption in
288
recombinational machinery may also affect point mutation rates. Although cases of “localized
289
hypermutation” have been proposed in plant organelle genomes (Sloan et al. 2009; Magee et al.
290
2010), it is not clear why increases in underlying mutation rates would be concentrated in the
291
same repeated subset of genes. In addition, differences in mutation rates cannot explain the
292
highly disproportional increases in nonsynonymous substitution rate, which instead suggest that
293
plastid genes are experiencing altered selection pressures – either relaxed purifying selection,
294
increased positive selection, or a combination of the two.
14
295
Many of the observed changes in plastid genome sequence and structure are consistent
296
with reduced intensity and/or efficiency of natural selection. For example, inefficient selection
297
resulting from a reduction in Ne could facilitate the accumulation of both genome rearrangements
298
and changes in amino acid sequence. Reduced Ne could also explain differential rate
299
accelerations across the genome if the distribution of fitness effects caused by nucleotide
300
substitutions (i.e., the fitness spectrum) differs among genes, because a variable fraction of sites
301
in each gene would be shifted into the nearly neutral category and become subject to drift (Ohta,
302
1992). Notably, some of the fastest-evolving genes in the Sileneae (accD, clpP, ycf1, and ycf2)
303
have been lost entirely in other angiosperm lineages that exhibit parallels to the observed patterns
304
of plastid genome evolution, including the Campanulaceae, Geraniaceae, Passifloraceae, and
305
Poaceae (Jansen et al. 2007). This supports the interpretation that these genes could be subject to
306
relaxed selection.
307
In and of itself, however, relaxed purifying selection generally cannot drive dN/dS above
308
the neutral value of 1 (cf. Lawrie et al., 2011). Nevertheless, a handful of Sileneae plastid genes
309
exhibit values that exceed 1—in some cases by very large margins (Table S1; Erixon and
310
Oxelman, 2008b; Sloan et al., 2012b). Therefore, although most of the observed changes in
311
plastid genome sequence and structure found in independent Sileneae lineages are consistent
312
with relaxed selection, strong positive selection appears to play at least some role. A complete
313
understanding of these recurring patterns of genomic change will likely involve multiple
314
evolutionary mechanisms.
315
There are no obvious ecological or physiological differences among the sampled Sileneae
316
species that would explain why their plastid genomes would be evolving under highly divergent
317
selection pressures. However, we still have a very incomplete understanding of the diverse
15
318
metabolic roles performed by plastids. None of the fast-evolving plastid genes in Sileneae have a
319
known function that is directly related to photosynthesis. The enzymes encoded by clpP and
320
accD are involved in protein metabolism and fatty acid biosynthesis, respectively (Peltier et al.,
321
2004; Kode et al., 2005; Stanne et al., 2009). The two largest genes in plastid genome, ycf1 and
322
ycf2, are known to be essential for cell viability (Drescher et al., 2000), but only recently has
323
there been any insight into the functional role of either gene, with the finding that ycf1 is
324
associated with an inner membrane complex involved in protein translocation (Kikuchi et al.,
325
2013). It has also been pointed out that the expression of fast-evolving non-photosynthetic genes
326
are largely under the control of a different RNA polymerase than their photosynthetic
327
counterparts, raising the possibility that interactions with transcriptional machinery might affect
328
rates of molecular evolution in the plastid genome (Guisinger et al., 2008), though the precise
329
mechanism behind such an effect remains unclear.
330
Another potential factor affecting rates of plastid genome evolution is the functional
331
interactions among the three genomic compartments that exist in plant cells. The history of rapid
332
mitochondrial evolution in S. conica, S. noctiflora, and members of the Geraniaceace (Parkinson
333
et al., 2005; Sloan et al., 2012a) raised the possibility of a causal mechanism linking divergent
334
evolutionary patterns in both organellar genomes (Sloan et al., 2012b). This correlation
335
apparently does not extend to every case of plastid genome acceleration in the Sileneae. For
336
example, rapid sequence evolution has been documented in some S. paradoxa mitochondrial
337
genes, but those increases have not occurred on a genome-wide scale (Sloan et al., 2009). In
338
addition, although L. chalcedonica exhibits a history of accelerated plastid genome evolution, its
339
mitochondrial genome does not exhibit increased rates of nucleotide substitution (unpublished
340
data). Therefore, if mitochondrial interactions play a role in the recurring “syndrome” of plastid
16
341
genome evolution observed in Sileneae and other angiosperms, they must involve lineage-
342
specific effects.
343 344 345
ACKNOWLEDGEMENTS
346
We would like to thank Luis Giménez and Kew Millenium Seed Bank for providing seeds for
347
this project and Nichole Peterson and the University of Minnesota’s Biomedical Genomics
348
Center for sequencing efforts. This work was supported by the National Science Foundation
349
(MCB-1022128).
350 351
REFERENCES
352
Bell, C.D., Soltis, D.E., and Soltis, P.S., 2010. The age and diversification of the angiosperms re-
353 354 355 356
revisited. Am. J. Bot. 97, 1296-1303. Blazier, J.C., Guisinger, M.M., and Jansen, R.K., 2011. Recent loss of plastid-encoded ndh genes within Erodium (Geraniaceae). Plant Mol. Biol. 76, 263-272. Chevreux, B., Wetter, T., and Suhai, S., 1999. Genome sequence assembly using trace signals
357
and additional sequence information. Computer Science and Biology: Proceedings of the
358
German Conference on Bioinformatics (GCB) 99 45-56.
359
Chumley, T.W., Palmer, J.D., Mower, J.P., Fourcade, H.M., Calie, P.J., Boore, J.L., and Jansen,
360
R.K., 2006. The complete chloroplast genome sequence of Pelargonium x hortorum:
361
organization and evolution of the largest and most highly rearranged chloroplast genome
362
of land plants. Mol. Biol. Evol. 23, 2175.
17
363
Day, A., and Madesis, P., 2007. DNA replication, recombination and repair in plastids. In Bock,
364
R. (Ed.), Topics in current genetics: Cell & molecular biology of plastids. Springer,
365
Berlin, Germany, pp. 65-119.
366
Drescher, A., Ruf, S., Calsa, T.,Jr, Carrer, H., and Bock, R., 2000. The two largest chloroplast
367
genome-encoded open reading frames of higher plants are essential genes. Plant J. 22, 97-
368
104.
369
Drouin, G., Daoud, H., and Xia, J., 2008. Relative rates of synonymous substitutions in the
370
mitochondrial, chloroplast and nuclear genomes of seed plants. Mol. Phylogenet. Evol.
371
49, 827-831.
372 373 374 375 376
Edgar, R.C., 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792-1797. Erixon, P. and Oxelman, B., 2008a. Reticulate or tree-like chloroplast DNA evolution in Sileneae (Caryophyllaceae)? Mol. Phylogenet. Evol. 48, 313-325. Erixon, P. and Oxelman, B., 2008b. Whole-gene positive selection, elevated synonymous
377
substitution rates, duplication, and indel evolution of the chloroplast clpP1 gene. PLoS
378
ONE 3, e1386.
379
Frajman, B., Eggens, F., and Oxelman, B., 2009. Hybrid origins and homoploid reticulate
380
evolution within Heliosperma (Sileneae, Caryophyllaceae) – a multigene phylogenetic
381
approach with relative dating. Systematic Biology 58, 328-345.
382 383
Greenberg, A.K. and Donoghue, M.J., 2011. Molecular systematics and character evolution in Caryophyllaceae. Taxon 60, 1637-1652.
18
384
Guisinger, M.M., Kuehl, J.V., Boore, J.L., and Jansen, R.K., 2008. Genome-wide analyses of
385
Geraniaceae plastid DNA reveal unprecedented patterns of increased nucleotide
386
substitutions. Proc. Natl. Acad. Sci. 105, 18424-18429.
387
Guisinger, M.M., Kuehl, J.V., Boore, J.L., and Jansen, R.K., 2011. Extreme reconfiguration of
388
plastid genomes in the angiosperm family Geraniaceae: rearrangements, repeats, and
389
codon usage. Mol. Biol. Evol. 28, 583-600.
390
Hardison, R.C., Roskin, K.M., Yang, S., Diekhans, M., Kent, W.J., Weber, R., Elnitski, L., Li, J.,
391
O'Connor, M., Kolbe, D., et al, 2003. Covariation in frequencies of substitution, deletion,
392
transposition, and recombination during eutherian evolution. Genome Res. 13, 13-26.
393
Jansen, R.K., Cai, Z., Raubeson, L.A., and Daniell, H., 2007. Analysis of 81 genes from 64
394
plastid genomes resolves relationships in angiosperms and identifies genome-scale
395
evolutionary patterns. Proc. Natl. Acad. Sci. 104, 19369.
396
Jansen, R.K., Raubeson, L.A., Boore, J.L., dePamphilis, C.W., Chumley, T.W., Haberle, R.C.,
397
Wyman, S.K., Alverson, A.J., Peery, R., Herman, S.J., et al, 2005. Methods for obtaining
398
and analyzing whole chloroplast genome sequences. Methods Enzymol. 395, 348-384.
399 400 401
Jeffroy, O., Brinkmann, H., Delsuc, F., and Philippe, H., 2006. Phylogenomics: the beginning of incongruence? Trends Genet. 22, 225-231. Kikuchi, S., Bedard, J., Hirano, M., Hirabayashi, Y., Oishi, M., Imai, M., Takase, M., Ide, T.,
402
and Nakai, M., 2013. Uncovering the protein translocon at the chloroplast inner envelope
403
membrane. Science 339, 571-574.
404 405
Kode, V., Mudd, E.A., Iamtham, S., and Day, A., 2005. The tobacco plastid accD gene is essential and is required for leaf development. Plant J. 44, 237-244.
19
406
Lawrie, D.S., Petrov, D.A., Messer, P.W., 2011. Faster than neutral evolution of constrained
407
sequences: the complex interplay of mutational biases and weak selection. Genome Biol.
408
Evol. 3, 383-395.
409
Li, R., Yu, C., Li, Y., Lam, T.W., Yiu, S.M., Kristiansen, K., and Wang, J., 2009. SOAP2: an
410
improved ultrafast tool for short read alignment. Bioinformatics 25, 1966-1967.
411
Lohse, M., Drechsel, O., Kahlau, S., and Bock, R., 2013. OrganellarGenomeDRAW—a suite of
412
tools for generating physical maps of plastid and mitochondrial genomes and visualizing
413
expression data sets. Nucleic Acids Res. 41, W575-581.
414
Magee, A.M., Aspinall, S., Rice, D.W., Cusack, B.P., Sémon, M., Perry, A.S., Stefanović, S.,
415
Milbourne, D., Barth, S., Palmer, J.D., Gray, J.C., Kavanagh, T.A., and Wolfe, K.H.
416
2010. Localized hypermutation and associated gene losses in legume chloroplast
417
genomes. Genome Res. 20, 1700–1710.
418 419 420 421
Maréchal, A., and Brisson, N., 2010. Recombination and the maintenance of plant organelle genome stability. New Phytol. 186, 299-317. Milne, I., Bayer, M., Cardle, L., Shaw, P., Stephen, G., Wright, F., and Marshall, D., 2010. Tablet--next generation sequence assembly visualization. Bioinformatics 26, 401-402.
422
Mower, J.P., Touzet, P., Gummow, J.S., Delph, L.F., and Palmer, J.D., 2007. Extensive variation
423
in synonymous substitution rates in mitochondrial genes of seed plants. BMC Evol. Biol.
424
7, 135.
425 426 427 428
Muse, S.V. and Gaut, B.S., 1997. Comparing patterns of nucleotide substitution rates among chloroplast loci using the relative ratio test. Genetics 146, 393-399. Ohta, T., 1992. The nearly neutral theory of molecular evolution. Annu. Rev. Ecol. Syst. 23, 263-286.
20
429 430 431
Palmer, J.D., 1986. Isolation and structural analysis of chloroplast DNA. Meth. Enzymol. 118, 167-186. Parkinson, C.L., Mower, J.P., Qiu, Y.L., Shirk, A.J., Song, K., Young, N.D., DePamphilis, C.W.,
432
and Palmer, J.D., 2005. Multiple major increases and decreases in mitochondrial
433
substitution rates in the plant family Geraniaceae. BMC Evol. Biol. 5, 73.
434
Peltier, J.B., Ripoll, D.R., Friso, G., Rudella, A., Cai, Y., Ytterberg, J., Giacomelli, L., Pillardy,
435
J., and van Wijk, K.J., 2004. Clp protease complexes from photosynthetic and non-
436
photosynthetic plastids and mitochondria of plants, their predicted three-dimensional
437
structures, and functional implications. J. Biol. Chem. 279, 4768-4781.
438
Pond, S.L.K., Frost, S.D.W., and Muse, S.V., 2005. HyPhy: hypothesis testing using
439 440
phylogenies. Bioinformatics 21, 676-679. Raubeson, L.A. and Jansen, R.K. 2005. Chloroplast genomes of plants. In: Henry, R.J. (Ed.),
441
Plant diversity and evolution: genotypic and phenotypic variation in higher plants. CABI,
442
Wallingford, UK, pp. 45-68.
443
Rautenberg, A., Sloan, D.B., Aldén, V., and Oxelman, B., 2012. Phylogenetic relationships of
444
Silene multinervia and Silene section Conoimorpha (Caryophyllaceae). Systematic Bot.
445
37, 226-237.
446
Ronquist, F., Teslenko, M., van der Mark, P., Ayres, D.L., Darling, A., Hohna, S., Larget, B.,
447
Liu, L., Suchard, M.A., and Huelsenbeck, J.P., 2012. MrBayes 3.2: efficient Bayesian
448
phylogenetic inference and model choice across a large model space. Syst. Biol. 61, 539-
449
542.
450 451
Sato, S., Nakamura, Y., Kaneko, T., Asamizu, E., and Tabata, S., 1999. Complete structure of the chloroplast genome of Arabidopsis thaliana. DNA Res. 6, 283-290.
21
452
Schmitz-Linneweber, C., Maier, R.M., Alcaraz, J.P., Cottet, A., Herrmann, R.G., and Mache, R.,
453
2001. The plastid chromosome of spinach (Spinacia oleracea): complete nucleotide
454
sequence and gene organization. Plant Mol. Biol. 45, 307-315.
455
Shao, R., Dowton, M., Murrell, A., and Barker, S.C., 2003. Rates of gene rearrangement and
456
nucleotide substitution are correlated in the mitochondrial genomes of insects. Mol. Biol.
457
Evol. 20, 1612-1619.
458
Shedge, V., Arrieta-Montiel, M., Christensen, A.C., and Mackenzie, S.A., 2007. Plant
459
mitochondrial recombination surveillance requires unusual RecA and MutS homologs.
460
Plant Cell 19, 1251-1264.
461
Sloan, D.B., Alverson, A.J., Chuckalovcak, J.P., Wu, M., McCauley, D.E., Palmer, J.D., and
462
Taylor, D.R., 2012a. Rapid evolution of enormous, multichromosomal genomes in
463
flowering plant mitochondria with exceptionally high mutation rates. PLoS Biol. 10,
464
e1001241.
465
Sloan, D.B., Alverson, A.J., Wu, M., Palmer, J.D., and Taylor, D.R., 2012b. Recent acceleration
466
of plastid sequence and structural evolution coincides with extreme mitochondrial
467
divergence in the angiosperm genus Silene. Genome Biol. Evol. 306.
468
Sloan, D.B., Oxelman, B., Rautenberg, A., and Taylor, D.R., 2009. Phylogenetic analysis of
469
mitochondrial substitution rate variation in the angiosperm tribe Sileneae
470
(Caryophyllaceae). BMC Evol. Biol. 9, 260.
471 472
Stamatakis, A., 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688-2690.
22
473
Stanne, T.M., Sjogren, L.L., Koussevitzky, S., and Clarke, A.K., 2009. Identification of new
474
protein substrates for the chloroplast ATP-dependent Clp protease supports its
475
constitutive role in Arabidopsis. Biochem. J. 417, 257-268.
476
Tian, D., Wang, Q., Zhang, P., Araki, H., Yang, S., Kreitman, M., Nagylaki, T., Hudson, R.,
477
Bergelson, J., and Chen, J.Q., 2008. Single-nucleotide mutation rate increases close to
478
insertions/deletions in eukaryotes. Nature 455, 105-108.
479
Weng, M.-L., Ruhlman, T.A., Gibby, M., and Jansen, R.K., 2012. Phylogeny, rate variation, and
480
genome size evolution of Pelargonium (Geraniaceae). Mol. Phylogenet. Evol. 64, 654-
481
670.
482
Wolfe, K.H., Li, W.H., and Sharp, P.M., 1987. Rates of nucleotide substitution vary greatly
483
among plant mitochondrial, chloroplast, and nuclear DNAs. Proc. Natl. Acad. Sci. 84,
484
9054-9058.
485 486 487
Wyman, S.K., Jansen, R.K., and Boore, J.L. 2004. Automatic annotation of organellar genomes with DOGMA. Bioinformatics 20, 3252–3255. Xu, Y.Z., Arrieta-Montiel, M.P., Virdi, K.S., de Paula, W.B.M., Widhalm, J.R., Basset, G.J.,
488
Davila, J.I., Elthon, T.E., Elowsky, C.G., Sato, S.J., et al, 2011. MutS HOMOLOG1 is a
489
nucleoid protein that alters mitochondrial and plastid properties and plant response to
490
high light. Plant Cell 23, 3428-3441.
491
Xu, W., Jameson, D., Tang, B., and Higgs, P.G., 2006. The relationship between the rate of
492
molecular evolution and the rate of genome rearrangement in animal mitochondrial
493
genomes. J. Mol. Evol. 63, 375-392.
494 495
Yang, Z., 2007. PAML 4: Phylogenetic Analysis by Maximum Likelihood. Mol. Biol. Evol. 24, 1586-1591.
23
496 497 498 499
Zerbino, D.R. and Birney, E., 2008. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821-829. Zhang, Z., Schwartz, S., Wagner, L., and Miller W., 2000. A greedy algorithm for aligning DNA sequences. J. Comput. Biol. 7:203-214.
500
Zhu, L., Wang, Q., Tang, P., Araki, H., and Tian, D., 2009. Genomewide association between
501
insertions/deletions and the nucleotide diversity in bacteria. Mol. Biol. Evol. 26, 2353-
502
2361.
24
Table 1. Source material for plastid genome sequencing Species Agrostemma githago L.
Source Giles County, VA, USA
Collected/Provided By Stephanie Goodrich
GenBank Accession KF527884
Lychnis chalcedonica L. Silene conoidea L. Silene paradoxa L.
Frankfurt Botanical Garden, Germany Royal Botanic Gardens, Kew, UK Lamole, Italy
Luis Giménez Millennium Seed Bank (0002015) Michael Hood
KF527886 KF527885 KF527887
Table 2. Summary of all sequenced plastid genomes in the tribe Sileneae Speciesa
Size (bp)
IR Size (bp)
GC Content (%)
Genes (protein/tRNA/rRNA)b
Intronsb
Agrostemma githago Lychnis chalcedonica Silene paradoxa Silene conoidea Silene conica Silene noctiflora Silene latifolia Silene vulgaris
151,733 148,081 151,632 147,896 147,208 151,639 151,736 151,583
25,440 23,540 25,454 26,828 26,858 29,891 25,906 26,008
36.4 36.3 36.6 36.0 36.1 36.5 36.4 36.3
77 / 30 / 4 78 / 30 / 4 77 / 30 / 4 77 / 30 / 4 77 / 30 / 4 77 / 30 / 4 77 / 30 / 4 77 / 30 / 4
20 18 18 17 16 16 20 20
a
Bolded species names indicate genomes that were sequenced for this study. Gene and intron totals do not include identical duplicates in the IR, but the L. chalcedonica gene count does include both divergent clpP copies. b
Table 3. Summary of repetitive sequence content in Sileneae plastid genomes Size (bp)a
Repetitive Sequence (bp)a
Tandem Repeats (bp)a
Agrostemma githago
126,293
1238
433
Lychnis chalcedonica Silene paradoxa Silene conoidea
124,541 126,178 121,068
4234 2023 1433
940 1369 687
Silene conica Silene noctiflora Silene latifolia
120,350 121,748 125,830
1550 2730 956
637 1031 344
Silene vulgaris
125,575
890
548
Species
a
Reported genome sizes and repeat contents excludes one copy of the large IR
FIGURE LEGENDS Figure 1. Phylogenetic relationships within the tribe Sileneae inferred from a concatenation of all plastid protein-coding genes except accD, clpP, ycf1, ycf2, and ribosomal protein genes. Analyses are based on either all sites (A) or only first and second codon positions (B). Values at each node indicate bootstrap support and Bayesian posterior probabilities.
Figure 2. History of inversions and intron losses in the evolution of Sileneae plastid genomes. Values below each branch indicate dN/dS calculated for a concatenation of all plastid proteincoding genes (except accD). Although L. chalcedonica and S. paradoxa may form a sister group within this tree (Fig. 1), plastid gene sequences from related species indicate that the inversions and intron losses in these lineages were independent events (Erixon and Oxelman, 2008a). If S. noctiflora, S. conica, and S. conoidea form a monophyletic group (Fig. 1b), it is possible that some of their intron losses are the result of shared ancestral events.
Figure 3. Shift in IR boundaries. Thick lines represent IRs, and thin lines represent the adjacent large single-copy (LSC) and small single-copy (SSC) regions. Species with accelerated rates of sequence and structural evolution in the plastid genome are highlighted in black.
Figure 4. Nonsynonymous sequence divergence in plastid protein-coding genes and concatenated gene sets. Branch lengths in all trees are drawn to the same scale based on the number of nonsynonymous substitutions per site. Gray shading highlights species with accelerated rates of sequence and structural evolution in the plastid genome.
26
Figure 5. Summary of relative ratio tests between pairs of plastid genes and concatenated gene sets. Darker shading indicates stronger disproportionality in the relative branch lengths within the corresponding trees. Shading is based on the likelihood ratio test statistic. Values above 40 are significant at the α = 0.05 level after Bonferroni correction for 210 pairwise tests. Cells above and below the diagonal are based on nonsynonymous and synonymous sequence divergence, respectively.
27
A
Arabidopsis thaliana
B
Arabidopsis thaliana
Spinacia oleracea
Spinacia oleracea
Agrostemma githago 78/1
100/1
Lychnis chalcedonica
Silene noctiflora
86/1
Silene latifolia
96/1 100/1
Silene conica Silene conoidea
100/1
0.01
Lychnis chalcedonica
Silene paradoxa
100/1
Silene vulgaris
100/1
73/0.99
100/1
Silene paradoxa
100/1
0.01
Agrostemma githago
Silene vulgaris Silene latifolia
30/- 100/1 60/0.59
Silene conoidea Silene conica Silene noctiflora
Agrostemma githago
0.16 A B G
Lychnis chalcedonica
0.56 A G
Silene paradoxa
0.55
0.16
Silene latifolia
0.17
Silene vulgaris
0.20 C D E F G H I 0.19
Silene noctiflora
0.80 J A G H
0.21
Silene conica
0.85 0.12 INVERSIONS A. psaA-ycf3 : psaI-ycf4 B. ycf3-trnS : rbcL-accD C. psbM-trnD : trnE-trnT D. accD-psaI : clpP-psbB E. psbM-trne : accD-clpP F. trnT-psbD : psbE-petL
Silene conoidea INTRON LOSSES G. clpP-1 & clpP-2 H. rpoC1 I. rpl16 J. atpF
rps3 rpl22 rps19 rpl2
trnI
ycf2
ndhH
rps15
ycf1
Agrostemma Lychnis S. paradoxa S. latifolia S. vulgaris S. noctiflora S. conica S. conoidea
LSC-IRA
SSC-IRB
ATP Synthase
NADH-Plastoquinone Oxidoreductase
Cytochrome b6f
Large Small RNA Ribosomal Ribosomal Polymerase Subunit Subunit
Photosystem I
ycf1
Photosystem II
ycf2
0.02 Arabidopsis thaliana Spinacia oleracea Agrostemma githago Lychnis chalcedonica Silene paradoxa Silene latifolia Silene vulgaris Silene conica Silene conoidea Silene noctiflora
Arabidopsis thaliana Spinacia oleracea Agrostemma githago Lychnis chalcedonica Silene paradoxa Silene latifolia Silene vulgaris Silene conica Silene conoidea Silene noctiflora
clpP Arabidopsis thaliana Spinacia oleracea Agrostemma githago Lychnis chalcedonica Silene paradoxa Silene latifolia Silene vulgaris Silene conica Silene conoidea Silene noctiflora
clpP
ycf2
ycf1
Ribosomal Small Subunit
Ribosomal Large Subunit
RNA Polymerase
rbcL
Photosystem II
Photosystem I
Cytochrome b6f
NADH Dehydrogenase
matK
cemA
0
ccsA
ATP Synthase
1000
ATP Synthase ccsA cemA matK Nonsynonymous Sites
NADH Dehydrogenase Cytochrome b6f Photosystem I Photosystem II rbcL RNA Polymerase Ribosomal Large Subunit Ribosomal Small Subunit ycf1 ycf2 clpP Synonymous Sites
Highlights
. . . . .
Multiple species in the angiosperm tribe Sileneae harbor divergent plastid genomes. At least three lineages exhibit parallel increases in the rate of genome evolution. These lineages have elevated substitution rates in the same subset of plastid genes. Independent lineages also share identical inversions and intron losses. Multiple mechanisms are likely responsible for this repeated evolutionary pattern.
28
Graphical abstract
29