Pagination not final (cite DOI) / Pagination provisoire (citer le DOI) 1

ARTICLE

Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 03/09/15 For personal use only.

Genome size evolution in Ontario ferns (Polypodiidae): evolutionary correlations with cell size, spore size, and habitat type and an absence of genome downsizing Thomas A. Henry, Jillian D. Bainard, and Steven G. Newmaster

Abstract: Genome size is known to correlate with a number of traits in angiosperms, but less is known about the phenotypic correlates of genome size in ferns. We explored genome size variation in relation to a suite of morphological and ecological traits in ferns. Thirty-six fern taxa were collected from wild populations in Ontario, Canada. 2C DNA content was measured using flow cytometry. We tested for genome downsizing following polyploidy using a phylogenetic comparative analysis to explore the correlation between 1Cx DNA content and ploidy. There was no compelling evidence for the occurrence of widespread genome downsizing during the evolution of Ontario ferns. The relationship between genome size and 11 morphological and ecological traits was explored using a phylogenetic principal component regression analysis. Genome size was found to be significantly associated with cell size, spore size, spore type, and habitat type. These results are timely as past and recent studies have found conflicting support for the association between ploidy/genome size and spore size in fern polyploid complexes; this study represents the first comparative analysis of the trend across a broad taxonomic group of ferns. Key words: genome size, genome downsizing, phenotype, principal component regression, phylogenetic generalized least squares. Résumé : Il est reconnu que la taille du génome est corrélée avec plusieurs caractères chez les angiospermes, mais moins de choses sont connues au sujet de la corrélation entre les phénotypes et la taille du génome chez les fougères. Les auteurs ont exploré la variation de la taille du génome en rapport avec un ensemble de caractères morphologiques et écologiques chez les fougères. Trente-six taxons de fougères ont été échantillonnés a` partir de populations sauvages situées en Ontario, au Canada. Le contenu en ADN 2C a été mesuré par cytométrie en flux. Les auteurs ont vérifié s’il y avait réduction de la taille du génome en réponse a` la polyploïdie au moyen d’une analyse phylogénétique comparative permettant d’explorer la corrélation entre le contenu en ADN 1C et la ploïdie. Aucune évidence convaincante d’une réduction généralisée de la taille du génome en réponse a` la polyploïdie n’a été trouvée au cours de l’évolution des fougères présentes en Ontario. Les relations entre la taille du génome et 11 caractères morphologiques ou écologiques ont été explorées au moyen d’une analyse phylogénétique de régression en composantes principales. La taille du génome était significativement associée a` la taille des cellules, la taille des spores, le type de spores et le type d’habitat. Ces résultats arrivent a` point car des études antérieures et récentes sont arrivées a` des conclusions opposées en ce qui a trait a` l’association entre la ploïdie/taille du génome et la taille des spores chez des complexes de fougères polyploïdes. Cette étude constitue la première analyse comparative de la tendance parmi un large groupe taxonomique de fougères. [Traduit par la Rédaction] Mots-clés : taille du génome, réduction de la taille du génome, phénotype, régression des composantes principales, analyse phylogénétique par la méthode des moindres carrés généralisée.

Introduction Nuclear DNA content varies substantially between species and groups of organisms, but in most cases it is relatively constant within a species (Gregory 2005). It was once thought that plant lineages would proceed on a “one-way path to genomic obesity” due to the possibility of rapid genomic expansion via polyploidy and retrotransposon amplification (Bennetzen and Kellogg 1997), but it is now recognized that genome size evolution in plants is considerably more dynamic, and that both increases and decreases are common (Wendel et al. 2002; Hawkins et al. 2008). A number of different mechanisms can bring about changes in genome size through evolutionary time; these include genome duplications, transposable element activity, recombinatorial mechanisms and double-stranded break repair, insertion–deletion biases, and

large deletion events such as whole chromosome losses (Kirik et al. 2000; Bennetzen 2002; Petrov 2002; Leitch and Bennett 2004; Gregory 2004; Hawkins et al. 2009; Ågren and Wright 2011). It was established relatively early that genome size does not correlate with gene number or organismal complexity in multicellular eukaryotes (Mirsky and Ris 1951; Thomas 1971), and thus it is difficult to explain the huge amount of genome size variation observed among extant eukaryotic species (the “C-value enigma”; Gregory 2001a). It has been suggested that genome size evolution is driven primarily by the accumulation of noncoding DNA due to drift and intragenomic selection, or alternatively that genome size is under selection for its effects on cell size, cell metabolism, and other morphological and ecological traits (Gregory 2001a).

Received 1 June 2014. Accepted 6 February 2015. Corresponding Editor: T. Schwarzacher. T.A. Henry, J.D. Bainard, and S.G. Newmaster. Centre for Biodiversity Genomics, Department of Integrative Biology, University of Guelph, Guelph, ON N1G 2W1, Canada. Corresponding author: Thomas A. Henry (e-mail: [email protected]). Genome 57: 1–12 (2014) dx.doi.org/10.1139/gen-2014-0090

Published at www.nrcresearchpress.com/gen on xx xxx xxxx.

Pagination not final (cite DOI) / Pagination provisoire (citer le DOI)

Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 03/09/15 For personal use only.

2

It is becoming increasingly evident that, whatever the proposed force(s) behind genome size evolution, genome size variation at the species level has significant phenotypic consequences. A strong positive relationship between nuclear DNA content and cell volume exists in most groups of organisms, and this relationship has important consequences for cellular metabolism, cell cycle duration, and developmental time (Van’t Hof and Sparrow 1963; Bennett 1971; Cavalier-Smith 1978; Gregory 2001b). This has been referred to as the nucleotypic effect: nuclear DNA content affects the “nucleotype”, and the nucleotype and genotype act together to affect the phenotype (Bennett 1971; Gregory 2001a). In angiosperms, genome size is associated with guard cell length and density (Beaulieu et al. 2008). Genome size is also correlated with other traits such as seed mass and leaf mass per unit area (Beaulieu et al. 2007; Knight and Beaulieu 2008). In general, the effects on gross morphology are weaker than those at the cellular level; that is, there are diminishing effects of genome size at “higher phenotypic scales” (Knight and Beaulieu 2008). The effects of nuclear DNA content also manifest at the ecological and evolutionary scales. In plants, DNA content is negatively correlated with minimum generation time and life cycle duration through its effects on cell size and cell cycle duration (Bennett 1972, 1987). There is also some evidence that plant species with large genomes are excluded from extreme environments and from regions with shorter growing seasons (Knight and Ackerly 2002; Knight et al. 2005). Other research suggests that endangered plant species tend to have large genomes and that plant lineages with larger genomes are less diverse on average than those with smaller genomes (Vinogradov 2003; Knight et al. 2005). In addition, there appears to be contrasting effects of genome size and ploidy level on plant invasiveness: plant invasiveness is positively correlated with chromosome number and ploidy level but is negatively correlated with genome size (Pandit et al. 2014). These trends have led some to postulate that there are upper limits on genome size that act to oppose genome size expansion caused by polyploidy and retrotransposon amplification (the “large genome constraint hypothesis”; Knight et al. 2005). Leptosporangiate ferns (Polypodiidae) are a particularly interesting group in which to study genome size evolution, due in part to the large genomes and unique life cycle characteristics that are typical of this group. Ferns are currently underrepresented in genome size research, as genome size estimates exist for less than 1% of extant fern species. The mean 2C DNA content in leptosporangiate ferns (19.15 pg; n = 80) is substantially larger than the mean 2C DNA content in angiosperms (11.59 pg; n = 7541) (Plant DNA C-values Database: Bennett and Leitch 2012), as is the mean chromosome number (2n = 110.54 in pteridophytes vs. 2n = 31.98 in angiosperms) (Klekowski and Baker 1966). The high chromosome numbers and large genome sizes observed in most fern species are thought to be the result of ancient polyploidy events followed by gene silencing and diploidization without substantial DNA or chromosome loss (reviewed in Soltis and Soltis 1987; Nakazato et al. 2008; Barker and Wolf 2010), although strong evidence to support this hypothesis is lacking (Nakazato et al. 2008). Interestingly, endopolyploidy (ploidy level variation within an individual) seems to be absent from the sporophytic tissues of ferns (Bainard et al. 2011) but is common in most mosses and in certain angiosperm families (Barow and Meister 2003; Bainard and Newmaster 2010; Leitch and Leitch 2012). Furthermore, the fern life cycle is significantly different than the angiosperm life cycle: in ferns, dispersal occurs via airborne spores, the gametophytic stage is free-living, and fertilization occurs via flagellate sperm that must swim through water prior to fertilization. Given the unique aspects of the fern life cycle along with their large genomes, ferns are a particularly intriguing group within which to study genome size evolution. The importance of DNA removal following polyploidy in fern genome size evolution is unclear. Genome downsizing following

Genome Vol. 57, 2014

polyploidy is a widespread phenomenon in angiosperms and can be detected across angiosperms as a whole (Leitch and Bennett 2004) as well as within specific genera (e.g., Artemisia; Pellicer et al. 2010). Genome downsizing is apparent as a less-than-proportional relationship between ploidy level and 2C DNA content or alternatively as a negative relationship between ploidy level and 1Cx DNA content (1Cx = 2C/ploidy, i.e., the mean size of a basic genome within a polyploid; Greilhuber et al. 2005). In ferns, 2C DNA content seems to increase linearly with chromosome number and ploidy level (Nakazato et al. 2008; Bainard et al. 2011), which implies that chromosome size is relatively conserved in ferns (Nakazato et al. 2008) and suggests that the rate of DNA removal following genome duplication events is relatively low (Leitch and Leitch 2012). However, this hypothesis has not been statistically tested in ferns. Furthermore, a thorough review of the literature suggests that a phylogenetic comparative method to test for genome downsizing has not been applied in any group of plants. This is problematic for several reasons. First, if there is substantial variation in mean 1Cx DNA content between lineages, the evidence for genome downsizing may be partially obscured by this background noise. Second, there is some evidence that polyploids tend to occur in lineages with smaller ancestral 1Cx DNA values (Grif 2000; Peruzzi et al. 2009), and this also could generate an observed correlation between ploidy level and 1Cx DNA content, when in fact no genome downsizing need occur. If there is sufficient phylogenetic signal in 1Cx DNA content, then applying a phylogenetic correction should reduce the signal generated by the nonrandom distribution of polyploidy in relation to 1Cx DNA content. In ferns, it has long been assumed that spore size is related to ploidy level (Barrington et al. 1986; Tryon and Lugardon 1991). This could be driven by a relationship between spore size and genome size, assuming that species with higher ploidy levels have larger genomes. However, there are some exceptions to this trend, where in some cases spore size does not appear to correlate with ploidy level (Barrington et al. 1986; Tryon and Lugardon 1991), and it is unclear if this is due to variation in genome size that is independent of ploidy level or if there are other factors that are more important for explaining spore size variation in ferns. Dyer et al. (2013) explored the correlation between genome size and spore size in the Asplenium monanthes fern complex and found that while spore size was significantly correlated with genome size in an ordinary least squares regression analysis, this trend was not significant when adjusting for phylogenetic relatedness using phylogenetic independent contrasts. Similar results have been found for pollen size in seed plants — Knight et al. (2010) examined 464 seed plants and found that pollen size was significantly correlated with genome size in an ordinary least squares analysis but not in a phylogenetic independent contrasts analysis. Knight et al. (2010) attributed this result to a single large divergence in genome size and pollen size between gymnosperms and angiosperms. These results suggest that genome size may not be the primary factor contributing to spore size variation in ferns, but this has never been tested in a phylogenetic context for a broad taxonomic group of ferns. The objectives of this study were thus twofold: first, to test for genome downsizing following polyploidy using a phylogenetic comparative method; and second, to explore the morphological and ecological correlates of genome size in Ontario ferns. We undertook a survey of the fern flora (Polypodiidae) of southern Ontario and obtained genome size estimates using flow cytometry for 34 species and 2 hybrid taxa, along with data on 11 morphological and ecological traits, and sequence data from three plastid regions. Genome downsizing was explored using a phylogenetic generalized least squares (PGLS) approach to test for a negative correlation between 1Cx DNA content and ploidy. The morphological and ecological correlates of genome size in Ontario ferns were explored by using a multivariate approach to characterize genome Published by NRC Research Press

Pagination not final (cite DOI) / Pagination provisoire (citer le DOI) Henry et al.

size variation in relation to the principal axes of variation in multivariate trait space using a phylogenetic principal component regression analysis.

Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 03/09/15 For personal use only.

Materials and methods Sample collection Fern species were collected from 11 different locations across southern Ontario, Canada, during the period from July to September 2011 (see Supplemental Table S11 for more information). For each individual, a voucher specimen consisting of a frond with mature sori was taken and used for identification purposes and spore length measurements. Healthy frond tissue devoid of sori was also collected, stored on moist paper towel, and kept refrigerated at 4 °C for up to one week for use in genome size analysis. A small piece of frond tissue was desiccated and stored on silica beads in preparation for DNA extraction. Voucher specimens are accessioned in the OAC (BIO) Herbarium at the University of Guelph. The classification of the fern species follows Smith et al. (2006) with updates by Christenhusz et al. (2011) and Rothfels et al. (2012b). Genome size and cytological data Samples were prepared for flow cytometry using the method established by Bainard et al. (2011). In brief, frond tissue was cochopped with standards of known genome size (see Supplemental ˇ et al. 1989) Table S1 for the standards used) in LB01 buffer (Dolezel with 1% polyvinylpyrrolidone 40 (PVP-40, Sigma-Aldrich), 100 ␮g/mL propidium iodide (Sigma-Aldrich), and 50 ␮g/mL RNase A. The solution was filtered through a 30 ␮m filter and stained on ice in the dark for 50–80 min. An additional 1% PVP-40 was added to poor quality samples (particularly Dryopteris spp.), as this was found to improve sample quality. The relative fluorescent intensities of at least 1000 nuclei from both the fern sample and the standard were measured on a linear scale using a Partec CyFlow SL (Partec, Münster, Germany) equipped with a blue solid-state laser tuned at 20 mW and operating at 488 nm. Quality was monitored daily using 3 ␮m calibration beads (Partec, Münster, Germany). The data were analyzed using Partec FlowMax software (version 2.52, 2007). 2C DNA content was calculated as 2C = [2C sample peak mean/2C standard peak mean] × standard 2C (pg), 1C DNA content was calculated as 1C = 2C/2, and 1Cx DNA content was calculated as 1Cx = 2C/ploidy (Greilhuber et al. 2005). For each individual, three independent genome size measurements were taken on separate days and then averaged. To obtain a single estimate for each species, genome size estimates were averaged across individuals; a list of the individuals collected for each species can be found in Supplemental Table S1. Chromosome counts and ploidy estimates for each species were compiled from a variety of previously published studies. The cytology of North American fern species was extensively surveyed in the second half of the 20th century and in almost all cases published chromosome counts were available for Ontario cytotypes of the fern species collected. Ploidy was determined for each species by comparing the published chromosome counts for each species against the basic chromosome number for the genus as reported in the Flora of North America Volume 2 (Flora of North America Editorial Committee 1993), i.e., ploidy = 2n/x, where x is the basic chromosome number for the genus. Some fern species are known to show ploidy-level variation across their ranges (Britton 1974) and this could be problematic because incorrect assumptions about ploidy level could lead to erroneous estimates of 1Cx DNA content for those species. However, this did not appear to be a problem in this study; across the dataset, the predicted ploidy levels were in strong agreement with the 2C genome size estimates. The only species collected with known ploidy level varia-

1

3

tion across its range in Ontario was Asplenium trichomanes, but these specimens were confidently identified to subsp. quadrivalens (tetraploid) based on morphology and habitat type, and their genome size estimates matched their predicted ploidy levels compared to other published genome size estimates for the genus. Thus, we are confident that the predicted ploidy levels accurately reflect the cytology of the individuals collected for this study. Trait data Data were compiled for 11 morphological and ecological traits. These included four continuous traits (stomatal length, spore length, frond length, frond width) and seven categorical traits (habitat type, light preference, boreal distribution, rarity, type of reproduction, persistent evergreen fronds, spore type). Data on the traits were compiled from a combination of measurements, field observations, and data taken from the Ferns and Fern Allies of Canada (Cody and Britton 1989), the Flora of North America Volume 2 (Flora of North America Editorial Committee 1993), and the Flora of Ontario Integrated Botanical Information System (FOIBIS; Newmaster and Ragupathy 2012). Information on how species were scored for each trait can be found in the Supplemental Methods. Sequence data For each species included in this study, sequence data for the plastid regions rbcL, atpA, and matK were obtained either by sequencing directly or from GenBank. For each species, rbcL and atpA were sequenced from one individual; information on the DNA extraction, PCR amplification, and sequencing protocols is provided in the Supplemental Methods. For both rbcL and atpA a small number of species consistently failed to amplify and data for these species were obtained from GenBank when available. Due to the difficulty in obtaining universal matK primers for ferns (CBOL Plant Working Group 2009; Li et al. 2011), matK was not sequenced directly for this study and sequence data were instead obtained from GenBank, when available, for each species. GenBank accession numbers for the sequences used in this study are provided in Supplemental Table S3. Sequences were aligned using ClustalW 2.1 (Larkin et al. 2007) with the default settings, and the alignment was manually adjusted where necessary using BioEdit 7.1.3 (Hall 1999). The noncoding regions flanking atpA and the 5= and 3= ends of the matK coding region were removed prior to phylogenetic analysis because these regions contained large amounts of missing data and (or) were difficult to align across the species included in this study. Three additional taxa, Botrychium virginianum, Equisetum arvense, and Equisetum hyemale, were included as outgroups to root the tree. Phylogenetic reconstruction For each region, the best-fitting nucleotide substitution model was chosen from the 24 nucleotide substitution models implemented in MrModeltest 2.3 (Nylander 2004). Phylogenetic analyses were performed in MrBayes 3.2.1 (Ronquist et al. 2012) using a Bayesian Metropolis-coupled Markov chain Monte Carlo (MCMC) analysis. First, preliminary analyses were performed on each of the single gene datasets. As the resulting topologies were largely congruent and differed mostly in support for conflicting nodes (trees not shown), the combined three-gene dataset was analyzed with each gene assigned a separate model of nucleotide substitution (GTR + ⌫ + I for all three regions) and with the model parameters (including overall rate) unlinked across the partitions. The final Bayesian MCMC analysis was run for 10 000 000 generations and consisted of four independent runs of four chains each (three heated and one cold; temperature parameter set to 0.15) with trees sampled from the cold chains every 1000 generations. An exami-

Supplementary data are available with the article through the journal Web site at http://nrcresearchpress.com/doi/suppl/10.1139/gen-2014-0090. Published by NRC Research Press

Pagination not final (cite DOI) / Pagination provisoire (citer le DOI) 4

Genome Vol. 57, 2014

Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 03/09/15 For personal use only.

Table 1. Cytological information and genome size estimates for the 34 species and 2 hybrids of ferns (Polypodiidae) collected for this study; accession numbers, genome size estimates, and coefficients of variation (CVs) for individual specimens are reported in Supplemental Table S1. Family

Species

2na

xb

Ploidy

2C±SE (pg)

1Cx (pg)

Aspleniaceae Aspleniaceae Aspleniaceae Aspleniaceae Athyriaceae Athyriaceae Cystopteridaceae Cystopteridaceae Cystopteridaceae Cystopteridaceae Dennstaedtiaceae Diplaziopsidaceae Dryopteridaceae Dryopteridaceae Dryopteridaceae Dryopteridaceae Dryopteridaceae Dryopteridaceae Dryopteridaceae Dryopteridaceae Dryopteridaceae Dryopteridaceae Onocleaceae Onocleaceae Osmundaceae Osmundaceae Osmundaceae Polypodiaceae Pteridaceae Pteridaceae Pteridaceae Thelypteridaceae Thelypteridaceae Thelypteridaceae Woodsiaceae Woodsiaceae

Asplenium rhizophyllum Asplenium ruta-muraria Asplenium trichomanes Asplenium trichomanes-ramosum Athyrium filix-femina Deparia acrostichoides Cystopteris bulbifera Cystopteris laurentiana Cystopteris tenuis Gymnocarpium dryopteris Pteridium aquilinum Homalosorus pycnocarpos Dryopteris × benedictii Dryopteris × triploidea Dryopteris carthusiana Dryopteris clintoniana Dryopteris cristata Dryopteris filix-mas Dryopteris goldiana Dryopteris intermedia Dryopteris marginalis Polystichum acrostichoides Matteuccia struthiopteris Onoclea sensibilis Osmunda claytoniana Osmunda regalis Osmundastrum cinnamomeum Polypodium virginianum Adiantum pedatum Cryptogramma stelleri Pellaea glabella Phegopteris connectilis Thelypteris noveboracensis Thelypteris palustris Woodsia ilvensis Woodsia oregana

7212 1447,13 1441,2,7 721,2,6 801,6,7 802,13 841,5,6 2524,5,6 1684,5 1601,7 1041,7 802,13 2054 12313 1643 2463 1643 1643 823 823 823 821,2 801 741,6,13 442,7,13 441,7,13 441,2,7 1481,2,5 581,9,13 602 1161,10,13 901,6,8 541,2,11 701,11,13 821,7,13 1524

36 36 36 36 40 40 42 42 42 40 26 40–41 41 41 41 41 41 41 41 41 41 41 40 37 22 22 22 37 29–30 30 29 30 27–36 27–36 38–41 38–41

2 4 4 2 2 2 2 6 4 4 4 2 5 3 4 6 4 4 2 2 2 2 2 2 2 2 2 4 2 2 4 3 2 2 2 4

8.15±0.04 12.86±0.14 18.26±0.21 8.93±0.06 14.57±0.09 18.79±0.09 8.42±0.08 22.44±0.33 16.81±0.11 15.41±0.11 16.29±0.15 12.84±0.11 40.38±0.78 26.14±0.12 33.72±0.36 46.82±0.33 33.60±0.23 31.94±0.14 16.84±0.05 17.73±0.10 13.47±0.10 15.72±0.08 26.71±0.15 31.29±0.18 27.39±0.10 27.26±0.26 31.54±0.14 31.28±0.06 10.47±0.05 9.31±0.06 13.70±0.09 14.30±0.20 9.04±0.04 15.87±0.06 8.35±0.06 17.39±0.23

4.07 3.21 4.57 4.46 7.29 9.39 4.21 3.74 4.20 3.85 4.07 6.42 8.08 8.71 8.43 7.80 8.40 7.98 8.42 8.87 6.74 7.86 13.36 15.65 13.70 13.63 15.77 7.82 5.23 4.65 3.43 4.77 4.52 7.93 4.18 4.35

aChromosome count references: 1, Britton 1953; 2, Britton 1964; 3, Britton and Soper 1966; 4, Cody and Britton 1989; 5, Haufler and Soltis 1986; 6, Löve 1976; 7, Manton 1950; 8, Mulligan and Cody 1979; 9, Paris and Windham 1988; 10, Rigby 1973; 11, Tryon and Tryon 1973; 12, Wagner 1954; 13, Wagner and Wagner 1966. bBasic chromosome numbers for genera, taken from the Flora of North America Volume 2 (Flora of North America Editorial Committee 1993).

nation of the standard deviation of split frequencies as calculated by MrBayes and an assessment of the trends in split posterior probabilities both within and between runs using AWTY (Nylander et al. 2008) suggested that the runs did not stabilize and converge until after ⬃750 000 generations, so to be conservative the first 1 000 000 generations were discarded as burn-in. The final analysis thus comprised four independent runs of 9000 sampled trees each. A majority-rule consensus tree was calculated from the 36 000 pooled sample trees using MrBayes. Branches with posterior probabilities less than 0.75 were collapsed to polytomies prior to performing any statistical analyses. Note that using the fully resolved topology or either of the two best-supported alternate topologies (posterior probabilities > 0.05) caused only minor perturbations in the overall results (data not shown) and would not change the overall conclusions. The tree was plotted using TreeGraph 2 (Stöver and Müller 2010). Statistical analyses All statistical analyses were carried out with R 3.1.0 (R Core Team 2014) using components of the ade4 (Dray and Dufour 2007), ape (Paradis et al. 2004), and phytools (Revell 2012) packages. Genome downsizing To test for genome downsizing following polyploidy, the correlation between 1Cx DNA content and ploidy level was investigated

for the 36 taxa collected for this study using both an uncorrected Pearson’s correlation coefficient and a PGLS correlation coefficient (Martins and Hansen 1997; Garland and Ives 2000; Rohlf 2001). To accommodate deviations from Brownian motion in the PGLS correlation analysis, the expected phylogenetic species covariance matrix was scaled using the multivariate maximumlikelihood estimate (MLE) of Pagel’s ␭ (Pagel 1999; Freckleton et al. 2002; Revell and Harmon 2008), estimated for 1Cx DNA content and ploidy level, as implemented in the R package phytools. To supplement this analysis, the (nonphylogenetic) correlation between 1Cx DNA content and ploidy was also explored for the two most heavily sampled genera in the fern genome size literature, Dryopteris and Asplenium. Genome size estimates were compiled for species in these genera from the Plant DNA C-values Database (Bennett and Leitch 2012) as well as several recent publications (Ekrt et al. 2009, 2010; Dyer et al. 2013) and the association between 1Cx DNA content and ploidy was examined for each genus using Pearson’s correlation coefficients. Phylogenetic principal component regression In brief, the multivariate analysis involved two major steps. In the first step, a phylogenetic principal component analysis (PPCA; Revell 2009) was performed on the trait matrix (excluding the two hybrid taxa). Note that this analysis differs from the implementaPublished by NRC Research Press

Pagination not final (cite DOI) / Pagination provisoire (citer le DOI) Henry et al.

5

Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 03/09/15 For personal use only.

Fig. 1. Bayesian phylogeny based on the three-gene dataset (rbcL + atpA + matK) for the 36 fern taxa collected for this study. Posterior probabilities (PP) are indicated above the branches; bold branches indicate PP > 0.95 and asterisks (*) indicate PP = 1.00. Note that prior to performing any statistical analyses, nodes with PP < 0.75 were collapsed to polytomies. Branch lengths are indicated below the branches.

tion of PPCA in the R package phytools because the trait matrix was first scaled using the Hill–Smith scaling method (Hill and Smith 1976), as implemented in the R package ade4, to accommodate the presence of both continuous and categorical variables. Also, to accommodate deviations from Brownian motion, the expected phylogenetic species covariance matrix was scaled using the multivariate MLE of Pagel’s ␭, estimated on the trait matrix excluding DNA content, as implemented in the R package phytools. The principal components with eigenvalues > 1 were retained and used in subsequent analyses. In the second step, the principal components were regressed against 2C DNA content using the PGLS method. The multivariate MLE of Pagel’s ␭ was estimated for the principal components and 2C DNA content and then used to scale the expected phylogenetic species covariance matrix prior to performing the PGLS analysis.

The regression coefficients for the components (and their associated variances) were back-transformed into the original variable space to obtain estimates of the relationships between the original traits and genome size. Bonferroni’s correction was used to correct for multiple comparisons. The residuals were assessed for normality and homoscedasticity; note that analysis of the residuals was performed in the phylogenetically-independent species space and not in the original species space (discussed in Revell 2009). A more detailed description of the method and an R function to implement it can be found in the Supplemental Methods. The relationships between genome size and stomatal length, spore length, spore type, and habitat type were also explored individually using bivariate PGLS analyses. In most fern species, spores and gametophytes contain the 1C DNA amount but apogamous species such as Pellaea glabella and Phegopteris connectilis proPublished by NRC Research Press

Pagination not final (cite DOI) / Pagination provisoire (citer le DOI) 6

duce spores and gametophytes with unreduced nuclei containing the 2C DNA amount (Manton 1950; Rigby 1973; Mulligan and Cody 1979), so for the bivariate analysis of genome size and spore length, gametophytic DNA content (1C DNA content for sexuallyreproducing species and 2C DNA content for apogamous species) was used as the measure of genome size.

Genome Vol. 57, 2014

Fig. 2. Correlation between 1Cx DNA content ploidy level for 36 fern taxa; the relationship was not significant after adjusting for evolutionary relatedness (␭MLE = 0.92, r = –0.234, p = 0.169).

Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 03/09/15 For personal use only.

Results In total, 91 individuals were collected from 34 species and 2 hybrids spanning 20 genera and 12 families. Genome size estimates (averaged across individuals) and cytological data are presented for each species in Table 1; genome size estimates and coefficients of variation (CVs) for each individual can be found in Supplemental Table S1. New estimates are reported for Asplenium trichomanes-ramosum, Cryptogramma stelleri, Cystopteris laurentiana, Dryopteris × benedictii (carthusiana × clintoniana), Dryopteris × triploidea (carthusiana × intermedia), Woodsia ilvensis, and Woodsia oregana subsp. cathcartiana. The estimates for Cryptogramma stelleri and Woodsia spp. are the first for their respective genera and the estimates for Woodsia spp. are the first for Woodsiaceae. The 2C DNA amounts ranged from 8.15 pg (Asplenium rhizophyllum) to 46.82 pg (Dryopteris clintoniana) and the overall mean 2C DNA content was 20.11 pg. CVs for most samples ranged from 4% to 6%, with the exception of species in a few genera (particularly Dryopteris) that had CVs from 6% to 8% (Supplemental Table S1). The poor quality of these samples is likely due to the presence of cytosolic compounds in the leaf tissue which provide a significant technical barrier toward obtaining high-quality histograms. In nearly all cases, estimates from multiple individuals of the same species were highly consistent, the only notable exception being several individuals of Dryopteris carthusiana. Repeat estimates were also generally in agreement with previously published genome size estimates (Ekrt et al. 2009, 2010; Bainard et al. 2011; Bai et al. 2012). The complete three-gene alignment was 3820 base pairs in length and contained 103 of 117 possible sequences for the 39 taxa included, with 11.8% missing data. Summary statistics for each of the three genes are provided in Supplemental Table S4. The majority-rule consensus tree is presented in Fig. 1. Genome downsizing There was no evidence found in support of a correlation between 1Cx DNA content and ploidy level. A weak negative relationship was found between 1Cx DNA content and ploidy level across species (Pearson’s correlation r = –0.318, p = 0.059; Fig. 2), but this relationship was not significant after adjusting for evolutionary history (PGLS ␭MLE = 0.92, r = –0.234, p = 0.169). In addition, the nonphylogenetic correlation between 1Cx DNA content and ploidy level was not significant in either of the most heavily sampled genera in the fern genome size literature (Asplenium: n = 14, r = –0.467, p = 0.092; Dryopteris: n = 17, r = –0.270, p = 0.294; Supplemental Table S5). Genome size and morphological and ecological traits The trait data for each species can be found in Supplemental Table S6. There was considerable covariance among the traits for the species collected for this study. The phylogenetic PCA performed on the trait matrix yielded 6 of 15 principal components with eigenvalues > 1 (Table 2). These six components captured 76.4% of the variation from the original traits. The r2 values between each axis and each trait are presented in Table 2, along with their row totals (total variance explained for each trait) and column totals (eigenvalues, i.e., total variance explained by each component). The species ordination is plotted for the first two components in Fig. 3. The species exhibited strong clustering by habitat type along the dimensions of highest variance in multivariate trait space (Fig. 3). Of the six components, only PC2 and PC3 were significantly associated with 2C DNA content (p < 0.001 and p = 0.002, respectively; Table 3).

Several traits were significantly associated with 2C DNA content. In the phylogenetic principal component regression analysis, stomatal length, spore length, habitat type, and spore type were found to be significantly associated with 2C DNA content (all p < 0.001; Table 4). In particular, stomatal length and spore length were both positively related to 2C DNA content (Table 4; Figs. 4A–4B), wetland ferns had larger genomes on average than ferns growing in soil or on rock (Table 4; Fig. 4C), and species with chlorophyllous spores had larger genomes on average (Table 4). In the bivariate PGLS analyses, stomatal length (␭MLE = 0.71, r2 = 0.52, p < 0.001), spore length (␭MLE = 0.70, r2 = 0.50, p < 0.001), and habitat type (␭MLE = 0.60, r2 = 0.33, p = 0.002) were significantly associated with genome size (Fig. 4), while spore type was not (␭MLE = 0.94, r2 = 0.08, p = 0.098).

Discussion Phylogeny The phylogeny generated for this study is largely congruent with other recent fern phylogenies (Schuettpelz and Pryer 2007; Kuo et al. 2011; Rothfels et al. 2012a). The majority of nodes (32 of 35) were strongly supported with Bayesian posterior probabilities >0.95. There were several nodes with lower support concerning the placement of Dennstaedtiaceae relative to Pteridaceae and the Eupolypod ferns and the placement of Diplaziopsidaceae relative to Cystopteridaceae, Aspleniaceae, and the rest of the Eupolypod II ferns; in other studies these same nodes have been difficult to recover with strong support. In this study, Dennstaedtiaceae was recovered as sister to the Eupolypod ferns with low support; in other recent studies Dennstaedtiaceae has been placed as sister to Pteridaceae + Eupolypod ferns with low to moderate support (Schuettpelz and Pryer 2007; Kuo et al. 2011). This study also recovered Homalosorus (Diplaziopsidaceae) as sister to the rest of the Eupolypod II ferns with low support; other studies have placed Diplaziopsidaceae (including Homalosorus) as sister to Aspleniaceae with low support (Kuo et al. 2011; Rothfels et al. 2012a). Genome downsizing In this study, evidence is provided supporting the hypothesis that widespread and substantial DNA loss following polyploidy has not been a major factor in the evolution of Ontario ferns. Published by NRC Research Press

Pagination not final (cite DOI) / Pagination provisoire (citer le DOI) Henry et al.

7

Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 03/09/15 For personal use only.

Table 2. Summary of variation explained by the principal components with eigenvalues > 1 from a phylogenetic principal component analysis (␭MLE = 0.56) for 34 fern species.

Stomatal length Spore length Blade length Blade width Habitat type Light preference Boreal species Rarity Reproduction Evergreen fronds Chlorophyllous spores Eigenvalues

PC1

PC2

PC3

PC4

PC5

PC6

Variation explained

Total variation

Proportion explained

0.02 0.06 0.61 0.70 0.57 0.13 0.01 0.29 0.29 0.29 0.17 3.15

0.27 0.66 0.01 0.04 0.75 0.26 0.01 0.02 0.01 0.05 0.59 2.67

0.41 0.01 0.06 0.03 0.13 0.36 0.06 0.59 0.20 0.00 0.01 1.85

0.00 0.15 0.01 0.01 0.07 0.08 0.42 0.10 0.60 0.03 0.01 1.48

0.00 0.03 0.07 0.07 0.00 0.42 0.09 0.33 0.10 0.10 0.01 1.21

0.06 0.00 0.03 0.00 0.03 0.09 0.29 0.13 0.29 0.18 0.01 1.10

0.76 0.91 0.79 0.84 1.56 1.34 0.87 1.44 1.49 0.66 0.80 11.46

1 1 1 1 2 2 1 2 2 1 1 15

0.76 0.91 0.79 0.84 0.78 0.67 0.87 0.72 0.75 0.66 0.80 0.76

Note: Entries represent r2 values between the components and the original variables. Entries with r2 > 0.25 are indicated in bold and were considered to represent traits that were significantly associated with a particular component.

Fig. 3. Species ordination for the first two principal components from a phylogenetic principal component analysis for 34 fern species. The numbers indicated for each point correspond to the species identification numbers in Supplemental Table S6. Note that axis scores are presented in the original species space and not in the phylogenetically-independent species space (discussed in Revell 2009). To avoid clutter, trait vectors were included only for traits that were significantly associated with at least one of the two components.

Table 3. Estimates of regression coefficients between the principal components and 2C DNA content using a phylogenetic principal component regression analysis (␭MLE = 0.68). Intercept PC1 PC2 PC3 PC4 PC5 PC6

Coefficient

SE

p

26.59 0.90 7.06 5.07 0.32 1.71 2.59

3.00 1.14 1.24 1.46 1.62 1.81 1.87

Genome size evolution in Ontario ferns (Polypodiidae): evolutionary correlations with cell size, spore size, and habitat type and an absence of genome downsizing.

Genome size is known to correlate with a number of traits in angiosperms, but less is known about the phenotypic correlates of genome size in ferns. W...
1MB Sizes 0 Downloads 7 Views