J Mol Evol DOI 10.1007/s00239-014-9621-4

ORIGINAL ARTICLE

Scombroid Fishes Provide Novel Insights into the Trait/Rate Associations of Molecular Evolution Fan Qiu • Andrew Kitchen • J. Gordon Burleigh Michael M. Miyamoto



Received: 15 December 2013 / Accepted: 19 April 2014 Ó Springer Science+Business Media New York 2014

Abstract The study of which life history traits primarily affect molecular evolutionary rates is often confounded by the covariance of these traits. Scombroid fishes (billfishes, tunas, barracudas, and their relatives) are unusual in that their mass-specific metabolic rate is positively associated with body size. This study exploits this atypical pattern of trait variation, which allows for direct tests of whether mass-specific metabolic rate or body size is the more important factor of molecular evolutionary rates. We inferred a phylogeny for scombroids from a supermatrix of molecular and morphological characters and used new phylogenetic comparative approaches to assess the associations of body size and mass-specific metabolic rate with substitution rate. As predicted by the body size hypothesis, there is a negative correlation between body size and substitution rate. However, unexpectedly, we also find a negative association between mass-specific metabolic and substitution rates. These relationships are supported by analyses of the total molecular data, separate mitochondrial and nuclear genes, and individual loci, and they are robust to phylogenetic uncertainty. The molecular evolutionary rates of scombroids are primarily tied to body size. This study demonstrates that groups with novel patterns of trait variation can be particularly informative for identifying Electronic supplementary material The online version of this article (doi:10.1007/s00239-014-9621-4) contains supplementary material, which is available to authorized users. F. Qiu (&)  J. G. Burleigh  M. M. Miyamoto Department of Biology, University of Florida, Box 118525, Gainesville, FL 32611-8525, USA e-mail: [email protected] A. Kitchen Department of Anthropology, University of Iowa, Iowa City, IA 52242-1322, USA

which life history traits are the primary factors of molecular evolutionary rates. Keywords Molecular evolutionary rates  Comparative methods  Scombroidei  Mass-specific metabolic rate  Body size

Introduction Molecular evolutionary rates vary greatly among taxa, and elucidating the mechanisms that drive this variation remains an important challenge in evolutionary biology (Lanfear et al. 2010). A number of biological traits, including mass-specific metabolic rate, body size, generation time, population size, DNA repair mechanisms, and habitat, have been associated with molecular evolutionary rates (Smith and Donoghue 2008; Bromham 2009, 2011). However, it can be difficult to distinguish among the individual effects of these traits because they often co-vary (Nabholz et al. 2008; Welch et al. 2008; Lanfear et al. 2013). For example, the metabolic rate hypothesis (Martin and Palumbi 1993; Gillooly et al. 2005; Santos 2012) predicts that species with a high mass-specific metabolism generate many mutagenic byproducts (e.g., reactive oxygen species, reactive aldehydes, and singlet oxygen) through cellular respiration. These byproducts introduce mutations by way of oxidative DNA damage, which increases the nucleotide substitution rate along with the mutation rate. Correspondingly, species with high mass-specific metabolic rates will have higher molecular evolutionary rates than species with lower metabolism. However, species with high mass-specific metabolic rates often have small body sizes, short generation times, short lifespans, and high fecundities, all of which may also result in higher

123

J Mol Evol Table 1 Scombroid taxonomy a

Species

Common names

b

Families

Genera

Gempylidae

16

24

Snake mackerels and escolars

3

11

Sailfishes, spearfishes, and marlins

15

51

Tunas*, bonitos, mackerels, and butterfly mackerel* Barracudas

Istiophoridae* Scombridae Sphyraenidae

1

21

Trichiuridae

10

39

Xiphiidae*

1

1

46

147

Totals

Cutlassfishes, hairtails, scabbardfishes, and frostfishes Swordfish

This taxonomy conforms to the traditional arrangement of six families (Johnson 1986; Carpenter et al. 1995; Nelson 2006; Wiley and Johnson 2010) and the National Center for Biotechnology Information classification a

The numbers of genera and species per family follow Nelson (2006)

b

Asterisks mark the regionally endothermic billfishes (Istiophoridae and Xiphiidae), tunas (Allothunnus, Auxis, Euthynnus, Katsuwonus, and Thunnus), and butterfly mackerel (Gasterochisma melampus)

nucleotide substitution rates (Welch et al. 2008; Bromham 2009, 2011). Bony fishes of the suborder Scombroidei (Table 1) are unusual as they include both regionally endothermic and ectothermic groups (Block et al. 1993; Dickson 1995; Dickson and Graham 2004). Regional endothermy is one of many physiological, morphological, and biochemical adaptations for an active predatory lifestyle in the pelagic zone. The regional endotherms, which include the billfishes, tunas, and butterfly mackerel, are all able to warm their viscera, muscles, and/or cranial regions to above ambient temperatures. Direct experimental measures of mass-specific standard metabolic rate (SMR) are available from only seven scombrid species (Supplementary Table 1). This paucity of direct experimental SMRs is due to the many technical challenges with obtaining and interpreting the metabolic rates of fish species that are often large, active, pelagic, and continuously swimming (Blank et al. 2007; Fitzgibbon et al. 2008). Nevertheless, these seven direct estimates show that the SMRs of regional endotherms are higher than those of ectothermic species (Dickson and Graham 2004; Fitzgibbon et al. 2008). Furthermore, the available measures of oxygen consumption versus swimming speed from four scombrid species indicate that the mass-specific maximum metabolic rates (MMRs) of the regional endotherms are also similarly higher (Sepulveda and Dickson 2000; Sepulveda et al. 2003). Fish physiologists have synthesized these different lines of evidence to conclude that regional endotherms have a higher mass-specific metabolic rate than do ectothermic scombroids (Dickson and Graham 2004; Blank et al. 2007; Fitzgibbon et al. 2008). Correspondingly,

123

according to the metabolic rate hypothesis, the regional endotherms should also have higher rates of molecular evolution. Body size is often negatively correlated with molecular evolutionary rate (Martin and Palumbi 1993; Gillooly et al. 2005; Welch et al. 2008; Santos 2012). This may be because body size is tied to the number of cell divisions that are required by an organism for its growth, maturation, and maintenance (Bromham 2011). In this scenario, species with more cell divisions may place a greater adaptive premium on the efficiency of DNA replication and repair than species with fewer cell divisions to ensure their successful development and reproduction. Conversely, body size may be associated with some other life history trait(s) that in turn has a greater direct effect on molecular evolutionary rates (Bromham 2009). For example, generation time, longevity, and population size often co-vary with body size, and these factors have been implicated in molecular evolutionary rate variation (Ohta 1992; Nabholz et al. 2008; Thomas et al. 2010). In contrast to most other animal taxa, the mass-specific metabolic rates and body sizes of scombroids are positively correlated (Schmidt-Nielsen 1984; Bromham 2009, 2011). The largest scombroids, including billfishes, tunas, and butterfly mackerel, have higher mass-specific metabolic rates than other smaller scombroids (Dickson 1995; Dickson and Graham 2004). Thus, unlike other animal groups, the metabolic rate hypothesis makes a contradictory prediction about the molecular evolutionary rates of the billfishes, tunas, and butterfly mackerel than we would expect based on body size. If metabolic inefficiency is a major source of substitutions, then the metabolic rate hypothesis predicts higher molecular evolutionary rates for the regionally endothermic billfishes, tunas, and butterfly mackerel. Conversely, if body size is a more important factor, then the body sizes of these species suggest lower molecular evolutionary rates. This study exploits these contradictory predictions to directly test whether mass-specific metabolic rate or body size is the more important factor of molecular evolutionary rate variation. We first inferred a scombroid phylogeny from a molecular and morphological supermatrix and then tested for associations between mass-specific metabolic rate, body size, and nucleotide substitution rate using three recently developed comparative methods that directly account for the phylogenetic non-independence of species (Lartillot and Poujol 2011; Mayrose and Otto 2011; Felsenstein 2012). These tests support the body size prediction and refute mass-specific metabolic rate as a primary predictor of the molecular evolutionary rate variation. Our investigation demonstrates how studying groups with unusual patterns of trait variation can provide novel insights into which traits are most closely associated with variable molecular evolutionary rates.

J Mol Evol

Methods

Phylogenetic Inference and Reference Phylogenies

Molecular and Morphological Systematic Data

We used the Akaike Information Criterion (AIC) in MODELTEST v3.7 to identify the preferred set of partitions and associated substitution models for the molecular data (Posada and Crandall 1998). We examined various partitioning schemes, including the use of one partition for all molecular data, separate partitions for mtDNA and nDNA, two subsets for protein-coding and rRNA genes, four subdivisions for the protein-coding and rRNA loci of both mtDNA and nDNA, and a separate subdivision for each gene. We also extended these partitioning schemes to include subdivisions for the first, second, and third codon positions of the protein-coding genes and/or stems and loops of the rRNA loci. The log likelihoods for the best substitution models were summed across all partitions to generate a total likelihood score and AIC for that partitioning strategy. Based on a comparison of the total AICs for each partitioning strategy, we used a separate partition for the first, second, and third codon positions of each protein-coding gene and for the stems and loops of each rRNA locus, with various substitution models for these 56 partitions (Supplementary Table 2). The morphological characters comprised their own partition, and we used the Mk model for this subdivision (Lewis 2001). Maximum likelihood phylogenetic inference of the partitioned molecular and morphological data was performed with GARLI v2.0 using two sequential chains of 100 million generations each (Zwickl 2006). We also performed 1,000 nonparametric bootstrap replicates to assess phylogenetic uncertainty (Felsenstein 1985). For the comparative analyses, we used the ML phylogeny as well as 20 randomly selected bootstrap trees to examine the possible effects of phylogenetic uncertainty. We were limited to 20 bootstrap trees by the intensive computational demands of our comparative analyses. The branch lengths of the ML phylogeny and 20 bootstrap trees were estimated by GARLI with the molecular data only. We also transformed the molecular branch lengths of the 21 trees to make them ultrametric using the penalized likelihood method implemented in r8s v1.8 (Sanderson 2003) prior to their use as the reference phylogenies in the comparative tests. These ultrametric conversions were performed with a smoothing parameter of 10, which was selected by cross validation based on 12 well-documented calibration dates from the scombroid fossil record (Supplementary Table 3).

All scombroid core nucleotide sequences were downloaded from the National Center for Biotechnology Information (NCBI) on October, 2009 using the NCBI Taxonomy Browser (Sayers et al. 2009). To generate clusters of homologous sequences, we first performed an all-by-all BLASTN search of all sequences (Altschul et al. 1990). Any pair of sequences that had a significant BLAST hit, with a maximum E-value of 10e-10 and the pairwise alignment length covering C50 % of the length of both sequences, was considered homologous. We then performed single linkage clustering to obtain all clusters of sequences (representing homologous loci) that formed a connected component in a graph with nodes (sequences) linked by edges representing significant BLAST hits. We identified clusters of loci that had been used in phylogenetic inferences of bony fishes (Little et al. 2010; Betancur et al. 2013; Miya et al. 2013) and confirmed that their sequences were annotated similarly in GenBank. These clusters likely correspond to orthologous sequences, given that gene duplications in mitochondrial DNA (mtDNA) are rare, the nuclear DNA (nDNA) sets represent low copy genes, and there is little obvious conflict among gene trees. We then edited these clusters of putative orthologs by deleting any sequences that were not associated with a formal species designation and by removing those clusters with sequences from fewer than four species. If a species had more than one sequence in the same cluster, then we deleted all but the longest one. Multiple sequence alignments for each gene cluster were generated with MUSCLE v3.8.31 (Edgar 2004). The initial MUSCLE alignments for the mitochondrial 12S and 16S rRNA genes required further edits involving the reassignment of unpaired gaps in stems to adjacent loops as guided by the secondary structures of 12S and 16S rRNA in teleost fishes (Cannone et al. 2002). We also assembled a character matrix for the 62 morphological characters described by Carpenter et al. (1995). These characters were originally scored for different non-overlapping genera and families that were accepted as monophyletic by these authors. These accepted genera and families do not include the three genera and one family that are not monophyletic in our maximum likelihood (ML) phylogeny (Supplementary Sect. 1). For each accepted genus and family, we assigned their morphological states from Carpenter et al. (1995) to all species of that group found in the supermatrix. We concatenated the final alignments for all 20 genes and 62 morphological characters to form a single supermatrix.

Rate Variation Among Lineages The degree of rate heterogeneity among lineages was quantified with the index of dispersion [R(t); Bedford and Hartl 2008]. Substitution rates were estimated by r8s for all

123

J Mol Evol

internodes of the ML phylogeny after the ultrametric conversion of its branch lengths for the total molecular data. These estimates of substitution rates were converted into weighted counts of substitutions per branch following the procedure of Bedford and Hartl (2008). The index of dispersion was then calculated for scombroids as the variance-to-mean ratio of these weighted substitution counts. The expected value for this ratio is 1 if the weighted substitution counts are Poisson distributed. Body Size and Mass-Specific Metabolic Rate Data Estimates of maximum body mass (in kg) were derived from different sources (Supplementary Table 4) and then log transformed. Scombroids are a relatively well-studied group (Collette 2010; FishBase October 2012; http://www. fishbase.org), which thereby minimizes the problem of sampling error in estimations of both maximum and average body mass. Maximum body mass was chosen over average body mass because it is reported for the entire species, whereas the average is often presented for a particular stock or population. For example, Yamashita et al. (2005) reported an average body mass of 3.6 kg for skipjack tuna (Katsuwonus pelamis) from around Japan, whereas Kaneko and Ralston (2007) published an average body mass of 8.6 kg for this species from near Hawaii. Mass-specific metabolic rate was scored as a binary character with ‘‘high’’ states for regionally endothermic billfishes, tunas, and butterfly mackerel and ‘‘low’’ states for the ectothermic scombroids. Mass-specific metabolic rate was scored as a binary, rather than continuous, trait because direct experimental SMRs are available for only seven scombrids (Supplementary Table 1). The scoring of regionally endothermic and ectothermic species as ‘‘high’’ and ‘‘low’’, respectively, follows the widely accepted conclusion of fish physiologists (Sepulveda and Dickson 2000; Sepulveda et al. 2003; Dickson and Graham 2004; Blank et al. 2007; Fitzgibbon et al. 2008). Correlation of Body Size and Mass-Specific Metabolic Rate We tested for a correlation between body size and massspecific metabolic rate with the Comparative Threshold Test (CTT) of THRESHML (Felsenstein 2012). The CTT assumes that each binary character (i.e., high/low massspecific metabolic rate) is determined by an unobserved continuous trait known as the liability. Liability values that exceed an estimated threshold result in state ‘‘1’’ (e.g., high mass-specific metabolic rate) for the discrete character. The CTT uses the current provisional covariance matrix to transform the continuous and liability characters of the internal and/or external nodes of the tree into a set of new

123

variables that are evolving along the phylogeny by independent Brownian motion. This set of variables is updated by Markov chain Monte Carlo (MCMC) sampling that is operating within a likelihood framework. The resultant MCMC samples allow for the re-estimation of the covariance matrix, and thereby, an inference of the correlation between the evolutionary changes of the continuous and discrete characters along the phylogeny. Our CTT was performed with 30 consecutive chains of one million generations each (i.e., successive cycles of updated covariance matrix and transformed characters). The proposal size for the Metropolis updates of tip liabilities was set to 7.5. Burn-in was retained at the default length of 1,000 initial liabilities. Such long runs and larger proposal sizes were necessary to maintain the acceptance rate of the newly proposed tip liabilities at *0.47 and to minimize the transformation errors to \2 %. The relationship between log maximum body mass and mass-specific metabolic rate was determined with the Pearson product-moment correlation coefficient (Pearson’s r). Association of Mass-Specific Metabolic and Molecular Evolutionary Rates The association between the binary character for mass-specific metabolic rate and substitution rate was assessed with a likelihood ratio test (LRT) in TRAITRATE v1.1 (Mayrose and Otto 2011). The LRT evaluates the null hypothesis of no association between the mass-specific metabolic and molecular evolutionary rates. Specifically, it compares the fit of a two-clock model, which estimates separate substitution rates for each character state, to that of a null model with a single substitution rate for the entire tree. The two-clock model estimates r, which represents the ratio of the substitution rates associated with the two character states. We performed 100 stochastic character mappings using the total molecular data, separate mtDNA and nDNA genes, and eight individual mtDNA and nDNA loci with sequences from [20 species from all six families (Supplementary Table 2). TRAITRATE does not allow partitioned data nor does it estimate the proportion of invariable sites (I). It also provides only four substitution models (JC, K80, HKY, and GTR; Mayrose and Otto 2011). Thus, each LRT was performed using a single partition and the preferred model (of the four available in TRAITRATE) as determined by MODELTEST. Furthermore, the gamma distribution alone (C) was used to account for rate variation among sites even when the best model called for ‘‘C?I’’ or ‘‘I’’. The significance of each LRT was evaluated using the Chi square distribution with a of 0.0005. We used a conservative cutoff to evaluate the test statistic (d) because parametric bootstrapping was not computationally feasible for all datasets.

J Mol Evol

Correlation of Body Size and Molecular Evolutionary Rate

Results Highly Variable Rates of Molecular Evolution

The correlation between body size and substitution rate was assessed with the Phylogenetic Corrected Covariance Test (PCCT) of COEVOL v1.3c (Lartillot and Poujol 2011). The PCCT uses a Bayesian MCMC approach to generate a phylogenetic corrected covariance matrix for the continuous traits and substitution rate parameters under study. The evolution of the traits and substitution rates is modeled along the phylogeny using a multivariate Brownian diffusion process. The PCCT imposes a prior on the covariance matrix that is conjugate to the Brownian process and also relies on data augmentation, whereby the MCMC sampling is conditional on a complete history (mapping) of substitutions for all sites across the phylogeny. These strategies greatly improve the computational efficiency of the PCCT. Log maximum body mass and substitution rate were the trait and molecular evolutionary parameter of interest in our PCCTs, respectively. The PCCTs were performed for the total molecular data, separate mtDNA and nDNA genes, and eight individual loci with sequences from [20 species from all six families using the GTR model that is provided in the program. Each PCCT consisted of 50,000 consecutive cycles of data augmentation (i.e., successive re-samplings of the substitution history as conditioned on the current parameter values) and a 10 % burn-in. Such long runs were necessary to ensure effective sample sizes of [200 for all datasets. The prior mean variance parameter for the two components of the covariance matrix was set to a truncated Jeffreys’ prior. The mean substitution rate along each branch was calculated with the geodesic averaging procedure. A posterior probability of \0.025 for a positive Pearson’s r was considered decisive for a negative correlation between body size and substitution rate (Lartillot and Poujol 2011). The PCCTs were conducted with the same 12 well-documented dates from the scombroid fossil record that were used to date the reference phylogenies with r8s (Supplementary Table 3). COEVOL also requires that an age estimate be provided for the root of the tree. In the absence of such an estimate from the fossil record, we used the average and standard deviation age of the roots for the ML phylogeny and 1,000 bootstrap trees as estimated by penalized likelihood in r8s (152.5 ? 17.3 million years ago). Data Accessibility Our molecular and morphological supermatrix, reference phylogenies, and input/output files for the comparative analyses are available on FigShare (http://dx.doi.org/10. 6084/m9.figshare.1011438).

The index of dispersion for the weighted substitution counts for all branches of the ML phylogeny (Fig. 1) indicates a high level of rate variation among scombroid lineages. The variance-to-mean ratio of the weighted substitution counts for scombroids is 14.0. This ratio for scombroids exceeds the Poisson expectation of 1 by an even greater amount than the overdispersed R(t) estimates of *5 for mammals and *1.5–3.0 for Drosophila (Bedford and Hartl 2008; Bedford et al. 2008). Thus, scombroid substitution rates are highly variable, and it is this heterogeneity that remains the focus of our comparative tests with mass-specific metabolic rate and body size (Tables 2, 3). Positive Correlation Between Body Size and MassSpecific Metabolic Rate The CTT with the ML phylogeny (Fig. 1) supports a significant positive correlation between log maximum body mass and the liability character for high/low mass-specific metabolic rate (Pearson’s r = 0.396, degrees of freedom = 95, P \ 0.001). The covariance between body size and mass-specific metabolic rate is 0.188, and body size explains 15.7 % of the total variation in the liability of metabolic rate. The CTTs with the 20 bootstrap trees also consistently support a significant positive correlation between body size and mass-specific metabolic rate (Pearson’s r, range of 0.348–0.491; P \ 0.001 in each case). Collectively, these results provide new phylogenetic comparative corroboration for the previous conclusion of fish physiologists (Dickson 1995; Dickson and Graham 2004; Fitzgibbon et al. 2008) that billfishes, tunas, and butterfly mackerel share a high mass-specific metabolic rate and are also the bigger scombroids. Negative Association Between Mass-Specific Metabolic and Substitution Rates The LRTs with the ML phylogeny support a significant association between mass-specific metabolic and substitution rates for the total molecular data as well as the mtDNA and nDNA genes (Table 2). The ML estimates of r (the substitution rate ratio for high and low metabolisms) range from 0.449 to 0.705 for these three datasets, indicating that high substitution rates are associated with low mass-specific metabolic rates. A significant association is similarly found for all eight individual mtDNA and nDNA genes with the larger samples. Parameter r for these eight genes ranges from 0.220 to 0.851 (mean = 0.464). The LRTs also recover a significant negative association in C95 % of

123

J Mol Evol

123

J Mol Evol b Fig. 1 Molecular and morphological ML phylogeny. This ML

phylogeny is rooted with Sphyraenidae (Carpenter et al. 1995; Nelson 2006). Bootstrap scores are presented for internal branches with [50 % support. Branches are proportional to their molecular branch lengths. For display purposes, branches with double slashes are shortened by 50 %. Red and blue highlight species with high and low mass-specific metabolic rates, respectively, disk shading corresponds to the log maximum body mass (kg) of each species (Supplementary Table 4), the numbers of sampled sequences per species are given in parentheses, and the 12 dated nodes for the r8s and COEVOL analyses are numbered according to Supplementary Table 3. Details about the molecular and morphological supermatrix, the relationships within the tree, the bootstrap support values, and the estimated dates for the internal nodes without fossil calibrations are provided in Supplementary Sects. 1 and 2

the bootstrap replicates for all datasets, except for Cyt b and ND 2 (75 and 60 %, respectively). Still, r is \1.0 for these two genes in 85 and 95 % of their bootstrap replicates, respectively. Collectively, the LRTs support a negative association between mass-specific metabolic and substitution rates, whereby species with a high mass-specific metabolic rate (regionally endothermic billfishes, tunas, and butterfly mackerel) have molecular evolutionary rates that are *50 % slower than those of species with a low metabolism (ectothermic scombroids). This negative (not positive) association contradicts the metabolic rate hypothesis as a primary explanation of the molecular evolutionary rate variation. Negative Correlation Between Body Size and Substitution Rate The PCCTs with the ML phylogeny support a decisive negative correlation between log maximum body mass and

substitution rate for the total molecular data as well as the mtDNA and nDNA genes (Table 3). Body size explains between 13.8 and 29.9 % of the total variation in substitution rate for these three datasets. A decisive negative correlation is similarly found for three of the five mtDNA and two of the three nDNA genes with the larger samples. Although non-decisive, a negative covariance is still supported for Cyt b, 16S rRNA, and Rhod. The PCCTs with the 20 bootstrap trees also recover a decisive negative correlation in C95 % of the replicates for the same datasets as those using the ML phylogeny, except for the mtDNA genes, ND 2, and 12S rRNA (80, 55, and 75 %, respectively). Furthermore, a negative covariance is found in C95 % of all bootstrap replicates for all datasets. Collectively, the PCCTs support a negative correlation between body size and substitution rate, whereby the large species (predominantly billfishes, tunas, and butterfly mackerel) have lower molecular evolutionary rates than the small ones. This relationship is consistent with the prediction of the body size hypothesis.

Discussion Evidence Contradicting the Metabolic Rate Hypothesis The metabolic rate hypothesis was supported early on by mtDNA and nDNA analyses of endothermic and ectothermic vertebrates (Martin et al. 1992; Martin and Palumbi 1993). Lanfear et al. (2007) later rejected the metabolic rate hypothesis based on a study of 12 nDNA and mtDNA genes for [300 metazoan species. Recently, Santos (2012)

Table 2 Likelihood ratio tests per gene and genome with the ML phylogeny as the reference tree (Fig. 1) and high/low mass-specific metabolic rate as the trait of interest Genomes

Genesa

MtDNA

COX I (73)

-18,251.90

-18,149.20

205.40

0.560

1.00 {1.00}

Cyt b (72)

-22,315.70

-22,306.30

18.80

0.851

0.75 {0.85}

ND 2 (42)

-15,484.70

-15,465.30

38.80

0.751

0.60 {0.95}

-5,802.35

-5,733.44

137.82

0.220

1.00 {1.00}

Log likelihood under the null model

16S rRNA (40) 12S rRNA (40) NDNA

MtDNA and nDNA a

Log likelihood under the two-clock model

Likelihood ratio test statistic (d)b

Relative rate parameter (r)

Bootstrap proportionsc

-4,670.42

-4,557.38

226.08

0.223

1.00 {1.00}

-81,073.30 -3,594.45

-80,848.50 -3,575.50

449.60 37.90

0.705 0.370

1.00 {1.00} 1.00 {1.00}

Rhod (38)

-4,844.95

-4,823.38

43.14

0.413

0.95 {1.00}

RAG 2 (21)

-5,391.34

-5,341.02

100.64

0.326

1.00 {1.00}

All 9 genes (64)

-20,704.00

-20,621.60

164.80

0.449

1.00 {1.00}

All 20 genes (97)

-102,357.00

-102,101.00

512.00

0.651

1.00 {1.00}

All 11 genes (96) Tmo-4c4 (57)

The numbers of sampled species per gene and genome are given in parentheses

b

All of these LRTs are significant at a = 0.0005

c

Bootstrap proportions refer to the frequencies of 20 bootstrap trees that recover a significant association or an r \ 1.0 (in curly brackets)

123

J Mol Evol Table 3 Phylogenetic corrected covariance tests per gene and genome with the ML phylogeny as the source tree (Fig. 1) and maximum body mass as the trait of interest Genomes

Genesa

MtDNA

COX I (73)

NDNA

a

-5.040

Pearson’s r

Posterior probabilityc

Bootstrap proportionsd

-0.664

0.000*

1.00 {1.00}

Cyt b (72)

-1.310

-0.170

0.180

0.00 {1.00}

ND 2 (42)

-1.820

-0.613

0.003*

0.55 {1.00}

16S rRNA (40)

-1.490

-0.192

0.280

0.00 {0.95}

12S rRNA (40)

-7.650

-0.892

0.000*

0.75 {1.00}

All 11 genes (96)

-3.330

-0.372

0.013*

0.80 {1.00}

Tmo-4c4 (57) Rhod (38)

-5.130 -2.840

-0.619 -0.256

0.000* 0.170

1.00 {1.00} 0.00 {1.00}

RAG 2 (21) MtDNA and nDNA

Covarianceb

-12.300

-0.956

0.000*

1.00 {1.00}

All 9 genes (64)

-4.880

-0.547

0.001*

1.00 {1.00}

All 20 genes (97)

-4.060

-0.380

0.006*

0.95 {1.00}

The numbers of sampled species per gene and genome are given in parentheses

b

Covariance and Pearson’s r are estimated as the means of their posterior distributions

c

Asterisks highlight posterior probabilities of \0.025, which are decisive for a negative Pearson’s r

d

Bootstrap proportions refer to the frequencies of 20 bootstrap trees that support a decisive posterior probability of \0.025 or a negative covariance (in curly brackets)

demonstrated that the mass-specific active (but not resting) metabolic rates of poison frogs are positively correlated with their nDNA and mtDNA substitution rates. These massspecific active and resting metabolic rates correspond to the MMRs and SMRs of fishes, respectively (Sepulveda and Dickson 2000; Sepulveda et al. 2003). We find that the nucleotide substitution rates of scombroids are negatively associated with their mass-specific metabolic rates (Table 2). This negative association is corroborated across all molecular data, genomes, and genes, which suggests that it is an organism, and not a gene or genome, specific effect and not an artifact of sampling error. It is the opposite trend predicted if a high metabolic rate is a major source of substitutions. In particular, the lack of a positive association in the mtDNA datasets is telling, given that the mitochondrion is the primary site in the cell where mutagenic metabolic byproducts are produced (Martin and Palumbi 1993; Lanfear et al. 2007). Although a positive relationship between mass-specific metabolic and substitution rates may be masked by other correlated characters (i.e., body size), the negative association in our analyses contradicts the metabolic rate hypothesis as a primary explanation of molecular evolutionary rate variation. Correspondingly, our negative relationship likely reflects the unusual positive co-variation of scombroid mass-specific metabolic rate with body size (Dickson 1995; Dickson and Graham 2004; our CTT results). While mass-specific metabolic rates may play a major role in some systems, our study provides further evidence that the metabolic rate hypothesis is not a general explanation of molecular evolutionary rate variation across

123

animals (Lanfear et al. 2007). This raises the question of whether previous support for the metabolic rate hypothesis may have been driven by covariates of mass-specific metabolic rate (Bromham 2011). Support for an Association of Body Size and Molecular Evolutionary Rate The negative correlation between body size and substitution rate that we find across all molecular data, genomes, and genes (Table 3) is consistent with the prediction that larger species should have lower molecular evolutionary rates (Bromham 2009, 2011). However, we cannot reject the possibility that a covariate of body size, and not body size itself, has a more direct influence on substitution rates. For example, generation time, which is often positively correlated with body size, is frequently implicated as a factor in the molecular evolutionary rates of animals and plants (Smith and Donoghue 2008; Thomas et al. 2010; Wilson Sayres et al. 2011). Also, longevity, another common covariate of body size, is often proposed to influence molecular evolutionary rates, especially in mtDNA studies (Nabholz et al. 2008; Welch et al. 2008). In animals with deterministic development, body size, generation time, and longevity can be mechanistically related to the rate of mitosis hypothesis, which posits that the accumulation of germ line mutations and substitutions per unit time is linked to the number of mitotic cellular divisions from zygote to reproductive adult (Lanfear et al. 2013). As more reliable estimates of these and other life history traits become available, we recommend that future studies of

J Mol Evol

scombroid molecular evolutionary rates first test whether these factors (as for mass-specific metabolic rate) are correlated with body size in atypical ways (i.e., in a negative direction for generation time and lifespan).

correlation that is found between body size and substitution rate when mass-specific metabolic rate is controlled for.

New Comparative Methods for Traits and Substitution Rates

The molecular evolutionary rate for a group can be considered a character of its life history (Bromham 2011). Understood in this way, it is not surprising that the substitution rate for a group is commonly intertwined in complex ways with many of its other life history traits. Parsing the effects of individual components within such complex interrelationships will benefit from the study of groups with different patterns of trait variation (Nabholz et al. 2008; Welch et al. 2008). In this study, we exploit the unusual positive association of scombroid mass-specific metabolic rate and body size to generate opposing predictions that allow for direct testing of the primacy of the traits. We thereby document how groups with novel patterns of trait variation can help to untangle the relative importance of typically confounded factors. Cartilaginous fishes of the family Myliobatidae constitute another marine group with larger species that share morphological specializations for regional endothermy (Dickson and Graham 2004; Bernal et al. 2012). Coupled with their active and pelagic lifestyles, manta and devil rays have been considered ‘‘warm bodied’’ (Alexander 1996), which thereby identifies their family as another fish group with the rare positive association between body size and mass-specific metabolic rate. However, direct estimates of mass-specific metabolic rate are known for only three ectothermic myliobatid species (i.e., no such measures exist for manta and devil rays; Neer et al. 2006). Furthermore, the taxon sampling of their DNA sequences remains limited, as most comparative molecular studies of the family have focused on DNA barcoding and population genetic questions (Schluessel et al. 2010; Naylor et al. 2012). Nevertheless, myliobatid rays remain a promising group to independently test the conclusions and recommendations of this study. Future comparative analyses of their unusual trait variation and substitution rates may provide further insights into which typically confounded life history factors are central to the pace of molecular evolution.

Our analyses take advantage of recent advances in phylogenetic comparative methods that provide an explicit statistical framework to test associations of characters with each other and with molecular evolutionary rates. THRESHML is presented as the first fully developed procedure for the phylogenetic comparison of both binary and continuous traits (Felsenstein 2012). This method differs from the Phylogenetic Logistic Regression procedure of Ives and Garland (2010) in that it accounts for the evolution of both the binary and continuous traits along the phylogeny, while Phylogenetic Logistic Regression fixes the continuous characters as known variables of the external tips. In turn, TRAITRATE (Mayrose and Otto 2011) and COEVOL (Lartillot and Poujol 2011) are new fully integrated statistical methods to test for the correlation of a binary or continuous trait and the substitution rate, respectively. In particular, TRAITRATE differs from the ML method of O’Connor and Mundy (2009) in that it tests for a phenotype/genotype association among all positions rather than at a subset of the sites. These three new comparative methods make complete use of all branches of the phylogeny, and thus the full covariance structure among species, in their phylogenetic comparisons. In addition to simple linear regressions (i.e., as used in Table 3), COEVOL can also perform phylogenetic multiple regressions of two or more continuous traits and substitution rates (Lartillot and Poujol 2011). We were unable to conduct robust COEVOL multiple regressions of scombroid mass-specific metabolic rate, maximum body mass, and substitution rate due to the availability of direct SMR estimates for only seven scombrid species (Supplementary Table 1). This paucity of direct continuous SMRs necessitated the TRAITRATE analyses of mass-specific metabolic rate as a binary trait (Table 2). Still, we have completed preliminary COEVOL multiple regressions for the seven known scombrid species as an illustration of how such tests can be done once a sufficient sample of experimental SMRs is obtained (Supplementary Sect. 3). Obviously, the results of these preliminary tests must be treated with caution as they have very low power. Nevertheless, it is intriguing that we find no pattern of a consistent positive partial correlation between SMR and substitution rate when body size is held constant (contra the metabolic rate hypothesis; Supplementary Table 5). This inconsistency stands in contrast to the uniform trend of a negative partial

The Utility of Groups with Novel Trait Variations

Acknowledgments We thank Charles Baer, Keith Choe, Jamie Gillooly, Debra Murie, Larry Page, Jose´ Ponciano, Michele Tennant, and Ying Wang for their helpful recommendations.

References Alexander RL (1996) Evidence of brain-warming in the mobulid rays, Mobula tarapacana and Manta birostris (Chondrichthyes:

123

J Mol Evol Elasmobranchii: Batoidea: Myliobatiformes). Zool J Linn Soc 118:151–164 Altschul SF, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 215:403–410 Bedford T, Hartl DL (2008) Overdispersion of the molecular clock: temporal variation of gene-specific substitution rates in Drosophila. Mol Biol Evol 25:1631–1638 Bedford T, Wapinski I, Hartl DL (2008) Overdispersion of the molecular clock varies between yeast, Drosophila, and mammals. Genetics 179:977–984 Bernal D, Carlson JK, Goldman KJ, Lowe CG (2012) Energetics, metabolism, and endothermy in sharks and rays. In: Carrier JC, Musick JA, Heithaus MR (eds) Biology of sharks and their relatives, 2nd edn. CRC Press, Boca Raton, pp 211–237 Betancur RR, Broughton RE, Wiley EO et al (2013) The tree of life and a new classification of bony fishes. PLOS Curr 5:18 Blank JM, Farwell CJ, Morrissette JM et al (2007) Influence of swimming speed on metabolic rates of juvenile Pacific bluefin tuna and yellowfin tuna. Physiol Biochem Zool 80:167–177 Block BA, Finnerty JR, Stewart AFR, Kidd J (1993) Evolution of endothermy in fish: mapping physiological traits on a molecular phylogeny. Science 260:210–214 Bromham L (2009) Why do species vary in their rate of molecular evolution? Biol Lett 5:401–404 Bromham L (2011) The genome as a life-history character: why rate of molecular evolution varies between mammal species. Philos Trans R Soc B 366:2503–2513 Cannone JJ, Subramanian S, Schnare MN et al (2002) The comparative RNA web (CRW) site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs. BMC Bioinform 3:2 Carpenter KE, Collette BB, Russo JL (1995) Unstable and stable classifications of scombroid fishes. Bull Mar Sci 56:379–405 Collette BB (2010) Reproduction and development in epipelagic fishes. In: Cole KS (ed) Reproduction and sexuality in marine fishes: patterns and processes. University of California Press, Berkeley, pp 21–63 Dickson KA (1995) Unique adaptations of the metabolic biochemistry of tunas and billfishes for life in the pelagic environment. Environ Biol Fish 42:65–97 Dickson KA, Graham JB (2004) Evolution and consequences of endothermy in fishes. Physiol Biochem Zool 77:998–1018 Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797 Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783–791 Felsenstein J (2012) A comparative method for both discrete and continuous characters using the threshold model. Am Nat 179:145–156 Fitzgibbon QP, Baudinette RV, Musgrove RJ, Seymour RS (2008) Routine metabolic rate of southern bluefin tuna (Thunnus maccoyii). Comp Biochem Physiol A 150:231–238 Gillooly JF, Allen AP, Brown JH, West GB (2005) The rate of DNA evolution: effects of body size and temperature on the molecular clock. Proc Natl Acad Sci USA 102:140–145 Ives AR, Garland T Jr (2010) Phylogenetic logistic regression for binary dependent variables. Syst Biol 59:9–26 Johnson GD (1986) Scombroid phylogeny: an alternative hypothesis. Bull Mar Sci 39:1–41 Kaneko JJ, Ralston NVC (2007) Selenium and mercury in pelagic fish in the central North Pacific near Hawaii. Biol Trace Elem Res 119:242–254 Lanfear R, Thomas JA, Welch JJ et al (2007) Metabolic rate does not calibrate the molecular clock. Proc Natl Acad Sci USA 104:15388–15393

123

Lanfear R, Welch JJ, Bromham L (2010) Watching the clock: studying variation in rates of molecular evolution between species. Trends Ecol Evol 25:495–503 Lanfear R, Ho SYW, Davies TJ et al (2013) Taller plants have lower rates of molecular evolution. Nat Commun 4:1879 Lartillot N, Poujol R (2011) A phylogenetic model for investigating correlated evolution of substitution rates and continuous phenotypic characters. Mol Biol Evol 28:729–744 Lewis PO (2001) A likelihood approach to estimating phylogeny from discrete morphological character data. Syst Biol 50:913–925 Little AG, Lougheed SC, Moyes CD (2010) Evolutionary affinity of billfishes (Xiphiidae and Istiophoridae) and flatfishes (Plueronectiformes): independent and trans-subordinal origins of endothermy in teleost fishes. Mol Phylogenet Evol 56:897–904 Martin AP, Palumbi SR (1993) Body size, metabolic rate, generation time, and the molecular clock. Proc Natl Acad Sci USA 90:4087–4091 Martin AP, Naylor G, Palumbi SR (1992) Rates of mitochondrial DNA evolution in sharks are slow compared with mammals. Nature 357:153–155 Mayrose I, Otto SP (2011) A likelihood method for detecting traitdependent shifts in the rate of molecular evolution. Mol Biol Evol 28:759–770 Miya M, Friedman M, Satoh TP et al (2013) Evolutionary origin of the Scombridae (tunas and mackerels): members of a paleogene adaptive radiation with 14 other pelagic fish families. PLoS One 8:e73535 Nabholz B, Glemin S, Galtier N (2008) Strong variations of mitochondrial mutation rate across mammals: the longevity hypothesis. Mol Biol Evol 25:120–130 Naylor GJP, Caira JN, Jensen K et al (2012) A DNA sequence-based approach to the identification of shark and ray species and its implications for global elasmobranch diversity and parasitology. Bull Am Mus Nat Hist 367:1–262 Neer JA, Carlson JK, Thompson BA (2006) Standard oxygen consumption of seasonally acclimatized cownose rays, Rhinoptera bonasus (Mitchill 1815), in the northern Gulf of Mexico. Fish Physiol Biochem 32:67–71 Nelson JS (2006) Fishes of the world, 4th edn. Wiley, Hoboken O’Connor TD, Mundy NI (2009) Genotype–phenotype associations: substitution models to detect evolutionary associations between phenotypic variables and genotypic evolutionary rate. Bioinformatics 25:i94–i100 Ohta T (1992) The nearly neutral theory of molecular evolution. Annu Rev Ecol Syst 23:263–286 Posada D, Crandall KA (1998) MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817–818 Sanderson MJ (2003) r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics 19:301–302 Santos JC (2012) Fast molecular evolution associated with high active metabolic rates in poison frogs. Mol Biol Evol 29:2001–2018 Sayers EW, Barrett T, Benson DA et al (2009) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 37:D5–D15 Schluessel V, Broderick D, Collin SP, Ovenden JR (2010) Evidence for extensive population structure in the white spotted eagle ray Aetobatus narinari within the Indo-Pacific. J Zool 281:46–55 Schmidt-Nielsen K (1984) Scaling: why is animal size so important?. Cambridge University Press, Cambridge Sepulveda C, Dickson KA (2000) Maximum sustainable speeds and cost of swimming in juvenile kawakawa tuna (Euthynnus affinis) and chub mackerel (Scomber japonicus). J Exp Biol 203:3089–3101 Sepulveda CA, Dickson KA, Graham JB (2003) Swimming performance studies on the eastern Pacific bonito Sarda chiliensis, a

J Mol Evol close relative of the tunas (family Scombridae). J Exp Biol 206:2739–2748 Smith SA, Donoghue MJ (2008) Rates of molecular evolution are linked to life history in flowering plants. Science 322:86–89 Thomas JA, Welch JJ, Lanfear R, Bromham L (2010) A generation time effect on the rate of molecular evolution in invertebrates. Mol Biol Evol 27:1173–1180 Welch JJ, Bininda-Emonds ORP, Bromham L (2008) Correlates of substitution rate variation in mammalian protein-coding sequences. BMC Evol Biol 8:53 Wiley EO, Johnson GD (2010) A teleost classification based on monophyletic groups. In: Nelson JS, Schultze HP, Wilson MVH (eds) Origin and phylogenetic interrelationships of teleosts. Verlag Dr. Friedrich Pfeil, Mu¨nchen, pp 123–182

Wilson Sayres MA, Venditti C, Pagel M, Makova KD (2011) Do variations in substitution rates and male mutation bias correlate with life-history traits? A study of 32 mammalian genomes. Evolution 65:2800–2815 Yamashita Y, Omura Y, Okazaki E (2005) Total mercury and methylmercury levels in commercially important fishes in Japan. Fish Sci 71:1029–1035 Zwickl DJ (2006) Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion. Dissertation, University of Texas

123

rate associations of molecular evolution.

The study of which life history traits primarily affect molecular evolutionary rates is often confounded by the covariance of these traits. Scombroid ...
870KB Sizes 3 Downloads 3 Views