Accepted Article 1

Title: An improved method to set significance thresholds for β diversity testing in microbial

2

community comparisons.1

3 4

Authors: Arda Gülay1 and Barth F. Smets1*

5 6

*Corresponding author

7 8

1

9

2800 Kgs Lyngby, Denmark. Phone: +45 45251600. FAX: +45 45932850. e-mail: [email protected]*,

10

Department of Environmental Engineering, Technical University of Denmark, Building 113, Miljøvej,

[email protected]

11 12

Summary

13

Exploring the variation in microbial community diversity between locations (β diversity) is a central

14

topic in microbial ecology. Currently there is no consensus on how to set the significance threshold

15

for β diversity. Here, we describe and quantify the technical components of β diversity, including

16

those associated with the process of subsampling. These components exist for any proposed β

17

diversity measurement procedure. Further, we introduce a strategy to set significance thresholds for

18

β diversity of any groups of microbial samples using rarefaction, invoking the notion of a meta-

19

community. The proposed technique was applied to several in silico generated OTU libraries and

20

experimental 16S rRNA pyrosequencing libraries. The latter represented microbial communities

21

from different biological rapid sand filters at a full-scale waterworks. We observe that β diversity,

22

after subsampling, is inflated by intra-sample differences; this inflation is avoided in the proposed

23

method. In addition, microbial community evenness (Gini > 0.08) strongly affects all β diversity

24

estimations due to bias associated with rarefaction. Where published methods to test β significance

25

often fail, the proposed meta-community based estimator is more successful at rejecting

26

insignificant β diversity values. Applying our approach, we reveal the heterogeneous microbial

27

structure of biological rapid sand filters both within and across filters.

28

This article has been accepted for publication and undergone full peer review but has not been through the copyediting, typesetting, pagination and proofreading process, which may lead to differences between this version and the Version of Record. Please cite this article as doi: 10.1111/1462-2920.12748

1

This article is protected by copyright. All righ1ts reserved.

Accepted Article 29 30

Introduction

31

The term β diversity was initially introduced in the macro-ecological literature to describe variation

32

in community composition between different sampling units as a component of diversity (Whittaker

33

(1960, 1972). It has been used to answer many ecological and evolutionary questions on macro-

34

organisms (e.g., Condit et al., 2002; McKnight et al., 2007). Using β-diversity, direct links could be

35

made between community composition and factors that affect them, particularly environmental

36

heterogeneity (Freestone et al., 2006; Mandl et al., 2009). Different measures for β diversity have

37

been proposed, such as (i) the variation in species richness, (ii) the variation in species composition

38

(Jurasinski et al., 2009) and (iii) the variation in phylogenetic composition (Martin, 2002) between

39

different sampling units. All β-diversity measures (reviewed in Tuomisto 2010a, 2010b) show

40

different properties and application of different measures on a single data set can result in different

41

outcomes and interpretations (Anderson et al., 2011). Consequently, comparisons between results of

42

these different measures can be elusive (Koleff et al., 2003).

43

With the use of high throughput molecular techniques to measure microbial community

44

composition (Parks et al., 2012), the concept of β diversity is now also used in the field of microbial

45

ecology. Macro- and microbial ecology differ greatly in the magnitude of individuals analyzed with

46

millions to billons individuals and potential significant variation between samples, characterizing

47

microbial ecology. Uneven sampling depth, common in next generation sequencing methodologies,

48

can strongly effects β diversity measurements (Chao et al., 2004; Lozupone et al., 2011), in that

49

microbial communities represented by fewer individuals will artificially be detected as different

50

(Lozupone et al., 2011). Normalization procedures (e.g., rarefaction; Gotelli et al., 2001) or

51

multivariate analysis (Ley et al., 2008; Lozupone et al., 2011) are most commonly applied to

52

minimize the effect of sampling depth (Schloss et al., 2009; Caporaso et al., 2010).

53

Bell (2010), Zinger et al. (2011) and Youssef et al. (2010) all used quantitative Bray-Curtis

54

diversity measures to reveal distance-decay relationship of bacteria in water-filled tree-holes, global

55

biological variation of bacteria in seawater ecosystem, and fine-scale microbial community

56

variation in sediment samples, respectively. On the other hand, Shade et al. (2013) used weighted

57

and unweighted UniFrac diversity measures to investigate temporal patterns of microbial and

58

macrobial communities. Other studies, still, use multiple measures: in the Human Microbiome

59

Project (2012) Bray-Curtis, Euclidean and weighted and unweighted UniFrac diversity measures are

60

implemented to explore variation of microbial communities at different body regions and test their 2

This article is protected by copyright. All righ1ts reserved.

Accepted Article 61

relation with environmental variables. β diversity methods commonly used in microbial ecology

62

were reviewed in the study of Lozupone et al (2008).

63

There is, however, no consensus about which measure is more appropriate to describe β diversity,

64

or whether different measures are relevant to different ecological hypotheses (reviewed in

65

Magurran, 2004). Some advocate the use of multiple and complimentary algorithms (Anderson et

66

al., 2011; Parks et al., 2012) for ecological interpretations. Yet, singular measures are common, and

67

the most popular approaches include distance-matrix based Bray-Curtis (Bray and Curtis 1957) and

68

UniFrac (Hamady et al., 2010) algorithms and raw data based principal component (PCA) and

69

correspondence (CA) analysis (Ramette, 2007). Further, although any positive value of β diversity

70

between two samples would indicate dissimilarity in community composition (Moshe et al., 2004),

71

there is no consensus about the methodology to determine whether this difference is significant. As

72

a result, significance is typically not discussed in microbial ecological studies. Nevertheless, it

73

seems critical to be able to answer the very question whether variations in community composition

74

between samples is significant or not (Legendre et al., 2005) or, in other words, whether the

75

community compositions of different samples are subsets or replicates of each other or not (Schloss,

76

2008).

77

Different methods have been proposed to assess the statistical significance of β diversity: First,

78

Hutcheson (1970) developed a method to calculate a significance value by comparing the Shannon

79

indeces of two independent large samples using a classical t-test. With the addition of phylogenetic

80

information into diversity analysis, Martin (2002) introduced the fixation index (FST) and the

81

phylogenetic test (P test). Then, the significance test of the UniFrac method, developed by

82

Lozupone et al. (2006), was introduced. These techniques are based on observed community

83

phylogenetic trees and several synthetically created phylogenetic trees (permutation based Monte

84

Carlo simulations) wherein the sequences are randomly distributed between environments of

85

interest. Another phylogenetic β diversity measure, βNTI, and associated significance method,

86

mean-nearest-taxon-distance (MNTD), was developed by Kraft et al. (2007). In the significance test

87

of MNTD the degree of non-random phylogenetic community structure is detected by randomizing

88

OTUs and their relative abundances across the tips of the phylogeny, permutating the MNTD

89

several times, and finally comparing the observed MNTD to the mean of this distribution (Stegen et

90

al., 2012). Although phylogeny-based techniques are powerful tools to detect dissimilarity patterns,

91

they are sensitive to

92

communities in samples that are being compared (Lozupone et al., 2011). In addition to

sampling depth (number of individuals per sample) and evenness of

3

This article is protected by copyright. All righ1ts reserved.

Accepted Article 93

phylogenetic based methods, Chase et al. (2011) introduced the Raup-Crick (after Raup and Crick

94

(1979)) metric with a meta-community based significance test. Like in the other significance testing

95

methods, the Raup-Crick metric measures the degree to which pairwise communities are more

96

different (or more similar) to each other than can be expected by chance. In this method an observed

97

meta-community from the communities of local samples is created and random sub-communities

98

are drawn from the observed meta-community proportional to communities among-site occupancy

99

in local samples (e.g. 1000 times), and then the null distribution of random sub-communities are

100

compared with the observed β diversity (shared species) to calculate the significance (Chase et al.,

101

2011). Proper application of the Raup Crick method requires relatively equal sampling depths

102

(Vellend et al., 2007) and evenness of local communities (Chase et al., 2011).

103

Youssef et al. (2010) and Lozupone et al. (2007) introduced multivariate ordination based

104

significance testing procedures which could be applied with any type of β diversity measure.

105

Youssef et al. (2010) hypothesized that random subsamples from the community at one sample

106

location would be more similar to each other than random subsamples from communities at two

107

separate sample locations, if these communities were significantly different. The β diversity value

108

between subsamples at one location was then used as a threshold for comparing microbial

109

communities at different sample locations: β diversity values above this threshold would be

110

significant and suggest a true non-random community difference. In their study, using Bray-Curtis

111

as the β diversity measure, all β diversities between samples were retained as significant (Youssef et

112

al., 2010). While this method suggests a threshold and can avoid the effect of sampling depth, it

113

ignores the fact that this significance threshold, can change with subsampling size. Lozupone et al.

114

(2012) developed an approach that combines subsampling with UPGMA (Unweighted Pair Group

115

with Arithmetic Mean), PCoA (Principle Coordinate Analysis) and Jackknifing techniques: In this

116

method, possible PCoA locations of subsamples of a sample are displayed with ellipses around the

117

original sample point representing the interquartile range (IQR) on each axis of the PCoA plots. β

118

diversity between any pair of samples is called significant if the distance between these two samples

119

is larger than the sum of the radii of the two ellipses around the individual sample locations in the

120

PCoA plots. Although this method is also β measure-free (i.e., it can be used together with any β

121

diversity algorithm), the IQR ranges can be underestimated due to the biases associated with each

122

step of the method (Fig.S1).

123

Here we propose a rigorous sequential procedure to assess the significance of a β diversity measure.

124

Our method takes advantage of rarefaction (Olszewski, 2004) to standardize pair-wise comparisons, 4

This article is protected by copyright. All righ1ts reserved.

Accepted Article 125

and the notion of a meta-community (Leibold et al., 2004) from which all samples derive. We begin

126

by recognizing that any measure of β diversity contains components that derive from differences

127

within (intra) and between (inter) microbial communities in different samples. We quantify these

128

components and analyze their properties. Then, we investigate how community evenness and

129

sample size affect these components. This procedure allows us to isolate the true (between-sample)

130

β diversity. Next, we propose a threshold value above which this β diversity must lie to be called

131

significant. This threshold value is calculated from random subsampling the observed meta-

132

community, which we approximate by summing all sample communities. We test our procedure on

133

in silico data of log-normally distributed microbial communities to assess the sensitivity of our

134

method to the sampling size. Finally, our method is applied to estimate β diversities within an

135

experimental data set of replicate 16S rRNA tag-based sequence libraries taken from different

136

sampling locations within and between biological rapid sand filters (RSFs) at a full-scale

137

waterworks. Here, we tested whether the environmental conditions in RSFs, which consists of high

138

and continuous throughputs of source waters and related electron donors, create on-average

139

homogeneous conditions within RSF. Hence, we were looking for significant β diversity values,

140

which would indicate sustained heterogeneity in environmental conditions.

141

Results and discussion

142

Identifying components of observed β diversity

143

Quantitative dissimilarity indices are commonly used to describe β diversity in microbial ecology.

144

While each dissimilarity index may have its own limitations, sample size (the number of individuals

145

in a sample) affects them all (Chao et al., 2004; Chase et al., 2011; Lozupone et al., 2011). To avoid

146

the bias of different sample sizes in dissimilarity and significance calculations, sample sizes are

147

generally standardized by rarefaction (Gotelli et al., 2001; Gilbert et al., 2011; Stegen et al., 2012)

148

or differences are visualized with multivariate ordination techniques (Lozupone et al., 2011). While

149

standardization of sample size is needed, subsampling by rarefaction adds a potential error into

150

sample comparisons, which should be removed from the β diversity estimation in order to estimate

151

the true difference between the samples.

152

The latter contribution, associated with rarefaction of an individual sample, originates from β

153

diversity ‘within’ a sample (or intra sample differences). We introduce the term βAA' to refer to the β

154

diversity calculated from intra sample (in this case: sample A) differences, while we use the term

155

βAB to refer to β diversity calculated from inter sample (in this case: samples A and B) differences.

5

This article is protected by copyright. All righ1ts reserved.

Accepted Article 156

Below, we carefully cover the meaning and importance of βAA' because it is an essential element in

157

our argument and method.

158

[FIG. 1]

159

Consider that one wishes to estimate the dissimilarity in community composition between sample A

160

and sample B (βAB), which are characterized by sequence library A and B, containing 10000 and

161

13000 individuals, respectively (Fig. 1). One would rarefy the samples to a consistent size (samples

162

A' and B'). However, when a sample is rarefied (A to A' and B to B' in Fig. 1), two errors are

163

introduced ([Eq.(1)]): (i) β1AA' which is the β diversity calculated between random subsamples at a

164

defined sample size (two random same-size subsamples are not identical) and (ii) β2AA' which is the

165

β diversity between a sample and its subsample arising from taxon loss (in case, OTU loss)

166

associated with the decrease in sample size. β1AA' and β1BB' can be calculated from the mean β

167

diversity of multiple equally sized subsamples from sample A and B. We propose that the inter-

168

sample β diversity, βAB, can then be corrected by subtracting the average value of β1AA' and β1BB'

169

([Eq. (1) and (2)]). Correction for β2AA' and β2BB' is more difficult, but is minimized if microbial

170

community evenness is similar in the compared samples. We propose to use the β1 diversity

171

estimated from differences between random subsamples of the meta-community as a threshold for

172

significance (see below).

173

(1)

174

(2)

175

Validation of the β diversity correction

176

Providing emperical evidence on the discussed errors in the β diversity estimation is difficult when

177

biological samples are being compared because, in such case, the true β diversity is not known.

178

Hence, we use an in silico created synthetic community of 13000 individuals distributed over 1000

179

OTUs. This community is compared with an identical replicate community. By comparing the

180

community against itself, the true βAB diversity is 0 for the original samples and all subsamples.

181

Contribution of β2AA' is also avoided, as the probability of OTU loss during rarefaction is the same

182

for all samples (identical) to be compared. We contend that if any non-zero βA'B' diversity is

183

calculated at any sampling size, it would originate from β1AA' diversity. We, then, evaluate whether

184

the true β diversity can be estimated, at any sampling size, by correcting for the calculated β1AA'

185

diversity.

186

We conducted our test on three identical log-normally distributed in silico samples and subsamples

187

of different sizes. Fig. 2 shows the results of this analysis for the quantitative Jaccard and Bray6

This article is protected by copyright. All righ1ts reserved.

Accepted Article 188

Curtis indices. In both cases, the calculated β diversities, between any pair of the three samples,

189

increases with decreasing subsampling depth. Values for β1AA' diversity of each sample also

190

increase with decreasing subsampling depth. At all subsampling depths and for both indices,

191

correction of β by β1 ([Eq. (1) and (2)]) adjusts the β A'B' values to 0 (±0.008, Fig. 2), which is the

192

true value of βAB. Hence, as the probability of OTU loss (β2AA') is the same for any pair of samples,

193

the calculated β derives entirely from β1AA'. This β1AA' effect must be removed from sample

194

comparisons to avoid inflation of the true βAB diversity. Since each dissimilarity index has its own β

195

algorithm and outcome (Fig.S2), β1AA' and β must be calculated with the same index before

196

correction.

197

Youssef et al. (2010) proposed the mean β1AA' of equally sized subsamples of an individual sample

198

as a threshold value to assess significance of a β value calculated between subsamples of two

199

samples. Our results indicate that the mean β1AA' diversity equals the βA'B' diversity when

200

completely identical samples are compared to each other (because β2AA' is the same for all samples

201

at the same subsampling depth). Hence, even after correction for β1AA', the smallest possible true β

202

difference between two replicates would be called significant using this criterion, and no pairwise β

203

diversity analysis could ever reject the null hypothesis that communities from the replicate

204

communities were identical. Hence, this criterion is not sufficiently stringent to determine

205

significance of β diversity.

206

The method of jackknife UPGMA clustering combined with PCoA plots (Lozupone et al., 2007)

207

was applied to the same data in order to compare the observed mean β1AA' diversity with the

208

calculated IQRs of the samples at the same subsampling depth. The observed mean β1AA' diversity,

209

determined by rarefaction, is far larger than the IQRs calculated by jackknifing (see Fig.S1). Hence,

210

IQRs underestimate the β1AA' diversity, and similar samples can erroneously be classified as

211

significantly different. Hence, an improved method to assess β significance appears warranted.

212

[FIG. 2]

213

Relation of evenness and diversity with inter β components

214

The corrected βAB diversity of any pair of samples (A and B) should have the same value,

215

irrespective of the degree of rarefaction, if the β2 diversities of A and B follow the same trend (i.e.,

216

the probability of OTU loss by rarefaction is same for A and B). The rarefaction process, however,

217

does not consider the underlying distribution of the community (Gotelli et al., 1996), but assumes

218

that individuals in a community are randomly distributed. Here, we examine to what extent β1AA'

7

This article is protected by copyright. All righ1ts reserved.

Accepted Article 219

and β2AA' vary with the evenness of the community structure at different sampling depths, and how

220

this affects β diversity calculations when the analyzed communities have different structures.

221

[TABLE 1]

222

In the first analysis, we created 3 different in silico communities (Fig.S3) with same OTU richness,

223

but different evenness (Table 1; see Fig.S4 for Lorenz curves), in order to determine the effect of

224

rarefaction on β1AA' and β2AA', as well as on pairwise βAB diversity. Moreover, we removed

225

singletons and doubletons from the log-normal community to observe the effect of rare OTUs on

226

β1AA' and β2AA'. β diversity analysis contingent on rarefaction is clearly sensitive to the evenness of

227

the considered communities (Fig. 3). The β1AA’ diversity, using the Bray–Curtis index (Fig. 3A-3B),

228

is highest for uniform and chi-squared distributed communities which display highest evenness. In

229

contrast, β1AA' diversity calculated with the Jaccard index (Fig. 3D-3E) is highest for the log-normal

230

community, irrespective of degree of rarefaction. This observation is consistent with a previous

231

study which revealed that each dissimilarity index has its own sensitivity to rare species (Parks et

232

al., 2012). Nevertheless, as seen before, the degree of rarefaction has a more dominant effect on

233

β1AA' than the community structure itself increasing rapidly with decreasing subsampling size. The

234

pairwise β diversities, corrected per Eq. (1), are not independent of subsampling size, except for

235

situations where communities are close in evenness (Fig. 3C-3F). With different initial evenness,

236

increasing sampling size typically suggest increasing dissimilarity, and even negative β values were

237

obtained by comparing the chi-squared and log-normal distributed communities, at smallest sample

238

size. For communities with similar evenness (e.g., log-normal vs. log-normal trimmed; uniform vs.

239

chi-squared), pairwise β diversity are nearly constant from the minimum to maximum subsampling

240

size.

241

[FIG. 3]

242

In the second analysis, we created 9 different log-normally distributed in silico communities with

243

same OTU richness, but varying in evenness (see Fig.S5 for rank-abundance curves) to determine to

244

what extent we can ignore the effect of evenness on β1AA' and β2AA' at different sampling depths.

245

Although all selected in-silico communities follow the log-normal distribution, their differences in

246

evenness affected pairwise β diversities corrected per Eq. (1) (green lines; Fig.S6). As expected, the

247

highest difference between corrected β diversity at the lowest and highest subsampling depth was

248

detected for communities having the highest difference in evenness (Gini values). The corrected β

249

diversities are independent of subsampling size when compared communities have a Gini value

250

difference less than 0.08 (Fig.S6). To minimize this artefact, one should verify whether the 8

This article is protected by copyright. All righ1ts reserved.

Accepted Article 251

communities in the compared samples are similar in evenness; the difference in the Gini values

252

should not exceed 0.08.

253

Artefactual increases or decreases in β values with sampling size must be recognized, when

254

comparing communities with different evenness. This is especially important when very small β

255

diversity values are used to claim changes in community composition along a temporal, spatial, or

256

environmental gradient (e.g. Herlemann et al., 2011; Lauber et al., 2009; Thompson et al., 2011).

257 258 259

Randomly subsampling the observed meta-community to set a β diversity significance threshold

260

Replication is essential when exploring and comparing microbial diversity, especially, if inferences

261

about large-scale diversity are being drawn from local-scale sample diversity (Prosser, 2010).

262

Replicates, properly sampled across a spatial scale, can be treated as local communities of a meta-

263

community due to dispersal in physically connected systems (Van der Gucht et al., 2007).

264

Constructing representative meta-communities from local communities, as introduced by Hubbell

265

(2001), has been used to explain evolutionary and ecological processes shaping microbial diversity

266

patterns (Palacious et al. 2008). Here, we approximate the meta-community as the combination of

267

all replicate communities. We propose the diversity associated with random subsampling from the

268

meta-community (i.e., the meta-community’s β1AA' called β1MC) as a significance threshold against

269

which any pairwise β diversity between replicates can be compared.

270

Alternatively, a distribution of β1MC (β1AA' of meta-community) values can be obtained by

271

calculating β1MC between subsamples (e.g. 1000 comparisons) randomly drawn from the observed

272

meta-community. Then a p-value can be estimated by comparing observed β1A'B' to the pattern

273

expected under β1MC distribution. This methodology can then be applied with any chosen index to

274

estimate β diversity.

275

[TABLE 2]

276

The proposed method was applied to compare synthetic replicate communities (Table 2) with

277

similar evenness, observed OTUs, but different total individuals. With structurally similar

278

communities, we expect no significant differences after sample size equalization by rarefaction. In

279

addition, we implemented previously published β significance tests (Lozupone et al., 2007; Youssef

280

et al., 2010; Chase et al., 2011) to compare with the results from our method. The performace of the

281

proposed method was also tested at different degrees of βAB using synthetic communities which follow

282

the same evenness distribution (Fig.S7). 9

This article is protected by copyright. All righ1ts reserved.

Accepted Article 283

The β1MC of the meta-community, calculated using either the Bray Curtis or Jaccard index, far

284

exceeded the pairwise βAB estimates (Fig. 4A). All βAB diversities were found to be insignificant

285

using the proposed procedure. This outcome is expected as the differences between samples

286

originated from the number of individuals and could be equalized using rarefaction. However, using

287

either the approach of Youssef et al. (2010; Fig. 4B) or Lozupone et al. (2007; Fig. 4C), all in silico

288

samples were significantly different from each other. In addition, the Raup-Crick method (Chase et

289

al., 2011) detected 2 βAB diversities among 3 as insignificant and 1 βAB diversity as significant (see

290

Table S1). Although the Raup-Crick method uses a meta-community concept to estimate

291

significance thresholds (Chase et al., 2011), the inability to reject the beta diversity as significant in

292

this example, likely stems from uncorrected comparisons for β1AA' (in this case correction for shared

293

species). Therefore, β analyses, without correction for β1AA' or without consideration of a meta-

294

community, may not be sufficiently rigorous. To equalize sampling depth and avoid bias from

295

β1AA', we recommend the use of multiple equally sized subsamples of communities that are being

296

compared for any of  diversity measure or the use our method together with a choice of any β

297

measure.

298

[FIG. 4]

299

A case study: Community heterogeneity at fine-scale in rapid sand filters

300

The introduced methods were applied to examine dissimilarity between communities from replicate

301

samples within an individual and between different rapid sand filters. All filters were located at the

302

same waterworks and receive, on average, the same groundwater feed throughout their operation.

303

These libraries allowed us to consider both phylogeny-based (weighted UniFrac (Hamady et al.,

304

2010)) as well as taxon-based (Bray-Curtis and Jaccard) β diversity measures. Our goal was to

305

assess whether β diversity between spatially segregated replicate samples (replicates) was

306

significant, which would be indicative of sustained spatially heterogeneous conditions in each

307

biofilter. This case study, in addition, allowed us to compare significance thresholds from our

308

method versus previous methods.

309

A total of 68856 sequences passed quality checks and were binned into 7871 OTUs at 97%

310

phylogenetic similarity. Calculated Gini coefficients revealed similar evenness between the

311

replicates, fully permitting our methodology. On the other hand, based on typical descriptors like

312

Shannon index, Chao1 estimator, and observed OTU counts (Table 3), community structures

313

differed between the replicates and between the biofilters. Rank-abundance curves of dominant and

314

rare OTUs of all samples are shown in Fig.S8. 10

This article is protected by copyright. All righ1ts reserved.

Accepted Article 315

[TABLE 3]

316

Pairwise β values (Fig. 5) were calculated using both taxon and phylogeny based estimators as

317

follows: the observed β value was calculated and corrected for β1AA' at different levels of

318

rarefaction; the corrected β was then compared with the threshold β1MC estimated from sampling the

319

meta community at the same subsampling depth. If the corrected β exceeded β1MC, β was considered

320

significant (see Fig.S9).

321

This analysis revealed several key points. First, as observed before, β1MC decreases with

322

subsampling depth – irrespective of the metric – but β1MC is less subject to subsampling depth for

323

the phylogenetic diversity metric. Second, all corrected β values increase with sampling depth.

324

Hence, the ability to identify significant β diversity increases with increased sampling depth (at

325

smallest sampling depths, most pairwise β values were not significant), but less sampling depth is

326

required when using phylogeny-based measures (the smallest subsampling depth was sufficient to

327

identify significant differences in several instances). Third, different metrics have different abilities

328

to resolve replicate communities: the Jaccard-based β values are very similar for all pairwise

329

comparisons, while the Unifrac-based β values are clearly different for different replicate pairs.

330

Fourth, conclusions about significant differences between replicates are not always consistent

331

across the different dissimilarity indices. For example, for several samples the Jaccard-based β

332

diversity would suggest no difference (Replicate 2 vs. 3, Replicate 1 vs. 3 in biofilter 2; and

333

Replicate 1 vs. 2, and 2 vs. 3 in biofilter7) whereas, Weighted UniFrac and Bray Curtis measures

334

indicate the opposite. Finally, the UniFrac-based analysis more readily reveals significant βAB

335

diversities, because it uses additional information (pairwise similarities of sequences) (Lozupone et

336

al., 2008) beyond the taxon-based measures. As expected, the previously proposed – but less

337

stringent – methods to test β diversity concluded that all samples are significantly different in terms

338

of microbial diversity (Fig.S10).

339

Using these strict criteria to determine significance of β diversity values, we conclude that even

340

over the small spatial scales examined, microbial communities in the rapid sand filters are

341

significantly different: these differences stem in large part from the rare OTUs (Table S2), as seen

342

by others (Youssef et al., 2010).

343

[FIG. 5]

344

To evaluate the stringency of our threshold values versus previous methods, we compare the mean

345

value of β diversities between random trees, following the unweighted UniFrac significance method

346

(Lozupone et al., 2008) to the β1MC of our approach using same diversity measure (Fig. 6). The β1MC 11

This article is protected by copyright. All righ1ts reserved.

Accepted Article 347

is more dependent on sampling depth than the significance threshold of unweighted Unifrac:

348

significance thresholds estimated with unweighted UniFrac are stricter than β1MC at higher sampling

349

depth, and vice versa at lower levels of rarefaction. This mainly stems from increases of β1MC with

350

decreasing sampling depth (red circles in Fig. 4A-4B and 5) and similar mean UniFrac values of

351

random trees at multiple subsampling scales (Lozupone et al., 2011). Therefore, we suggest to use

352

of our method to validate the p-values calculated with any β significance testing method, as the

353

observed β diversity should exceed the mean β1MC for any pair of samples to be called dissimilar or

354

not subsets of each other.

355

Conclusions

356

In this study, we have identified two independent technical components of β diversity, β1AA' and

357

β2AA', which are associated with the necessary process of rarefaction or subsampling when

358

comparing sample sequence libraries. Observed pairwise β diversities are, hence, inflated by the

359

β1AA' of both samples, and we propose a procedure for correction. Additionally, we demonstrate that

360

β2AA' can add to the bias of the β calculations, especially when the difference in evenness of the

361

compared sample communities exceeds a critical value (Gini > 0.08). We suggest that evenness is

362

checked in all sequence libraries before performing β analysis. We propose to use the β1AA' of the

363

observed meta-community (β1MC) as a threshold value against which to compare any pairwise β

364

value of replicate samples. This method can be applied irrespective of the chosen β diversity metric.

365

We demonstrate that this threshold is more rigorous than previously proposed methods to assess

366

significance in β diversity. In addition, due to variation in degree of strictness at different sampling

367

scales, we also suggest to use our method to validate any chosen β significance testing method.

368

Finally, by using the proposed method and 454 16S tag based pyrosequencing, we demonstrate that

369

the microbial community composition of rapid sand filters is spatially heterogeneous. We believe

370

our technique is useful in studies that focus on elucidating spatial patterns in microbial composition

371

and that explore compositional heterogeneity in microbial communities in any engineered or natural

372

ecosystem.

373

Experimental Procedures

374

Empirical in silico datasets

375

In silico OTU libraries were created using the “Stats” package (version 2.15.1) in R (R

376

Development Core Team, 2011). Log-normal, uniform and non-central chi-squared (9 d.f.)

377

distributed OTU libraries were created using the “rlnorm”, “runif” and “rchisq” commands,

378

respectively, each containing 8000 individuals. Additional log-normally distributed OTU libraries 12

This article is protected by copyright. All righ1ts reserved.

Accepted Article 379

were created with 4700, 7500, 8000, 13000 and 13500 individuals. All OTU libraries were rounded

380

to integers and transformed into the biological observation matrix (biom) format. All used

381

commands are listed in the Fig.S3.

382

Site description and sampling (16S rRNA dataset)

383

Sediment samples were obtained from 3 different rapid sand filters (approx. dimensions of the

384

filterbed: 18 m2 (area) x 0.7 m (depth)) at a waterworks in Copenhagen. Core samples of 60 cm

385

length were taken at 3 randomly chosen locations at each filter. Each core was sectioned in 10 cm

386

increments, starting from the top 5 cm using a metal spatula. Sections were transferred to the

387

laboratory on ice and homogenized in sterile plastic bags using a rotating drum.

388

DNA extraction

389

A 0.5 gr. drained subsample of each homogenized section was subject to genomic DNA extraction

390

using the MP FastDNA™ SPIN Kit (MP Biomedicals LLC., Solon, USA) per manufacturer’s

391

instructions. The concentration and purity of extracted DNA was checked by NanoDrop 2000

392

spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA). Equal volumes (10 µl) of the

393

DNA extracts of each section excluding top 5 cm from a core sample were combined and used for

394

further analysis.

395

PCR amplification and pyrosequencing

396

10 ng of extracted DNA was PCR amplified using Phusion (Pfu) DNA polymerase (Finnzymes,

397

Finland) and the 16S rRNA gene targeted (rDNA) universal primers PRK341F (5'-

398

CCTAYGGGRBGCAACAG-3') and PRK806R (5'-GGACTACNNGGGTATCTAAT-3') (Yu et al.,

399

2005). PCR was performed by the following cycle conditions: an initial denaturation at 98 °C for 30

400

s, 30 cycles of denaturation at 98 °C for 5 s, annealing at 56 °C for 20 s and elongation at 72 °C for

401

20 s, and a final elongation step at 72 °C for 5 min. Subsequent to PCR amplification, the samples

402

were held at 60 °C for 3 min and then immediately placed on ice. PCR products were analyzed and

403

cut from 1% agarose gel and purified by QIAEX II Gel Extraction Kit (QIAGEN). Adapters and

404

sample specific tags were added to the amplicon in a subsequent PCR amplification of 15 cycles at

405

same PCR conditions (Dowd et al., 2008). Amplified fragments were quantified using a QubitTM

406

fluorometer (Invitrogen) and sequenced in a two-region 454 run on a 70-75 GS PicoTiterPlate using

407

a Titanium kit and GS FLX pyrosequencing system at the National High-throughput DNA

408

Sequencing Center (University of Copenhagen, DK) per manufacturer instructions (Roche), as

409

described elsewhere (Sutton et al., 2013).

410

Bioinformatic analyses 13

This article is protected by copyright. All righ1ts reserved.

Accepted Article 411

All raw 16S rRNA sequences were quality-checked with Ampliconnoise (Quince et al., 2011) and

412

chimeras were removed using UCHIME (Edgar et al., 2011) using default parameters. Individual

413

sequences were classified into OTUs at 97% pairwise sequence identity (called OTU0.03) and

414

aligned against the Greengenes reference set (DeSantis et al., 2006) using the Pynast algorithm

415

(Caporaso, Bittinger, et al., 2010). Aligned representative sequences for each OTU were then used

416

to build phylogenetic trees using the “Fast Tree” method (Hamady et al., 2010). After quality

417

checks, all analyses were performed using QIIME software (Caporaso, Kuczynski, et al., 2010).

418

Abundant and rare OTUs in each sequence library were defined at 1% relative abundance threshold

419

(Gobet et al., 2010). With both in silico and real datasets, subsampling was employed to equalize

420

sample sizes for β diversity comparisons. A pseudo-random number generator (Mersenne twister

421

PRNG with NumPy`s default) was used for rarefaction analysis, to generate subsampled OTU

422

libraries (subsampling, without replacement). 10 random subsamples of each OTU library were

423

created and transferred into OTU-tables. Meta communities were created by combining related

424

OTU libraries and adding into the OTU-tables of original samples for further subsampling process.

425

Published β diversity significance methods including the Jackknife technique (Lozupone et al.,

426

2007) and the technique of Youssef et al. (2010) were conducted to the same analysis. Jackknife

427

process is a combination of any dissimilarity measure, UPGMA, PCoA and Jackknife techniques to

428

measure the robustness of the β diversity of any pair of samples by displaying the IQRs with

429

ellipses in Jackknife PCoA scatter plots. An IQR indicates the possible range where a random

430

subsample, at a certain sample size, can be located in. Raup-Crick, a published β diversity

431

significance method, was implemented using the R code in vegan package (version 2.0-10; Dixon,

432

2003) after sample size normalization at 4500 individuals. Calculations were implemented using the

433

model “r1” as the method in “oecosimu” function, which uses column marginal frequencies as

434

probabilities. In vegan package, the Raup-Crick code was originally developed by Chase et al.

435

(2011), and then modified by Jari Oksanen.

436

To assess to what extent β1AA' and β2AA' vary with evenness at different sampling depths, we run

437

simulations with communities having various degrees of evenness in R environment, and we chose

438

simulations having Gini differences of communities in the range of 0.013 to 0.385. Rarefaction at

439

different sampling scales, calculation and correction of Bray-Curtis β diversities were implemented

440

using phyloseq (McMurdie and Holmes, 2013) and vegan (Dixon, 2003) packages in R

441

environment. In addition, to determine the performance of our methods at different degree of βAB,

442

we created log normal distributed in silico OTU libraries having various degrees of evenness and 14

This article is protected by copyright. All righ1ts reserved.

Accepted Article 443

we chose comparisons having Bray-Curtis β diversities 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, and 0.7 using

444

simulations in R environment. Rarefaction at different sampling scales, calculation and correction

445

of Bray-Curtis β diversities were implemented using phyloseq (McMurdie and Holmes, 2013) and

446

vegan (Dixon, 2003) packages in R environment.

447

To compare significance thresholds, β values estimated with unweighted UniFrac between

448

randomly created synthetic trees extracted using “unifrac.unweighted” command implemented in

449

Mothur pipeline (Schloss et al., 2009). Otutable and phylogenetic tree of the real 454 dataset for

450

Mothur analysis were transferred from QIIME environment and converted manually. Boxplot based

451

comparisons of β1MC and mean unweighted UniFrac β values of randomized trees were generated in

452

R environment.

453

Statistical analyses

454

Alpha diversity of synthetic and experimental OTU libraries was computed by counting observed

455

species and by using the Chao1 (Chao, 1987) and Shannon index (Magurran, 2004) algorithms in

456

QIIME. Evenness was assessed with the Gini coefficient (Gcorr), calculated geometrically by

457

calculating the area under the Lorenz curves per OTU0.03 pair (Wittebolle et al., 2009) with

458

correction to minimize bias (Edwards et al., 2011).The compositional dissimilarity between OTU

459

libraries (i.e. the β diversity) was calculated using seven (Abundance Jaccard, Bray-Curtis, Pearson,

460

Chisq,

461

Weighted_UniFrac) metrics for the synthetic and experimental OTU libraries, respectively. Results

462

from calculated distance matrices were displayed using PCoA. Mean and standard deviation of β

463

diversities, obtained from the distance matrices, were calculated using R software (R Development

464

Core Team, 2011).

465

To determine whether β diversity originates from rare or dominant OTUs, an OTU-table was

466

created containing samples with and without rare OTUs. The effect of rare OTUs was calculated by

467

subtracting the distance between adapted samples from the distance between original samples.

468

Acknowledgement

469

We thank George Kwarteng Amoako for his assistance in the sample collection. We also thank Dr

470

Waleed Abu Al-Soud, Dr. Sanin Musovic, Dr. Simon Rasmussen, Dr. Hans-Jørgen Albrechtsen and

471

Dr Arnaud Dechesne for their technical support or discussions on various elements of this

472

manuscript. This research was financed by The Danish Council for Strategic Research (Project DW

473

Biofilter). The authors declare that they have no conflict of interest.

Euclidean,

Manhattan,

Soergel)

or

nine

474

15

This article is protected by copyright. All righ1ts reserved.

(in

addition:

Unweighted

UniFrac,

Accepted Article 475 476

References

477 478 479

Anderson,M.J., Crist,T.O., Chase,J.M., Vellend,M., Inouye,B.D., Freestone,A.L., et al. (2011) Navigating the multiple meanings of β diversity: a roadmap for the practicing ecologist. Ecol. Lett. 14: 19–28.

480

Bell,T. (2010) Experimental tests of the bacterial distance-decay relationship. ISME J. 4: 1357–65.

481 482

Bray,J.R. and Curtis,J.T. (1957) An Ordination of the Upland Forest Communities of Southern Wisconsin. Ecol. Monogr. 27: 325–349.

483 484 485

Caporaso,J.G., Bittinger,K., Bushman,F.D., DeSantis,T.Z., Andersen,G.L., and Knight,R. (2010) PyNAST: a flexible tool for aligning sequences to a template alignment. Bioinformatics 26: 266–7.

486 487 488

Caporaso,J.G., Kuczynski,J., Stombaugh,J., Bittinger,K., Bushman,F.D., Costello,E.K., et al. (2010) QIIME allows analysis of high- throughput community sequencing data. Nat. Methods 7: 335– 336.

489 490

Chao,A. (1987) Estimating the Population Size for Capture-Recapture Data with Unequal Catchability. Biometrics 43: 783–791.

491 492 493

Chao,A., Chazdon,R.L., Colwell,R.K., and Shen,T.-J. (2004) A new statistical approach for assessing similarity of species composition with incidence and abundance data. Ecol. Lett. 8: 148–159.

494 495 496

Chase,J.M., Kraft,N.J.B., Smith,K.G., Vellend,M., and Inouye,B.D. (2011) Using null models to disentangle variation in community dissimilarity from variation in α-diversity. Ecosphere 2: art24.

497 498

Condit,R., Pitman,N., Leigh,E.G., Chave,J., Terborgh,J., Foster,R.B., et al. (2002) Beta-diversity in tropical forest trees. Science (80-. ). 295: 666–669.

499 500 501

DeSantis,T.Z., Hugenholtz,P., Larsen,N., Rojas,M., Brodie,E.L., Keller,K., et al. (2006) Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72: 5069–72.

502 503

Dixon,P. (2003) VEGAN, a package of R functions for community ecology. J. Veg. Sci. 14: 927– 930.

504 505 506

Dowd,S.E., Sun,Y., Wolcott,R.D., Domingo,A., and Carroll,J.A. (2008) Bacterial Tag–Encoded FLX Amplicon Pyrosequencing (bTEFAP) for Microbiome Studies: Bacterial Diversity in the Ileum of Newly Weaned Salmonella-Infected Pigs. Foodborne Pathog. Dis. 5: 459–472.

16

This article is protected by copyright. All righ1ts reserved.

Accepted Article 507 508

Edgar,R.C., Haas,B.J., Clemente,J.C., Quince,C., and Knight,R. (2011) UCHIME improves sensitivity and speed of chimera detection. Bioinformatics 27: 2194–200.

509 510 511

Edwards,A., Anesio,A.M., Rassner,S.M., Sattler,B., Hubbard,B., Perkins,W.T., et al. (2011) Possible interactions between bacterial diversity, microbial activity and supraglacial hydrology of cryoconite holes in Svalbard. ISME J. 5: 150–60.

512 513

Freestone,A.L. and Inouye,B.D. (2006) Dispersal limitation and environmental heterogeneity shape scale-dependent diversity patterns in plant communities. Ecology 87: 2425–32.

514 515

Gilbert,J. a, Steele,J. a, Caporaso,J.G., Steinbrück,L., Reeder,J., Temperton,B., et al. (2011) Defining seasonal marine microbial community dynamics. ISME J. 1–11.

516 517

Gobet,A., Quince,C., and Ramette,A. (2010) Multivariate Cutoff Level Analysis (MultiCoLA) of large community data sets. Nucleic Acids Res. 38: e155.

518 519

Gotelli,N.J. and Colwell,R.K. (2001) Quantifying biodiversity : procedures and pitfalls in the measurement and comparison of species richness. Ecol. Lett. 4: 379–391.

520

Gotelli,N.J. and Graves,G.R. (1996) Species diversity. In, Null Models in Ecology., pp. 21–46.

521 522 523

Van der Gucht,K., Cottenie,K., Muylaert,K., Vloemans,N., Cousin,S., Declerck,S., et al. (2007) The power of species sorting: local factors drive bacterial community composition over a wide range of spatial scales. Proc. Natl. Acad. Sci. U. S. A. 104: 20404–9.

524 525 526

Hamady,M., Lozupone,C., and Knight,R. (2010) Fast UniFrac: facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data. ISME J. 4: 17–27.

527 528 529

Herlemann,D.P., Labrenz,M., Jürgens,K., Bertilsson,S., Waniek,J.J., and Andersson,A.F. (2011) Transitions in bacterial communities along the 2000 km salinity gradient of the Baltic Sea. ISME J. 1571–1579.

530 531

Hubbell,S.P. (2001) The Unified Neutral Theory of Biodiversity and Biogeography Princeton University Press, New Jersey.

532 533

Human,T. and Project,M. (2012) Structure, function and diversity of the healthy human microbiome. Nature 486: 207–14.

534 535

Hutcheson,K.A. (1970) A Test for Comparing Diversities Based on the Shannon Formula. J. Theor. Biol. 29: 151–154.

536 537

Jurasinski,G., Retzer,V., and Beierkuhnlein,C. (2009) Inventory, differentiation, and proportional diversity: a consistent terminology for quantifying species diversity. Oecologia 159: 15–26.

538 539

Koleff,P., Gaston,K.J., and Lennon,J.J. (2003) Measuring Beta Fiversity for Presence-Absence Data. J. Anim. Ecol. 72: 367–382.

17

This article is protected by copyright. All righ1ts reserved.

Accepted Article 540 541

Kraft,N.J.B., Cornwell,W.K., Webb,C.O., and Ackerly,D.D. (2007) Trait evolution, community assembly, and the phylogenetic structure of ecological communities. Am. Nat. 170: 271–83.

542 543 544

Lauber,C.L., Hamady,M., Knight,R., and Fierer,N. (2009) Pyrosequencing-based assessment of soil pH as a predictor of soil bacterial community structure at the continental scale. Appl. Environ. Microbiol. 75: 5111–20.

545 546

Legendre,P., Borcard,D., and Peres-neto,P.R. (2005) Analysing Beta Diversity : Partitioning The Spatial Variation of Community Composition Data. Ecol. Monogr. 75: 435–450.

547 548 549

Leibold,M. a., Holyoak,M., Mouquet,N., Amarasekare,P., Chase,J.M., Hoopes,M.F., et al. (2004) The metacommunity concept: a framework for multi-scale community ecology. Ecol. Lett. 7: 601–613.

550 551

Ley,R.E., Hamady,M., Lozupone,C., Turnbaugh,P.J., Ramey,R.R., Bircher,J.S., et al. (2008) Evolution of mammals and their gut microbes. Science 320: 1647–51.

552 553

Lozupone,C. a and Knight,R. (2008) Species divergence and the measurement of microbial diversity. FEMS Microbiol. Rev. 32: 557–78.

554 555 556

Lozupone,C., Hamady,M., Kelley,S.T., and Knight,R. (2007) Quantitative and qualitative beta diversity measures lead to different insights into factors that structure microbial communities. Appl. Environ. Microbiol. 73: 1576–85.

557 558

Lozupone,C., Hamady,M., and Knight,R. (2006) UniFrac--an online tool for comparing microbial community diversity in a phylogenetic context. BMC Bioinformatics 7: 371.

559 560

Lozupone,C., Lladser,M.E., Knights,D., Stombaugh,J., and Knight,R. (2011) UniFrac: an effective distance metric for microbial community comparison. ISME J. 5: 169–72.

561

Magurran,A.E. (2004) Measuring biological diversity Blackwell Publishing Ltd, Oxford.

562 563 564

Mandl,N. a., Kessler,M., and Robbert Gradstein,S. (2009) Effects of environmental heterogeneity on species diversity and composition of terrestrial bryophyte assemblages in tropical montane forests of southern Ecuador. Plant Ecol. Divers. 2: 313–321.

565 566

Martin,A.P. (2002) Phylogenetic Approaches for Describing and Comparing the Diversity of Microbial Communities. Appl. Environ. Microbiol. 68: 3673–3682.

567 568 569

McKnight,M.W., White,P.S., McDonald,R.I., Lamoreux,J.F., Sechrest,W., Ridgely,R.S., and Stuart,S.N. (2007) Putting beta-diversity on the map: broad-scale congruence and coincidence in the extremes. PLoS Biol. 5: e272.

570 571

McMurdie,P.J. and Holmes,S. (2013) phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data. PLoS One 8: e61217.

572 573

Moshe,K. and Matthew,S. (2004) Confidence Intervals and Hypothesis Testing for Beta Diversity. Ecology 85: 2895–2900. 18

This article is protected by copyright. All righ1ts reserved.

Accepted Article 574 575

Olszewski,T.D. (2004) A Unified Mathematical Framework for the Measurement of Richness and Evenness within and among Multiple Communities. Oikos 104: 377–387.

576 577

Palacios,C., Zettler,E., Amils,R., and Amaral-Zettler,L. (2008) Contrasting microbial community assembly hypotheses: a reconciling tale from the Río Tinto. PLoS One 3: e3853.

578 579

Parks,D.H. and Beiko,R.G. (2012) Measures of phylogenetic differentiation provide robust and complementary insights into microbial communities. ISME J. 7: 173–183.

580

Prosser,J.I. (2010) Replicate or lie. Environ. Microbiol. 12: 1806–10.

581 582

Quince,C., Lanzen,A., Davenport,R.J., and Turnbaugh,P.J. (2011) Removing noise from pyrosequenced amplicons. BMC Bioinformatics 12: 38.

583 584

R Development Core Team,R. (2011) R: A Language and Environment for Statistical Computing. R Found. Stat. Comput. 1: 409.

585

Ramette,A. (2007) Multivariate analyses in microbial ecology. FEMS Microbiol. Ecol. 62: 142–60.

586 587

Raup,D.M. and Crick,R.E. (1979) Measurement of Faunal Similarity in Paleontology. J. Paleontol. 53: 1213–1227.

588 589

Schloss,P.D. (2008) Evaluating different approaches that test whether microbial communities have the same structure. ISME J. 2: 265–75.

590 591 592

Schloss,P.D., Westcott,S.L., Ryabin,T., Hall,J.R., Hartmann,M., Hollister,E.B., et al. (2009) Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl. Environ. Microbiol. 75: 7537–41.

593 594

Shade,A., Gregory Caporaso,J., Handelsman,J., Knight,R., and Fierer,N. (2013) A meta-analysis of changes in bacterial and archaeal communities with time. ISME J. 1–14.

595 596

Stegen,J.C., Lin,X., Konopka,A.E., and Fredrickson,J.K. (2012) Stochastic and deterministic assembly processes in subsurface microbial communities. ISME J. 6: 1653–64.

597 598 599

Sutton,N.B., Maphosa,F., Morillo,J. a, Abu Al-Soud,W., Langenhoff,A. a M., Grotenhuis,T., et al. (2013) Impact of long-term diesel contamination on soil microbial community structure. Appl. Environ. Microbiol. 79: 619–30.

600 601 602

Thompson,F.L., Bruce,T., Gonzalez,A., Cardoso,A., Clementino,M., Costagliola,M., et al. (2011) Coastal bacterioplankton community diversity along a latitudinal gradient in Latin America by means of V6 tag pyrosequencing. Arch. Microbiol. 193: 105–114.

603 604 605

Tuomisto,H. (2010a) A diversity of beta diversities: straightening up a concept gone awry. Part 1. Defining beta diversity as a function of alpha and gamma diversity. Ecography (Cop.). 33: 2– 22.

19

This article is protected by copyright. All righ1ts reserved.

Accepted Article 606 607

Tuomisto,H. (2010b) A diversity of beta diversities: straightening up a concept gone awry. Part 2. Quantifying beta diversity and related phenomena. Ecography (Cop.). 33: 23–45.

608 609 610

Vellend,M., Verheyen,K., Flinn,K.M., Jacquemyn,H., Kolb,A., Van Calster,H., et al. (2007) Homogenization of forest plant communities and weakening of species-environment relationships via agricultural land use. J. Ecol. 95: 565–573.

611 612

Whittaker,R.. H.. (1972) Evolution and Measurement of Species Diversity. Int. Assoc. Plant Taxon. 21: 213–251.

613 614

Whittaker,R.H. (1960) Vegetation of the Siskiyou Mountains, Oregan and California. Ecol. Monogr. 30: 279–338.

615 616

Wittebolle,L., Marzorati,M., Clement,L., Balloi,A., Daffonchio,D., Heylen,K., et al. (2009) Initial community evenness favours functionality under selective stress. Nature 458: 623–6.

617 618 619

Youssef,N.H., Couger,M.B., and Elshahed,M.S. (2010) Fine-scale bacterial beta diversity within a complex ecosystem (Zodletone Spring, OK, USA): the role of the rare biosphere. PLoS One 5: e12414.

620 621 622

Zinger,L., Amaral-Zettler,L. a, Fuhrman,J. a, Horner-Devine,M.C., Huse,S.M., Welch,D.B.M., et al. (2011) Global patterns of bacterial beta-diversity in seafloor and seawater ecosystems. PLoS One 6: e24570.

623

20

This article is protected by copyright. All righ1ts reserved.

Accepted Article 624 625

Fig. Legends:

626 627 628

Fig. 1 Components of observed β diversity between sample A and B, calculated from rarefied

629

samples A' and B'. Numbers inside the boxes refer to the number of individuals. Equation estimates

630

the true β diversity between sample A and B by correcting the observed β diversity, βA῾B῾, with β1AA῾

631

and β1BB῾.

632

21

This article is protected by copyright. All righ1ts reserved.

Accepted Article 633

634 635

Fig. 2 Estimations of β diversities between identical replicates of an in silico OTU library using

636

Bray-Curtis and Jaccard indices. S, S' and S'' refer to the identical replicate samples. Blue, green

637

and red colors represent the replicate OTU libraries or their subsamples. (A, C) Observed β

638

diversities (β1AA' and βA'B') as a function of subsampling depth (B, D) β diversities as a function of

639

subsampling depth corrected per Eq. (2).

640

22

This article is protected by copyright. All righ1ts reserved.

Accepted Article 641

642 643

Fig. 3 β diversities of in silico OTU libraries with same richness but different evenness (see Table

644

1) calculated with Bray-Curtis and Jaccard indices. (A-C) β1AA' diversities as a function of

645

subsampling depth. Blue, green, orange and red colors represent the different OTU library and their

646

subsamples (B-D). βA'B' diversities, corrected per Eq. (2), as a function of subsampling depth.

647

23

This article is protected by copyright. All righ1ts reserved.

Accepted Article 648 649

Fig. 4 β diversity analysis of three log normal distributed in silico OTU libraries with different

650

individuals (4500, 7500 and 13500) using Bray-Curtis and Jaccard indices. βAB, βA'B' and β1MC

651

diversity as a function of subsampling depth using quantitative Bray-Curtis (A) and Jaccard (B)

652

index. Observed ( - - ) and corrected ( - -

653

different subsampling depths. The meta-community ( ) was created by combining all individuals in

654

all libraries. (C) PCoA plot (Bray Curtis) of in silico OTU libraries at 4500 and subsamples at 3500

655

individuals to simulate the subsampling approach (Youssef et al., 2010). (D) PCoA plot (Bray

656

Curtis) of in silico OTU libraries at 4500 individuals using Jackknife technique (Lozupone et al.

657

2012). IQRs were calculated at subsampling depth of 2000 individuals and represented by ellipses

658

(a representative Fig. is shown in the top-right corner).

659

) mean β diversities (per Eqn. (2)) are plotted at

24

This article is protected by copyright. All righ1ts reserved.

Accepted Article 660

661 662

Fig. 5 β diversity analysis applied to triplicate samples from triplicate biofilters using taxon based

663

and phylogeny based indices, at different sampling depths. (Replicate 1 vs. Replicate 2:

664

Replicate 1 vs. Replicate 3:

665

Eq.(2) using β1AA' of each sample at different subsampling depths. The meta-community was

666

created by combining all individuals of triplicate libraries, and β1MC (red) was calculated at

667

consistent sampling depth.

,

, Replicate 2 vs. Replicate 3: ). βAB (green) was corrected (blue) per

668

25

This article is protected by copyright. All righ1ts reserved.

Accepted Article 669

670 671 672

Fig. 6 Estimated β significance threshold values at sampling depths of 2500 and 5000 individuals.

673

Red line represents β1MC estimated with proposed method together in combination with Unweighted

674

UniFrac algorithm. Box plots represents the values (3 comparisons for a filter) of mean β diversities

675

between 1000 randomly created synthetic trees with the Unweighted UniFrac significance test.

676 677

26

This article is protected by copyright. All righ1ts reserved.

Accepted Article 678 679

Table legends

680

Table 1 Characteristics of in silico communities with different underlying distributions Community: Gini coefficient (Ginicorr.) Observed OTUs Chao1 Shannon index (H') Total Individuals

681 682

1 2

Log_normal 0.60 1000 1124 8.90 8000

Log_normal_trimmed1 Uniform 0.50 0.29 623 1000 623 1007 9.76 8.54 7497 8000

Chi-Squared2 0.26 1000 1000 9.81 8000

OTUs containing singletons and doubletons were removed from the Log_normal OTU library Chi-Squared OTU library was created with 9 degrees of freedom

683

27

This article is protected by copyright. All righ1ts reserved.

Accepted Article 684 685

Table 2 Characteristics of replicate log normal microbial communities and the derived meta-

686

community

687

Gini coefficient (Ginicorr.) Observed OTUs Chao1 Shannon entropy (H') Total individuals

Log_normal_a

Log_normal_b

Log_normal_c

0.51 1000 1087 9.16 4700

0.55 1000 1065 9.04 7500

0.54 1000 1019 9.17 13500

688 689

28

This article is protected by copyright. All righ1ts reserved.

Observed meta-community 0.54 1000 1006 9.14 25700

Accepted Article 690 691

Table 3 Summary descriptors of microbial community structures in replicate biofilter samples

Filter 1 Filter 2 Filter 3

692

Rep.1 Rep.2 Rep.3 Rep.1 Rep.2 Rep.3 Rep.1 Rep.2 Rep.3

Gini coefficient (Ginicorr.)

Observed OTU0.03

Chao1

Shannon entropy (H')

Total sequence count

0.91 0.90 0.90 0.83 0.86 0.86 0.82 0.81 0.86

533 614 563 1075 999 648 1262 1371 806

1496 1848 1659 3190 2999 1865 3579 5084 2277

3.33 3.84 3.66 5.36 4.95 4.29 6.09 6.39 4.72

7523 7649 7907 7712 5339 8447 8834 8415 7030

693 694

29

This article is protected by copyright. All righ1ts reserved.

An improved method to set significance thresholds for β diversity testing in microbial community comparisons.

Exploring the variation in microbial community diversity between locations (β diversity) is a central topic in microbial ecology. Currently, there is ...
2MB Sizes 0 Downloads 8 Views