Biometrics 70, 954–961 December 2014

DOI: 10.1111/biom.12238

Rotation-Based Multiple Testing in the Multivariate Linear Model Aldo Solari,1, * Livio Finos,2 and Jelle J. Goeman3 1

Department of Economics, Management and Statistics, University of Milano-Bicocca, Milan, Italy 2 Department of Statistical Sciences, University of Padua, Padua, Italy 3 Department for Health Evidence, Radboud University Medical Center, Nijmegen, The Netherlands ∗ email: [email protected] Summary. In observational microarray studies, issues of confounding invariably arise. One approach to account for measured confounders is to include them as covariates in a multivariate linear model. For this model, however, the application of permutation-based multiple testing procedures is problematic because exchangeability of responses, in general, does not hold. Nevertheless, it is possible to achieve rotatability of transformed responses at the cost of a distributional assumption. We argue that rotation-based multiple testing, by allowing for adjustments for confounding, represents an important extension of permutation-based multiple testing procedures. The proposed methodology is illustrated with a microarray observational study on breast cancer tumors. Software to perform the procedure described in this article is available in the flip R package. Key words:

Exchangeability; Familywise error rate; Microarray; Multiple testing; Permutation test; Rotation test.

1. Introduction Multiple testing has found its way in applications in genomics, where a large number of hypotheses are tested simultaneously. An archetypical example is a microarray study where the aim is to identify which of a large number of genes are differentially expressed between two groups of subjects. We will refer to this two-groups comparison for illustration purposes, although it is just one of the many instances of this problem. In microarray studies, however, random assignment of subjects to groups is not always feasible. In the context where the data arise from an observational study, the two groups may be defined by two different conditions, such as cancer and noncancer tissues, so one cannot control the assignment of subjects to conditions. Another often overlooked complication in microarray studies is batch effects, which occur because measurements are affected by laboratory conditions, reagent lots and personnel differences (Leek et al., 2010). In these contexts, issues of confounding will invariably arise. Confounding is a long-standing problem and precise formal definitions of related concepts can be found in Greenland, Robins, and Pearl (1999). By way of illustration, let us suppose that the two groups systematically differ, with older subjects belonging to one group and younger subjects belonging to the other group, and that age affects gene expression. Then, age is related both to group assignment and gene expression. It follows that a simple comparisons of average gene expression between groups that ignore age will be misleading because the group effect will be confounded with the effect of age. We can quantify the bias incurred by excluding a confounding covariate in the context where a linear regression model is appropriate and there is only one confounding covariate. First define the “correct” specification as y = γ0 + θ · x + γ · z + ε,

954

where y is the gene expression, x is the group covariate with coefficient θ, z is the age covariate with coefficient γ, and γ0 is the intercept term. If instead the confounding covariate, z, is ignored, one can fit the model y = γ0∗ + θ ∗ · x + ε∗ . When γ = 0, problems arise. On the one hand, if there is no group effect (θ = 0) and an imbalance in age in the two groups, then without adjusting for age, it can happen to observe a significant difference in expression between the two groups (ˆθ ∗ = 0), resulting in a false positive, as illustrated in the left plot of Figure 1. On the other hand, if there is a group effect (θ = 0) and an imbalance in age in the two groups, then without adjusting for age, it can happen that we observe no difference in expression between the two groups (ˆθ ∗ ≈ 0), resulting in a false negative, as illustrated in the right plot of Figure 1. The present work has primarily been motivated by the need for methodology that adequately addresses the problem of confounding. The proposed methodology will be illustrated with a microarray observational study on breast cancer tumors (Sotiriou et al., 2003) by discussing the comparison with a previous analysis that does not account for confounding (Korn and Freidlin, 2008). Multiple testing approaches with adjustments for confounding have been proposed in differential expression analyses for microarray data (Ghosh, 2008; Wagner et al., 2008; Heller, Manduchi, and Small, 2009). In particular, the multiple testing method proposed by Wagner et al. (2008) takes advantage of permutation testing to deal with the unknown dependence structures of p-values. Methods that do not incorporate the dependence structure of the p-values, such as the Bonferroni method, can be quite conservative when the genes are strongly dependent, resulting in a loss of power to detect truly differentially expressed genes. © 2014, The International Biometric Society

Rotation-Based Multiple Testing

955

− θ

group = 1 group = 0



− θ^* −

age

gene expression

gene expression

θ − − − θ^*

age

Figure 1. An artificial example showing the occurrence of a false positive (left) and of a false negative (right) which may arise by omitting the confounding covariate “age” from the model.

A popular permutation-based method for controlling the familywise error rate (FWER) is the method of Westfall and Young (1993), which provides finite-sample exactness and have been shown to be asymptotically optimal for a broad class of correlation structure (Meinshausen, Maathuis, and Buhlmann, 2011), although exchangeability is a key requirement for its validity (Pesarin, 2001; Westfall and Troendle, 2008). The method of Wagner et al. (2008), however, is based on permuting non-exchangeable residuals and, as such, it does not provide finite-sample exactness. With small sample sizes, this approach may lead to quite anticonservative tests, as we will show by simulations in Section 5. In this article we consider the multivariate linear model, where the multivariate response represents the gene expression profile, and the set of available covariates is divided into covariates of interest and covariates which play the role of confounders. When the model is correctly specified and it includes all confounding covariates, differentially expressed genes can be properly identified. For this model, we show that it is possible to construct a resampling-based multiple testing procedure that achieves finite-sample exactness at the cost of a distributional assumption, and how this result is linked to the theory of rotation testing (Langsrud, 2005; L¨ auter, Glimm, and Eszlinger, 2005; Perry and Owen, 2010). We argue that rotation-based multiple testing, by allowing for adjustments for confounding, represents an important extension of permutation-based multiple testing procedures such as that of Westfall and Young (1993). The outline of this article is as follows. In the next section, we give the formulation of the multivariate general linear model. Section 3 illustrates the concepts of exchangeability and rotatability and their role in permutation and rotation tests’ construction. A multiple testing procedure controlling the probability of k or more false rejections (k-FWER) is given in Section 4. In Section 5 we investigate, by means of simulations, the rotation procedure’s robustness properties under theoretically unfavorable conditions. The application of the proposed approach to a real microarray observational study on breast cancer tumors is presented in Section 6. Software to perform the procedures described in this article is available in the flip R package.

2. Multivariate Linear Model Suppose we have a sample of n independent subjects, for which we have an (n × m) matrix of m responses Y and an n × (q + c) design matrix, with c < n. We partition the design matrix into an n × q design matrix X of covariates of interest and an n × c matrix Z of potential confounders. Consider the multivariate general linear model defined by Y = X + Z + E,

(1)

where  and  are (q × m) and (c × m) matrices of parameters of interest and nuisance parameters, respectively. We usually suppose that the first column of Z equals 1n , a nvector of ones, which means that the first row of  contains the intercept term. The random matrix E is a matrix of errors. We do not assume a distributional form for E, but we specify its first two moments: E ∼ (0n×m , In ⊗ ),

(2)

where In denotes the (n × n) identity matrix and, for a random matrix M ∼ (A, B ⊗ C), the notation means that its elements have E(mij ) = aij and Cov(mij , mkl ) = bik cjl for matrices A, B and C of appropriate dimensions. Since we have assumed independent subjects, the rows of E are independent draws from some m-variate distribution with mean 0 and covariance , where  in the microarray setting represents the (m × m) gene-gene covariance matrix. The univariate linear model for the jth response is yj = Xθ j + Zγ j + ej ,

(3)

where yj , θ j , γ j , and ej are the jth column of Y , , , and E, respectively. Here ej has mean 0 and covariance σj2 In , where σj2 is the jth diagonal element of . The null hypothesis of no relationship between yj and X can be specified as Hj : θ j = 0.

(4)

956

Biometrics, December 2014

We have obtained a collection H1 , . . . , Hm of null hypotheses, one for each gene, where H=

m 

(5)

Hj ,

j=1

represents the global null hypothesis of no relationship between Y and X. In the microarray setting, this general framework looks as follows: m gene expressions are measured in n independent subjects (represented by Y ), and the aim is to test for association between each gene expression (represented by yj ) and a covariate of interest, such as a binary or continuous phenotype (represented by X), while taking into account additional subject characteristics, such as sex and age (represented by Z). 3. Null-Invariants The probability distribution of a random matrix can enjoy some useful symmetries. Starting with the pioneering work of de Finetti (1930), probabilistic symmetries have been considered in various contexts (Kallenberg, 2005), but two of them exchangeability and rotatability—stand out as especially interesting for permutation and rotation tests’ construction. Permutation and rotation tests’ construction can be seen as a special case of the randomization tests’ construction discussed in Lehmann and Romano (2005). This construction requires an algebraic group of transformations O such that d

Y = OY

under H,

resulted in identical distributions, that is, condition (6) does not hold. This problem can be remedied in the situation with a discrete confounding covariate, where we could permute the rows of Y within each covariate value, but condition (6) is fundamentally unrepairable in the situation with a continuous covariate unless we make a distributional assumption, as we will see. Consider projecting the response Y into the subspace orthogonal to the subspace spanned by the columns of Z. This can be done by premultiplying both sides of (1) by the projection matrix In − H, where H = Z(ZT Z)−1 ZT , obtaining ˆ = (In − H)X + (In − H)E. E

(7)

ˆ = (In − H)Y are the estimated residuals obtained Note that E from fitting the null model Y = Z + E. For model (7), we can define the exchangeability condition as d ˆ = ˆ E OE

under H,

(8)

for every permutation matrix O ∈ O. Wagner et al. (2008) and Zeng et al. (2011) considered perˆ but this construction gives permutation muting the rows of E, tests which are only asymptotically valid. In order to obtain ˆ should be row-exchangeable, that finite-sample exactness, E is, its null distribution should not change if we permute the ˆ To see the lack of exchangeability in (8), note that rows of E. under H we have

(6) ˆ ∼ (0n×m , (In − H) ⊗ ) E

d

for every n × n matrix O ∈ O, where “=” denotes equality in distribution. The practical usefulness of condition (6) is that it provides different observations OY which are equally likely under H. We refer to condition (6) as the randomization hypothesis and to transformations O as null-invariants. The name null-invariants comes from the fact that the transformation of Y by O does not change its null distribution (Goeman and Solari, 2010). In particular, when the null invariants are permutations, Y is called row-exchangeable, and when they are rotations, that is, orthogonal transformations, Y is called leftrotatable. 3.1. Exchangeability Exchangeability requires permutations as null-invariants. Here, the group O is formed by all the n! possible permutation matrices O, where O is a square matrix that have exactly one entry 1 in each row and each column and 0s elsewhere. It is easy to see that, in the particular case of no confounding covariates with Z = 1n , condition (6) holds for every permutation matrix O ∈ O, that is, Y is row-exchangeable. Indeed, in that case, the rows of Y are i.i.d. random vectors, and hence exchangeable, from some common m-variate distribution with mean vector (γ1 , . . . , γm )T and covariance matrix . For this model, permutation tests are both exact and distribution-free. However, when one or more confounding covariates are present, the rows of Y are independent but not identically distributed, and consequently permuting them would not have

and

ˆ ∼ (0n×m , O(In − H)O ⊗ ), OE T

thus a necessary condition for (8) to hold is (In − H) = O(In − H)OT , for every permutation matrix O ∈ O, which, in general, is not true (Commenges, 2003). 3.2. Rotatability The randomization hypothesis may be recovered, but only at the cost of a distributional assumption. Write (In − H) = QQT according to its eigenvalue decomposition, where Q is a semi-orthogonal (n × n − c) matrix whose columns are the eigenvectors corresponding to non-zero eigenvalues and QT Q = In−c . By premultiplying both sides of (7) by QT , we obtain

 +E  = X , Y

(9)

 = QT X, and E  = QT Y , X  = QT E. For model (9), where Y we can define the randomization hypothesis as  Y =O Y d

under H,

(10)

∈O  . Note that under for every (n − c) × (n − c) matrix O  H, we have Y ∼ (0(n−c)×m , In−c ⊗ ), and to achieve the ran-

Rotation-Based Multiple Testing domization hypothesis in (10), it is necessary for the null to satisfy invariants O

 In−c O  T. In−c = O  must be an orthogonal matrix with O O T = It follows that O  In−c . The group O formed by all possible orthogonal matrices  is the orthogonal group of degree n − c. Note or rotations O that, in contrast to permutations, which are finite in number, rotations are uncountably infinite. In addition, it is known that the class of distributions sat is the orthogonal group is the class of isfying (10) when O left-spherical distributions (Dawid, 1981; Finos, 2011). This  , we need to assume that means that to have rotatability of Y  is left-spherically distributed. Y By Lemma 1 in Dawid (1981), assuming that Y is left is left-spherically spherically distributed implies that Y distributed. However, the gain in generality that flows from allowing any left-spherical distribution for Y is somewhat illusory. Indeed, a classic result due to Maxwell (1860) shows that independence of observations and left-sphericity characterizes the normal law of errors, that is E ∼ N(0n×m , In ⊗ ).

(11)

Summing up, in the multivariate linear model, rotation tests are exact but not distribution-free, as they require  , which is fulfilled when E is multivariate left-sphericity for E normal. Without distributional assumptions, what we obtain is a weaker form of invariance, that we refer as second-moment null-invariance, which replaces (10) by

 Y ) and Cov(Y ) = Cov(O  Y ) under H.  ) = E(O E(Y

(12)

 This corresponds to second-moment exchangeability when O is a permutation matrix (Commenges, 2003). Rotation tests based on (12) are only asymptotically exact, but in any case they represent an improvement over the construction of Wagner et al. (2008) and Zeng et al. (2011), which is based ˆ and satisfies only first-moment on permuting the rows of E ˆ ˆ under H. null-invariance, that is, E(E) = E(OE) 4. Rotation-Based k-FWER Control A key feature of the rotation-based construction is that it provides a basis for carrying out multiple testing procedures. Controlling the FWER in microarray studies is typically too stringent a criterion, as it may lead to limited power to detect real differences. In this Section we will focus on the rotation-based version of the method of Korn and Freidlin (2008) for k-FWER control, although it is possible to provide the rotation-based version of other permutation-based multiple testing procedures, such as the method of Meinshausen (2006) for controlling the False Discovery Proportion. Unfortunately, permutation or rotation multiple testing procedures with finite-sample False Discovery Rate control are not straightforward to obtain. The permutation-based k-FWER controlling method of Korn and Freidlin (2008) is valid in the case of no confounding

957

covariates, that is, when Z = 1n . Indeed, Korn and Freidlin wrote in their discussion that “for situations in which a permutation test cannot be directly used, e.g., testing one independent variable in multiple regression [...] a promising approach is to use an approximate permutation test based on permuting the residuals.” Without loss of generality, we present the rotation-based procedure for controlling the k-FWER which uses p-values as  ) for the test statistics. We require that each p-value pj = pj (Y  hypothesis Hj must depend on Y only through its jth column  ) comes  yj . This requirement is fulfilled for example when pj (Y from the F statistic

  T  −1  T ˜j /q ˜T y j X(X X) X y   T  −1  T yj /(n − c − q) ˜T y j (In−c − X(X X) X )˜

,

which follows the F distribution with q and n − c − q degrees of freedom when the errors are normally distributed and Hj is true. ∈O  Y ) be the p-value calculated for  , let pj (O For any O  Y , and p(1) (O  Y ) ≤ · · · ≤ p(m) (O  Y ) the the rotated data set O ordered values. Define the random variable



 ) = max c : cα,k = cα,k (Y





O∈ O 

 Y ) < c}dμH ≤ α , 1{p(k) (O (13)

where 1{·} denotes the indicator function and μH is the Haar  (Stewart, 1980). In other words, probability measure on O cα,k is the α-quantile of the rotation distribution of the kth smallest p-value. As in Korn and Freidlin (2008), we consider the singlestep procedure that uses cα,k as critical value and rejects all the hypotheses Hj for which pj < cα,k . For the more powerful step-down procedure, see Romano and Wolf (2007). ForJ ⊆ {1, . . . , m}, let HJ be the intersection hypothesis  J denote the (n − c) × #J matrix consistHJ = j∈J Hj and Y yj , j ∈ J. ing of the columns  The following theorem, whose proof is given in the Appendix, shows that the single-step rotation procedure controls the k-FWER at α. Theorem 1. Suppose that the generalized randomization hypothesis

 Y J J = O Y d

under HJ ,

(14)

∈O  and J ⊆ {1, . . . , m}. Then the rotationholds for every O based single-step procedure that rejects any Hj for which pj is less than cα,k in (13) strongly controls the k-FWER at α. The condition (14) is a generalization of the randomization hypothesis (6) (Romano and Wolf, 2005). Indeed, although the randomization hypothesis does not hold if only a subset d

 Y under HJ with  = O of the hypotheses are true, that is, Y J ⊂ {1, . . . , m}, we just need it to hold for the subset of data  J used for the calculation of pj , j ∈ J. Y

958

Biometrics, December 2014

 is essenIt is worth noting that the group structure of O tial for the validity of the result, as remarked in Southworth, Kim and Owen (2008). Furthermore, the proof extends that given in Lehmann and Romano (2005), which is based on a group with finite elements (such as the permutation group), to a group with uncountably infinite elements (such as the orthogonal group). In practice, however, we need to replace cα,k in (13) by 

 ) = max c : ˆcα,k = ˆcα,k (Y

1

 #O



 Y ) < c} ≤ α , 1{p(k) (O

 O∈ O (15)

 because computationally we can only generate a finite set O  , by using for instance the of random orthogonal matrices O method described in Stewart (1980). It can be verified that the result of Theorem 1 still holds with ˆcα,k replacing cα,k for a sufficiently large number of random rotations. 5. Simulation Study We conducted a simulation study with the aim to better understand, for small to large sample sizes, the behavior of rotation tests in case of violations of the assumption of normality of errors given in (11). Furthermore, we were interested in how ˆ performs the procedure based on permuting the residuals E compared to the rotation procedure in terms of asymptotic convergence to the nominal level. In particular, we generated multivariate data according to model (1) where X is a n-vector with xi = 1{i ≤ n/2}, which defines the two groups to be compared, and Z is an n × 2 matrix containing the intercept term z1,i = 1 and a confounding covariate z2,i = i, i = 1, . . . , n. Parameters  and  were set to 01×n and 12×m , respectively, which means that the two groups were not differentially expressed, but there was a confounding factor. We considered four types of distribution for E: multivariate normal N(0n×m , In ⊗ ), heavy-tailed multivariate Student’s t Tg (0n×m , In ⊗ ) with g = 2 degrees of freedom, tailless Gaussian copula by applying the probability integral transform to N(0n×m , In ⊗ ) and multivariate skew normal SN(0n×m , In ⊗ , s) with shape parameter s equal to 100. We investigated different dimensions m = {1, 100}, sample sizes n = {4, 8, 16, 32, 64, 128} and  with on-diagonal elements equal to 1 and off-diagonal elements equal to ρ = {0, 0.5}. In total, the number of different combinations of error distribution, m, n, and ρ was 72. For each combination, the FWER was estimated from 5000 datasets as the average rejection rate of at least one true null hypothesis at the significance level α = 0.05. Both procedures were based on F  = 1000 random transformations. statistics with #O Simulation results are reported in Figure 2. As expected, we find that under non-normality the procedures’ error rate converges to the nominal level with increasing sample sizes in both the univariate (Figure 2U) and the multivariate (Figure 2M) scenarios. What is perhaps most surprising about Figure 2M is that, even under normality, the residualbased procedure is prone to a very anticonservative behavior with small sample sizes due to the lack of second-moment

null-invariance in (12). The performance of the two procedures in this simulation illustrates that, even if both are asymptotically valid, the rotation procedure performs reasonably well in moderately large sample sizes and outperforms the procedure based on permuting the residuals.

6. Application The data set by Sotiriou et al. (2003) consists of cDNA gene expression profiles from n = 99 tumor specimens from breast cancer patients. In addition to gene expression values for 7650 genes (probes), the data set contains informations about the estrogen receptor (ER) status (“+” or “−”) and the tumor grade (dichotomized in “1 or 2” and “3”) for each patient. ER biology plays a central role in breast carcinogenesis defining the configuration of the final tumor (Sotiriou et al., 2003). This motivates the research question of whether the ER status is associated with differential gene expression. However, despite tumor grade also affects the behavior of breast cancer patients, an adjustment for tumor grade in comparing ER statuses was not taken into account in earlier analyses of this data set. Korn and Freidlin (2008) used a single-step permutation procedure for controlling the k-FWER in two separate analyses: one analysis compares 65 patients with ER+ versus 34 patients with ER−, and the other analysis compares 54 patients with grade 1 or 2 tumors versus 45 patients with grade 3 tumors. Gene expression patterns were found to be strongly associated with ER status and moderately associated with tumor grade. Here we are concerned with the comparison of Korn and Freidlin’s analysis of ER+ and ER− patients, which does not account for tumor grade, and our analysis, that adjusts for it. There is a clear imbalance in the distribution of tumor grade across the ER+ and ER− groups: patients with grade 3 tumors are the 32.3% in the ER+ group, in contrast to 70.6% in the ER− group (a Fisher’s exact test for testing independence between estrogen receptor status and tumor grade gives a pvalue of 0.0005). Since tumor grade affects gene expression, the comparison of gene expression between ER+ and ER− patients that ignores tumor grade can be misleading because the estrogen receptor effect can be confounded with the tumor grade effect. Following Korn and Freidlin (2008), gene expression values were preprocessed as described in Sotiriou et al. (2003) and we restricted attention to the m = 7470 genes for which the number of missing values was less than the size of the ER− group minus 2. With reference to model (1), X is a vector of ER status, while Z is a matrix with two columns representing the intercept and the tumor grade covariate. By fitting a normal linear model for each gene, we obtained the values of 7470 two-sample t-test statistics for the ER coefficients. By controlling for the number of false positives at the level α = 0.05 and allowing for 10 errors (k = 11), the permutation procedure without correction for tumor grade identifies 258 genes, whereas the rotation procedure with correction for tumor grade identifies 180 genes. Both procedures were based  = 10 000 random transformations. The multiplicityon #O adjusted p-values from the two procedures are compared in Figure 3.

Rotation-Based Multiple Testing

0.10

64

M: Heavy−tail

M: Tail−less

M: Skew

0.20 0.15

FWER

0.05 n

128

64

32

8

16

4

128

64

32

16

8

4

0.00

0.05 0.00 128

64

32

8

16 n

0.10

0.20 0.15

FWER

0.05 0.00 4

128

64

32

16

8

0.10

0.20 0.15

FWER

0.10

0.20 0.15 0.10 0.05

n

0.25

M: Normal

0.25

n

0.25

n

128

32

8

16

4

64

128

32

8

16

4

0.00

0.02

0.04

Type I error

0.02 0.00 64

128

32

8

16

0.06

0.08

0.10 0.08 0.06

Type I error

0.02 0.00 4

64

128

32

8

4

16

n

0.25

n

0.00 4

U: Skew

0.04

0.08 0.06

Type I error

0.04

0.08 0.06 0.04 0.00

0.02

Type I error FWER

U: Tail−less

0.10

U: Heavy−tail

0.10

U: Normal

959

n

Figure 2. Estimated type I error and FWER for the rotation procedure (black line) and the procedure based on permuting the residuals (gray line) for U: m = 1 and M: m = 100 with solid line: ρ = 0 and dashed line: ρ = 0.5. Simulation margin of error for α = 0.05 is ±0.006, as displayed by the dotted lines. Among the discoveries, 171 genes were identified by both procedures, 87 by only the permutation procedure and 9 by only the rotation procedure. Although the rotation procedure discovered fewer genes than the permutation procedure, we have more confidence that the differential expression of the discovered genes can indeed be attributed to ER status. In fact, 58 out of the 87 genes (66.6%) discovered only by the permutation procedure showed significant associations with tumor grade (unadjusted p-values less that 0.05), while none of the 9 genes discovered only by rotation procedure did. We now assess how top ranked genes from the permutation analysis are reported in the rotation analysis. We focus on the top ranked genes because the agreement between the ranks of relevant genes is more meaningful than the agreement between the ranks of irrelevant genes. The proportion of genes in the top-258 list of the permutation analysis that are also in the top-258 list of the rotation analysis is 86.4%. Spearman’s and Kendall’s correlation coefficients between

the two rankings based on the set of the genes appearing in at least one top-258 list are 0.767 and 0.582, respectively, reflecting that the adjustment for tumor grade may be critical in judging the relevant genes.

7. Discussion We have presented a general framework for rotation-based multiple testing in the multivariate linear model. In observational microarray studies or more generally in highthroughput experiments, naive application of permutation methods may lead to invalid inferences because of the presence of confounding. Permutation testing is a very powerful and popular approach to multiple testing, and rotation testing represents an important extension of it. Software to perform the proposed approach is available in the R package flip, downloadable from CRAN. We have included in the Web Appendix a

Biometrics, December 2014

3 2 1 0

− log10 adj.p (rotation procedure)

4

960

0

1

2

3

4

− log10 adj.p (permutation procedure)

Figure 3. Comparison of multiplicity-adjusted p-values (in the − log10 scale) between the procedures with and without correction for tumor grade. At α = 0.05 and k = 10, the permutation and rotation procedures reject 180 and 258 hypotheses, respectively. Dark, open, and plus dots indicate the genes identified by both procedures, only the permutation procedure, and only the rotation procedure, respectively. commented R file showing the application to the example described in this article. 8. Supplementary Materials A Web Appendix, which contains the R code for producing the results described in Section 6, is available with this paper at the Biometrics website on Wiley Online Library.

Acknowledgements The authors thank Dr Lara Lusa (Institute for Biostatistics and Medical Informatics, University of Ljubljana) for providing the data set used in the application and the anonymous referees for several suggestions which have helped to improve the presentation of the article.

References Commenges, D. (2003). Exact and asymptotically robust permutation tests. Journal of Nonparametric Statistics 15, 171–185. Dawid, A. (1981). Some matrix-variate distribution theory: Notational considerations and a Bayesian application. Biometrika 68, 265–274. de Finetti, B. (1930). Funzione caratteristica di un fenomeno aleatorio. Accademia Nazionale dei Lincei. Mem. Serie VI. 4, 89–133. Finos, L. (2011). A note on left-spherically distributed test with covariates. Statistics & Probability Letters 81, 639–641. Ghosh, D. (2008). Multiple testing procedures under confounding. In: Beyond Parametrics in Interdisciplinary

Research: Festschrift in Honor of Professor Pranab K. Sen, N. Balakrishnan, E. A. Pe˜ na, and M. J. Silvapulle (eds), Volume 1, 243–256, Beachwood, Ohio: Institute of Mathematical Statistics. Goeman, J. and Solari, A. (2010). The sequential rejection principle of familywise error control. The Annals of Statistics 38, 3782–3810. Greenland, S., Robins, J. M., and Pearl, J. (1999). Confounding and collapsibility in causal inference. Statistical Science 14, 29–46. Heller, R., Manduchi, E., and Small, D. (2009). Matching methods for observational microarray studies. Bioinformatics 25, 904–909. Kallenberg, O. (2005). Probabilistic Symmetries and Invariance Principles. New York: Springer. Korn, E. and Freidlin, B. (2008). A note on controlling the number of false positives. Biometrics 64, 227–231. Langsrud, O. (2005). Rotation tests. Statistics and Computing 15, 53–60. L¨ auter, J., Glimm, E., and Eszlinger, M. (2005). Search for relevant sets of variables in a high-dimensional set up keeping the familywise error rate. Statistica Neerlandica 59, 298– 312. Leek, J. T., Scharpf, R. B., Bravo, H. C., Simcha, D., Langmead, B., Johnson, W. E., Geman, D., Baggerly, K., and Irizarry, R. A. (2010). Tackling the widespread and critical impact of batch effects in high-throughput data. Nature Reviews Genetics 11, 733–739. Lehmann, E. L. and Romano, J. P. (2005). Testing Statistical Hypotheses. New York: Springer. Maxwell, J. C. (1860). Illustrations of the dynamical theory of gases. Philosophical Magazine 19, 964–970. Meinshausen, N. (2006). False discovery control for multiple tests of association under general dependence. Scandinavian Journal of Statistics 33, 227–237. Meinshausen, N., Maathuis, M., and Buhlmann, P. (2011). Optimality of the Westfall-Young permutation procedure for multiple testing under dependence. The Annals of Statistics 39, 2795–3443. Perry, P. and Owen, A. (2010). A rotation test to verify latent structure. Journal of Machine Learning Research 11, 603– 624. Pesarin, F. (2001). Multivariate Permutation Tests: With Applications in Biostatistics. Chichester: Wiley. Romano, J. and Wolf, M. (2005). Exact and approximate stepdown methods for multiple hypotheses testing. Journal of the American Statistical Association 100, 94–108. Romano, J. and Wolf, M. (2007). Control of generalized error rates in multiple testing. The Annals of Statistics 35, 1378– 1408. Sotiriou, C., Neo, S., McShane, L., Korn, E., Long, P., Jazaeri, A., Martiat, P., Fox, S., Harris, A., and Liu, E. (2003). Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proceedings of the National Academy of Sciences 100, 10393–10398. Southworth, L., Kim, S., and Owen, A. (2008). Properties of balanced permutations. Journal of Computational Biology 16, 625–638. Stewart, G. W. (1980). The efficient generation of random orthogonal matrices with an application to condition estimators. SIAM Journal on Numerical Analysis 17, 403–409. Wagner, B. D., Zerbe, G. O., Mexal, S., and Leonard, S. S. (2008). Permutation-based adjustments for the significance of partial regression coefficients in microarray data analysis. Genetic Epidemiology 32, 1–8.

Rotation-Based Multiple Testing Westfall, P. H. and Troendle, J. F. (2008). Multiple testing with minimal assumptions. Biometrical Journal 50, 745–755. Westfall, P. H. and Young, S. S. (1993). Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment. New York: Wiley. Zeng, C., Pan, Z., MaWhinney, S., Bar-n, A., and Zerbe, G. (2011). Permutation and F distribution of tests in the multivariate general linear model. The American Statistician 65, 31–36.

961

 is a group in the algebraic sense, it holds O = Because O  ◦O  , thus O J J  Y )  ) = cα,k cα,k (Y (O

and by the generalized randomization hypothesis, under HJ , ∈O  for all O J J  Y ) < cα,k  Y )}.  ) < cα,k  )} = 1{pJ(k) (O 1{pJ(k) (Y (Y (O

Received July 2013. Revised April 2014. Accepted June 2014.

d

Consequently Appendix Proof of Theorem 1. Let J ⊆ {1, . . . , m} be the set of indices  Y ) ≤ corresponding to true null hypotheses, and let pJ(1) (O

 Y ) be the ordered values of {pj (O  Y ), j ∈ J}, for · · · ≤ pJ(#J) (O

∈O  . Define the random variable any O 

J  ) = max c : (Y cα,k

 O∈ O 



 Y ) < c}dμH ≤ α , 1{pJ(k) (O

 satisfying where μH is the Haar probability measure on O  ◦  ) = 1 and μH (  μH (O S) = μH (O S) for any measurable  S⊆O ∈O . and any O





J  ) < cα,k ) P pJ(k) (Y (Y

 =



 O∈ O



=E



J  Y ) < cα,k  Y )} dμH E 1{pJ(k) (O (O

O∈ O 

J  Y ) < cα,k  )}dμH 1{pJ(k) (O (Y



≤ α,

where in the last equality we interchange the order of integration and the expectation by the Fubini’s theorem. Finally, J  ) ≤ cα,k  ) with cα,k (Y ) defined because by construction cα,k (Y (Y





 ) < cα,k (Y ) in (13), we obtain P pJ(k) (Y

≤ α.

Copyright of Biometrics is the property of Wiley-Blackwell and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.

Rotation-based multiple testing in the multivariate linear model.

In observational microarray studies, issues of confounding invariably arise. One approach to account for measured confounders is to include them as co...
322KB Sizes 2 Downloads 5 Views