Appl. Statist. (2014) 63, Part 3, pp. 499–514

Optimal sampling ratios in comparative diagnostic trials Ting Dong, Liansheng Larry Tang and William F. Rosenberger George Mason University, Fairfax, USA [Received January 2013. Final revision July 2013] Summary. A subjective sampling ratio between the case and the control groups is not always an efficient choice to maximize the power or to minimize the total required sample size in comparative diagnostic trials. We derive explicit expressions for an optimal sampling ratio based on a common variance structure shared by several existing summary statistics of the receiver operating characteristic curve. We propose a two-stage procedure to estimate adaptively the optimal ratio without pilot data. We investigate the properties of the proposed method through theoretical proofs, extensive simulation studies and a real example in cancer diagnostic studies. Keywords: Area under the curve; Diagnostic accuracy; Partial area under the curve; Power; Receiver operating characteristic curve; Two-stage design

1.

Introduction

Diagnostic trials evaluate the diagnostic accuracy of a marker or compare the diagnostic accuracy of two markers. For example, in a diagnostic trial by Hendrick et al. (2008), investigators compared the accuracy of digital mammography with screen film mammography. Pepe et al. (2001) referred to these trials as phase III diagnostic trials. In these trials, the true disease status of subjects is known. To evaluate the diagnostic accuracy of a binary marker, sensitivity and specificity are used. Sensitivity is the probability of having a positive test result for a case subject. Specificity is the probability of having a negative test result for a control subject. The false positive rate (FPR) is 1− specificity. For continuous markers, we obtain the sensitivity and FPR on the basis of a threshold that distinguishes the test result as being positive or negative. Varying thresholds allow a number of sensitivities and FPRs to be computed simultaneously. The receiver operating characteristic (ROC) curve plots sensitivities against FPRs for all thresholds (Zhou et al., 1998, 2002). Typically the ratio between the number of cases versus the number of controls is fixed in advance. A lung cancer prevention trial used an equal ratio with 71 prostate cancer cases and 71 age-matched controls (Etzioni et al., 1999). Some studies use other case–control ratios; for example, the controls who were enrolled for prostate cancer screening were four times as many as the cases in the Physicians Health Study (Etzioni et al., 2003). In a cancer diagnostic trial from Goddard and Hinberg (1990), 135 cancer patients and 218 non-cancer patients were recruited. A traditional biomarker A and newly developed diagnostic biomarkers were used to test blood samples from each subject. The power for comparing biomarkers A and D is below 45% by using

Address for correspondence: Liansheng Larry Tang, Department of Statistics, George Mason University, MS 4A7, 4400 University Drive, Fairfax, VA 22030, USA. E-mail: [email protected] © 2013 Royal Statistical Society

0035–9254/14/63499

500

T. Dong, L. L. Tang and W. F. Rosenberger

the original sampling ratio of 0.62. Thus, these ratios may not be the best choice to maximize the test power, and optimal sampling ratios need to be derived and utilized to improve the power. Janes and Pepe (2006) provided the optimal ratio for evaluating a continuous marker to maximize the power for a fixed total sample size (SS). Their method is the first attempt to address the optimal sampling ratio in diagnostic trials. However, pilot data are required to estimate the optimal ratio. Without pilot data, some distributions must be assumed to perform the calculations. An optimal ratio from an incorrect distributional assumption may lead to an underpowered study. It is desirable to recalculate the optimal ratio when data become available during the trial. In addition, optimal ratios for comparative diagnostic trials have not been discussed in the literature. We make the following methodological contributions in this paper: (a) a design to update the optimal ratio for evaluating a single marker with questionable parametric assumptions; (b) extension of Janes and Pepe (2006) to two continuous markers and ordinal markers; (c) a design to update the optimal ratio for comparing two markers and rigorous proof of its properties. In this paper we first derive a general expression for the optimal sampling ratio of cases to controls in diagnostic trials. The ratio proposed is based on a common variance structure that is shared among existing ROC summary statistics. Special cases of these statistics include the non-parametric area under the ROC curve statistic AUC that was proposed by DeLong et al. (1988) and the weighted AUC-statistic by Wieand et al. (1989). The method proposed can be used in evaluating one marker or comparing two markers. The rest of the paper is organized as follows. In Section 2, we start with the optimal ratios for diagnostic trials. In Section 3, we propose a two-stage method to incorporate the idea of internal pilot data to estimate adaptively the optimal sampling ratio. The method can be applied in trials that evaluate one marker or compare two markers. We show that, although the optimal ratio is updated during a diagnostic trial, the analysis at the end of the trial can be carried out in the same fashion as in the traditional trial without affecting the nominal type I error rate. Section 4 illustrates the increase in power and the savings on the overall required SS by using the proposed method through a cancer example. Section 5 investigates benefits of the proposed procedure through extensive simulation studies. Some discussion is provided in Section 6. The data that are analysed in the paper can be obtained from http://wileyonlinelibrary.com/journal/rss-datasets

2.

Optimal sampling ratio

Suppose that we have N subjects with m cases and n controls. Each subject is measured by diagnostic test l (l = 1, 2). We define the ith case as Xli , where i = 1, : : : , m, and the jth control as Ylj , where j = 1, : : : , n. The joint cumulative survival functions for cases are .X1i , X2i / ∼ Sd .x1 , x2 / and the joint cumulative survival functions for controls are .Y1j , Y2j / ∼ Sd¯ .y1 , y2 /. Their marginal survival distributions are Xli ∼ Sd,l .x/ and Ylj ∼ Sd,l ¯ .y/ respectively. For the threshold c varying in .−∞, ∞/, the sensitivity is Sd,l .c/ = Pr.Xli > c/, and the FPR is Sd,l ¯ .c/ = −1 .u/}, where the Pr.Ylj > c/: Consequently, the ROC curve for test l is defined as Rl .u/ = Sd,l {Sd,l ¯ FPR u falls within [0, 1]. Summary measures for a single ROC curve include the area under the ROC curve, AUC, the 1 partial AUC, pAUC, and the weighted AUC, wAUC. wAUC for marker l, Ωl = 0 Rl .u/ dW.u/,

Optimal Sampling Ratios

501

was given by Wieand et al. (1989) where W.u/ is a probability measure. We let W.u/ be a point u0 , an FPR, to calculate the sensitivity of a test, or W.u/ = u, where u ∈ .0, 1/, to obtain AUC. When W.u/ = .u − u0 /=.u1 − u0 /, where u ∈ .u0 , u1 /, Ωl gives the partial AUC. The statistics for comparing markers might be parametric (Mazumdar and Liu, 2003), or non-parametric (DeLong et al., 1988; Wieand et al., 1989). Let θ be the parameter in the ROC comparison, and θˆ be the estimator. On the basis of the variance expressions for these ROC statistics, we identify the following common structure for the variance of the aforementioned ROC statistics when the sample sizes become large: ˆ = v x + vy , var.θ/ .1/ m n where vx is the variance associated with measurements of case patients and vy is the variance related to control patients. In this paper we use the non-parametric statistics of DeLong et al. (1988) and Wieand et al. (1989). We present the variance expressions for these statistics in Sections 2.1.2 and 2.2. A similar variance structure for a conventional binormal ROC statistic of Mazumdar and Liu (2003) is presented in Appendix A. Given the variance structure in equation (1), the total required SS in a diagnostic trial can be minimized by using an optimal sampling ratio when the variance is fixed. In other words, the power for comparing two markers can be maximized by using this optimal sampling ratio. Suppose that the total required SS in the diagnostic trial is N = m + n; the sampling ratio is r = m=n. Let the variance of θˆ be a fixed constant, a. Since m = rn = Nr=.1 + r/, it follows that vx vy .1 + r/.vx =r + vy / + = = a: m n N The total required SS can then be expressed as .1 + r/.vx =r + vy / : a To minimize N, we take the first derivative with respect to r and equate it to 0. We obtain vy =a − vx =ar −2 = 0: By solving this equation, the optimal sampling ratio is obtained as √ .2/ r Å = .vx =vy /: N=

It is worth noting that the optimal sampling ratio is analogous to the Neyman allocation ratio for clinical trials which has been widely used to reduce the overall SS for a fixed power. However, as will be seen from Sections 2.1 and 2.2, vx and vy in diagnostic trials take more complicated forms than those used in clinical trials which are commonly the variances of response variables in treatment and control groups. Interested readers are refered to Jennison and Turnbull (2000) and Rosenberger and Lachin (2002). 2.1. Optimal sampling ratio for continuous markers The difference Δ = Ω1 − Ω2 was used in Wieand et al. (1989) to compare the wAUCs for continuous data. Here the estimator Ωˆ l,m,n of Ωl , for l = 1, 2, is obtained by substituting the empirical function estimators Sˆ d, l for Sd,l and Sˆ d,l ¯ for Sd,l ¯ in Ωl :  Ωˆ l,m,n =

0

1

−1 Sˆ d,l {Sˆ d,l ¯ .u/} dW.u/:

ˆ m,n = Ωˆ 1,m,n − Ωˆ 2,m,n . Hereafter the subscripts m and The resulting Δ-statistic is given by Δ ˆ ˆ will be used. We shall need n in Δm,n will be omitted unless necessary and the notation Δ

502

T. Dong, L. L. Tang and W. F. Rosenberger

differentiability of the ROC functions for our main theorem. The following assumption guarantees this property.   denote Assumption 1. Sd,l and Sd,l ¯ have continuous positive derivatives on R. Let Sd,l and Sd,l ¯ their derivatives.

Let  wi,l =

1

0

−1 [Rl .u/ − I{Xli  Sd,l ¯ .u/}] dW.u/

and  vj,l =

1

0

−1 .Rl .u/[I{Ylj  Sd,l ¯ .u/} − u]/ dW.u/,

 {S −1 .u/}=S  {S −1 .u/}. The variances of w and v are for l = 1, 2, where Rl .u/ = Sd,l i,l j,l ¯ ¯ ¯ d,l d,l d,l  1 2  1 1 Rl .s ∧ t/ dW.s/dW.t/ − Rl .s/ dW.s/ var.wi,l / = 0

and  var.vj,l / =

0

1 1 0

0

0

Rl .s/ Rl .t/.s ∧ t/ dW.s/ dW.t/ −



1

Rl .s/s dW.s/

2 :

0 w¯ m = Σm i=1 wi =m,

and v¯n = Σnj=1 vj =n. Let wi = wi,1 − wi,2 and vj = vj,1 − vj,2 . Further denote Wieand et al. (1989) and Tang et al. (2008) studied the Δ-statistic and showed that, under Å , where T Å is a small order term with T Å .m + ˆ is w¯ m + v¯n + Ω1 − Ω2 + Tm,n assumption 1, Δ m,n m,n −1=2 converging to 0 in probability, as m, n → ∞ (Wieand et al. (1989), page 591). They also n/ showed that  1 1 2  −1 −1 var.wi / = var.wi,l / − 2 [Sd {Sd,1 .3/ ¯ .s/, Sd,2 ¯ .t/} − R1 .s/ R2 .t/] dW.s/ dW.t/, 0

l=1

0

and var.vj / =

2 

 var.vj,l / − 2

l=1

0

1 1 0

−1 −1 R1 .s/ R2 .t/[Sd¯ {Sd,1 ¯ .s/, Sd,2 ¯ .t/} − st] dW.s/ dW.t/:

.4/

2.1.1. Optimal sampling ratio for evaluating one continuous marker ˆ that Ωˆ 1 is We start with one marker, say marker 1. It follows from the approximation of Δ n v =n + Ω : The w s are independent, idenasymptotically equivalent to Σm w =m + Σ 1 i,1 i=1 i,1 j=1 j,1 tically distributed random variables corresponding to measurements of cases and the vj,1 s are independent, identically distributed random variables related to measurements of controls. Following the general expression (2), we can see that√the optimal sampling ratio for evaluating marker 1 on the basis of wAUC is given by r1Å = {var.wi,1 /=var.vj,1 /}. This ratio includes existing results for AUC by Hanley and Hajian-Tilaki (1997) and for the sensitivity by Janes and Pepe (2006). wAUC Ωˆ 1 estimates AUC when W.u/ = u for 0 < u0 < 1. Consequently, the optimal ratio becomes 

 E{I.X1i > Y1j / I.X1i > Y1l /} − E{I.X1i > Y1j /}2 , E{I.X1i > Y1j / I.X1k > Y1j /} − E{I.X1i > Y1j /}2

Optimal Sampling Ratios

503

for √ i, k = 1, : : : , m, and j, l = 1, : : : , n. This can be written in terms of placement values, [var{Sd,1 ¯ .Y1j /}=var{Sd,1 .X1i /}], as shown in Janes and Pepe (2006). When W.u/ = I{u = u0 } for 0 < u0 < 1, Ωˆ 1 estimates the sensitivity at the FPR u0 and the optimal ratio can be shown to reduce to   R1 .u0 / − R21 .u0 / Å rs,1 = , R1 .u0 /2 u0 − R1 .u0 /2 u20 or √ [R1 .u0 /{1 − R1 .u0 /}={u0 .1 − u0 /}] : R1 .u0 / This has been derived in Janes and Pepe (2006). 2.1.2. Optimal sampling ratio for comparing two continuous markers Since the wi s are random variables corresponding to measurements of case patients and the vj s are also random variables related to measurements of control subjects, expression (2) gives the optimal ratio for comparing the difference between wAUCs: √ r Å = {var.wi /=var.vj /}, .5/ where the variances are given in equations (3) and (4). ˆ compares AUCs, partial AUCs or sensitivities at a particular FPR, we discuss the Since Δ optimal ratios for these special cases by specifying corresponding weight functions. When we ˆ compares the AUCs. The optimal ratio in let the weight function be W.u/ = u, for 0 < u < 1, Δ equation (5) implies that the following ratio between the case and the control maximizes the Å = √.vA =vA /, where power for comparing the AUCs A: rA x y vA x =

2 

[E{I.Xli > Ylj / I.Xli > Yll /} − E{I.Xli > Ylj /}2 ] − 2[E{I.X1i > Y1j / I.X2i > Y2l /}

l=1

− E{I.X1i > Y1j /} E{I.X2i > Y2l /}]

.6/

and vA y =

2 

[E{I.Xli > Ylj / I.Xlk > Ylj /} − E{I.Xli > Ylj /}2 ]

l=1

− 2[E{I.X1i > Y1j / I.X2k > Y2j /} − E{I.X1i > Y1j /} E{I.X2k > Y2j /}],

.7/

ˆ compares the sensitivities as shown in Appendix A. When W.u/ = I{u = u0 } for 0 < u0 < 1, Δ of two markers at the FPR u0 . The optimal ratio in equation (5) reduces to ⎞ ⎛ 2 2 {Rl .u0 / − Rl .u0 /} − 2A ⎟ ⎜ ⎟ ⎜ l=1 Å rs = ⎜ ⎟, 2 ⎠ ⎝ [Rl .u0 /2 − {Rl .u0 /u0 }2 ] − 2B l=1

−1   where A = Pr{X1i > S −1 ¯ .u0 /, X2i > Sd,2 ¯ .u0 /} − R1 .u0 / R2 .u0 / and B = R1 .u0 / R2 .u0 /[Pr{X1i > d,1 −1 −1 2 Sd,1 ¯ .u0 /, X2i > Sd,2 ¯ .u0 /} − u0 ]. t

2.2. Optimal sampling ratio for ordinal markers ˆ The variance of the Δ-statistic involves the first derivatives of the ROC curves. The optimal

504

T. Dong, L. L. Tang and W. F. Rosenberger

ratio in equation (5) cannot be readily applied to ordinal data which often occur in radiology. We thus consider the non-parametric statistic by DeLong et al. (1988) to obtain the optimal ratio for comparing two ordinal markers which are usually two imaging modalities in radiology. ˆA Let ΩA l = P.Xli > Ylj / + P.Xli = Ylj /=2 for marker l, and Ωl be its estimator. DeLong’s statistic A A D ˆ = Ω − Ω and is given as estimates Δ 1 2 m n  1  .1/ .2/ ˆ D = Ωˆ A ˆA Δ .ψ − ψij /, 1 − Ω2 = mn j=1 i=1 ij .l/

where ψij equals 1, for Ylj < Xli , 21 for Ylj = Xli and 0 for Ylj > Xli , for marker l, l = 1, 2. ˆ D has the form var.Δ ˆ D/ = DeLong et al. (1988) showed that the large sample variance of Δ D D vx =m + vy =n, with vD x =

2  l=1

.l/ .l/

.l/

.1/ .2/

.1/

.2/

.l/ .l/

.l/

.1/ .2/

.1/

.2/

{E.ψij ψil / − E.ψij /2 } − 2{E.ψij ψij / − E.ψij /E.ψij /}

from the cases, and vD y =

2  l=1

{E.ψij ψkj / − E.ψij /2 } − 2{E.ψij ψkj / − E.ψij /E.ψkj /}

Å = √.vD =vD / maxfrom the controls. Therefore, it follows from equation (2) that the ratio rD x y imizes the power for comparing two ordinal markers. For the problem of evaluating a single ordinal marker, the optimal ratio is reduced to .1/ .1/ .1/   E.ψij ψil / − E.ψij /2 : .1/ .1/ .1/ E.ψij ψkj / − E.ψij /2

3.

Two-stage procedure to obtain the optimal ratio

We may assume a parametric model to obtain the variances and resulting optimal ratios derived in the preceding section. When a parametric model is correctly specified, the optimal ratio can be calculated from equation (2) for comparing ROC summary measures, and the SS to obtain a specified power can be subsequently derived. However, if the parametric model is misspecified, the SS calculated may not give the appropriate power. Fig. 1 shows the optimal ratios for comparing the AUCs and pAUCs with the case and control having different variances. The case and control observations are from the bivariate normal distributions with .X1 , X2 / ∼ N{.2, 2/, Σx }, and .Y1 , Y2 / ∼ N{.0, 0/, Σy }, where Σx has diagonal elements 1 and off-diagonal elements 0.1, and Σy has diagonal elements σY2 and off-diagonal elements 0:1σY2 . We see that the optimal ratio decreases as σY increases from 0.8 to 1.3. This indicates that the variances of the case and the control play an important role in the optimal ratio. When the variance for the control is larger than the case, the optimal ratio becomes larger than 1, indicating that sampling more controls than cases yields a better power to detect a difference between markers. Thus, the misspecification of parametric models at the planning stage may lead to an incorrect optimal ratio. For a fixed sample two-sided hypothesis test, to detect the difference between ROC summary measures, the required SSs m and n with power 1 − β and the significance level of α are given by m = rn =

.z1−α=2 + zβ /2 .vx + vy / Δ21

,

.8/

0.8

1.0

Optimal Ratio

1.2

1.4

1.6

505

0.6

0.6

0.8

1.0

Optimal Ratio

1.2

1.4

1.6

Optimal Sampling Ratios

0.8

0.9

1.0

σY

(a)

1.1

1.2

1.3

0.8

0.9

1.0

σY

1.1

1.2

1.3

(b)

Fig. 1. Optimal sampling ratio for comparing (a) the AUCs or (b) pAUCs: the case and control observations are from the bivariate normal distributions with .X1 , X2 /  N {.2, 2/, Σx } and .Y1 , Y2 /  N {.0, 0/, Σy }, where Σx has diagonal elements 1 and off-diagonal elements 0.1, and Σy has diagonal elements σY2 and off-diagonal elements 0:1σY2 ; the pAUCs are obtained over the FPR between 0 and 0.6

where Δ1 is the the difference between ROC summary measures under the alternative hypothesis to be detected. The total required SS is N = m + n. Proschan (2005) introduced the concept of internal pilot data which often refers to accumulated data after a trial has been carried out for a certain period of time. To correct for the model misspecification at the beginning of the trial, we propose a two-stage procedure to use internal pilot data after some observations are available during the trial. Suppose that the total required SS N is fixed. Without loss of generality, we use a two-sided test in the procedure proposed. The procedure is given in the following steps. Step 1: specify √ a parametric model to obtain vx,0 and vy,0 , and the resulting initial optimal ratio r0Å = .vx,0 =vy,0 /. Step 2: use the ratio r0Å together with vx,0 and vy,0 in the following SS formula to calculate initial SS m0 and n0 with power 1 − β and significance level α, m0 = .zα=2 + zβ /2 .vx + r0Å vy /=Δ21 , and n0 = N − m0 , where Δ1 is the difference between ROC summary measures under the alternative hypothesis. Step 3: after sufficient marker measurements are available on m1 cases and n1 controls at

506

T. Dong, L. L. Tang and W. F. Rosenberger

the first stage, the variance expressions of either the Δ-statistic or DeLong’s statistic are recalculated by using available data. These variance √ estimators, vˆx,1 and vˆy,1 , are applied in equation (2) to recalculate the optimal ratio rˆÅ = .vˆx,1 =vˆy,1 /. Step 4: continue the trial by recruiting M2 cases and N2 controls, where M2 and N2 are given by N rˆÅ M2 = − m1 , 1 + rˆÅ .9/ N N2 = : − n 1 1 + rˆÅ It was showed in Proschan (2005) that using the internal pilot data for comparing population means in clinical trials maintains the nominal type I error rate. The reason is that the sample variance that is obtained at the end of the first stage does not give information for the sample mean at the end of the trial. We show that it is also true in our case as m, n → ∞. Suppose ˆ m,n is the estimated Δ at the end of the stage with m cases and n controls. The variance that Δ m1 n1 .wi − w¯ m1 /2 =m1 and vˆy,1 = Σj=1 .vj − v¯n1 /2 =n1 , where estimators at the first stage are vˆx,1 = Σi=1 wi and vj are given in Section 2.2. We first state the results for w¯ m and v¯n in the following theorem, and then state the result for the Δ-statistic in the consequent corollary. The proof is provided in Appendix A. Theorem 1. Let H0 : Ω1 = Ω2 . Assume that m, n → ∞, m1 =m → λ1 , n1 =n → λ2 and m=n → λ, where 0 < λ < ∞ and 0 < λ1 , λ2 < 1. Then, √ d 2 .w¯ m , vˆx,1 /T m → N{.0, σv,x /, Σw }, .10/ √ √ √ 2 where σv,x = var.vˆx,1 m/ and Σw = diag{var.w¯ m m/, var.vˆx,1 m/}. Also, under assumption 1, √ d 2 .v¯n , vˆy,1 /T m → N{.0, σv,y /, Σv }, .11/ √ √ √ 2 where σv,y = var.vˆy,1 m/ and Σv = diag{var.v¯n m/, var.vˆy,1 m/}. Theorem 1 implies that w¯ m is asymptotically independent of vˆx,1 , and v¯n is asymptotically independent of vˆy,1 . We also observe that both w¯ m and vˆx,1 are obtained on different subjects ˆ m,n by ignoring the small from v¯n and vˆy,1 . Thus, we can obtain the following corollary for Δ Å in the approximation of Δ. ˆ order term Tm,n ˆ m,n is asymptotically indepenCorollary 1. Under the regularity conditions in theorem 1, Δ dent of vˆx,1 and vˆy,1 as m, n → ∞. Therefore, the variance estimated at the first stage does not give information for the Δ-statistic at the end for large SSs. Thus, the resulting optimal ratio by using data from the first stage does not reveal information about the estimated difference between two ROC statistics obtained at the end of the second stage. Consequently, although the optimal ratio is updated during the trial, the analysis at the end of the trial can be carried out in the same fashion as in the trial without updating the optimal ratio. This is important in maintaining the proper type I error rate. 4.

Application to the cancer diagnostic trial

We applied our method to the cancer diagnostic trial from Goddard and Hinberg (1990). Measurements from the blood samples are highly skewed for all biomarkers. We compared

Optimal Sampling Ratios

507

a new biomarker D and the reference biomarker A to illustrate the increment in power and the SS savings by using the procedure proposed. We assumed a contrast of Δ1 = 0:05 between AUCs and the type I error rate 0.05 for calculating power and SS from a two-sided alternative. The overall SS N is 353 by summing the numbers of cases and controls. At the first stage, we accrued data on m1 = 60 cancer and n1 = 60 non-cancer patients and obtained the variance estimates vˆx,1 = 0:082 and vˆy,1 = 0:035, which resulted in the optimal case–control ratio rˆÅ = 1:53, from equation (2). Using this optimal ratio in the expression (9) in step 4 of the procedure proposed, the numbers of the cases and controls to be recruited in the second stage were calculated to be 153 and 80 respectively. The power by using the optimal ratio was then 50.9% from the equation     N rˆÅ − zα=2 : .12/ 1 − β = Φ Δ1 .1 + rˆÅ /.vˆx,1 + vˆy, 1 rˆÅ / This power offers a 7% increment over the power 43.8% calculated by using equation (12) by replacing rˆÅ with the original ratio of 0.62. We also investigated the savings on the overall SS by using the procedure proposed. Using the original power 43.8% with the estimated optimal ratio rˆÅ = 1:53, the overall SS was calculated to be 292 with 177 cancer patients and 115 non-cancer patients. This offers savings of 61 patients over the original ratio. 5.

Simulation studies

In this section, we demonstrate the performance of our method for maximizing power or minimizing total SSs when comparing summary statistics of diagnostic tests in extensive simulation studies. We consider both continuous data and ordinal data. 5.1. Simulation studies based on continuous data The biomarker results in the example used in Section 4 are highly skewed, and a log-normal distribution was used by Goddard and Hinberg (1990) as a possible approximation to the distribution of results. Thus, we consider bivariate log-normal distributions in the simulation studies. In addition, we simulate data from both bivariate normal distributions which are commonly used for symmetrically distributed marker results and bivariate exponential distributions which can approximate survival biomarker results. The bivariate normal models have the forms .X1 , X2 / ∼ N{.μ1 , μ2 /, ΣX } and .Y1 , Y2 / ∼ N{.0, 0/, ΣY }, where the diagonal elements of ΣX and ΣY are 1 and 9 respectively, and the correlation parameter ρ is the same for two matrices. We choose ρ = 0:1 and ρ = 0:25 in our simulations. AUC is set to be 0.70 for marker 1, and 0.75 or 0.80 for marker 2. pAUC with the FPR in the range (0, 0.6) is set to be 0.30 for marker 1, and 0.35 or 0.40 for marker 2. The bivariate log-normal models have the forms exp.X1 , X2 / and exp.Y1 , Y2 / for cases and controls respectively. The AUCs and pAUCs remain the same as in the normal models. The log-normal distribution may also demonstrate the robustness of the aforementioned non-parametric methods. The performance of the methods is expected to be similar for the normal and log-normal distributions because the non-parametric estimators should remain invariant under monotone transformations. According to the algorithm in Gumbel (1960), the bivariate exponential random variables take the form H.x, y/ = H1 .x/ H2 .y/[1 + 4ρ{1 − H1 .x/}{1 − H2 .y/}], where Hl , l = 1, 2, are univariate exponential functions, and ρ is in [−0:25, 0:25]. We set ρ to 0.1 or 0.25 here. The marginal survival functions are exp.−βl1 x/ and exp.−βl2 y/, so we could

508

T. Dong, L. L. Tang and W. F. Rosenberger

generate data from these two distributions. In the simulation, AUC is set to 0.70 for marker 1, and 0.75 or 0.80 for marker 2. pAUC with the FPR in the range .0, 0:6/ is set to 0.30 for marker 1, and 0.35 or 0.45 for marker 2. We compare the proposed two-step procedure with the equal case–control ratio and the optimal ratio. We use DeLong’s statistic for comparing the AUCs and the Δ-statistic for comparing the pAUCs. In our simulation, we first assume that our samples were from bivariate normal distributions; then we use equation (8) to calculate the initial total required SS. With the type I error rate 0.05 and power 80%, the initial total required SSs are N = 1421, or N = 326 to detect the difference of two pairs of AUCs of .0:70, 0:75/ and .0:70, 0:80/ respectively, with ρ = 0:1. When ρ = 0:25, the total required SSs, N = 1207, or N = 278, are needed to detect the difference in these pairs. There are three different sampling ratios: (a) the proposed two-stage optimal ratio; (b) the optimal ratio of 0.5 for the normal and log-normal distributions and the optimal ratio of 1.5 for the exponential distributions; (c) the equal sampling ratio. To implement the method proposed, we let m1 = n1 = N=4. By substituting non-parametric √ variance estimates vˆx,1 and vˆy,1 , the resulting optimal ratio is estimated by rˆÅ = .vˆx,1 =vˆy,1 /, and M2 and N2 are calculated by using equation (9). We then generate M2 new observations for cases and N2 observations for controls. Consequently, the null hypothesis of equal AUCs or pAUCs is rejected in favour of the alternative if the absolute value of the calculated Z-statistic is greater than or equal to z0:025 . The simulated power is then calculated as the percentage of times out of 5000 simulation runs that the null hypothesis is rejected. The simulated powers are presented in Table 1, which illustrates that the simulated powers of the two-stage method proposed are close to those of the optimal ratio and are greater than those of the equal sampling ratio in the normal settings. Since the optimal ratio for the exponential distribution specified is Table 1. Simulated power for comparing AUCs or pAUCs by using the two-stage method proposed and fixed ratios, over 5000 simulations† ρ

Distribution

Powers (%) for comparing AUCs AUC for marker 2

0.10

BN LN BE

0.25

BN LN BE

0.75 0.80 0.75 0.80 0.75 0.80 0.75 0.80 0.75 0.80 0.75 0.80

Two-stage

80.0 80.4 79.1 80.5 81.0 81.6 82.2 81.0 82.0 83.5 83.7 83.6

Fixed ratio Equal

Optimal

77.0 74.6 74.5 74.8 80.4 80.0 78.0 77.6 78.7 78.1 82.6 82.8

79.5 80.2 78.3 79.4 82.2 82.7 81.7 80.1 81.5 82.6 83.3 84.4

Powers (%) for comparing pAUCs pAUC for marker 2

0.35 0.45 0.35 0.45 0.35 0.45 0.35 0.45 0.35 0.45 0.35 0.45

Two-stage

33.3 88.1 34.0 89.1 84.0 85.0 37.3 91.8 37.0 92.7 91.2 90.9

Fixed ratio Equal

Optimal

31.4 86.2 31.9 85.3 83.4 84.0 34.8 89.4 34.2 89.8 90.3 90.7

32.8 88.9 32.1 88.0 84.6 84.4 37.5 92.1 36.7 92.3 91.0 90.8

†AUC for marker 1 is 0.70, and pAUC for marker 1 is 0.30. BN, bivariate normal; LN, bivariate log-normal; BE, bivariate exponential. ρ is the correlation coefficient of two markers. The optimal ratios for the bivariate normal and log-normal distributions are close to 0.5, and the optimal ratios for the bivariate exponential distribution are close to 1.5.

Optimal Sampling Ratios

509

Table 2. Average updated total SS and simulated power for comparing AUCs by using the proposed two-stage method over 5000 simulations† ρ

0.10 0.25

Results for m1 = n1 = N=5

Results for m1 = n1 = N=7

AUC for marker 2

Initial SS

Updated SS

Power (%)

AUC for marker 2

Initial SS

Updated SS

Power (%)

0.75 0.80 0.75 0.80

1744 405 1527 357

1333 311 1160 273

80.9 80.6 80.3 81.5

0.75 0.80 0.75 0.80

1744 405 1527 357

1335 313 1161 275

80.4 79.1 80.2 79.8

†The AUC for marker 1 is 0.70. ρ is the correlation coefficient of two markers.

close to 1.5, we see that most of the powers of the method proposed are greater than those of fixed ratios. We also conduct simulation studies to illustrate that the method proposed reduces the total SS compared with the equal ratio. The aforementioned bivariate normal distribution is applied to simulate test results. We first calculate the initial total SS N with the equal ratio, type I error rate 0.05 and power 80%. At the end of stage I, with m1 = n1 simulated test results from two groups, we update the case/control ratio with the estimated optimal ratio from the interim data, and recalculate the total SS that is needed to achieve 80% power on the basis of the estimated ratio. Additional test results are then generated according to the updated SS in two groups, and the Z-statistic is estimated. The null hypothesis of equal AUCs is rejected in favour of the alternative if the absolute value of the calculated Z-statistic is greater than or equal to z0:025 . The simulated power is given by the percentage of times out of 5000 simulation runs that the null hypothesis is rejected. The simulated power and the average updated total SS with m1 = n1 = .N=5, N=7/ are presented in Table 2, which illustrates that the two-stage method proposed reduces the total SS compared with the equal ratio. The simulated power of the two-stage method proposed is close to the nominal power for all parameterizations. In addition, the simulated power and updated SS vary little with different sizes at stage I. We also evaluate the performance of the two-stage procedure to see whether the procedure maintains the nominal type I error rate. We use N = 200, 400, 500. We consider the parametric distributions and the three different sampling ratios that were used in the previous simulation. We assume equal AUCs or pAUCs with the AUCs being .0:70, 0:75, 0:80/, and the pAUCs being .0:30, 0:35, 0:40/. The nominal type I error rate is 0:05 in our simulation. The simulated type I error rates with 10 000 simulation runs are shown in Table 3. All these rates are close to the nominal level when the sample size goes to 500. 5.2. Simulation studies based on ordinal data We also conduct simulation studies to evaluate the simulated power of the proposed method on ordinal test results. We first use the aforementioned bivariate log-normal distributions and bivariate exponential distributions to simulate continuous results. We then use the 20th, 40th, 60th and 80th percentiles of the distributions to categorize the simulated continuous data as follows. A test result is recoded as 1 if it is less than the 20th percentile, 2 if it is between the 20th and 40th percentiles, 3 if it is between the 40th and 60th percentiles, 4 if it is between the 60th and 80th percentiles, and 5 if it is greater than the 80th percentile. The rest of the simulated

510

T. Dong, L. L. Tang and W. F. Rosenberger

Table 3. Type I error rates for comparing the AUCs or pAUCs by using the two-stage method proposed, over 10000 simulations† ρ

0.1

Distribution

BN LN BE

0.25

BN LN BE

Error rates (%) for comparing the AUCs

Error rates (%) for comparing the pAUCs

AUCs

N = 200

N = 400

N = 500

pAUCs

N = 200

N = 400

N = 500

0.70 0.75 0.80 0.70 0.75 0.80 0.70 0.75 0.80 0.70 0.75 0.80 0.70 0.75 0.80 0.70 0.75 0.80

4.5 5.1 4.9 4.9 5.1 5.0 5.0 5.0 5.2 4.9 5.1 5.2 5.0 4.9 4.8 4.2 5.3 4.2

5.0 5.0 5.1 5.0 4.9 4.4 5.1 4.9 5.1 4.8 5.0 5.1 5.1 4.7 3.9 5.0 5.0 5.1

5.0 4.9 4.9 5.0 5.1 5.0 5.0 4.9 4.9 4.7 5.0 5.0 5.0 4.9 5.0 5.2 4.9 4.9

0.30 0.35 0.40 0.30 0.35 0.40 0.30 0.35 0.40 0.30 0.35 0.40 0.30 0.35 0.40 0.30 0.35 0.40

4.8 5.0 5.2 4.6 4.6 5.0 5.2 5.3 4.7 5.1 5.2 4.9 5.1 4.5 4.6 5.0 5.1 4.9

5.1 4.9 5.1 5.2 5.1 5.1 4.9 5.0 4.9 5.0 5.3 5.1 5.3 5.0 4.8 5.0 4.7 4.7

5.0 5.0 5.5 5.1 5.0 4.9 5.0 5.1 5.1 4.9 5.2 5.1 5.2 4.8 5.0 4.8 5.0 5.0

†BN, bivariate normal distribution; LN, bivariate log-normal distribution; BE, bivariate exponential distribution. N is the total required SS and ρ is the correlation coefficient of two markers.

settings are identical to those in the previous section on evaluating the power for continuous data. The results in Table 4 indicate that the simulated power by using the method proposed is similar to that of the optimal ratios and is higher than for those parameterizations using the equal ratio. 6.

Discussion

The optimal sampling ratio in diagnostic trials can maximize the test power or minimize the overall SS. The optimal sampling ratio that is discussed in this paper is analogous to the optimal allocation ratio in assigning treatments to patients in clinical trials. The optimal allocation ratio has been used in clinical trials for decades, but the importance of the optimal ratio in diagnostic trials has not been widely recognized. Implementation requires the calculation of complicated variances of frequently used ROC statistics. This paper discusses a common variance structure for ROC statistics and thereby introduces optimal sampling ratios in comparative diagnostic trials based on these statistics. Two popular non-parametric ROC statistics are used to illustrate the explicit forms of the optimal ratios because their variance expressions can be written as the sum of separate terms; one relates to the cases, and the other relates to the controls. If preliminary studies are available before carrying out a comparative diagnostic trial, the variance can be estimated by using pilot data to obtain the optimal ratio for comparing specified ROC summary measures. The ratio can then be used to recruit patients in the trial, and recalculating the ratio may not be necessary during the trial. However, when medical practitioners do not have preliminary data for the markers and are not certain about the distributions of the marker results, the distribution assumption that is used for obtaining the optimal ratio

Optimal Sampling Ratios

511

Table 4. Simulated power for ordinal data for comparing AUCs by using the two-stage method proposed and fixed ratios, over 5000 simulations† ρ

0.10

Distribution

LN BE

0.25

LN BE

AUC for marker 2

0.75 0.80 0.75 0.80 0.75 0.80 0.75 0.80

Two-stage power (%)

81.2 82.0 87.1 86.6 80.9 80.0 89.0 88.1

Fixed ratio power (%) Equal

Optimal

75.3 77.3 84.7 84.5 77.6 78.6 87.1 87.2

79.7 83.2 88.6 86.4 80.0 80.2 88.6 88.9

†The AUC for marker 1 is 0.70. LN, bivariate log-normal distribution; BE, bivariate exponential distribution. ρ is the correlation coefficient of two markers.

may be far from the true underlying distributions for the marker results. This may result in less power or larger overall SSs than using the true optimal ratio. The two-stage procedure proposed is then particularly useful to ensure that the optimal ratio can be recalculated by using internal pilot data during the trial. The procedure proposed performs well in a large-scale simulation study. We also demonstrate that the procedure proposed maintains the nominal type I error rate in the simulation. We use an example in cancer diagnostic studies to illustrate the application of our method on maximizing the test power and saving overall SSs. The results indicate that, compared with the original sampling ratio, using the proposed two-stage procedure for a fixed overall sample size increased the test power. Alternatively, for the fixed test power, the procedure proposed reduces the overall SS by nearly 25%. In some rare diseases, it may not be possible to recruit the required number of the cases. Suppose that only 135 cancer patients can be recruited in the aforementioned cancer diagnostic trial. If the calculated optimal ratio of 1.53 is maintained, then 89 non-cancer patients should be in the trial. This leads to the total SS of N = 224. Using the power calculation formula (12) gives a power of 35.2%, which sacrifices 8% power while reducing the SS by 129. This indicates that, for a fixed number of cases, recruiting more controls may increase the power if the budget of a trial allows. This can be seen from the variance expression (1) since, when m is fixed in equation (1), the variance decreases as n increases. Thus, with the constraint of total 353 subjects and 135 cases, the original sampling ratio of 1.14 (135/118) gives the maximum power. The characteristics of subjects are often matched in case–control studies to minimize the confounding effects. Janes and Pepe (2008) illustrated that a ROC summary estimate without adjusting for covariates may be biased. If covariate information is available, matching should be considered for deriving the optimal sampling ratio. Future research on this topic is warranted. Acknowledgements The authors thank the Associate Editor, the Joint Editor and a referee for their constructive comments. The authors also thank their colleague Anand Vidyashankar for many useful suggestions that led to an improvement in this paper. The project described here was supported by award R15CA150698 from the National Cancer Institute under the American Recovery and Reinvestment Act of 2009 and by award H98230-11-1-0196 from the National Security Agency.

512

T. Dong, L. L. Tang and W. F. Rosenberger

The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Cancer Institute or the National Institutes of Health. Appendix A: Variance expressions of receiver operating characteristic statistics, variance derivation and proof of proposition 1 A.1. Variance expressions for a parametric receiver operating characteristic statistic When measurements of markers have bivariate normal distributions, Mazumdar and Liu (2003) provided expressions for the variances, vx and vy . Suppose that .X1 , X2 / ∼ N{.μ1d , μ2d /, Σd } for i = 1, : : : , m and .Y1 , Y2 / ∼ N{.μ1d¯ , μ2d¯ /, Σd¯ } for j = 1, : : : , n, where   2 ρσ1d σ2d σ1d Σd = 2 ρσ1d σ2d σ2d and

 Σd¯ =

σ12d¯ ρσ1d¯ σ2d¯

 ρσ1d¯ σ2d¯ : σ22d¯

The statistic considered in Mazumdar and Liu (2003) is the partial AUC estimator given by  c2  c2 bn θˆp = Φ.α1 − β1 x/ φ.x/ dx − Φ.α2 − β2 x/ φ.x/ dx, c1

c1

where α1 = .μ1d − μ1d¯ /=σ1d , β1 = σ1d¯ =σ1d and α2 = .μ2d − μ2d¯ /=σ2d and β2 = σ2d¯ =σ2d : Let    1 r c Γ1 .r, s; c1 , c2 / = √ 2 φ √ 2 Φ.x/ c2 1 .s + 1/ .s + 1/ and

   r 1 rs c φ √ 2 φ.x/ + √ 2 Φ.x/ c2 , Γ2 .r, s; c1 , c2 / = − 2 1 s +1 .s + 1/ .s + 1/ √ 2 √ 2 √ 2 √ 2   with c1 = c1 .s + 1/ + rs= .s + 1/ and c2 = c2 .s + 1/ + rs= .s + 1/. Let   1 σ¯ μld − μld¯ Γ1 , − ld ; c1 , c2 fl1 = σld σld σld and fl2 = −fl1 , for l = 1, 2. In addition, let      1 σ¯ σ¯ μld − μld¯ μld − μld¯ fl3 = 3 .μld¯ − μld /Γ1 , − ld ; c1 , c2 + σld¯ Γ2 , − ld ; c1 , c2 , σld σld σld σld 2σld and fl4 = −

  1 σ¯ μld − μld¯ Γ2 , − ld ; c1 , c2 : 2σld σld¯ σld σld

bn The variances vx and vy for θˆp can be written as 2 2 2 2 2 2 2 2 vx = σ1d .f11 + 2f13 σ1d / + σ2d .f21 + 2f23 σ2d / − 2ρσ1d σ2d .f11 f21 + f13 f23 /

and 2 2 2 2 2 2 vy = σ12d¯ .f12 + 2f14 σ1d¯ / + σ22d¯ .f22 + 2f24 σ2d¯ / − 2ρσ1d¯ σ2d¯ .f12 f22 + f14 f24 /:

A.2. Derivation of vxA and vyA We can show that

 1 0

0

1

−1 Sd {Sd,−1 ¯ 1 .s/, Sd, ¯ 2 .t/} ds dt

Optimal Sampling Ratios can be expressed as





∞ −∞

513



−∞

Sd .y1 , y2 / dSd,¯ 1 .y1 / dSd,¯ 2 .y2 /:

−1 Let Sd,−1 ¯ 1 .s/ = y1 and Sd, ¯ 2 .t/ = y2 ; then we have  1 1 −1 Sd {Sd,−1 ¯ 1 .s/, Sd, ¯ 2 .t/} ds dt = E{I.X1i > Y1j / I.X2i > Y2l /}: 0

0

Similarly, vy becomes vy =

2 

  1

l=1

0

 1

0

−2 0

1

0

1

 Rl .s/Rl .t/.s ∧ t/ ds dt −

2

Rl .s/s ds

−1 R1 .s/ R2 .t/[Sd¯ {Sd,−1 ¯ 1 .s/, Sd, ¯ 2 .t/} − st] ds dt:

It follows that  1 1  1 −1 R1 .s/ R2 .t/ Sd¯ {Sd,−1 ¯ 1 .s/, Sd, ¯ 2 .t/} = 0

0

1

0

0

1

0

−1  Sd, 1 {Sd,−1 ¯ 1 .s/} Sd, 2 {Sd, ¯ 2 .t/} −1  Sd,¯ 1 {Sd,−1 ¯ 2 {Sd, ¯ 1 .s/} Sd, ¯ 2 .t/}

−1 Sd¯ {Sd,−1 ¯ 1 .s/, Sd, ¯ 2 .t/} ds dt:

−1 Let Sd,−1 ¯ 1 .s/ = y1 and Sd, ¯ 2 .t/ = y2 ; then it follows that  Sd, 1 .y1 / Sd, 2 .y2 /Sd¯ .y1 , y2 / dy1 dy2 = E{I.X1i < Y1j / I.X2k < Y2j /} = E[{1 − I.X1i > Y1j /}{1 − I.X2k > Y2j /}] = 1−E{I.X1i > Y1j /}−E{I.X2k > Y2j /}+E{I.X1i > Y1j / I.X2k > Y2j /}:

Because

 1 0

1

R1 .s/ R2 .t/ st ds dt

0

can also be written as 1 − Pr.X1i > Y1j / − Pr.X2k > Y2j / + E{I.X1i > Y1j /} E{I.X2k > Y2j /}, the expressions for vx and vy are simplified to equations (6) and (7) respectively.

A.3. Proof of theorem 1

m Recall that w¯ m = Σm ¯ m/ = i=1 wi =m, and v¯n = Σj=1 vj =n are the sample means at the end of the trial and E.w E.v¯n / = 0, where m and n are the SSs at the end of the trial. The variance estimators at the end of the first m1 n1 2 2 stage are vˆx, 1 = Σi=1 .wi − w¯√ m1 / =m1 and vˆy, 1 = √Σj=1 .vj − v¯n1 / =n1 , where m1 and n1 are SSs at the end of the first stage. Let Am = w¯ m m and Bm = vˆx, 1 m. We shall show that d Am .13/ → N2 .0, Σw /, Xm := 2 Bm − σv, x where Σw is a diagonal matrix. For this, using the Cramer–Wold device, consider lT Xm , where l = .l1 , l2 /T . Since the Bm can be expressed as m  1 2 √ wi =m1 − m1 w¯ 2m1 , Bm = m i=1

it follows that

m  m1 l w2  l1 wi  √ 2 i 2 + − l2 w¯ m1 m − σv,2 x l Xm = m m 1 i=1 i=1  √ √ m σv,2 x m m 1 m1 2 + = l1 wi + l2 wi − √ m m1 i=1 m m − m1 T

= Tm .1/ + Tm .2/ − Tm .3/:

m  √ m − m1 l1 wi − l2 w¯ 2m1 m m i=m1 +1

.14/

514

T. Dong, L. L. Tang and W. F. Rosenberger

Since the wi s are bounded random variables, Tm .1/ and Tm .2/ have finite second moments. Also, Tm .1/ and Tm .2/ are independent since Tm .1/ is based on the random variables {wi : i = 1, : : : , m1 } and Tm .2/ is based on the random variables {wi : i = m1 + 1, : : : , m}. Hence, by the central limit theorem and Slutsky’s theorem (Serfling, 1980), it follows that, as m → ∞, d

Tm .1/ + Tm .2/ → N[0, var.λ1 l1 w1 + l2 w12 / + var{.1 − λ1 /l1 w1 }]: Also, under hypothesis H0 , since E.w13 / = 0, the limiting variance can be shown to reduce to l12 σ12 + l22 σ22 , where σ12 = {.1 − λ1 /2 + 1} var.w1 /, and σ22 = var.w12√ /. Now, returning to the last term on the right-hand side of equation (14), note that Tm .3/ = m−1=2 l2 .w¯ m1 m/2 converges to 0 in probability, as m → ∞, by the central limit theorem. This completes the proof of expression (13) and hence expression (10). We now turn to the proof of expression (11). Now, under assumption 1, Rl .u/ is continuously differentiable, and it follows that  1 −1  .R1 .u/[I{Y1j  Sd,−1 vj = ¯ 1 .u/} − u] − R2 .u/[I{Y2j  Sd, ¯ 2 .u/} − u]/ dW.u/: 0

Now the proof can be completed along the lines of the proof of expression (10). This completes the proof of theorem 1.

References DeLong, E. R., DeLong, D. M. and Clarke-Pearson, D. L. (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics, 44, 837–845. Etzioni, R., Kooperberg, C., Pepe, M., Smith, R. and Gann, P. H. (2003) Combining biomarkers to detect disease with application to prostate cancer. Biostatistics, 4, 523–538. Etzioni, R., Pepe, M., Longton, G., Hu, C. and Goodman, G. (1999) Incorporating the time dimension in receiver operating characteristic curves: a case study of prostate cancer. Med. Decsn Makng, 19, 242–251. Goddard, M. J. and Hinberg, I. (1990) Receiver operator characteristic (roc) curves and non-normal data: an empirical study. Statist. Med., 9, 325–337. Gumbel, E. J. (1960) Bivariate exponential distributions. J. Am. Statist. Ass., 55, 698–707. Hanley, J. A. and Hajian-Tilaki, K. O. (1997) Sampling variability of nonparametric estimates of the areas under receiver operating characteristic curves: an update. Acad. Radiol., 4, 49–58. Hendrick, R. E., Cole, E. B., Pisano, E. D., Acharyya, S., Marques, H., Cohen, M. A., Jong, R. A., Mawdsley, G. E., Kanal, K. M., D’Orsi, C. J., Rebner, M. and Gatsonis, C. (2008) Accuracy of soft-copy digital mammography versus that of screen-film mammography according to digital manufacturer: ACRIN DMIST retrospective multireader study. Radiology, 247, 38–48. Janes, H. and Pepe, M. (2006) The optimal ratio of cases to controls in a case-control for estimating the classification accuracy of a biomarker. Biostatistics, 7, 456–468. Janes, H. and Pepe, M. S. (2008) Matching in studies of classification accuracy: implications for analysis, efficiency, and assessment of incremental value. Biometrics, 64, 1–9. Jennison, C. and Turnbull, B. W. (2000) Group Sequential Methods with Applications to Clinical Trials. New York: Chapman and Hall. Mazumdar, M. and Liu, A. (2003) Group sequential design for comparative diagnostic accuracy studies. Statist. Med., 22, 727–739. Pepe, M. S., Etzioni, R., Feng, Z., Potter, J. D., Thompson, M. L., Thornquist, M., Winget, M. and Yasui, Y. (2001) Phases of biomarker development for early detection of cancer. J. Natn. Cancer Inst., 93, 1054–1061. Proschan, M. (2005) Two-stage sample size re-estimation based on nuisance parameter a review. J. Biopharm. Statist., 15, 559–574. Rosenberger, W. F. and Lachin, J. M. (2002) Randomization in Clinical Trials Theory and Practice. New York: Wiley. Serfling, R. J. (1980) Approximation Theorems of Mathematical Statistics. New York: Wiley. Tang, L., Emerson, S. S. and Zhou, X. (2008) Nonparametric and semiparametric group sequential methods for comparing accuracy of diagnostic tests. Biometrics, 64, 1137–1145. Wieand, S., Gail, M. H., James, B. R. and James, K. L. (1989) A family of non-parametric statistics for comparing diagnostic markers with paired or unpaired data. Biometrika, 76, 585–592. Zhou, X., McClish, D. K. and Obuchowski, N. A. (2002) Statistical Methods in Diagnostic Medicine. New York: Wiley. Zou, K., Tempany, C., Fielding, J. and Silverman, S. (1998) Original smooth receiver operating characteristic curve estimation from continuous data: statistical methods for analyzing the predictive value of spiral ct of ureteral stones. Acad. Radiol., 5, 680–687.

Optimal sampling ratios in comparative diagnostic trials.

A subjective sampling ratio between the case and the control groups is not always an efficient choice to maximize the power or to minimize the total r...
261KB Sizes 3 Downloads 3 Views