This article was downloaded by: [University of Nebraska, Lincoln] On: 04 April 2015, At: 11:55 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Biopharmaceutical Statistics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/lbps20

Decision Rules for Subgroup Selection Based on a Predictive Biomarker a

Johannes Krisam & Meinhard Kieser

a

a

Institute of Medical Biometry and Informatics , University of Heidelberg , Heidelberg , Germany Published online: 06 Jan 2014.

Click for updates To cite this article: Johannes Krisam & Meinhard Kieser (2014) Decision Rules for Subgroup Selection Based on a Predictive Biomarker, Journal of Biopharmaceutical Statistics, 24:1, 188-202, DOI: 10.1080/10543406.2013.856018 To link to this article: http://dx.doi.org/10.1080/10543406.2013.856018

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Journal of Biopharmaceutical Statistics, 24: 188–202, 2014 Copyright © Taylor & Francis Group, LLC ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543406.2013.856018

DECISION RULES FOR SUBGROUP SELECTION BASED ON A PREDICTIVE BIOMARKER Johannes Krisam and Meinhard Kieser

Downloaded by [University of Nebraska, Lincoln] at 11:55 04 April 2015

Institute of Medical Biometry and Informatics, University of Heidelberg, Heidelberg, Germany When investigating a new therapy, there is often some plausibility that the treatment is more efficient (or efficient only) in a subgroup as compared to the total patient population. In this situation, the target population for the proof of efficacy is commonly selected in a data-dependent way, for example, based on the results of a pilot study or a planned interim analysis. The performance of the applied selection rule is crucial for the success of a clinical trial or even a drug development program. We consider the situation in which the selection of the patient population is based on a biomarker and where the diagnostic that evaluates the biomarker may be perfect, that is, with 100% sensitivity and specificity, or not. We develop methods that allow an evaluation of the operational characteristics of rules for selecting the target population, thus enabling the choice of an appropriate strategy. Especially, the proposed procedures can be used to calculate the sample size required to achieve a specified selection probability. Furthermore, we derive optimal selection rules by modeling the uncertainty about parameters by prior distributions. Throughout, there is a strong impact of sensitivity and specificity of the biomarker on the results. It is therefore essential to evaluate the rules for patient selection before applying them, thereby bearing in mind that the diagnostic that evaluates the applied biomarker may be imperfect. Key Words:

Adaptive designs; Sample size calculation; Subgroup selection.

1. INTRODUCTION In recent years, there has been a rapid increase in interest in personalized therapies. Among others, advances in the understanding of disease mechanisms have led to the conclusion that an apparently homogeneous patient population is in fact often heterogeneous, thus resulting in differences in therapy response. Furthermore, targeted therapies have been developed that are tailored to a specific mechanism of action and that may not work for all patients suffering from a disease. In both situations, it is plausible that subgroups of patients exist for which a new therapy may be especially effective. In the classical approach, the patients with a potentially enhanced therapeutic effect are identified in pilot Phase II studies and efficacy is then demonstrated in subsequent pivotal Phase III studies. As an alternative, Received December 7, 2012; Accepted September 26, 2013 Address correspondence to Meinhard Kieser, Institute of Medical Biometry and Informatics, University of Heidelberg, D-69120 Heidelberg, Germany; E-mail: [email protected] 188

Downloaded by [University of Nebraska, Lincoln] at 11:55 04 April 2015

DECISION RULES FOR SUBGROUP SELECTION

189

adaptive seamless two-stage designs have been proposed for the combination of Phases II and III in a single study in the situation that the treatment effect may be especially pronounced for a subset of patients (see, e.g., Wang et al., 2007; Brannath et al., 2009; Wang et al., 2009; Hung et al., 2011; Jenkins et al., 2011; Friede et al., 2012). Here, the results of a planned interim analysis are used to select the subgroup and/or the total population for the proof of efficacy in the second step of the trial. Whatever approach is pursued, a false selection of the target population has extremely serious consequences as it may lead to a failed study or even to an erroneous stop of drug development. It is therefore astonishing that, to our knowledge, no systematical evaluations of the characteristics of decision rules for the selection of target populations were presented in the literature so far. We consider the situation where the subgroup with a potentially increased treatment benefit is identified by a predictive biomarker. As Carrol (2007) pointed out, “there is a widespread belief that a biomarker selection strategy will result in smaller, more efficient and lower risk development.” As one of the crucial assumptions frequently made to support this opinion, he quoted the supposed perfectness of the diagnostic evaluating the biomarker, that is, sensitivity and specificity equal to 1. However, in practice this assumption does usually not hold true, and we therefore consider both the situation of perfect and that of imperfect identification of the subgroup of interest. The outline of this article is as follows. In section 2, the notation is introduced and some basic distributional results we use in the following are presented. The characteristics of two classes of patient selection rules are investigated in section 3, namely, rules that depend on the difference of treatment effects in the total population and the subgroup, and rules that depend on both the treatment effect in the total population and that in the subgroup. Especially, the question of choosing the sample size required to achieve a specified value for a correct selection probability is addressed. We provide a recommendation for the minimal performance of the biomarker in terms of sensitivity and specificity required to ensure a prespecified correct selection probability. In section 4, optimal decision rules are derived in case of uncertainty about the treatment effects. We conclude with a discussion in section 5. 2. NOTATION AND SOME BASIC RESULTS We assume that we investigate a total population G0 and a subgroup G1 ⊂ G0 that is aimed to be identified by a predictive biomarker whose complement is denoted by G2 = G0 \G1 . Let i = T C denote the treatment and control group, j = 1 2 the subpopulation Gj , and X ij the outcome in the subpopulation j = 1 2 and group i with XkT 1 k=1n1 ∼ NT 1  1 i.i.d. XkC1 k=1n1 ∼ NC1  1 i.i.d. XlT 2 l=1n−n1 ∼ NT 2  1 i.i.d. XlC2 l=1n−n1 ∼ NC2  1 i.i.d. Furthermore, let the variables XmT m=1n  XmC m=1n denote the outcome in the total population. For simplicity of notation and without loss of generality we set the

190

KRISAM AND KIESER

variance of the outcome equal to 1 and assume balanced allocation for treatment and control group. The prevalence of the subgroup within the investigated patient population is denoted by . Define 1 = T 1 − C1 , 2 = T 2 − C2 and accordingly 0 = 1 + 1 − 2 . In the case of a perfect biomarker, that is, sensitivity sens and specificity spec both equal to 1, the estimators  0 =  XT −  XC   1 =  XT 1 −  X C1

Downloaded by [University of Nebraska, Lincoln] at 11:55 04 April 2015

for 0 and 1 have the following bivariate normal distribution (see Appendix A for derivation):      2/n 2/n    0  1 ∼ N 0  1   2/n 2/n √ Thus, the correlation between  0 and  1 amounts to . Since we are particularly interested in the difference of the effect sizes between the main population and the subgroup  = 1 − 0 , we also consider the estimator  0 for which = 1 −    21 −    ∼ N  n holds true. In case the biomarker is imperfect, that is, sens or spec is not equal to 1, in general some patients belonging to G1 or G2 are assigned to the “false” subgroup. The proportion of biomarker-positive classified individuals then evolves from  to ˜ =  sens + 1 − 1 − spec . Concerning the estimators of the treatment group differences, this leads to (see Appendix A for derivation)    2/n 2/n  0    1  ∼ N 0  q1 + 1 − q0   2/n 2/n ˜ with q =  sens + spec − 1 · / ˜ Furthermore, corr 0   1  =

√ ˜ and the estimator   is biased by the factor q:   21 −  ˜   ∼ N q  n ˜

Note that q is smaller than 1 if and only if spec is smaller than 1. As a consequence, in case of spec = 1 a sensitivity smaller than 1 does not result in a bias of the estimators  1 and  , but their variances increase.

DECISION RULES FOR SUBGROUP SELECTION

191

3. CHARACTERISTICS OF DECISION RULES

Downloaded by [University of Nebraska, Lincoln] at 11:55 04 April 2015

3.1. Decision Rules Depending on the Difference in Treatment Effects Between Total Population and Subgroup We consider the situation in which we base the selection of the target population on the observed difference in treatment effects between the total population and the subgroup, that is, on  = 1 −  0 : If   exceeds a prespecified threshold value c > 0, the efficacy of the drug is subsequently investigated for the subgroup only by solely enrolling biomarker-positive patients for the second phase of the trial. If   is smaller than or equal to c, the total population is selected as the target population. In order to describe the characteristics of this decision rule, we have to define when a decision is correct or not. This can be achieved by introducing a relevance threshold . In case the true difference in effect size between the two populations 1 − 0 exceeds , the correct decision is to select the subgroup as target population; otherwise, selection of the total population is correct. Let 1 =  be the set of possible realisations of our estimator  . We now establish a set of decision rules 1 = dc  c ∈  with dc  1 → 0 1 , such that  1 if y > b dc ˜y = 1 ˜y>c ˜y 1 y>b y = 0 else, where the outcome “1” leads us to restrict further investigation to the subgroup only and “0” means that we consider the total population subsequently. For a given selection rule dc , the probability for selecting the subgroup is then given by

     P dc  = 1 = P  > c = q1 − 0 − c/q 

n˜ 21 −  ˜



where denotes the distribution function of the standard normal distribution. Figure 1 shows the probability for selecting the total population or the subgroup, respectively, depending on the difference  = 1 − 0 for the threshold c = 01 and the scenario that the prevalence of the subgroup patients amounts to  = 025 and the sample size per group in the total population is n = 100. We choose = 01 as the relevance threshold. Values for sensitivity and specificity of 1.0, 0.8, and 0.6 indicating different levels of diagnostic accuracy of the biomarker are considered. For specificity equal to 1, the probability for a correct selection is greater than 0.5 for all differences in effect size. As expected, for a difference in treatment effects larger than the relevance threshold of 0.1, the probability of correctly selecting the subgroup increases with increasing difference in treatment effects between the populations and uniformly decreases with decreasing sensitivity of the biomarker. If the specificity drops to 0.8, the difference in effect size between the subgroup and the total population has to be greater than 0.2, 0.233, and 0.3 for a sensitivity of 1.0, 0.8, and 0.6, respectively, to result in a correct selection of the subgroup with a probability that is higher than 0.5. The other way around, this means that for sample sizes similar to the considered one, a correct selection of the subgroup occurs with an appropriate probability only in the case of a substantial advantage in efficacy as

Downloaded by [University of Nebraska, Lincoln] at 11:55 04 April 2015

192

KRISAM AND KIESER

Figure 1 Probability of selecting the subgroup or the total population, respectively, depending on the difference in treatment effect between subgroup and total population (threshold for decision rule c = 01, prevalence of patients in subgroup  = 025, sample size per group n = 100; sens and spec denote the sensitivity and specificity, respectively).

compared to the total population. In the extreme case of a specificity of only 0.6, the probability for choosing the subgroup is smaller than choosing the total population even if the efficacy in the subgroup is much higher: For sensitivity values below or equal to 0.8, even an advantage in standardized treatment effect by 0.5 does not lead to a selection probability above 0.5. The considerable impact of sensitivity and specificity is also reflected in the sample size required to achieve a specified correct selection probability  > 05. For  > c/q > 0, the required sample size per group amounts to n=

2z2 1 −  ˜  2  − c/q q ˜ 2

where z is the -quantile of the standard normal distribution. As an example, let us assume that the difference in effect size between the subgroup and the total population amounts to 0.3, the prevalence of the subgroup to 0.25, the decision threshold c to 0.1, and the relevance threshold to 0.1. We would like to achieve a selection probability of at least 0.8 for selecting the subgroup when applying the decision rule considered earlier. For a perfect biomarker, the required sample size per group amounts to n = 107; if sensitivity and specificity equal 0.8, the sample size per group increases to n = 3223 patients. For the extreme situation of sensitivity and specificity equal to 0.6, however, it is not even possible to achieve a selection probability of 0.8, since c/q = 09 exceeds  = 03. Thus, the selection probability is always below 0.5 and decreases with increasing sample size. In the following, we determine minimal values for sensitivity and specificity that are required to obtain a selection probability above 0.5 for given treatment effect difference , subgroup prevalence , and decision and relevance thresholds c and , respectively. In the case of  > , that is, when selection of the subgroup is the correct decision, we achieve a selection probability higher than 0.5 if and only if

DECISION RULES FOR SUBGROUP SELECTION

193

 > c/q. For a biomarker with sens =  spec ,  > 0, the above inequality results in the requirement

Downloaded by [University of Nebraska, Lincoln] at 11:55 04 April 2015

sens =  spec  >

/c − 1 + 1   + 1/c − 1 + 

It should be noted that the values for sens and spec do not depend on the number of enrolled patients per group n. For example, for  = 1 the required minimal value for sensitivity and specificity amounts to 0.75 for  = 03, c = 01 and  = 025, and to 0.625 for  = 075. By transformation of the preceding sample size formula, the minimal values for sensitivity and specificity required to achieve a given selection probability  > 05 can be obtained:  sens =  spec  = 2nz2 c −  + 1 +  + z4  − 1 + 2 + nc − 1 −  + 1c −  − c  2 + 2 − 1z  − 1 + 

−1 · nc −  + 1c − 2 + 2z2  − 1 + 2  Note that these values now depend on n. Let us again consider the case  = 03, c = 01, and  = 025 ( = 075). For a sample size of n = 100 enrolled patients per group and equal sensitivity and specificity (i.e.,  = 1), the minimal required value to achieve a correct selection probability of 0.7 amounts to 0.92 for  = 025 and 0.72 for  = 075. 3.2. Decision Rules Depending on the Treatment Effects in the Total Population and the Subgroup Jenkins et al. (2011) considered a more complex decision rule where the set of possible actions is extended. Depending on the observed treatment effects in the total population and the subgroup, the subsequent evaluation of efficacy is performed in both populations, or in only the total population or the subgroup, respectively, or efficacy is not investigated further in either population due to futility. The related decision rule is based on the estimated treatment effects  0 and 2  Let  =  be the set of 1 in the total population or subgroup, respectively. 2   1 , and 2 = dc0 c1   c0  c1  ∈ 2 denotes the set of possible realizations of  0   decision rules with dc0 c1   2 → 0 1 2 , such that

 dc0 c1  ˜y0  y˜ 1  = 1 ˜y0 >c0 ˜y0  1 ˜y1 >c1 ˜y1   The outcome “(0,0)” stands for stopping for futility, “(1,0)” implies that the total population is selected as target population, “(0,1)” means that the subgroup is selected, and “(1,1)” denotes the situation that efficacy is investigated subsequently for both the total population and the subgroup. The threshold values c0 and c1 are chosen to reflect the considerations on when an observed effect is promising

Downloaded by [University of Nebraska, Lincoln] at 11:55 04 April 2015

194

KRISAM AND KIESER

enough to justify further investigation of a (sub)population. It should be noted that Jenkins et al. (2011) considered the situation of survival endpoints and expressed the thresholds in terms of the hazard ratio. They chose c0 = 125 and c1 = 167 but did not investigate the characteristics of this decision rule. In the following, we illustrate how our methods can be used in the planning phase to choose adequate thresholds and sample sizes. Since the distribution of  0   1  is known, the respective decision probabilities     0   1  = 0 0 = P  0 ≤ c0   1 ≤ c1  P dc0 c1       P dc0 c1   0   1  = 1 0 = P  0 > c0   1 ≤ c1      P dc0 c1   0   1  = 0 1 = P  0 ≤ c0   1 > c1      P dc0 c1   0   1  = 1 1 = P  0 > c0   1 > c1  can be calculated for given values. As in the previous section, we define the correctness or incorrectness of a selection by introducing relevance thresholds: If the true treatment effect in the total population 0 lies above a prespecified constant 0 , it is correct to select the patients from G0 for the second stage of the trial. Analogously, selection of the subgroup is correct if the treatment effect in the subgroup 1 exceeds a prespecified threshold

1 . As an example, we consider the situation of a treatment effect of 0 = 02 in the total population and of 1 = 05 in the subgroup and use the relevance thresholds

0 = 01 and 1 = 03. We apply the selection rule with threshold values for the estimated treatment effects of c0 = 01 and c1 = 04 for the total population and the subgroup, respectively; that is, the treatment effects for both populations lie above the threshold values. Figure 2 shows the probabilities for the occurrence of the four possible actions depending on the sample size per group.

Figure 2 Probability of selecting both the subgroup and the total population, the subgroup or the total population only, or stopping for futility, respectively, depending on sample size per group n (treatment effects 0 = 02, 1 = 05, thresholds for decision rule c0 = 01, c1 = 04, prevalence of patients in subgroup  = 025; sens and spec denote the sensitivity and specificity, respectively).

Downloaded by [University of Nebraska, Lincoln] at 11:55 04 April 2015

DECISION RULES FOR SUBGROUP SELECTION

195

For a perfect biomarker, the probability of correctly selecting both the total population and the subgroup ranges between 0.4 and 0.6 for the considered sample sizes and increases with increasing sample size. The probability of choosing only the total population amounts to roughly 0.2 for all considered sample sizes up to 100 per group. The probability of stopping for futility is notable and decreases only slightly with increasing sample size. Especially, with a probability of about 0.2 the drug’s efficacy is investigated further neither in the total population nor in the subgroup even though remarkable treatment effects exist in both populations. For sensitivity and specificity equal to 0.8, the probability of considering only the total population further on is higher as compared to the situation of a perfect biomarker and increases with increasing sample size. In contrast, the probability of selecting both the total population and the subgroup is lower and decreases with increasing sample size. This effect becomes even more pronounced for the situation of sensitivity and specificity equal to 0.6. The probability of selecting only the total population is then much higher than the probability for any of the other three possible actions; it further increases with increasing sample size and amounts to about 0.6 for a sample size of 100 per group. Furthermore, the probability for a futility decision is also increased and is now higher than the probability for selecting both populations or the subgroup only. 4. OPTIMIZATION OF DECISION RULES Up to now, we investigated the characteristics of selection rules for specified values of the treatment effects in the total population and in the subgroup or their difference, respectively. However, in practice there is usually some uncertainty about these quantities. The question arises of how the threshold values of decision rules can be selected in an optimal way in such a situation. Generally, this task can be accomplished by quantifying the (un)certainty about the involved parameters by prior distributions, by choosing a loss function that specifies the penalty for false decisions, and by determining the decision rule such that the expected loss is minimized. More concrete, let  be the set of possible actions that may be taken when selecting the target population and let  be the set of prior random variables. We choose a loss function L  ×  → , where La  measures the loss sustained if the action a realizes under the prior random variable . Let p denote the density function of  and  Y the random variable defining the decision. Let p˜y denote the density function of  Y  and p˜y  = p˜yp,  the set of realizations of  Y , and  = d d   →  the set of decision rules that lead to an action a given a realization y˜ . The so-called Bayes risk rd is then the expected loss for a given decision rule d: rd = ELd˜y  =

  Ld˜y p˜y d˜yd

An optimal decision rule is one that minimizes the Bayes risk. As an example, we consider in the following the decision rule introduced in section 3.1. Here, the subgroup is selected if the observed treatment effect is by at least c larger in the subgroup as compared to the total population; otherwise, the total population is selected. For the difference in treatment effects, we assume

196

KRISAM AND KIESER

a normal prior with expectation m and variance w. Furthermore, we assume that it is desirable to select the subgroup only if the actual difference in effect size is greater than the prespecified relevance threshold , and that the total population is preferred as target population otherwise. A frequently used function that quantifies the penalty for false decisions is the quadratic loss function

Downloaded by [University of Nebraska, Lincoln] at 11:55 04 April 2015

 L  0 1 ×  → +  La  =  − 2 1 a=0 ∈   a  + 1 a=1 ∈−   a    1 if a = i  ∈ I where 1 a=i ∈I a  = a ∈ 0 1  I ⊂  0 else It should be noted that our approach using a loss function covers also the alternative approach of using a utility function U that rewards correct decisions in a quadratic order by yielding the same optimal threshold c∗ due to a symmetry argument. The same is actually true for a combined loss-utility function L − U , which is minimized at c∗ as well, since both L and −U itself are minimized at c∗ . For the situation of a perfect biomarker, we have p˜y  = p˜y   · p ∝ e−

˜y−2 2v

2

− −m 2w



where v denotes the variance of  . For a given decision rule dc we thus obtain the Bayes risk rdc  ∝ = =









− −





− c



0



Ldc ˜y e−  − 2 e−



− c−

 2 e−

˜y−2 2v

˜y−2 2v

˜y−2 2v

2

− −m 2w 2

− −m 2w 2

− −m−  2w

d˜yd

d˜yd +

d˜yd +





0











c

c−



 − 2 e−

 2 e−

˜y−2 2v

˜y−2 2v

2

− −m 2w

d˜yd

2

− −m−  2w

d˜yd

which is minimized at c∗ = −

2m − 1 −  m − v + =− +

w nw

(for derivation see Appendix B). As a consequence, dc∗ is the optimal decision rule with respect to the Bayes risk. In the case of an imperfect biomarker,   is biased by the factor q and has variance v˜ = 2/n. ˜ The optimal threshold is then given by c∗ = −

m − ˜v 2m − 1 −  ˜ + q = − + q

qw q nw ˜

(see Appendix B for details of the derivation). Overall we remark that an optimal decision rule does heavily depend on various parameters such as sample size, specificity, sensitivity, the aim of selection represented by the parameter , and the prior distribution representing the degree of uncertainty. As an example, we consider the situation of a normal prior for

Downloaded by [University of Nebraska, Lincoln] at 11:55 04 April 2015

DECISION RULES FOR SUBGROUP SELECTION

197

Figure 3 Optimal decision threshold depending on sample size per group n (normal prior with mean m = 01 and variance w = 004 for difference in treatment effect between subgroup and total population, prevalence of patients in subgroup  = 025 and relevance threshold = 005; sens and spec denote the sensitivity and specificity, respectively).

the difference in effects with expectation 0.1 and variance 0.04. It is assumed that selection of the subgroup is desirable if the treatment effect in the subgroup is by 0.05 higher than in the total population. The chosen prior distribution indicates that we are optimistic that the advantage of the treatment effect in the subgroup as compared to the total population is actually greater than this value. Figure 3 shows the optimal decision threshold depending on the sample size per group for the same combinations of sensitivity and specificity as considered previously. It can be seen that for spec = 1 the optimal threshold approaches the value of as the sample size increases. However, for smaller sample sizes the optimal threshold lies considerably below . As sensitivity and specificity decrease, the optimal threshold c∗ decreases and moves away from the value of . In fact, the optimal threshold approaches q for increasing sample size which is lower than

whenever specificity is lower than 1. Whether c∗ approaches the constant q

from above or from below depends solely on the quantity m −  that makes the threshold quite sensible to these two prespecified parameters. The situation can be described as follows: In the case m > , we are in an optimistic setting; that is; we expect the treatment effect difference to be larger than the treshold . This leads to a low threshold c∗ . In contrast, the pessimistic setting m < leads to a higher threshold. It should also be noted that c∗ is approaching q when the subgroup prevalence  is increasing. Therefore, a relatively large subgroup yields a rather cautious decision threshold c∗ , which is close to the constant q . 5. DISCUSSION Selecting an appropriate patient population for the proof of efficacy of a new therapy plays a crucial role for the success of clinical trials and drug development programs. In this article, we considered the common situation that the decision of investigating efficacy in the complete patient population and/or a subgroup is

Downloaded by [University of Nebraska, Lincoln] at 11:55 04 April 2015

198

KRISAM AND KIESER

made in a data-dependent way, for example, based on a pilot study or on data of a planned interim analysis. We derived methods that allow an evaluation of the performance of rules for selecting the target population in the planning phase thus enabling to choose an appropriate strategy. Especially, the sample size or the minimal values for sensitivity and specificity of the biomarker, respectively, required to achieve a specified selection probability can be calculated. Although we restricted our considerations to two specific classes of decision rules, the methods can also be used to deal with other types of selection strategies. The classes of decision rules we considered in section 3.1 are variants of the set of -selection rules proposed by Friede et al. (2012) for adaptive subgroup selection based on interim test statistics. These rules depending on the treatment effect difference between subgroup and total population are implemented in the ADDPLAN PE module for subgroup selection in adaptive patient enrichment designs (Aptiv Solutions, 2013). It should be mentioned that application of a rule that is based on the difference in effects, as those presented in section 3.1, may not be satisfactory in the case where the treatment under investigation shows clinically relevant treatment effects both in the total population and in the subgroup. If such a scenario is anticipated, more complex nonbinary decision rules based on the absolute treatment effect estimates as provided in section 3.2 are preferable in order to avoid unfavorable interim decisions. In practice, there is usually some uncertainty about the value of parameters that influence the characteristics of selection rules, as, for example, the actual difference in treatment effects between the total population and the subgroup. In this situation, decision-theoretic methods can be used to derive selection rules which are optimal in some sense. We considered optimal decision rules minimizing the Bayesian risk. As an example, we derived optimal rules for quadratic loss functions and the situation that selection of the target population is based on the difference in treatment effects between the total population and a subgroup and that a normal prior is assumed for this parameter. Of course, other loss functions and other prior distributions may be used as well. For our example, we also applied a uniform prior, which led to very similar results. Furthermore, optimal decision rules can also be derived for more complex selection strategies by pursuing our approach. However, the involved computations are then even more demanding and it may be no longer possible to obtain analytical solutions. A reviewer pointed out that “the problem of an imperfect bioassay may be relevant if the subgroup is defined by a strong biological hypothesis that leads to a clear biological definition of a hypothetical ‘true’ subgroup for which better bioassays can and will be developed in the future.” Such a situation is quite common in practice. For example, it was estimated in a report prepared by a panel of the American Society of Clinical Oncology/College of American Pathologists (ASCO/CAP) that 20% of testing for overexpression of the human epidermal growth factor receptor 2 (HER2) might be incorrect (Wolff et al., 2007). Furthermore, other previous investigations also showed quite high false positive and false negative rates (Perez et al., 2006; Reddy et al., 2006). In contrast, for a recent HER2 testing method, values for sensitivity and specificity of 98.7% of 99.3%, respectively, were demonstrated (Dekker et al., 2012). We agree with the reviewer that the later use of a more precise bioassay may bear difficulties if the later assay deviates substantially from the assay used in the trial. It may then be difficult to

Downloaded by [University of Nebraska, Lincoln] at 11:55 04 April 2015

DECISION RULES FOR SUBGROUP SELECTION

199

generalize the (higher) efficacy found in the biomarker-positive subgroup of the trial to the “true” subgroup and hence to the new biomarker-positive group. In all our investigations, a strong impact of the sensitivity and specificity of the biomarker was observed. This holds true both for the performance characteristics of the decision rules and the required sample sizes, and for the optimal strategies derived under uncertainty of parameter values. When planning for a data-dependent selection of the target population, it is therefore important to bear in mind that the diagnostic that evaluates the applied biomarker is potentially not perfect but imperfect and to take the consequences into consideration. Although there are increasing efforts in determining the accuracy of biomarkers (see, e.g., Dekker et al., 2012), only rough estimates of sensitivity and specificity may in general be available in the planning stage of a clinical trial. One could then consider a plausible range of values in order to define an appropriate decision rule. Overall, it is essential to thoroughly evaluate the rules for patient selection in the planning phase of a clinical trial before applying them. Considerations such as those presented in this article may help to define an appropriate decision rule that actually shows the desired performance.

APPENDIX A: DISTRIBUTION OF TREATMENT EFFECT ESTIMATORS Perfect Biomarker In the case of a perfect biomarker we have 2 2 1  =  Var 0  =  Var n n   

T Cov  0   1 = Cov  XC   XT 1 −  X C1 X −

T T 1

C C1  = Cov  X  X  X + Cov  X n1 n1 n n     T T1 C C1 = 1/nn1  · Cov + Cov Xi  Xj Xi  Xj = 1/nn1  · Cov



i=1 n1  j=1

j=1

XjT 1 

n1 

XjT 1 + Cov

j=1

i=1



n1  j=1

j=1

XjC1 

 =  Var X T 1  + Var X C1     √  1 1 2 = + =  i.e., corr  0   1 =  n n n

n1 

XjC1

j=1

Since  0   1  is bivariate normally distributed, it follows     1 −  0 ∼ N 1 − 0  Var 1  + Var 0  − 2 · Cov  0   1   21 −  = N 1 − 0   n

200

KRISAM AND KIESER

Imperfect Biomarker For given sensitivity and specificity sens and spec , n˜ 1 =  sens + 1 −  1 − spec n individuals are classified as biomarker-positive in the treatment and the control group, respectively. In both the treatment group and the control group there are  sens n true-positive and 1 − 1 − spec n false-positive patients and therefore

Downloaded by [University of Nebraska, Lincoln] at 11:55 04 April 2015

+ =  sens / sens + 1 − 1 − spec  is the positive predictive value. With q =  sens + spec − 1 · /, ˜ the expectation of  the estimator 1 is given by  sens 1 + 1 − spec 0 − 1  ˜ ˜ −  sens + spec − 1  sens + spec − 1 = · 1 + · 0 ˜ ˜ = q · 1 + 1 − q · 0 

+ · 1 + 1 − +  · 2 =

2 and its variance is given by n . The covariance between the two estimators is ˜ calculated as done earlier and is again equal to 2/n. Hence, the correlation is given √ ˜ Similarly to the case of a perfect biomarker, we get by .

  21 −  ˜   1 − 0 ∼ N q1 − 0   n ˜ APPENDIX B: DERIVATION OF OPTIMAL THRESHOLD Perfect Biomarker The Bayes risk is given by rdc  ∝ = =









Ldc ˜y e−

− −









− c



0



 − 2 e−



− c−

 2 e−

˜y−2 2v

˜y−2 2v

˜y−2 2v

2

− −m 2w

2

− −m 2w

2

− −m−  2w

d˜yd

d˜yd +

d˜yd +





0











c

c−



 − 2 e−

 2 e−

˜y−2 2v

˜y−2 2v

2

− −m 2w

d˜yd

2

− −m−  2w

d˜yd

Without loss of generality we assume that = 0. Since we aim at minimizing the Bayes risk, we calculate its partial derivation with respect to c by using the Fundamental Theorem of Calculus and obtain  0  c−2 −m2 c−2 −m2 rdc  ∝− 2 e− 2v − 2w d + 2 e− 2v − 2w d c − 0

DECISION RULES FOR SUBGROUP SELECTION

2 v+c2 w √ √ 1 − m 2vw = e v · 2 vwv + wmv + cw 3 v + w

Downloaded by [University of Nebraska, Lincoln] at 11:55 04 April 2015

+ 2e

mv+cw2 2vwv+w



201





2wv + w mv + cw + v w + vw I  2

2

2

mv + cw 2vwv + w

 

denotes the Gaussian which is equal to zero at c∗ = −mv/w. The expression “I”  2 z integral Iz = 0 e−t dt. Since mv + cw and Imv + cw/ 2vwv + w are the only expressions in the preceding term that may not be positive and since both of them are positive if and only if c > −mv/w and negative if and only if c < −mv/w, the same property holds true for the derivation of rdc . Therefore, c∗ is the single critical value of rdc . Furthermore, c∗ is a minimum point since m2 v+w

2 rdc  ∗ 4e− 2w2 vw2 c  = > 0 2 c v + w2 As a consequence c∗ is the global minimum point of the Bayes risk. Imperfect Biomarker We again assume without loss of generality = 0. Then p˜y  ∝ e−

˜y−q2 2˜v

2

− −m 2w

2

=e

2

− ˜y/q− − −m 2w 2 2q v˜



and substituting z = y˜ /q in the integration yields c∗ = −m˜v/qw as optimal decision threshold. ACKNOWLEDGMENTS We thank the guest editors and the referees for their careful review and helpful suggestions. FUNDING This work was supported by the program “Mathematics for Innovations in Industry and Services” of the German Federal Ministry of Education and Research (BMBF) under grant 05M13VHC. REFERENCES Addplan, Inc., an Aptiv Solutions Company. (2013). ADDPLAN PE Version 6.0 User Manual, Rev. 3. Cologne, Germany: Aptiv Solutions. Brannath, W., Zuber, E., Branson, M., Bretz, F., Gallo, P., Posch, M., Racine-Poon, A. (2009). Confirmatory adaptive designs with Bayesian decision tools for a targeted therapy in oncology. Statistics in Medicine 28:1445–1463. Carroll, K. J. (2007). Biomarkers in drug development: Friend or foe? A personal reflection gained working within oncology. Pharmaceutical Statistics 6:253–260.

Downloaded by [University of Nebraska, Lincoln] at 11:55 04 April 2015

202

KRISAM AND KIESER

Dekker, T. J., Borg, S. T., Hooijer, G. K., Meijer, S. L., Wesseling, J., Boers, J. E., Schuuring, E., Bart, J., van Gorp, J., Mesker, W. E., Kroep, J. R., Smit, V. T., van de Vijver, M. J. (2012). Determining sensitivity and specificity of HER2 testing in breast cancer using a tissue micro-array approach. Breast Cancer Research 14:R93. Friede, T., Parsons, N., Stallard, N. (2012). A conditional error function approach for subgroup selection in adaptive clinical trials. Statistics in Medicine 31:4309–4120. Hung, H. M. J., Wang, S. J., O’Neill, R. (2011). Flexible design clinical trial methodology in regulatory applications. Statistics in Medicine 30:1519–1527. Jenkins, M., Stone, A., Jennison, C. (2011). An adaptive seamless phase II/III design for oncology trials with subpopulation selection using correlated survival endpoints. Pharmaceutical Statistics 10:347–356. Perez, E. A., Suman, V. J., Davidson, N. E., Martino, S., Kaufman, P. A., Lingle, W. L., Flynn, P. J., Ingle, J. N., Visscher, D., Jenkins, R. B. (2006). HER2 testing by local, central, and reference laboratories in specimens from the North Central Cancer Treatment Group N9831 Intergroup Adjuvant Trial. Journal of Clinical Oncology 24:3032–3038. Reddy, J. C., Reimann, J. D., Anderson, S. M., Klein, P. M. (2006). Concordance between central and local laboratory HER2 testing from a community-based clinical study. Clinical Breast Cancer 7:153–157. Wang, S. J., O’Neill, R. T., Hung, H. M. J. (2007). Approaches to evaluation of treatment effect in randomized clinical trials with genomic subset. Pharmaceutical Statistics 6:227– 244. Wang, S. J., Hung, H. M. J., O’Neill, R. T. (2009). Adaptive patient enrichment designs in therapeutic trials. Biometrical Journal 51:358–374. Wolff, A. C., Hammond, M. E., Schwartz, J. N., Hagerty, K. L., Allred, D. C., Cote, R. J., Dowsett, M., Fitzgibbons, P. L., Hanna, W. M., Langer, A., McShane, L. M., Paik, S., Pegram, M. D., Perez, E. A., Press, M. F., Rhodes, A., Sturgeon, C., Taube, S. E., Tubbs, R., Vance, G. H., Van de Vijver, M. J., Wheeler, T. M., Hayes, D. F. (2007). American Society of Clinical Oncology/College of American Pathologists guideline recommendations for human epidermal growth factor receptor 2 testing in breast cancer. Journal of Clinical Oncology 25:118–145.

Decision rules for subgroup selection based on a predictive biomarker.

When investigating a new therapy, there is often some plausibility that the treatment is more efficient (or efficient only) in a subgroup as compared ...
404KB Sizes 0 Downloads 0 Views