An Efficient Estimator for the Expected Value of Sample Information.

ORIGINAL ARTICLE

An Efficient Estimator for the Expected Value of Sample Information Nicolas A. Menzies, PhD

Background. Conventional estimators for the expected value of sample information (EVSI) are computationally expensive or limited to specific analytic scenarios. I describe a novel approach that allows efficient EVSI computation for a wide range of study designs and is applicable to models of arbitrary complexity. Methods.The posterior parameter distribution produced by a hypothetical study is estimated by reweighting existing draws from the prior distribution. EVSI can then be estimated using a conventional probabilistic sensitivity analysis, with no further model evaluations and with a simple sequence of calculations (Algorithm 1). A refinement to this approach (Algorithm 2) uses smoothing techniques to improve accuracy. Algorithm performance was compared with the conventional EVSI estimator (2-level Monte Carlo integration) and an alternative developed by Brennan and Kharroubi (BK), in a cost-effectiveness case study. Results. Compared with the conventional estimator, Algorithm 2 exhibited a root mean square error (RMSE) 8%–17% lower, with

far fewer model evaluations (3–4 orders of magnitude). Algorithm 1 produced results similar to those of the conventional estimator when study evidence was weak but underestimated EVSI when study evidence was strong. Compared with the BK estimator, the proposed algorithms reduced RSME by 18%–38% in most analytic scenarios, with 40 times fewer model evaluations. Algorithm 1 performed poorly in the context of strong study evidence. All methods were sensitive to the number of samples in the outer loop of the simulation. Conclusions. The proposed algorithms remove two major challenges for estimating EVSI—the difficulty of estimating the posterior parameter distribution given hypothetical study data and the need for many model evaluations to obtain stable and unbiased results. These approaches make EVSI estimation feasible for a wide range of analytic scenarios. Key words: value of information; EVSI; research design; decision theory. (Med Decis Making XXXX;XX:XXX– XXX)

A

decision theory and holds that the expected value of information (VOI) is equal to the expected improvement in social welfare (or other valued goal) generated by the information.1 By collecting more information before a decision is made, a decision maker can reduce the probability that a suboptimal policy will be chosen and thereby can improve outcomes on expectation. Realistically, additional research will only dispel some of the uncertainty attached to a particular decision, typically providing probabilistic information on a subset of the factors influencing the decision. The expected value of sample information (EVSI) places a monetary value on the expected improvement in outcomes that would be produced by this information. Published approaches for computing EVSI place great burdens on an analysis. Ades and colleagues2 describe approaches for calculating EVSI in various situations, yet except for a small number of specific cases (characterized by linear relationships between parameters and net benefits and by independence of

common approach to estimating the potential value of new research draws on statistical

Received 16 September 2014 from the Department of Global Health and Population and the Center for Health Decision Science, Harvard University, Boston, MA (NAM). Financial support for this study was provided in part by a grant from the HIV Modeling Consortium. The funding agreement ensured the author’s independence in designing the study, interpreting the data, and writing and publishing the report. Revision accepted for publication 9 March 2015. Supplementary material for this article is available on the Medical Decision Making Web site at http://mdm.sagepub.com/supplemental. Address correspondence to Nicolas A. Menzies, Department of Global Health and Population, Harvard T. H. Chan School of Public Health, 665 Huntington Avenue, Boston, MA 02115; telephone: (617) 432-0492; e-mail: [email protected]. Ó The Author(s) 2015 Reprints and permission: http://www.sagepub.com/journalsPermissions.nav DOI: 10.1177/0272989X15583495

MEDICAL DECISION MAKING/MON–MON XXXX

Downloaded from mdm.sagepub.com at Kungl Tekniska Hogskolan / Royal Institute of Technology on November 15, 2015

1

MENZIES

parameter subsets) these approaches require 2 nested Monte Carlo integration steps to estimate results, with the innermost loop requiring the ability to simulate from the posterior parameter distribution given study data. Drawing samples from the posterior distribution can be straightforward if this distribution is available in closed form, and this can be achieved by careful selection of conjugate prior-likelihood combinations.3,4 However, this approach will be inappropriate for many analyses. Sampling from the posterior distribution can also be achieved by using Markov Chain Monte Carlo (MCMC) simulation,5 yet this approach will add substantially to computation time. The need for 2-level Monte Carlo integration provides additional challenges, especially for complex models. Monte Carlo integration generally requires large sample sizes to obtain accurate results, and the need for large sample sizes at both levels of the analysis requires the model to be evaluated a great number of times. This can be challenging for complicated models, with the total number of model evaluations being the product of the number of draws in each integration step. This number becomes even larger still if MCMC methods are required to obtain samples from the posterior distribution. Brennan, Kharroubi, and colleagues5,6 have described alternative methods that replace the second Monte Carlo integration step with Laplace approximation. This method requires information on the mode of the posterior distribution as well as partial derivatives at a number of locations. If the posterior is not available in closed form, numerical optimization methods are required to estimate the mode and partial derivatives. This reduces the benefits of this method as the number of model evaluations rises substantially. In addition, this method requires parameter distributions to be smooth, differentiable, unimodal functions. While these conditions will be met in the majority of analyses, they exclude discrete parametric distributions and nonparametric distributions based on empirical data,6 as well as analyses featuring numerically calibrated models. Strong and others7 take a different approach to avoiding the second Monte Carlo integration step, simulating hypothetical data from the parameter sets produced by the first Monte Carlo step (as with the 2-step method) and then regressing the net benefit estimated for each parameter set against a summary statistic calculated from the simulated data using a nonparametric approach. While the appropriate choice of summary statistic may not be obvious in more complicated examples, this approach can be implemented without further model evaluation apart

2

from that required for the outer Monte Carlo loop and using readily available regression software. Other recent work has focused on solutions for particular types of EVSI problem. For example, Ades and others8 proposed a method for multiarm cluster-randomized trials with binary outcomes, and Brennan and Kharroubi9 demonstrated the utility of their method for time-to-event data modeled with a Weibull distribution. Given the computational challenges of estimating EVSI, it is not surprising that few applied analyses have been reported. A 2013 review of the VOI literature found that half of all papers were methodological in scope and that real-life applications of EVSI remained scarce.10 The algorithms described in the following section represent a novel approach to obtaining EVSI estimates with low computational burden. These approaches can be implemented with models of arbitrary complexity and are compatible with any combination of prior and likelihood. Similar to the approach of Brennan and Kharroubi,9 the approach used here achieves efficiency gains by replacing the inner Monte Carlo integration step. To do so, the proposed approach draws on information already available from the parameter sets run as part of the outer Monte Carlo integration step, and no further model evaluations are required. For this reason, these algorithms can be implemented using the results of a conventional probabilistic sensitivity analysis. The performances of the proposed algorithms are compared with both the conventional estimator and the approach of Brennan and Kharroubi, in the context of a case study cost-effectiveness analysis. METHODS Framework for Estimating Value of Information The notation used by Ades and others2 is adopted to provide a mathematical description of EVSI. Net benefit is denoted Bðt; uÞ, a function of the policy (t) and other factors (u) that affect the policy outcome. Decision making with no new evidence. Based on current information, a rational decision maker will choose the policy (t ) that maximizes Bðt; uÞ on expectation: t 5 arg max Eu Bðt; uÞ:

ð1Þ

t

If B0 is used to denote the net benefit obtained by applying this decision rule, the expected value of B0 is given by



EFFICIENT EVSI ESTIMATION

EB0 5 max Eu Bðt; uÞ: t

ð2Þ

Decision making with study evidence. If a policy is selected after observing the results of new research, this may change the relative ranking of policy options. The distribution of study data (DÞ will be determined by u as well as the study design used to collect data. The relationship between any new study and the model parameters may be complex; however, all that is assumed here is that D can be represented by a likelihood for the parameters or some function of the parameters. In the simplest situations, new study data might only provide information about a single study parameter. At the other extreme, D could update the joint distribution of all model parameters.* For full generality, parameters affected by D are not distinguished from parameters unaffected by D, although this distinction will apply in many specific applications. The realization of values for D can be thought of as involving 2 steps—first a value for u is drawn from the prior distribution pðuÞ, and then a value for D is drawn from the predictive distribution pðDjuÞ (note that u and D may be scalar, vector, or otherwise). This is equivalent to a draw from pð DÞ because pðDjuÞpðuÞ is equal to pðD; uÞ. If D is observed before a decision is made, the optimal policy chosen in light of these new data (t ð DÞ) can be calculated as t ðD Þ 5 arg max EujD Bðt; uÞ:

The outer expectation of Equation 5 (ED ) can also be understood as taking the expectation over the predictive distribution pðDjuÞ and then taking the expectation over the prior distribution pðuÞ (i.e., Eu EDju ), and this 2-step calculation is used in the novel approaches for computing EBS described in the next section. Expected value of sample information. Once EBS has been estimated, EVSI can be calculated directly by comparing EBS to EB0 , the expected net benefit with no new information: EVSI 5 EBS EB0 :

Novel Approach to Calculating EVSI The central difficulty with computing EVSI comes with the calculation of the innermost expectation of EBS (EujD Bðt; uÞ in Equation 5Þ. It is this value that is being estimated by the inner Monte Carlo integration step described earlier and that Brennan and Kharroubi estimate using Laplace approximation. The proposed approach relies on the fact that many samples from the prior distribution pðuÞ are already needed in order to implement the outermost expectation of Equation 5 and that a numerical approximation of the posterior distribution pðujDÞ can be obtained by reweighting these parameter sets according to the data likelihood pðDjuÞ, using Bayes theorem:

ð3Þ

t

pðujDÞ 5

For a given value of D, the expected net benefit produced by the optimal policy is given by max EujD Bðt; uÞ: t

ð4Þ

If BS is used to denote the net benefit obtained by applying this decision rule, the expected value of BS can be calculated by taking the expectation of BS over the distribution of D: EBS 5 ED max EujD Bðt; uÞ: t

ð6Þ

ð5Þ

pðuÞ pðDjuÞ : pðD Þ

ð7Þ

A numerical approximation of the posterior distribution of Bðt; uÞ given D can be obtained by applying the same reweighting approach to the values of Bðt; uÞ calculated for each parameter set. If N samples are drawn from the prior distribution pðuÞ and Bðt; uÞ calculated for each of these samples, the expectation EujD Bðt; uÞ can be estimated directly: EujD Bðt; uÞ 5 E

X N Bðt; uÞ pðDjuÞ Bðt; ui Þ wn ðD Þ; ﬃ pðD Þ n51

ð8Þ

where the weights wn ð DÞ are proportional to the likelihood pðDjuÞ: * This might occur if D represents a likelihood not around specific parameters but rather for a modeled outcome that might be influenced by all parameters, as when a model is calibrated to observed data. This situation could also arise if the joint prior distribution for u exhibits no statistically independent subsets, such that better information on one parameter indirectly affects the probability distribution of other parameters.

pðDjun Þ : wn ðDÞ 5 PN n 5 1 pðDjun Þ

ð9Þ

As pð DÞ is not estimated directly, the weights are obtained by dividing pðDjuÞ by a normalization

3


MENZIES

B2. Estimate Bðt ðDn Þ; un Þ, the net benefit that would be obtained conditional on Dn and un being realized. This is equivalent to the net benefit for policy t ðDn Þ and parameter set un , already calculated as part of step A. B3. Repeat steps B1 and B2 for all n, and then average the results across all n:

constant chosen so that the weights sum to 1, ensuring that pðujDÞ is a proper density. This normalization N P constant pðDjun Þ represents the Monte Carlo estin51

mator for pð DÞ. If EujD Bðt; uÞ is estimated by resampling from the N parameter samples using wn ð DÞ for sampling weights, this procedure conforms exactly to the sampling importance resampling (SIR) approach described by Rubin11 for estimating posterior distributions.y The 2 algorithms described below operationalize this approach. The first algorithm is a direct application of Equations 7–9 and can be implemented with minimal programing difficulty once the relationship between model parameters and study data has been operationalized (i.e., a predictive distribution for the study data given u and the likelihood function produced by study data). The second algorithm extends this approach using smoothing techniques to obtain a better approximation of Bðt; uÞ and therefore a better estimate of EVSI. Algorithm 1. This algorithm proceeds by evaluating a sample of draws from the prior parameter distribution, as with a conventional probabilistic sensitivity analysis. A. Draw a large sample of parameter sets from the prior distribution pðuÞand evaluate the model to estimate net benefit Bðt; un Þfor each parameter set (un for n 2 ½1; N ) and each policy t. B. Estimate EBS : B1. For a single parameter set un , draw a sample (Dn ) from the predictive distribution of research results pðDjun Þ and identify the optimal policy t ðDn Þ: t ðDn Þ 5 arg max

P

Bðt; uk Þ k2K P

t

5 arg max t

k2K

X

pðDn juk Þ pðDn juk Þ

wk ðDn ÞBðt; uk Þ;

ð10Þ

k2K

where K represents the set of parameter sets.z

y This resampling approach may reduce computational burden if the number of parameter sets is very large and many of the weights w are close to zero. However, this approach will also introduce a small amount of Monte Carlo error, with the magnitude of this error depending on the size of the resample. z Note that the set K is equivalent to N but is distinguished in Equation 10 as n is held fixed while k is the index over which the summation is applied.

4

dS 5 EB

PN

n51

Bðt ðDn Þ; un Þ : N

ð11Þ

C. Estimate EB0 using conventional methods: d0 5 max EB

PN

n51

t

Bðt; un Þ : N

ð12Þ

D. Finally, estimate EVSI using the results from Equations 11 and 12: d 5 EB d0 : dS EB EVSI

ð13Þ

As shown in the examples provided in the next section, Algorithm 1 can provide a reasonable approximation for EVSI in some analytic scenarios. However, this approach will likely perform worse as the strength of the evidence provided by the proposed study design increases. This problem relates to the use of weighted draws from the prior distribution to approximate the posterior distribution. Although the initial sample of parameter sets might be large, the weighting applied to these parameter sets decreases the precision of inferences made about the posterior distribution. This reduced precision can be quantified by computing the effective sample size (ESS), using w from Equation 9§: ESS 5 P

1 2: i wi

ð14Þ

As the evidence supplied by a study design becomes stronger (as will be achieved by increasing the sample size or reducing measurement error), the resulting likelihood becomes narrowly concentrated in a small region of parameter space, with the consequence that many weights will be close to zero and the effective sample size smaller. For example, if 1000 samples are drawn from a parameter distributed §

This formula for the effective sample size derives from the more P 2 ð wi Þ general formula ESS 5 P i w2 described by Kish22 for weighted probð iÞ i ability samples. In the present case the weights sum to 1 so the numer 2 P ator wi is also equal to 1. i




with a Beta(1,1) distribution (i.e., uniform over the unit interval) and a binomial likelihood is assumed for study evidence, a study with a sample size of 200 would reduce the ESS of the posterior sample to around 100, a tenth of the original sample size. In Equation 10, as study evidence becomes stronger and ESS declines, wn ðDn Þ will increase toward 1 and all other weights will decline. P Consequently, the maximand in Equation 10 wk ðDn ÞBðt; uk Þ will conk2K

verge to Bðt; un Þ. This is essentially an overfitting dS and problem and produces an upward bias in EB therefore an upward bias in the EVSI estimate. This issue is equivalent to having an inadequate number of samples in the inner loop of the conventional 2level Monte Carlo estimator, as explored by Oakley and others12 in the context of estimating EVPPI. In the limit, as all weights apart from wn asymptote to N dS asymptotes to P max Bðt; un Þ=N, and zero, EB n51

t

d becomes equivalent to the EVPI estimator.# EVSI Algorithm 2. To respond to the issues that might be caused by low ESS, Algorithm 2 uses smoothing techniques to obtain a better estimate of EujD Bðt; uÞ. This approach is motivated by the fact that for most decision models, the outcome of interest will change smoothly as a function of the parameters. If true, more precise estimates of EujD Bðt; uÞ can be obtained by modeling the relationship between net benefits and the parameters informed by new research. Such meta-modeling techniques have been used previously for other VOI research, with a number of authors using Gaussian processes to emulate complex simulation models13,14 and recent methodological papers using meta-modeling to compute the value of partial perfect information.15,16 Strong and others7 also use spline regression via generalized additive models as part of their EVSI estimation approach. Algorithm 2 follows the same major steps as Algorithm 1. A. Draw a large sample of parameter sets (un for n 2 ½1; N ) from the prior parameter distribution pðuÞ and evaluate the model to estimate net

benefit Bðt; un Þ for each parameter set n and each policy t. dS : B. Estimate EB B1. For each strategy, fit a smooth relationship (B~ðt; gÞ) predicting expected net benefit as a function of g, the subset of parameters informed by new research. B2. For a single parameter set un , draw a sample (Dn ) from the predictive distribution of research results pðDjun Þ and identify the optimal policy t ðDn Þ. The formula for identifying t ðDn Þ is the same as Equation 10 except that Bðt; uk Þ has been replaced by B~ðt; gk Þ, the prediction equation estimated in step B1: t ðDn Þ 5 arg max t

arg max t

P

~

Bðt; gk Þ k2K P

X

pðDn juk Þ 5 p ð D n juk Þ k2K

wk ðDn ÞB~ðt; gk Þ;

where K represents the set of parameter sets. B3. Estimate B~ðt ðDn Þ; gn Þ, the net benefit that would be obtained by strategy t ðDn Þ conditional on un being realized, as predicted by the prediction equation estimated in step B1. B4. Repeat steps B2 and B3 for each n in N, and then average the results across N: dS 5 EB

PN

n51

B~ðt ðDn Þ; gn Þ : N

ð16Þ

d0 . To do so, the prediction equaC. Estimate EB tion estimated in step B1 is used to predict net benefits for each strategy and parameter d0 is calculated as the highest predicted set. EB net benefit produced by any of the strategies, when averaged over all parameter sets.** This matches the result that would be obtained for dS if the likelihood was uninformative and EB thus all weights in Equation 15 equal to 1=N. d0 5 max EB

PN

n51

t

# An option for avoiding this problem would be to exclude un from the maximand in Equation 10, essentially creating a jackknife estimator for t ðDn Þ. While this approach would resolve the overfitting problem, it would also reduce the precision with which t ðDn Þ is estimated. Simulation experiments showed this jackknife approach to perform worse (i.e., produce an EVSI estimator with greater RMSE) than the version shown in Equation 10.

ð15Þ

k2K

B~ðt; gn Þ : N

½17

** For step C, it is also possible to use the conventional estimator for d0 shown in Equation 12. However, by using the results of the predicEB dS and EB d0 , a portion of any systematic errors tion equation for both EB d0 is associated with the prediction equation will be netted out when EB dS in step D, producing a lower variance estimator for subtracted from EB EVSI.

5


MENZIES

D. Finally, estimate EVSI:

Case Study Methods

d 5 EB d0 : dS EB EVSI

This approach can be understood as a generalization of the approach described by Strong and others15 for estimating EVPPI. As study evidence becomes increasingly strong (in Equation 15, as wn ðDn Þ asymptotes to 1 and all other weights asymptote to zero), the EVSI estimator produced by Algorithm 2 becomes equivalent to the EVPPI estimator described by Strong and others. This is a desirable property, since EVSI should approach EVPPI as the strength of study evidence becomes overwhelming. This property (asymptoting to EVPPI) can be contrasted with the issue with Algorithm 1, which asymptotes to EVPI as study evidence becomes overwhelming. The overfitting problem in Algorithm 1 is resolved by using B~ðt; gk Þ instead of Bðt; uk Þ in Equation 15, where B~ðt; gk Þ provides an unbiased approximation of the expected net benefit for each strategy for given values of the parameters in g. Options for estimating the relationship B~ðt; gÞ include regression splines, locally weighted polynomial (LOESS) regression, and Gaussian process regression. The reduction in estimation error achieved with Algorithm 2 will depend on the accuracy of the smoothing approach.yy The analyses described below were undertaken using the R programming language, in which Gaussian process regression can be implemented using the kernlab package.17 Of the possible smoothing approaches, Gaussian process regression was chosen as it is well-suited to problems involving multiple parameters, including the case study described below.18 However, Gaussian process regression can become slow as the number of observations (in this case parameter sets) becomes large. Spline regression may be preferred for EVSI analyses where only individual parameters will be directly informed by the proposed study design(s), as it will perform well with large numbers of observations and is supported by many software packages. In R, spline regression can be accomplished via the mgcv package.19 Further details on the implementation of the smoothing approach are provided in the online technical appendix.

yy Any reasonable smoothing approach should produce an improvement over Algorithm 1. However, a smoother that overfits the training data will produce an upward bias in the EVSI estimate. At the extreme (if the smoother is flexible enough to fit all the training data), the EVSI estimate will be equal to the EVPI estimate. A smoother that is in insufficiently flexible will lead to an underestimate of EVSI.

6

Hypothetical cost-effectiveness analysis. Performance of the proposed algorithms was compared with the conventional 2-level Monte Carlo approach in a hypothetical cost-effectiveness analysis, based on a case study developed by Brennan and Kharroubi.5,6 The analysis compared 2 competing strategies for treating a hypothetical disease: treatment with drug T0 or T1. For each drug, a fraction of patients respond to treatment, experiencing a utility improvement of fixed duration. A fraction of patients experience side effects, experiencing a utility decrement of fixed duration. Costs are composed of a one-time drug cost for each patient plus inpatient costs incurred by a fraction of patients admitted to hospital. Inpatient costs are calculated as a per-day cost multiplied by the number of hospital days. Table 1 gives mean estimates and measures of uncertainty for model parameters. Following the original example, all priors in this example were assumed to be Normal, and the proposed study designs (described below) were assumed to produce Normal likelihoods.zz All parameters were assumed to be independent. The willingnessto-pay threshold (l) was assumed to be $100,000 per quality-adjusted life-year (QALY) saved, and for simplicity outcomes were not discounted. Net benefits were estimated by subtracting costs from the monetized value of QALY improvements20: BðT0; uÞ 5 lðu5 u6 u7 1 u8 u9 u10 Þ ðu1 1 u2 u3 u4 Þ;

ð18Þ

BðT1; uÞ 5 lðu14 u15 u16 1 u17 u18 u19 Þ ðu11 1 u12 u13 u4 Þ: ð19Þ

In this example, Strategy T0 produces expected costs of $12,100 (95% posterior interval: 10,000, 14,800) and an expected QALY gain of 0.62 (0.18, 1.18), for net benefits of $49,700 (6000, 106,400) per patient.§§ Strategy T1 produces expected costs of

zz While Normal priors and likelihoods are technically incorrect for parameters defined over bounded regions (e.g., probabilities, utilities), this specification was used to maintain consistency with the Brennan and Kharroubi example. A benefit of these choices is that a closedform solution is available for the posterior parameter distribution given study data. This closed-form posterior is not necessary for the proposed algorithms but facilitates the implementation of the conventional approach. §§ These results were computed using a Monte Carlo simulation with 1 million draws from the prior. As the model is not linear, these results will differ somewhat from results calculated from the mean values of the parameters.




Table 1

Parameters for Case Study Cost-Effectiveness Analysis Prior Standard Deviation

Prior Mean Parameter Description

Drug cost (u1 ; u11 ) Percentage hospitalized (u2 ; u12 ) Days in hospital (u3 ; u13 ) Hospital per day cost (u4 ) Percentage responding (u5 ; u14 ) Utility change if respond (u6 ; u15 ) Duration of response (years) (u7 ; u16 ) Percentage with side effects (u8 ; u17 ) Utility change if side effects (u9 ; u18 ) Duration of side effects (years) (u10 ; u19 )

Standard Deviation of Patient-Level Values

T0

T1

T0

T1

T0

T1

$10,000 10% 5.2 $4000 70% 0.30 3.0 25% –0.10 0.50

$15,000 8% 6.1 $4000 80% 0.30 3.0 20% –0.10 0.50

$10 2% 1.0 $2000 10% 0.10 0.5 10% 0.02 0.2

$10 2% 1.0 $2000 10% 0.05 1.0 5% 0.02 0.2

$5000 25% 4.0 $2000 20% 0.20 1.0 20% 0.10 0.8

$5000 25% 4.0 $2000 20% 0.20 2.0 10% 0.10 0.8

$17,000 (15,000, 19,600) and an expected QALY gain of 0.71 (0.22, 1.34), for net benefits of $54,100 (4500, 117,200). Based on these results, T1 is expected to produce incremental costs of $4900 (2900, 6800), incremental QALYs of 0.09 (–0.66, 0.86), and incremental net benefit of $4400 (–71,100, 81,500), compared with T0, and would be the preferred strategy in the absence of any further information. There is substantial decision uncertainty in this problem, with strategy T1 optimal with P = 0.543. In addition, a modification of this example was considered in which subsets of parameters were assumed to be correlated. Following Brennan and Kharroubi,5 parameters u5, u7, u14, and u16 were assumed to be correlated with a correlation coefficient of 0.6, and parameters u6 and u15 were assumed to be independent of this subset but correlated with each other, also with a correlation coefficient of 0.6. In this modified example T1 is preferred to T0, with incremental costs of $4900 (2800, 6800), incremental QALYs of 0.10 (–0.42, 0.69), and incremental net benefit of $5300 (–46,800, 64,300). Study designs for collection new information. Five hypothetical data collection exercises were considered: Exercise 1: A clinical trial collecting information on the fraction of patients who respond to drug treatment (directly informing parameters u5 and u14 ) Exercise 2: A study on the utility improvements for those who respond to drug treatment (directly informing parameters u6 and u15 ) Exercise 3: A study combining Exercises 1 and 2 (directly informing parameters u5 , u6 ,u14 , and u15 ) Exercise 4: A study of the duration of response to therapy, for those who respond (directly informing parameters u7 and u16 )

Exercise 5: A study combining Exercises 1, 2, and 4 (directly informing parameters u5 , u6 , u7 ,u14 , u15 , and u16 ).

For each of these 5 data collection exercises, 5 different study sample sizes were considered (10, 25, 50, 100, 200), producing 25 hypothetical study designs. These study designs were applied to the 2 examples described in the section Hypothetical Cost-effectiveness Analysis (uncorrelated and correlated parameters), for a total of 50 analytic scenarios. The different EVSI estimators were compared in the context of these 50 scenarios. Analyses for comparing EVSI estimators. For the conventional approach for computing EVSI (2-level Monte Carlo sampling), an equal number of parameter samples were used in each Monte Carlo integration step, and results were estimated for each scenario using 1000, 10,000, and 100,000 parameter samples per level (requiring 1 million, 100 million, and 10 billion model evaluations, respectively). For the 2 proposed approaches (Algorithm 1 and Algorithm 2), results were estimated for each scenario with 1000, 2000, 5000, and 10,000 parameter samples. Results from the conventional approach using 100,000 parameter samples per level were used as the gold standard. A second set of analyses were undertaken to decompose estimation errors into systematic bias and zero-mean variance, focusing on Exercise 1 and study sample sizes of 10 and 200. These analyses compared the conventional 2-level Monte Carlo approach and both proposed algorithms, as well as the Brennan and Kharroubi5 estimator, hereafter termed BK. Estimators were evaluated for parameter samples of 1000, 2000, 5000, and 10,000. For the conventional approach, these parameter sample sizes were assumed for both levels of the calculation,

7


MENZIES

14 12 10 8

EVSI ($ thousands)

6 4 2 1,000 x 1,000 model evaluations

0

10,000 x 10,000 model evaluations

12 10 Exercise 1 Exercise 2 Exercise 3

8 6

Exercise 4 Exercise 5

4 2 100,000 x 100,000 model evaluations

0 0

50

100

150

200 0

50

100

150

200

Sample Size for New Research

Figure 1 Expected value of sample information (EVSI) estimates for 5 hypothetical study designs and different study sample sizes, estimated using the conventional 2-level Monte Carlo sampling approach. Dotted lines represent the gold standard EVSI results produced by the conventional estimator implemented with 100,000 samples in both inner and outer loop (lower left panel of this figure).

requiring 1, 4, 25, and 100 million model evaluations, respectively. For the BK estimator, the model needs to be evaluated 2m 1 1 times for each parameter sample, where m represents the total number of parameters. In the case study m = 19, and so the model was evaluated 39,000, 78,000, 195,000, and 390,000 times for the different parameter sample sizes evaluated. For the proposed algorithms, the model is computed once for each parameter sample, requiring 1000, 2000, 5000, and 10,000 evaluations for the different sample size comparisons. Each outcome was re-estimated 1000 times, and root mean square error (RMSE), bias, and standard deviation were calculated from these results, reported as a fraction of the gold standard EVSI value. Results from the conventional estimator implemented with 10 million samples in the outer loop and 100,000 samples in the inner loop were used as the gold standard. RESULTS Case Study Results for Multiple Data Collection Exercises Figure 1 presents EVSI estimates computed using the conventional 2-level Monte Carlo approach,

8

based on the 25 hypothetical study designs described previously, assuming uncorrelated parameters. Each line represents EVSI for a different data collection exercise, with results plotted as a function of study sample size. Each panel presents results for a different number of parameter set samples used to estimate EVSI. Compared with the gold standard (Figure 1, lower left panel), EVSI estimates calculated using 1000 samples in each level of the analysis mostly reproduce general trends, with EVSI increasing in study sample size, and EVSI becoming increasingly large for Exercises 1, 2, 4, 3, and 5 in order. However, these estimates appear noisier than the gold standard. In some cases the EVSI estimates decline as study sample size increases, although by definition EVSI should be strictly nondecreasing in study sample size. In contrast, EVSI estimates obtained with 10,000 samples in each level of the analysis closely match the results of the gold standard. Figure 2 presents EVSI estimates obtained with Algorithm 1. Results estimated with 1000 and 2000 parameter sets (top panels) appear noisy and biased upward compared with the gold standard. In




Exercise 1

Exercise 2

Exercise 3

Exercise 4

Exercise 5

14 12 10 8

EVSI ($ thousands)

6 4 2 0

1,000 model evaluations




12 10 8 6 4 2 0 0

50

100

150

200 0

50

100

150

200


Figure 2 Expected value of sample information (EVSI) estimates for 5 hypothetical study designs and different study sample sizes, estimated using Algorithm 1. Dotted lines represent the gold standard EVSI results produced by the conventional estimator implemented with 100,000 samples in both inner and outer loop (lower left panel of Figure 1).

addition, for some data collection exercises (Exercises 1 and 3) there appears to be an ongoing upward trajectory as the study sample size increases from 50 to 200, in contrast to the gold standard which appears to asymptote to a fixed value. Only when 10,000 samples are used do the large majority of results appear to match the gold standard, yet even here there appears to be a small upward bias with Exercise 3 when study evidence is strong. Figure 3 presents EVSI estimates obtained with Algorithm 2. Compared with Algorithm 1, the results produced by Algorithm 2 appear less noisy for any given number of parameter samples, and results obtained with a prior sample as small as 1000 reproduce the general trends of the gold standard. With a prior sample of 10,000, the results closely match the gold standard. For the conventional approach, parameter samples of 1000 at each level of the analysis produced RMSE of 8.6% (across all 25 study designs), and parameter samples of 10,000 at each level of the analysis produced RMSE of 3.3%. For Algorithm 1, parameter

samples of 1000, 2000, 5000, and 10,000 produce RMSE values of 15.2%, 12.6%, 9.0%, and 3.1%, respectively. For Algorithm 2, parameter samples of 1000, 2000, 5000, and 10,000 produce RMSE values of 7.4%, 8.4%, 6.5%, and 2.0%, respectively. Similar findings were produced in the analyses that assumed correlated parameter sets (detailed results and figures shown in the technical appendix). Across all 50 analytic scenarios, the conventional estimator exhibited RSME of 8.9% and 2.6% for parameter samples of 1000 3 1000 and 10,000 3 10,000, respectively. For Algorithm 1, RMSE was estimated as 15.1%, 11.0%, 7.0%, and 3.0% for parameter samples of 1000, 2000, 5000, and 10,000, respectively, and for Algorithm 2 the matching RSME values were 5.5%, 6.2%, 5.0%, and 1.7%. Disaggregation of Estimation Errors into Variance and Bias Table 2 presents the results of the second set of analyses, reestimating results for 2 analytic scenarios

9


MENZIES

Exercise 1

Exercise 2

Exercise 3

Exercise 4

Exercise 5

14 12 10 8

EVSI ($ thousands)

6 4 2 0





12 10 8 6 4 2 0 0

50

100

150

200 0

50

100

150

200


Figure 3 Expected value of sample information (EVSI) estimates for 5 hypothetical study designs and different study sample sizes, estimated using Algorithm 2. Dotted lines represent the gold standard EVSI results produced by the conventional estimator implemented with 100,000 samples in both inner and outer loop (lower left panel of Figure 1).

(Exercise 1 with uncorrelated parameters and study sample size of 10 and 200) multiple times to disentangle the contribution of bias and random error. All results are presented as a percentage of the EVSI value ($2457 for N = 10, and $3146 for N = 200), calculated using the conventional 2-level Monte Carlo approach with a large number of parameter samples (10 million in the outer loop and 100,000 in the inner loop). When study evidence is weak (study sample size = 10), all methods appear to have minimal bias (\1% of total EVSI) when a large sample of parameter sets (N = 10,000) are used. Some estimators appear systematically biased when fewer parameter sets are used, with both conventional and BK estimators biased downward for N = 1000 and Algorithm 1 biased upward for all values of N less than 5000. The same general findings hold when study evidence is stronger (study sample size = 200), although in this case Algorithm 1 is strongly affected, with EVSI substantially underestimated for all values of N. The variance of each estimator drops progressively as N increases, matching the expected decline in

10

Monte Carlo error with increasing sample size. Comparing between the estimators, the BK estimator exhibits a standard deviation that is substantially higher than the other estimators (15%–35% higher than the conventional estimator for a given analytic scenario). For Algorithm 1 the standard deviation is very similar to the conventional estimator, and for Algorithm 2 it is 10%–20% lower than the conventional estimator. For RMSE, there is clear advantage with higher numbers of parameter samples, with a 10-fold increase in parameter samples reducing RMSE by a factor of 3 or more for all estimators. Compared with the conventional estimator, the BK estimator exhibits substantially higher RMSE, due to the greater variance of this estimator. Algorithm 1 exhibits similar RMSE as the conventional estimator when study evidence is weak but higher RMSE values when study evidence is strong, particularly where the number of parameter samples is low, due to systematic upward bias. RMSE results for Algorithm 2 are consistently lower than those for the conventional




Table 2 Estimates for Systematic Bias, Standard Deviation, and RMSE as a Percentage of True EVSI Value, for Three Competing Approaches to Calculating EVSI and Different Number of Parameter Samples Sample Size for New Research = 10 Approach

Conventional

BK estimator

Algorithm 1

Algorithm 2

N

1000 2000 5000 10,000 1000 2000 5000 10,000 1000 2000 5000 10,000 1000 2000 5000 10,000

Bias a

–3.2 –0.7 –1.0 –0.3 –4.5a –0.6 0.2 –0.4 2.8a 2.3a 0.1 0.4 0.1 1.4 0.2 0.6

Sample Size for New Research = 200

s

RMSE

Bias

s

RMSE

28.4 20.1 12.4 8.8 35.2 25.5 16.2 11.8 28.7 20.1 12.4 8.8 24.3 16.7 10.3 7.3

28.5 20.1 12.4 8.8 35.5 25.5 16.2 11.8 28.9 20.1 12.5 8.8 24.3 16.7 10.3 7.3

–1.8 –0.6 –0.8 –0.3 –3.5a –0.4 0.0 –0.4 36.7a 20.7a 8.2a 4.8a 0.8 1.9a 0.6 0.8

23.7 15.9 10.0 7.1 27.2 19.6 12.6 9.1 22.9 15.4 9.8 7.0 21.1 14.5 9.0 6.4

23.8 15.9 10.1 7.1 27.4 19.6 12.6 9.1 43.3 25.8 12.8 8.5 21.1 14.6 9.1 6.5

Note: For the conventional approach, N represents number of samples in both inner and outer loops, for N2 total samples (e.g., 1 million samples for N = 1000). For Algorithms 1 and 2, N represents the total number of parameter samples. All estimates compared with gold standard represented by the conventional 2-level Monte Carlo sampling approach implemented with 10 million samples in the outer loop and 100,000 samples in the inner loop. BK = Brennan and Kharroubi; EVSI = expected value of sample information; RMSE = root mean square error. a. Indicates bias is statistically significant at P \ 0.01 level via 2-sample t test.

algorithm, representing a 13% reduction when averaged over the 8 comparisons. It is important to note that in these comparisons, the conventional algorithm requires N2 model evaluations as opposed to N model evaluations for Algorithm 1 and 2. When the conventional approach is implemented with the same number of model evaluations as the proposed algorithms for N = 10,000 (i.e., with 100 samples in both inner and outer loops), the resulting estimator exhibits a 10%–20% downward bias and large standard deviation, with RMSE approximately 10 times higher than the proposed algorithms. Similarly, for this case study the BK estimator requires 39 model evaluations for every parameter sample, and in analyses where the total number of model evaluations was restricted to 10,000, the resulting estimator exhibited RMSE 6–9 times higher than the proposed algorithms, although still smaller than the conventional approach under the same conditions. DISCUSSION This paper describes 2 new algorithms for estimating EVSI and compares their performance against alternative estimators in a hypothetical case study.

The great strength of Algorithm 1 is its simplicity— this approach requires minimal programming ability beyond that required to operationalize the relationship between parameters and study data and can produce reasonable EVSI estimates in situations where study evidence is not strong. However, if study evidence is strong or if the number of parameter samples available is small, Algorithm 1 can produce EVSI estimates that are biased upward. Compared with Algorithm 1, Algorithm 2 requires an additional step, calculating a smooth prediction equation for net benefit as a function of the quantities informed by new research. However, this relationship only needs to be estimated once and so imposes little additional computational burden. Based on the results of the case study, Algorithm 2 produces EVSI estimates with minimal bias and greater precision than the conventional estimator in a wide range of scenarios. The major limitation of Algorithm 2 is the need to choose and tune a prediction function. This is an important step, since if the prediction function performs poorly, EVSI results will be biased downward. The analyst will need to confirm that the prediction function is operating successfully. This step can be accomplished by standard regression diagnostics and is not discussed in detail here.

11


MENZIES

For both proposed algorithms, no further model evaluations are needed in addition to those obtained for the outer loop of the computation. This can be compared with the conventional algorithm, which requires many extra model evaluations in an additional inner loop. As a consequence, both proposed algorithms reduce the number of model evaluations required to compute EVSI by a number of orders of magnitude and can be implemented using the results of a traditional probabilistic sensitivity analysis. A second major advantage of the proposed algorithms is the approach used to approximate the posterior parameter distribution given study data. With the conventional estimator, new samples need to be obtained from the posterior parameter distribution. This can be accomplished by choosing prior and data likelihood so that the posterior distribution can be estimated in closed form or by estimating the posterior numerically using MCMC methods. For convenience, a conjugate prior-likelihood combination was adopted for the case study, yet this approach restricts the analyst to a fixed set of prior-likelihood relationships that may not be realistic. The other option— estimating the posterior using MCMC methods— greatly increases computational burden and may be infeasible for all but the simplest models. In contrast, for both proposed algorithms, the posterior distribution is approximated via a simple reweighting of existing parameter sets. No additional requirements are placed on the operationalization of prior or likelihood, and computational burden is low. A consequence of this approach is greater flexibility in the types of study designs that can be evaluated. For example, the proposed algorithms allow EVSI calculations for study designs that would produce multiple likelihoods pertaining to multiple different model parameters or for study designs that relate to model outcomes and other complex functions of model parameters. This flexibility also applies to the prior distribution, with the proposed approach accommodating prior distributions based on resampling of empirical data, multimodal distributions, and prior distributions created by numerical calibration. The proposed algorithms are also compared with an EVSI estimator (BK) proposed by Brennan and Kharroubi,5,6 based on second-order Laplace approximation. As described by these authors, the BK estimator provides a number of advantages over the conventional 2-level estimator, reducing computation time and allowing feasible EVSI estimation in a wider set of analytic scenarios. In the results of the case study, the proposed algorithms perform favorably in comparison to the BK estimator, with

12

substantially lower RMSE across the majority of analytic scenarios, despite requiring approximately 40 times fewer model evaluations to achieve these results. In addition, the proposed algorithms allow for feasible EVSI estimation in a wider variety of analytic scenarios. The analyst faces many challenges in trying to estimate the value of new research. Constructing a computer model that describes how policy changes influence valued outcomes can be difficult, and quantifying the uncertainty associated with these relationships can be similarly difficult. Value of information analyses can be very sensitive to the way the decision problem is posed, and to produce valid results the analyst must have a good idea of what policy options are being considered, the criteria that will be used to judge them, and how and when new information might influence policy making.21 Understanding the size of the population that might benefit from research findings can be difficult yet can have a great influence on the monetary value estimated for new research. The approaches described in this paper provide no assistance in resolving these problems, which will need to be addressed within the context of a given VOI analysis. However, these approaches may resolve some of the additional analytic challenges that have traditionally accompanied VOI analyses, allowing efficient estimation of EVSI in most situations where such an analysis would be appropriate. It is hoped that reducing these computational barriers will allow greater attention to be paid to the other challenges noted above and will allow more routine estimation of EVSI results to inform resource allocation across the research portfolio. ACKNOWLEDGMENTS I thank Milton Weinstein, Joshua Salomon, and Thomas McGuire for feedback on the ideas that led to this work, and I further thank Joshua Salomon for comments on an earlier version of this manuscript. I also acknowledge the input of 2 anonymous reviewers in suggesting a number of improvements to the description and implementation of the algorithms described in this paper.

REFERENCES 1. Raiffa H, Schlaiffer R. Applied Statistical Decision Theory. New York: Wiley Interscience; 1967. 2. Ades AE, Lu G, Claxton K. Expected value of sample information calculations in medical decision modeling. Med Decis Mak. 2004;24(2):207–27.




3. Claxton K, Neumann PJ, Araki S, Weinstein MC. Bayesian value-of-information analysis: an application to a policy model of Alzheimer’s disease. Int J Technol Assess Health Care. 2001;17(1):38–55. 4. Chilcott J, Brennan A, Booth A, Karnon J, Tappenden P. The role of modeling in prioritising and planning clinical trials. Health Technol Assess. 2003;7(23):1–125. 5. Brennan A, Kharroubi SA. Efficient computation of partial expected value of sample information using Bayesian approximation. J Health Econ. 2007;26(1):122–48. 6. Kharroubi SA, Brennan A, Strong M. Estimating expected value of sample information for incomplete data models using Bayesian approximation. Med Decis Mak. 2011;31(6):839–52. 7. Strong M, Brennan A, Oakley J. Fast efficient computation of expected value of sample information from a probabilistic sensitivity analysis sample: a non-parametric regression approach. Trials. 2013;14(suppl 1):O25. 8. Ades AE, Sculpher M, Sutton A, et al. Bayesian methods for evidence synthesis in cost-effectiveness analysis. Pharmacoeconomics. 2006;24(1):1–19. 9. Brennan A, Kharroubi SA. Expected value of sample information for Weibull survival data. Health Econ. 2007;16(11):1205–25. 10. Steuten L, van de Wetering G, Groothuis-Oudshoorn K, Rete`l V. A systematic and critical review of the evolving methods and applications of value of information in academia and practice. Pharmacoeconomics. 2013;31(1):25–48. 11. Rubin D. Using the SIR algorithm to simulate posterior distributions. Bayesian Stat. 1988;3:395–402. 12. Oakley JE, Brennan A, Tappenden P, Chilcott J. Simulation sample sizes for Monte Carlo partial EVPI calculations. J Health Econ. 2010;29(3):468–77.

13. Rojnik K, Naversnik K. Gaussian process metamodeling in Bayesian value of information analysis: a case of the complex health economic model for breast cancer screening. Value Health. 2008;11(2):240–50. 14. Oakley JE. Decision-theoretic sensitivity analysis for complex computer models. Technometrics. 2009;51(2):121–9. 15. Strong M, Oakley JE, Brennan A. Estimating multiparameter partial expected value of perfect information from a probabilistic sensitivity analysis sample: a nonparametric regression approach. Med Decis Mak. 2014;34(3):311–26. 16. Madan J, Ades AE, Price M, et al. Strategies for efficient computation of the expected value of partial perfect information. Med Decis Mak. 2014;34(3):327–42. 17. Karatzoglou A, Smola A, Hornik K, Zeileis A. kernlab—an S4 package for kernel methods in R. J Stat Softw. 2004;11(9):1–20. 18. Rasmussen CE, Williams KI. Gaussian Processes for Machine Learning. Cambridge MA: MIT Press; 2006. 19. Wood S. Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. J R Stat Soc. 2011;73(1):3–36. 20. Stinnett AA, Mullahy J. Net health benefits: a new framework for the analysis of uncertainty in cost-effectiveness analysis. Med Decis Mak. 1998;18(2 suppl):S68–80. 21. Eckermann S, Willan AR. Expected value of information and decision making in HTA. Health Econ. 2007;16(2):195–209. 22. Kish L. Survey Sampling. New York: Wiley; 1965.

13


Quantifiers induced by subjective expected value of sample information.

Strategies for efficient computation of the expected value of partial perfect information.

Estimating the Expected Value of Sample Information Using the Probabilistic Sensitivity Analysis Sample: A Fast, Nonparametric Regression-Based Method.

Computing Expected Value of Partial Sample Information from Probabilistic Sensitivity Analysis Using Linear Regression Metamodeling.

Estimating multiparameter partial expected value of perfect information from a probabilistic sensitivity analysis sample: a nonparametric regression approach.

Efficient Implementation of a Symbol Timing Estimator for Broadband PLC.

A generalization of Chao's estimator for covariate information.

An efficient estimator of the mutation parameter and analysis of polymorphism from the 1000 genomes project.

An efficient steganography method for hiding patient confidential information.

Simple and efficient algorithm for improving the MDL estimator of the number of sources.

An efficient system for selectively altering genetic information within mRNAs.

A semiparametrically efficient estimator of the time-varying effects for survival data with time-dependent treatment.

A mutual information estimator with exponentially decaying bias.

Bony Versus Soft Tissue Reconstruction for Anterior Shoulder Instability: An Expected Value Decision Analysis.

Interpretation of the Expected Value of Perfect Information and Research Recommendations: A Systematic Review and Empirical Investigation.

Motivational Deficits in Schizophrenia and the Representation of Expected Value.

Estimating the expected value of partial perfect information in health economic evaluations using integrated nested Laplace approximation.

Immediate information expected by the neonatologist from the placenta.

Why contextual preference reversals maximize expected value.

An Improved F(st) Estimator.

The fitness value of information.

Suggestions for an efficient documentation and information retrieval in an outpatient unit for respiratory diseases.

Negativity as an estimator of entanglement dimension.

Robust and Efficient Frequency Estimator for Undersampled Waveforms Based on Frequency Offset Recognition.