Biostatistics (2016), 17, 1, pp. 149–164 doi:10.1093/biostatistics/kxv031 Advance Access publication on August 28, 2015

Personalized screening intervals for biomarkers using joint models for longitudinal and survival data DIMITRIS RIZOPOULOS∗ Department of Biostatistics, Erasmus University Medical Center, 3000 CE Rotterdam, The Netherlands [email protected] JEREMY M. G. TAYLOR Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA JOOST VAN ROSMALEN Department of Biostatistics, Erasmus University Medical Center, 3000 CE Rotterdam, The Netherlands EWOUT W. STEYERBERG Department of Public Health, Erasmus University Medical Center, 3000 CE Rotterdam, The Netherlands JOHANNA J. M. TAKKENBERG Department of Cardiothoracic Surgery, Erasmus University Medical Center, 3000 CE Rotterdam, The Netherlands SUMMARY Screening and surveillance are routinely used in medicine for early detection of disease and close monitoring of progression. Motivated by a study of patients who received a human tissue valve in the aortic position, in this work we are interested in personalizing screening intervals for longitudinal biomarker measurements. Our aim in this paper is 2-fold: First, to appropriately select the model to use at the time point the patient was still event-free, and second, based on this model to select the optimal time point to plan the next measurement. To achieve these two goals, we combine information theory measures with optimal design concepts for the posterior predictive distribution of the survival process given the longitudinal history of the subject. Keywords: Decision making; Information theory; Personalized medicine; Random effects.

1. INTRODUCTION Decision making in medicine has become increasingly complex for patients and practitioners. This has resulted from factors such as the shift away from physician authority toward shared decision making, unfiltered information on the Internet, new technology providing additional data, numerous treatment options ∗ To

whom correspondence should be addressed. c The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: [email protected]. 

D. RIZOPOULOS AND

150

OTHERS

c(t)

Conditional Survival Probabilities

Longitudinal Outcome

1

0 t Follow−up

t +Δt

Fig. 1. Illustration of a patient’s longitudinal profile up to time t, and conditional survival function for whom we wish to plan an extra measurement. Physicians are interested in events occurring within the time interval (t, t + t]. Constant c(t) defines the threshold above which the physician requires extra information before making a decision.

with associated risks and benefits, and results from new clinical studies. Within this context, medical screening procedures are routinely performed for several diseases. Most prominent examples can be found in cancer research, where, for example, mammographic screening is performed for the detection of breast cancer in asymptomatic patients, and prostate-specific antigen levels are used to monitor the progression of prostate cancer in men who have been treated for the disease. In general, the aim of screening procedures is to optimize the benefits, i.e. early detection of disease or deterioration of the condition of a patient, while also balancing the respective costs. In this paper, we are interested in optimizing screening intervals for asymptomatic or symptomatic patients that are followed-up prospectively (the latter possibly after a medical procedure). The setting we focus on is better explained via Figure 1. More specifically, this figure depicts a hypothetical patient who has been followed up to time point t, and has provided a set of longitudinal biomarker measurements. Using this information, physicians are interested in events occurring within a medically relevant time interval (t, t + t]. If the probability of surviving beyond t + t is lower than the pre-specified constant c(t), then the physician is supposed to immediately take action to improve the prospects for this patient. However, when the chance of surviving until t + t is higher than c(t), then the physician may not initiate treatment or require an invasive test, and instead wishes to gain a better understanding of the disease progression using additional biomarker measurements. The decision to request an additional measurement requires careful weighing of the potential benefits and costs to the patient and health care system. Hence, we are interested in carefully planning the timing of the next measurement in order to maximize the information we can gain on disease progression, and at the same time minimize costs and patient burden. Statistical methods for optimizing medical screening strategies have flourished in the past years. These methods have been primarily based on Markov and multistate models (Parmigiani, 1993, 1998, 1999, 2002; Lee and Zelen, 1998; Schaefer and others, 2004). The appealing feature of these models is that they can simultaneously handle different medical options and select the one that provides the most gains at an acceptable cost (cost-effectiveness). Here we will instead work under the framework of joint models for longitudinal and time-to-event data (Tsiatis and Davidian, 2004; Rizopoulos, 2012). An advantageous feature of joint models is that they utilize random effects and therefore have an inherent subject-specific nature. This allows one to better tailor decisions to individual patients, i.e. personalize screening, rather than using the same screening rule for all. In addition, contrary to the majority of screening approaches that are based on a dichotomized version of the last available biomarker measurement to indicate the need for a further medical procedure (e.g. prostate-specific antigen greater than 4.0 ng/mL), joint models utilize

Personalized screening intervals for biomarkers

151

the whole longitudinal history to reach a decision without requiring dichotomization of the longitudinal outcome. Our methodological developments couple joint models with information theory quantities and optimal design concepts. In particular, our aim in this paper is 2-fold: First, to appropriately select the joint model to use at time t, i.e. the time point the subject of interest is still event-free, and second, based on this model to select the optimal time point u > t to plan the next measurement. We measure optimality by the amount of information gained for the survival process given the history of the subject of interest that includes both baseline information and his/her accumulated longitudinal measurements. The motivation for this research comes from a study conducted by the Department of Cardio-Thoracic Surgery of the Erasmus Medical Center in the Netherlands. This study includes 285 patients who received a human tissue valve in the aortic position in the hospital from 1987 until 2008 (Bekkers and others, 2011). Aortic allograft implantation has been widely used for a variety of aortic valve or aortic root diseases. Major advantages ascribed to allografts are the excellent hemodynamic characteristics as a valve substitute; the low rate of thrombo-embolic complications, and, therefore, absence of the need for anticoagulant treatment and the resistance to endocarditis. A major disadvantage of using human tissue valves, however is the susceptibility to degeneration and the concomitant need for re-interventions. The durability of an aortic allograft is age-dependent, leading to a high lifetime risk of re-operation, especially for young patients. Re-operations on the aortic root are complex, with substantial re-operative risks, and mortality rates in the range of 4–12%. A timely well-planned re-operation has lower mortality risks compared with urgent re-operative surgery. It is therefore of great interest for cardiologists and cardio-thoracic surgeons to plan in the best possible manner echocardiographic assessments that will inform them about the future prospect of a patient with a human tissue valve in order to optimize medical care, carefully plan re-operation, and minimize valve-related morbidity and mortality. The rest of the paper is organized as follows. Section 2 briefly presents the joint modeling framework and the information theory quantities that can be used to select the joint model at a particular follow-up time, and Section 3 the corresponding quantities for planning the next longitudinal measurement. Section 4 illustrates the use of these information measures in the Aortic Valve study, and Section 5 presents the results of a cost-effectiveness simulations study. 2. CHOOSING THE MODEL TO USE AT TIME t We start with a short introduction of the joint modeling framework we will use in our following developments. Let Dn = {Ti , δi , yi ; i = 1, . . . , n} denote a sample from the target population, where Ti∗ denotes the true event time for the ith subject, Ci the censoring time, Ti = min(Ti∗ , Ci ) the corresponding observed event time, and δi = I (Ti∗  Ci ) the event indicator, with I (·) being the indicator function that takes the value 1 when Ti∗  Ci , and 0 otherwise. In addition, we let yi denote the n i × 1 longitudinal response vector for the ith subject, with element yil denoting the value of the longitudinal outcome taken at time point til , l = 1, . . . , n i . To accommodate different types of longitudinal responses, we postulate a generalized linear mixedeffects model. In particular, the conditional distribution of yi given a vector of random effects bi is assumed to be a member of the exponential family, with linear predictor given by k[E{yi (t) | bi }] = ηi (t) = xi (t)β + zi (t)bi ,

(2.1)

where k(·) denotes a known one-to-one monotonic link function, and yi (t) denotes the value of the longitudinal outcome for the ith subject at time point t, xi (t) and zi (t) denote the time-dependent design vectors for the fixed effects β and for the random effects bi , respectively. The random effects are assumed to follow a multivariate normal distribution with mean zero and variance–covariance matrix D. For the survival process, we assume that the risk for an event depends on a function of the subject-specific linear predictor

D. RIZOPOULOS AND

152

OTHERS

ηi (t) and/or the random effects: h i (t | Hi (t), wi ) = lim Pr{t  Ti∗ < t + t | Ti∗  t, Hi (t), wi }/t t→0

= h 0 (t) exp[γ  wi + f {Hi (t), bi , α}],

t > 0,

(2.2)

where Hi (t) = {ηi (s), 0  s < t} denotes the history of the underlying longitudinal process up to t, h 0 (·) denotes the baseline hazard function, wi is a vector of baseline covariates with corresponding regression coefficients γ . Function f (·), parameterized by vector α, specifies which components/features of the longitudinal outcome process are included in the linear predictor of the relative risk model. Some examples, motivated by the literature (Rizopoulos and Ghosh, 2011; Rizopoulos, 2012; Taylor and others, 2013; Rizopoulos and others, 2014) are: f {Hi (t), bi , α} = αηi (t), f {Hi (t), bi , α} = αηi (t), with ηi (t) =  t ηi (s) ds, f {Hi (t), bi , α} = α  bi . f {Hi (t), bi , α} = α

dηi (t) , dt

0

These formulations of f (·) postulate that the hazard of an event at time t may be associated with the underlying level of the biomarker at the same time point, the slope of the longitudinal profile at t, the accumulated longitudinal process up to t, or the random effects alone. Finally, the baseline hazard function h 0 (·) is modeled flexibly using a B-splines approach, i.e. log h 0 (t) = γh 0 ,0 +

Q 

γh 0 ,q Bq (t, v),

(2.3)

q=1

where Bq (t, v) denotes the qth basis function of a B-spline with knots v1 , . . . , v Q and γ h 0 the vector of spline coefficients. To avoid the task of choosing the appropriate number and position of the knots, we include a relatively high number of knots (e.g. 15–20) and appropriately penalize the B-spline regression coefficients γ h 0 for smoothness using the differences penalty (Eilers and Marx, 1996). For the estimation of joint model’s parameters, we use a Bayesian approach—more details can be found in Section 1 of supplementary material available at Biostatistics online. As motivated in Section 1, our aim is to decide the appropriate time point to plan the next measurement for a new subject j from the same population (i.e. with j ∈ {i = 1, . . . , n}), whose survival chance at a particular time point t lies within the interval [c(t), 1] (see Figure 1). In general, and as we have just seen, defining a joint model entails choosing an appropriate model for the longitudinal outcome (i.e. baseline covariates and functional form of the time effect), an appropriate model for the time-to-event (i.e. baseline covariates), and how to link the two outcomes using function f (·). Hence, a relevant question is which model to use for deciding when to plan the next measurement of subject j. Standard approaches for model selection within the Bayesian framework include the deviance information criterion (DIC) and (pseudo-) Bayes factors. However, a potential problem with these methods is that they provide an overall assessment of a model’s predictive ability, whereas we are interested in the model that best predicts future events given the fact that subject j was event-free up to time point t. To identify the model to use at t, we follow similar arguments as in Commenges and others (2012), and focus on the conditional density function of the survival outcome given the longitudinal responses and survival up to t. More specifically, we let M = {M1 , . . . , M K } denote a set of K joint models fitted to the original dataset Dn . Due to the fact that the choice of the optimal model in M is based on a finite sample, an important issue we need to address is overfitting. Hence, to obtain an objective estimate of the predictive ability of each model

Personalized screening intervals for biomarkers

153

based on the available data Dn , we will use the cross-validatory posterior predictive conditional density of the survival outcome. For model Mk , this is defined as p{Ti∗ | Ti∗ > t, Yi (t), Dn\i , Mk }, where Yi (t) = {yi (til ); 0  til  t, l = 1, . . . , n i } and Dn\i = {Ti  , δi  , yi  ; i  = 1, . . . , i − 1, i + 1, . . . , n} denotes the version of the dataset that excludes the data for the ith subject. Similarly, we can define the same distribution under model M ∗ , which denotes the model under which the data have been generated (M ∗ is not required to be in M). Using conventional quantities of information theory (Cover and Thomas, 1991), we select the model Mk in the set M that minimizes the cross-entropy: CEk (t) = E{− log[ p{Ti∗ | Ti∗ > t, Yi (t), Dn\i , Mk }]},   where the expectation is taken with respect to p Ti∗ | Ti∗ > t, Yi (t), Dn\i , M ∗ . An estimate of CEk (t) that accounts for censoring can be obtained using the available information in the sample at hand. We term this estimate the cross-validated Dynamic Conditional Likelihood: cvDCLk (t) =

n 1  −I (Ti > t) log p{Ti , δi | Ti > t, Yi (t), Dn\i , Mk }, n t i=1

(2.4)

 where n t = i I (Ti > t). In turn, a consistent estimate of p{Ti , δi | Ti∗ > t, Yi (t), Dn\i , Mk } can be obtained from the MCMC sample of model Mk by utilizing a similar computation as for the conditional predictive ordinate statistic (Chen and others, 2008; condition on Mk is assumed but is dropped in the following expressions): p{Dn\i , Ti > t, Yi (t)} p{Ti , δi , Ti > t, Yi (t), Dn\i }  p(Dn\i | θ ) p{Ti > t, Yi (t) | θ} p(θ ) = dθ p(Dn )  p{Ti > t, Yi (t) | θ } p(Dn | θ ) p(θ) dθ = p{Ti , δi , Yi (t) | θ } p(Dn )  1 p(θ | Dn ) dθ , = p{Ti , δi | Ti > t, Yi (t), θ }

[ p{Ti , δi | Ti > t, Yi (t), Dn\i }]−1 =

(2.5)

where θ includes here both the parameters of the joint model θ and the random effects. Hence, based on the MCMC sample from [θ | Dn ] and identity (2.5) we can obtain a Monte Carlo estimate of (2.4) as ⎞ ⎛ G n   1 1 1 ˆ ⎠, cvDCL(t) = (2.6) I (Ti > t) log ⎝ n t i=1 G g=1 p{Ti , δi | Ti > t, Yi (t), θ (g) } where θ (g) denotes here the gth realization from the MCMC sample {θ (g) ; g = 1, . . . , G}. Calculation of p{Ti , δi | Ti > t, Yi (t), θ (g) } follows in a similar manner as the calculations we will illustrate in the following section, and due to space limitations we do not show them here. 3. SCHEDULING PATTERNS BASED ON INFORMATION THEORY 3.1 Planning of the next measurement Having chosen the appropriate model to use at time t, our aim next is to decide when to plan the following measurement of subject j. Similarly to the previous section, the quantity that we will use for making

154

D. RIZOPOULOS AND

OTHERS

this decision is the posterior predictive distribution. This distribution effectively combines the two sources of information we have available, namely, the observed data of the jth subject {T j∗ > t, Y j (t)} with the original dataset we used to fit the joint model Dn , and still maintains a cross-validatory flavor because j ∈ {i = 1, . . . , n}. We let y j (u) denote the measurement of the longitudinal outcome at the future time point u > t. For the selection of the optimal u, we would like to maximize the information we gain by measuring y j (u) at this time point, provided that the patient was still event-free up to u. That is, if the event occurs before taking the measurement for subject j, then there is no gain in information. To achieve this, we suitably adapt and extend concepts from Bayesian optimal designs (Verdinelli and Kadane, 1992; Clyde and Chaloner, 1996) to our setting. More specifically, we propose to select the optimal time u by maximizing the utility function (conditioning on baseline covariates w j is assumed in the following expressions but is dropped for notational simplicity): p(T j∗ | T j∗ > u, {Y j (t), y j (u)}, Dn ) − λ2 I (T j∗ > u) , U(u | t) = E λ1 log (3.1) p{T j∗ | T j∗ > u, Y j (t), Dn } where Y j (t) = {y j (t jl ); 0  t jl  t, l = 1, . . . , n j } and the expectation is taken with respect to the joint predictive distribution [T j∗ , y j (u) | T j∗ > t, Y j (t), Dn ]. The first term in (3.1) is the expected Kullback– Leibler divergence between the posterior predictive conditional distributions with and without this extra measurement, namely

 p(T j∗ | T j∗ > u, {Y j (t), y j (u)}, Dn ) EKL(u | t) = E Y E T ∗ |Y log p{T j∗ | T j∗ > u, Y j (t), Dn }   p(T j∗ | T j∗ > u, {Y j (t), y j (u)}, Dn ) ∗ ∗ ∗ p(T j | T j > t, {Y j (t), y j (u)}, Dn ) dT j = log p{T j∗ | T j∗ > u, Y j (t), Dn } × p{y j (u) | T j∗ > t, Y j (t), Dn } dy j (u).

(3.2)

The larger this divergence is, the more information we expect to gain by measuring the longitudinal outcome at u. However, when the true event actually occurs before u, i.e. T j∗ < u, then expression     p T j∗ | T j∗ > t, {Y j (t), y j (u)}, Dn becomes p T j∗ | T j∗ > t, Y j (t), Dn , and hence EKL(u | t) = 0. On the other hand, when T j∗ > u, then, heuristically, as u is set further away from the last time point a longitudinal measurement was taken, the more information we would expect to gain regarding the shape of the longitudinal profile of subject j. The value of u that maximizes this information gain will depend on the characteristics of the subject’s profile. Another aspect we need to account for is the “cost” of waiting. In particular, as we just saw, EKL(u | t) penalizes against selecting u > T j∗ ; however, the fact that we still need to wait up to time u when u < T j∗ also means that we inevitably increase the risk that the patient will experience the event. Because we would not like to wait up to a point that it would be too late for the physician to intervene, we need to further penalize for waiting up to u. This explains the inclusion of the second term in (3.1). The expectation with respect to this second term gives the conditional survival probability π j (u | t) = Pr{T j∗ > u | T j∗ > t, Y j (t), Dn }, which has been used in the context of dynamic individualized predictions (Yu and others, 2008; Rizopoulos, 2011, 2012; Taylor and others, 2013). The purpose of the non-negative constants λ1 and λ2 is to weigh the contribution of the conditional survival probability as opposed to the information gain, and to take into account that these two terms have different units. In practical applications, elicitation of these constants for trading information units with probabilities can be difficult. To overcome this, we can suitably utilize the equivalence between compound

Personalized screening intervals for biomarkers

155

and constrained optimal designs (Cook and Wong, 1994; Clyde and Chaloner, 1996). More specifically, it can be shown that for any λ1 and λ2 , there exists a constant κ ∈ [0, 1] for which maximization of (3.1) is equivalent to maximization of EKL(u | t) subject to the constraint that π j (u | t)  κ. Under this equivalent formulation elicitation of κ is relatively easier; for example, a logical choice is to set κ equal to the constant c(t), introduced in Figure 1, for deciding whether the physician is supposed to intervene or not. In turn, a choice for c(t) can be based on medical grounds (e.g. physicians are not willing to allow survival probabilities to drop below a specific threshold value) or on time-dependent receiver operating characteristic analysis. Then, the optimal value of u can be found by maximizing EKL(u | t) in the interval (t, t up ], where t up = {u : π j (u | t) = κ}. In practice, it also may be reasonable to assume that there is an upper limit t max of the time interval the physician is willing to wait to obtain the next measurement, in which case t up = min{u : π j (u | t) = κ, t max }. 3.2 Estimation Estimation of (3.1) proceeds by suitably utilizing the conditional independence assumptions presented in Section 1 of supplementary material available at Biostatistics online. More specifically, the density of the joint distribution [T j∗ , y j (u) | T j∗ > t, Y j (t), Dn ] can be written as the product of the predictive distributions, p(T j∗ | T j∗ > t, {Y j (t), y j (u)}, Dn ) = and p{y j (u) | T j∗ > t, Y j (t), Dn } =





p(T j∗ | T j∗ > t, {Y j (t), y j (u)}, θ ) p(θ | Dn ) dθ ,

p{y j (u) | T j∗ > t, Y j (t), θ } p(θ | Dn ) dθ .

Analogously, the first term in the right-hand side of each of the above expressions can be written as expectations over the corresponding posterior distributions of the random effects of subject j, p{y j (u) |

T j∗

 > t, Y j (t), θ } =

p{y j (u) | b j , θ } p{b j | T j∗ > t, Y j (t), θ } db j ,

and   p T j∗ | T j∗ > t, {Y j (t), y j (u)}, θ =



  p T j∗ | T j∗ > t, b j , θ p{b j | T j∗ > t, {Y j (t), y j (u)}, θ } db j ,

respectively. In a similar manner, the predictive distribution [T j∗ | T j∗ > u, {Y j (t), y j (u)}, Dn ] is written as: p



T j∗

|

T j∗



> u, {Y j (t), y j (u)}, Dn =



p(T j∗ | T j∗ > u, b j , θ ) p(b j | T j∗ > u, {Y j (t), y j (u)}, θ )

× p(θ | Dn ) db j dθ, with p(T j∗

|

T j∗

 T∗ h j (T j∗ | H j (t, b j , θ )) exp{− 0 j h j (s | H j (t, b j , θ )) ds} u , > u, b j , θ ) = exp{− 0 h j (s | H j (t, b j , θ )) ds}

D. RIZOPOULOS AND

156

OTHERS

and the hazard function is given by (2.2), in which we have explicitly noted that the history of the longitudinal process H j (t, b j , θ ) is a function of both the random effects and the parameters, because it is estimated from the mixed model (2.1). By combining the above equations, we can construct a Monte Carlo simulation scheme to obtain an estimate of (3.1) for a specific value of u. This scheme contains the following steps that should be repeated q = 1, . . . , Q times: ˜ θ, ¨ and {θ˘ ( ) , = 1, . . . , L} independently from the posterior distribution of the Step 1: Simulate θ, parameters [θ | Dn ]. ˜ Step 2: Simulate b˜ j from the posterior of the random effects [b j | T j∗ > t, Y j (t), θ]. Step 3: Simulate y˜ j (u) from [y j (u) | b˜ j , θ˜ ]. ( ) Step 4: Simulate b¨ j and {b˘ j , = 1, . . . , L} independently from the posterior of the random effects ( ) [b j | T j∗ > t, {Y j (t), y˜ j (u)}, θ¨ ] and [b j | T j∗ > u, {Y j (t), y˜ j (u)}, θ˘ ], respectively. Step 5: Simulate T¨ j∗ from [T j∗ | T j∗ > t, b¨ j , θ¨ ].   L ( ) , where Step 6: If T¨ j∗ > u, compute EKL(q) (u | t) = log L −1 =1 A( ) n /Ad

A( ) n

= h j (T¨ j∗

( ) ( ) | H j (T¨ j∗ , b˘ j , θ˘ )) exp

and A( ) d

  = exp −

u 0

 −

T¨ j∗ 0

h j (s

( ) ( ) | H j (s, b˘ j , θ˘ )) ds

 ( ) ( ) ˘ ˘ h j (s | H j (s, b j , θ )) ds ;

otherwise set EKL(q) (u | t) = 0. The sample mean the Monte Carlo samples provides the estimate of EKL(u | t), i.e.  Q over (q) ˆ EKL (u | t). The posterior distribution of the random effects in Steps 2 and EKL(u | t) = Q −1 q=1 ( ) 4 is not of a known form, and therefore the realizations b˜ j , b¨ j , and {b˘ j , = 1, . . . , L} are obtained using a Metropolis–Hastings algorithm, with proposal distribution a multivariate Student’s-t distribution with mean bˆ = arg maxb {log p(b | T j∗ > t, Y j (t), θˆ )}, variance–covariance matrix V = {−∂ 2 log p(b | T j∗ > t, Y j (t), θˆ )/∂b ∂b|b=bˆ }−1 , and four degrees of freedom, where θˆ denotes the posterior means based on [θ | Dn ]. Step 3 just entails simulating a realization from the mixed-effects model. In Step 5 again, the distribution [T j∗ | T j∗ > t, b¨ j , θ¨ ] is not of standard form, and hence we simulate the realization T¨ j∗ using the inversion method, i.e. first we simulate v ∼ U (0, 1), and then using a line-search method we find the T¨ j∗ for which S j (T¨ j∗ , b¨ j , θ¨ )/S j (t, b¨ j , θ¨ ) = v, with S j (·) denoting the subject-specific survival function (Equation (7) in supplementary material available at Biostatistics online). Finally, in Step 6 it is sufficient to compute the numerator of (3.2) to obtain the optimal u. An estimate of π j (u | t) can be obtained from a separate Monte Carlo scheme, i.e. u (g) (g) G 1  exp{− 0 h j (s | H j (s, b˚ j , θ˚ )) ds} πˆ j (u | t) = , G g=1 exp{− t h j (s | H j (s, b˚ (g) , θ˚ (g) )) ds} j 0 (g) (g) (g) with {θ˚ , g = 1, . . . , G} a sample from [θ | Dn ], and b˚ j a sample from [b j | T j∗ > t, Y j (t), θ˚ ]. Based on the Monte Carlo sample of EKL(u | t), a 95% credible interval van be constructed using the corresponding Monte Carlo sample percentiles. Using this simulation scheme, we take as optimal u the value uˆ = arg maxu∈{u 1 ,...,u max } {EKL(u | t)}, subject to the constraint that πˆ j (u | t)  κ, with u 1 , . . . , u max denoting a grid of values in (t, t up ].

Personalized screening intervals for biomarkers

157

4. ANALYSIS OF THE AORTIC VALVE DATASET We return to the Aortic Valve dataset introduced in Section 1. Our aim is to use the existing data to build joint models that can be utilized for planning the visiting patterns of future patients from the same population. In our study, two surgical implantation techniques were employed, with 77 (27%) patients receiving a sub-coronary implantation and the remaining 208 patients a root replacement. These patients were followed prospectively over time with annual telephone interviews and biennial standardized echocardiographic assessment of valve function until July 8, 2010. Echo examinations were scheduled at 6 months and 1 year postoperatively, and biennially thereafter, and at each examination, echocardiographic measurements were taken. By the end of follow-up, 1262 assessments have been recorded, with an average of 4.3 measurements per patient (s.d. 2.4 measurements), 59 (20.7%) patients had died, and 73 (25.6%) patients required a re-operation on the allograft. Here we are interested in the composite event re-operation or death, which was observed for 125 (43.9%) patients. For our analysis, we will focus on aortic gradient, which is the primary biomarker of interest for these patients. We start by defining a set of joint models for the square root transform of aortic gradient. A preliminary analysis suggested that many patients exhibit profiles for aortic gradient. To account for this feature, we postulate the linear mixed model: yi (t) = (β0 + bi0 ) + (β1 + bi1,1 )Bn (t, 1) + (β2 + bi2 )Bn (t, 2) + β3 Agei + β4 Femalei + εi (t), where Bn (t, {1, 2}) denotes the B-spline basis for a natural cubic spline with boundary knots at baseline and 19 years and one internal knot placed at 3.7 years (i.e. the median of the observed follow-up times), Female denotes the dummy variable for females. For the survival process, we consider five relative risk models with different association structures between the two outcomes: M1 : h i (t) = h 0 (t) exp{γ1 Agei + γ2 Femalei + α1 ηi (t)}, M2 : h i (t) = h 0 (t) exp{γ1 Agei + γ2 Femalei + α2 ηi (t)}, M3 : h i (t) = h 0 (t) exp{γ1 Agei + γ2 Femalei + α1 ηi (t) + α2 ηi (t)},    t M4 : h i (t) = h 0 (t) exp γ1 Agei + γ2 Femalei + α1 ηi (s) ds , 0

M5 : h i (t) = h 0 (t) exp(γ1 Agei + γ2 Femalei +

α k bi ),

where the baseline hazard is approximated using penalized B-splines. For all parameters, we took standard prior distributions. In particular, for the vector of fixed effects β, the regression coefficients γ , the spline coefficients γ h 0 , and for the association parameter α we used independent univariate diffuse normal priors. For the variance of the error terms σ 2 , we took an inverse-Gamma prior, while for D we assumed an inverse Wishart prior. Tables 1 and 2 in Section 2 of supplementary material available at Biostatistics online present the posterior means and 95% credible intervals of the parameters of the joint models to the Aortic Valve data. With respect to the parameters of the longitudinal component, we do not observe great differences between the fitted joint models for both aortic gradient. For the relative risk submodels, we see that when the terms ηi (t) and ηi (t) are independently included in the model both of them are related to the risk of re-operation/death, however, when both of them are included in the model, the slope term ηi (t) does not seem to be associated with the hazard of the event anymore. We continue by investigating the performance of the fitted joint models with respect to predicting future events after different follow-up times. Table 1 shows the DIC and ˆ the value of cvDCL(t) × n t for t = 5, 7, 9, 11, and 13 years. We observe that models M2 and M5 have

D. RIZOPOULOS AND

158

OTHERS

ˆ Table 1. DIC and cvDCL(t) evaluated at five time points for the fitted joint models

DIC ˆ cvDCL(t = 5) × n t ˆ cvDCL(t = 7) × n t ˆ cvDCL(t = 9) × n t ˆ cvDCL(t = 11) × n t ˆ cvDCL(t = 13) × n t

M1

M2

M3

M4

M5

7237.26 −387.67 −342.75 −289.11 −233.56 −177.10

7186.18 −348.96 −309.69 −260.44 −208.11 −156.88

7195.57 −380.82 −336.95 −284.13 −229.27 −173.58

7268.45 −441.69 −391.78 −332.06 −270.28 −206.40

7186.34 −356.31 −315.26 −264.95 −212.06 −160.03

Table 2. The values of EKL(u | t) (i.e. the first term of (3.1)), π j (u | t) (i.e. the second term of (3.1)), for Patient 7 computed in a dynamic manner, after each of his longitudinal measurements (i.e. time point t), and for u set in the interval (t, t up ] t

t up − t

3.5

5

4

5

5

5.7

6.7

7.7

5

5

5

5

EKL(u | t)

π j (u | t)

4.5 5.5 6.5 7.5

−2.363 −2.354 −2.306 −2.279

0.983 0.964 0.943 0.919

8.5 5.0 6.0 7.0 8.0 9.0

−2.065 −2.366 −2.422 −2.210 −2.214 −2.106

0.892 0.982 0.963 0.941 0.915 0.886

*

6.0 7.0 8.0 9.0 10.0

−2.416 −2.470 −2.261 −2.300 −2.146

0.981 0.959 0.933 0.904 0.872

*

6.7 7.7 8.7 9.7 10.7

−2.455 −2.410 −2.311 −2.352 −2.207

0.979 0.956 0.928 0.897 0.861

*

7.7 8.7 9.7 10.7 11.7

−2.426 −2.389 −2.328 −2.304 −2.235

0.976 0.949 0.917 0.882 0.841

*

8.7 9.7 10.7 11.7 12.7

−2.403 −2.394 −2.307 −2.272 −2.171

0.973 0.941 0.906 0.865 0.819

*

u

*

(Continued)

Personalized screening intervals for biomarkers

159

Table 2. Continued. t 9.1

11.3

13.5

15.1

t up − t

u

EKL(u | t)

π j (u | t)

4.3

9.9 10.8 11.7 12.5 13.4

−2.355 −2.335 −2.201 −2.130 −2.096

0.971 0.939 0.903 0.863 0.817

11.9 12.5 13.1 13.8 14.4

−2.202 −2.044 −2.003 −1.908 −1.920

0.969 0.935 0.897 0.856 0.810

13.9 14.3 14.7 15.1 15.4

−1.919 −1.855 −1.742 −1.655 −1.719

0.968 0.933 0.897 0.858 0.817

15.4 15.7 15.9 16.2 16.5

−1.769 −1.731 −1.632 −1.560 −1.536

0.968 0.934 0.899 0.862 0.824

3.1

1.9

1.4

*

*

*

*

The last column denotes which time point is selected to plan the next measurement.

ˆ almost identical DIC values but according to cvDCL(t) model M2 provides the best predictions for future events. To illustrate the use of the methodology developed in Section 3 for selecting the next screening time point, we focus on the continuous marker and two patients from the Aortic Valve dataset, namely Patient 81 who is a male of 39.8 years old, and Patient 7 who is a male of 46.8 years old and neither had an event (Patient 81 was still event-free 11.6 years after the operation, while Patient 7 was also event-free after 20.1 years). The longitudinal trajectories of these patients are shown in Figure 1 of supplementary material available at Biostatistics online. More specifically, Patient 7 exhibits a slightly increasing trajectory of aortic gradient indicating that he remained in relatively good condition during the follow-up, whereas Patient 81 showed an increasing profile of aortic gradient indicative of a deterioration of his condition. We should note that in accordance with a realistic use of (3.1) for planning the next measurement, the joint models presented above have been fitted in the version of the Aortic Valve dataset that excluded those two patients. Tables 2 and 3 show the values of EKL(u | t) (i.e. the first term of (3.1)), and of π j (u | t) (i.e. the second term of (3.1)) for these two patients computed in a dynamic manner, namely, after each of their longitudinal measurements (i.e. time point t). The upper limit t up of the interval (t, t up ] was set as the minimum of t max = 5 years and the time point u for which π j (u | t) was equal to κ = 0.8. Then the optimal u was taken as the point that maximizes EKL(u | t) in a grid of five equidistant values in the interval (t, t up ]. All calculations have been based on joint model M2,1 . For Patient 7, who remained relatively stable during the first period of his follow-up, we observe that EKL(u | t) always suggests waiting the full 5 years up to measurement seven; however, after the seventh measurement and because his profile started to slightly increase and his relative older age, the length of the interval (t, t up ] for finding the optimal u decreases, which in turn suggests to wait after each visit less time before requesting the next

D. RIZOPOULOS AND

160

OTHERS

Table 3. The values of EKL(u | t) (i.e. the first term of (3.1)), π j (u | t) (i.e. the second term of (3.1)), for Patient 81 computed in a dynamic manner, after each of his longitudinal measurements (i.e. time point t), and for u set in the interval (t, t up ] t

t up − t

0.3

5

1.3

5

3.3

5

5.3

3.9

7.1

2.1

10.6

1.7

u

EKL(u | t)

π j (u | t)

1.3 2.3 3.3 4.3 5.3 2.3 3.3 4.3 5.3 6.3 4.3 5.3 6.3 7.3 8.3 6.1 6.9 7.6 8.4 9.2 7.5 7.9 8.4 8.8 9.2 11.0 11.3 11.6 12.0 12.3

−2.488 −2.205 −2.227 −2.197 −2.153 −2.176 −2.127 −1.963 −2.088 −1.868 −2.143 −1.436 −2.015 −1.913 −1.634 −2.143 −1.603 −1.879 −1.967 −1.750 −1.888 −1.993 −1.963 −1.901 −2.104 −0.243 −0.156 −1.397 −0.542 −2.184

0.989 0.977 0.963 0.948 0.933 0.987 0.973 0.957 0.940 0.922 0.978 0.956 0.932 0.905 0.875 0.968 0.935 0.900 0.864 0.827 0.969 0.937 0.906 0.875 0.844 0.970 0.940 0.910 0.880 0.849

*

* *

*

*

*

The last column denotes which time point is selected to plan the next measurement.

echocardiography. For Patient 81 on the other hand, we observe that for the first two visits, where he exhibited relatively low levels of aortic gradient, EKL(u | t) suggests to wait for a longer period years before requesting the next measurement. In accordance, t up is set to 5 years because the probability that he will survive more than 5 years was higher than κ = 0.8. After the third measurement however, when it starts to become clear that the condition of this patient deteriorates, EKL(u | t) recommends to wait less time to request the next echocardiography. Accordingly from the fourth measurement onwards, the length of the interval (t, t up ] decreases and in the last measurement is set to 1.7 years that also indicates the deterioration of the patient. 5. COST-EFFECTIVENESS SIMULATION STUDY The analysis of the Aortic Valve dataset illustrated that the utility function U (u | t) effectively adapts screening intervals depending on the characteristics of the patients’ longitudinal trajectories. However,

Personalized screening intervals for biomarkers

161

due to the fact in this dataset the data have been already collected under a fixed-planning design, we cannot directly assess the advantages of planning measurements using the framework of Section 3. To this end, we have performed a simulation study to evaluate the cost-effectiveness of U (u | t). In the context of screening, and assuming that the monetary cost of each examination remains (roughly) the same during follow-up, we would prefer a procedure that requires the fewest possible screenings. On the other hand, of course, we would also like to get as close as possible to the optimal time point at which a medical action is required. Under these considerations, the setting of our simulation study is motivated by the Aortic Valve dataset. In particular, we assume that patients undergo a valve transplantation and we follow their progression after this operation using the longitudinal outcome (aortic gradient). The aim is to effectively plan re-operations. We assume that the patient population has a level of homogeneity such that the decision of the cardiologists to request a re-operation can be solely based on the aortic gradient levels and how these are translated to survival probabilities (no need to correct for any baseline covariates or other longitudinal outcomes). In addition, we assume that at any particular follow-up time t, the cardiologist will request a re-operation if the conditional survival probability π j (t + t | t) falls under a specific threshold κ. We used t = 5 and κ = 0.8. The goal is to select the screening strategy that follows each individual patient both efficiently (i.e. with the fewest measurements possible) and effectively (i.e. request re-operation as soon as π j (t + t | t) = κ). For each iteration of the simulation study, we proceeded as follows: I. We simulate data for 450 patients from a joint model that has the same structure as model M3 fitted to the Aortic Valve dataset (the parameter values we used can be found in Section 3 of supplementary material available at Biostatistics online). From those patients, we randomly chose 400 for the training dataset and 50 for the test dataset. Similarly to the Aortic Valve dataset, all patients have measurements at baseline, 6 months, 1 year, and biannually thereafter. For the patients in the test dataset, we only keep their first three measurements. The reason why we fix the screening times of the first three measurements is that it is not medically advisable to request a re-operation within a year from the first operation. Then the model is fit in the training dataset. opt II. For each patient in the test dataset, we first find her true optimal time point t j to plan the reoperation. This is the time point at which her true conditional survival function π true j (t + t | t) (u | t) using the true values of the parameters and equals κ. To locate this point, we calculate π true j random effects for a fine grid of values {u, t} with u > t and t  1. Note that due to the fact that we use the true values of the random effects and under the conditional independence assumptions (Section 1 of supplementary material available at Biostatistics online), no longitudinal measurements are required for the calculation of π true j (u | t). III. From year 1 onwards, our aim next is to compare two strategies, namely, (S1) with fixed visits every 2 years (as in the Aortic Valve dataset), and (S2) with the personalized screening of Section 3. For each patient and any time point t up to which we have longitudinal measurements, the sth strategy (s = 1, 2) proposes to take the next measurement at time t sj (e.g. S1 sets the first t sj at 3 years for all patients). Then, for each strategy we simulate a longitudinal measurement using the true values of the parameters and random effects at the corresponding t sj . Using this measurement, we estimate π sj (t sj + t | t sj ) based on the joint model fitted to the training dataset. For each patient j and strategy s, we keep repeating this procedure until π sj (t sj + t | t sj )  κ, and when this condition is satisfied opt,s we set as the optimal re-operation time point for the sth strategy t j = t sj . The above scheme has been repeated 500 times and the results are summarized in Figure 2. The left panel of opt opt,1 opt opt,2 Figure 2 shows boxplots of the absolute errors |t j − t j | and |t j − t j | for the fixed and personalized screening strategies, respectively, from all the 500 × 50 = 25 000 test subjects. Analogously, the middle panel of Figure 2 shows boxplots of the number of screenings for the two strategies again from all test subjects. We can clearly observe that the personalized strategy requires fewer screenings and reaches the

OTHERS

20 15 10 5

Distance from True Event Time

6 5 4

0

72.1%

75.5%

Fixed

Personalized

1

0

2

3

Number of Screenings after Year One

10 5

Absolute Error Optimal Intervention Point

15

7

25

D. RIZOPOULOS AND

162

Fixed

Personalized

Fixed

Personalized

Fig. 2. Simulation results based on 500 datasets comparing the fixed-planning and personalized screening strategies. Left panel: absolute error between the optimal intervention time point and the time point suggested by the two screening strategies. Middle panel: number of screenings after year 1 required by the two strategies. Right panel: positive distances between the true event time of each patient and the intervention point suggested by each strategy; below the boxplots the percentage of positive differences is shown (i.e. the percentage of patients for whom the intervention time was earlier than the true event time).

optimal intervention time point with less absolute error. As an extra comparison between the two strategies, opt,1 opt,2 the right panel of Figure 2 shows the boxplot of the errors (T j∗ − t j ) and (T j∗ − t j ) for the number opt,s of cases (shown under the boxplots) for which T j∗ > t j , s = 1, 2, with T j∗ denoting the true event time. Again we observe that the personalized screening strategy performs a little bit better than the classic fixedplanning design. 6. DISCUSSION In this paper, we have presented two measures based on information theory that can be used to dynamically select models in time and optimally schedule longitudinal biomarker measurements. Contrary to standard screening procedures that require all patients to adhere to the same screening intervals, the combination of the framework of joint models for longitudinal and time-to-event data with these two measures allows one to tailor screening to the needs of individual patients and dynamically adapt during follow-up. The advantageous feature of our approach is that the optimal u is found by conditioning on both the baseline characteristics w j and accumulated longitudinal information Y j (t) of subject j. This allows one to better tailor screening to the subject’s individual characteristics compared with approaches that only use the baseline covariates and the last available longitudinal measurement (Markov assumption). We should note that due to the fact that the timing of each next visit depends on the collected longitudinal responses of each subject, the recorded data should be analyzed under an approach that accounts for this feature (e.g. a full likelihood-based method).

Personalized screening intervals for biomarkers

163

In this work, we have concentrated on a single endpoint. Nevertheless, using recent extensions of joint models (Andrinopoulou and others, 2014) the same ideas can be relatively easily extended to multiple endpoints (competing risks) or intermediate states combined with one or several endpoints (multistate settings). In such settings, the first term of (3.1) will remain the same, but the second term π j (u | t) will need to be replaced by the corresponding event-specific cumulative incidence functions. ˆ ˆ we have implemented them in functions Finally, to facilitate the computation of cvDCL(t) and EKL cvDCL() and dynInfo(), respectively, available in package JMbayes (version 0.7-3) for the R programming language (freely available at http://cran.r-project.org/package=JMbayes). An example on the use of these functions can be found in supplementary material available at Biostatistics online. SUPPLEMENTARY MATERIAL Supplementary Material is available at http://biostatistics.oxfordjournals.org. ACKNOWLEDGMENTS The authors would like to thank Professor Daniel Commenges for useful discussions with regard to the development of the cvDCL(t). Conflict of Interest: None declared. FUNDING This research was partially supported by the Netherlands Organisation for Scientific Research VIDI grant 016.146.301 (D. Rizopoulos) and US National Institutes of Health grant CA129102 (J.M.G. Taylor). REFERENCES ANDRINOPOULOU, E. R., RIZOPOULOS, D., TAKKENBERG, J. AND LESAFFRE, E. (2014). Joint modeling of two longitudinal outcomes and competing risk data. Statistics in Medicine 33, 3167–3178. BEKKERS, J., KLIEVERIK, L., RAAP, G., TAKKENBERG, J. AND BOGERS, A. (2011). Re-operations for aortic allograft root failure: experience from a 21-year single-center prospective follow-up study. European Journal of CardioThoracic Surgery 40, 35–42. CHEN, M.-H., HUANG, L., IBRAHIM, J. AND KIM, S. (2008). Bayesian variable selection and computation for generalized linear models with conjugate priors. Bayesian Analysis 3, 585–614. CLYDE, M. AND CHALONER, K. (1996). The equivalence of constrained and weighted designs in multiple objective design problems. Journal of the American Statistical Association 91, 1236–1244. COMMENGES, D., LIQUET, B. AND PROUST-LIMA, C. (2012). Choice of prognostic estimators in joint models by estimating differences of expected conditional Kullback–Leibler risks. Biometrics 68, 380–387. COOK, D. AND WONG, W. K. (1994). On the equivalence of constrained and compound optimal designs. Journal of the American Statistical Association 89, 687–692. COVER, T. AND THOMAS, J. (1991) Elements of Information Theory. New York: Wiley. EILERS, P. AND MARX, B. (1996). Flexible smoothing with b-splines and penalties. Statistical Science 11, 89–121. LEE, S. AND ZELEN, M. (1998). Scheduling periodic examinations for the early detection of disease: applications to breast cancer. Journal of the American Statistical Association 93, 1271–1281. PARMIGIANI, G. (1993). On optimal screening ages. Journal of the American Statistical Association 88, 622–628. PARMIGIANI, G. (1998). Designing observation times for interval censored data. Sankhya: The Indian Journal of Statistics 60, 446–458.

164

D. RIZOPOULOS AND

OTHERS

PARMIGIANI, G. (1999). Decision model in screening for breast cancer. In: Bernardo, J. M., Berger, J. O., Dawid, A. P. and Smith, A. F. M. (editors), Bayesian Statistics 6. Oxford: Oxford University Press, pp. 525–546. PARMIGIANI, G. (2002) Modeling in Medical Decision Making: A Bayesian Approach. New York: Wiley. RIZOPOULOS, D. (2011). Dynamic predictions and prospective accuracy in joint models for longitudinal and time-toevent data. Biometrics 67, 819–829. RIZOPOULOS, D. (2012) Joint Models for Longitudinal and Time-to-Event Data, with Applications in R. Boca Raton: Chapman & Hall/CRC. RIZOPOULOS, D. AND GHOSH, P. (2011). A Bayesian semiparametric multivariate joint model for multiple longitudinal outcomes and a time-to-event. Statistics in Medicine 30, 1366–1380. RIZOPOULOS, D., HATFIELD, L., CARLIN, B. AND TAKKENBERG, J. (2014). Combining dynamic predictions from joint models for longitudinal and time-to-event data using Bayesian model averaging. Journal of the American Statistical Association 109, 1385–1397. SCHAEFER, A., BAILEY, M., SHECHTER, S. AND ROBERTS, M. (2004). Modeling medical treatment using Markov decision processes. In: Brandeau, M., Sainfort, F. and Pierskalla, W. (editors), Operations Research and Health Care. Berlin: Springer, pp. 593–612. TAYLOR, J. M. G., PARK, Y., ANKERST, D., PROUST-LIMA, C., WILLIAMS, S., KESTIN, L., BAE, K., PICKLES, T. AND SANDLER, H. (2013). Real-time individual predictions of prostate cancer recurrence using joint models. Biometrics 69, 206–213. TSIATIS, A. A. AND DAVIDIAN, M. (2004). Joint modeling of longitudinal and time-to-event data: an overview. Statistica Sinica 14, 809–834. VERDINELLI, I. AND KADANE, J. B. (1992). Bayesian designs for maximizing information and outcome. Journal of the American Statistical Association 87, 510–515. YU, M., TAYLOR, J. M. G. AND SANDLER, H. (2008). Individualized prediction in prostate cancer studies using a joint longitudinal-survival-cure model. Journal of the American Statistical Association 103, 178–187. [Received March 4, 2015; revised June 25, 2015; accepted for publication August 2, 2015]

Personalized screening intervals for biomarkers using joint models for longitudinal and survival data.

Screening and surveillance are routinely used in medicine for early detection of disease and close monitoring of progression. Motivated by a study of ...
NAN Sizes 0 Downloads 9 Views