Research Article Received 19 March 2013,

Accepted 17 January 2014

Published online 13 February 2014 in Wiley Online Library

(wileyonlinelibrary.com) DOI: 10.1002/sim.6111

Meta-analysis of single-arm survival studies: a distribution-free approach for estimating summary survival curves with random effects Christophe Combescure,a * † Yohann Foucherb and Daniel Jacksonc In epidemiologic studies and clinical trials with time-dependent outcome (for instance death or disease progression), survival curves are used to describe the risk of the event over time. In meta-analyses of studies reporting a survival curve, the most informative finding is a summary survival curve. In this paper, we propose a method to obtain a distribution-free summary survival curve by expanding the product-limit estimator of survival for aggregated survival data. The extension of DerSimonian and Laird’s methodology for multiple outcomes is applied to account for the between-study heterogeneity. Statistics I 2 and H 2 are used to quantify the impact of the heterogeneity in the published survival curves. A statistical test for between-strata comparison is proposed, with the aim to explore study-level factors potentially associated with survival. The performance of the proposed approach is evaluated in a simulation study. Our approach is also applied to synthesize the survival of untreated patients with hepatocellular carcinoma from aggregate data of 27 studies and synthesize the graft survival of kidney transplant recipients from individual data from six hospitals. Copyright © 2014 John Wiley & Sons, Ltd. Keywords:

meta-analysis; survival curves; random effects

1. Introduction Developments in the statistical techniques for meta-analysis of survival studies have focused mainly in the combination of studies comparing two arms with the aim to assess a pooled measure of the intervention’s effect [1–8]. However, a measure of the intervention’s effect, such as the hazard ratio that is commonly pooled in meta-analyses, may not tell the whole story [9], and pooled survival probabilities in each arm may be useful complementary information to the intervention’s effect. Moreover, many studies in epidemiology aim to assess the risk of a time-dependent outcome in a specific population. Therefore, methods to obtain a summary survival curve in a single population meet a real need in clinical research. Few methods have been proposed to obtain a summary survival curve from published survival curves. The simplest approach is to pool the survival probabilities reported in the studies at the same time point. The survival probabilities, treated as proportions, can be combined at each time point using either fixed effect univariate methods or, probably better, the DerSimonian and Laird method (random effects) for a single outcome [1, 10]. Statistics, for instance I 2 , can be derived to measure the impact of heterogeneity between studies [11]. Repeating this estimation at various time points allows the assessment of a summary survival curve, but studies ending before a given time point are omitted in the combination

a CRC

Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 2521–2537

2521

& Division of Clinical Epidemiology, Department of Health and Community Medicine, University of Geneva & University Hospitals of Geneva, Geneva, Switzerland b EA 4275: Biostatistics, Clinical Research and Subjective Measures in Health Sciences, Labex Transplantex ITUN & Inserm U1064, Nantes University, Nantes, France c MRC Biostatistics Unit, Institute of Public Health, Cambridge, U.K. *Correspondence to: Christophe Combescure, Division of Clinical Epidemiology, University Hospital of Geneva, Rue Gabrielle Perret-Gentil 4 1211, Geneva, Switzerland. † E-mail: [email protected]

C. COMBESCURE, Y. FOUCHER AND D. JACKSON

of survival rates at this time. Moreover, the correlations between the survival probabilities at multiple times from a study are not used because data at each time point are analyzed independently. This may have undesirable consequences, such as an increasing pooled survival function. To account for the correlations between the survival probabilities, Dear et al. [12] proposed a linear regression model for survival data aggregated at multiple times, in which the time is the main predictor and the estimated survival probabilities are the responses. However the assumptions made are strong, and, in particular, the linearity assumption may not be supported by the data. As an improvement on this methodology, Arends et al. [13] proposed mixed-effect regression models. However, the resulting summary survival curve is not necessarily non-increasing [13]. Additionally, it is difficult to check the validity of such parametric models. Alternatively, Fiocco et al. [14, 15] proposed Poisson-correlated gamma-frailty models with more flexible assumptions on the shape of the survival (piecewise exponential distribution). In this method, the between-study heterogeneity is accounted by the introduction of frailty components. An autoregressive correlation structure is assumed between the frailty components [15]. The advantage of parametric approaches is to propose a joint analysis of survival proportions reported at multiple time points and in different studies. However, parametric approaches make assumptions about the shape of survival curves or about the distribution of the random effects that may or may not be appropriate for the data. In this paper, we propose a method to obtain a distribution-free summary survival curve assuming random effects. Summary survival probabilities are derived using a product-limit estimator: the conditional survival probabilities extracted from published survival curves are combined for various intervals in time, and the summary survival probabilities are the product of these pooled estimates. No assumption on the shape of the survival curve is needed. To account for the between-study heterogeneity in the estimation of the pooled conditional survival probabilities, we use a recent extension of DerSimonian and Laird’s methodology for multiple outcomes [16]. The between-study variance can be estimated without making assumptions about the distribution of the random effects or on their correlations. Statistics I 2 and H 2 for multivariate meta-analysis are used to quantify the impact of the heterogeneity in the published survival curves [17]. A statistical test for between-strata comparison is proposed, with the aim to explore studylevel factors potentially associated with survival. Our approach allows also the estimation of the mean and median survival times. The performance of the proposed estimator, tentatively named MetaSurv for convenience, is investigated using a simulation study. Data from a recently published meta-analysis [18] were also re-analyzed to illustrate the usefulness of this approach, and we tested the approach with individual data of kidney transplant recipients [19].

2. Method 2.1. Principle of the method The product-limit estimator is a popular technique for the assessment of survival probabilities in the analysis of individual data [20]. The survival probability at time tj , denoted by S.tj /, is obtained by multiplying the conditional survival probabilities up to time tj : S.tj / D p1 p2 : : :pj

(1)

where pi .i D 1; : : :; j / is the probability that a patient alive at ti 1 is still alive at ti . We use this type of estimator to perform the meta-analysis of aggregated survival data from several studies. The conditional survival probabilities estimated in each study are calculated from the survival estimates extracted from the published survival curves at a set of time points .t1 ; : : :; tJ /. The conditional survival probabilities are pooled by applying the extension of DerSimonian and Laird’s univariate methodology for multiple outcomes [16]. Finally, the summary survival probabilities are obtained by substituting the pooled estimates in Equation (1). By analogy with the contribution of a right-censored patient to the Kaplan–Meier estimator, a study ending at tj is accounted for the estimation of the pooled pi for i 6 j and thus in the estimation of the summary survival probabilities at times greater than tj . 2.2. Estimation of summary survival probabilities

2522

The estimated survival probabilities extracted from the studies of the meta-analysis are denoted by SOk .tj /, where k represents the study .k D 1; : : :; K/ and j ranges from 0 to Jk . The time origin is denoted by t0 ; and Sk .t0 / are assumed equal to 1. Jk depends on the length of follow-up in study k Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 2521–2537

C. COMBESCURE, Y. FOUCHER AND D. JACKSON

and can vary across studies. Let J D maxfJk g be the last time point observed in the studies of the meta-analysis. We assume that the numbers of at-risk patients during the interval Œtj 1 I tj , denoted by Nkj , are known in all studies and at all times tj . Despite recommendations [21], these numbers may be not reported, but they can be approximated from published survival curves [4, 8]. The conditional survival probabilities, denoted by pkj , are the probabilities that a patient alive at tj 1 in the study k is still alive at time tj . Their estimates are derived from the estimated survival probabilities: pOkj D SO .tj /=SO .tj 1 / for j > 1. The DerSimonian and Laird methodology assumes that the estimates to be pooled are normally distributed or that the number of studies is sufficient for the central limit theorem to imply that the weighted average is approximately normally distributed. For this reason, it is recommended to apply a transformation on the probability before pooling, for instance the arcsine transformation with continuity correction of 0.25 [22–24]: s ! Nkj pOkj C 0:25 O kj D arcsin Nkj C 0:50 Other corrections can be applied [24]. The estimate of kj is assumed approximately normally distributed with variance:  1   Var O kj D 0:25 Nkj C 0:5 (2) If the event is rare or the sample size is small, the normal approximation may not be very accurate despite the continuity correction. The vector of O kj in study k is denoted by O k . The estimated within-study covariance matrix of O k , denoted by VO k , is diagonal because the conditional survival probabilities are not correlated. The terms on the diagonal, the within-study variances, are given by Equation (2). The between-study covariance matrix is denoted by D. The pooled estimate of  k obtained by the DerSimonian and Laird’s method, denoted by O DL , is approximately normally distributed with mean and covariance matrix given by " K #1 " K # 1 1   X X O DL D (3) VOk C DO VOk C DO O k kD1

VO

kD1

DL

"

K  1 X D VOk C DO

#1 (4)

kD1

where DO is an estimate of D. In Equations (3) and (4), the size of the vectors  k is assumed equal, that is, the studies are assumed to have the same length of follow-up, but in practice, this size varies. To ease the computation, when studies have missing entries of  k , the variance terms in V k can be replaced by a very large value, 1012 for instance [16]. The purpose of this change is only to make Equations (3) and (4) computable when studies have different durations of follow-up: the missing terms are replaced by non-informative values, and Equations (3) and (4) can be used. In contrast to V k , the matrix D is not diagonal because the between-study covariance of conditional survival probabilities may be non-zero. For instance, if the survival is globally better in a study than in another study, then the conditional survival probabilities in this study are likely higher than in the other study. The matrix D can be estimated by equating the expectation of a matrix Q to its observed value. This matrix Q is a multidimensional extension of the Cochran Q statistic commonly used in meta-analysis of a single outcome. This matrix and the resulting method for estimation are described by Jackson et al. [16] and are not reproduced here. The estimates for a model with fixed effect, denoted by O FE , are obtained by solving Equations (3) and O  0. Finally, the vector of the pooled conditional survival probabilities is obtained from O DL (4) with D (either fixed or random effects) by the back-transformation:   pO DL D sin2 O DL and the pooled survival probabilities are obtained by their product (either fixed or random effects): (5)

sD1

Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 2521–2537

2523

j   Y pOsDL SO tj D

C. COMBESCURE, Y. FOUCHER AND D. JACKSON

As a direct consequence of Equation (5), the estimates of the summary survival probabilities are nonincreasing over time. The 95% confidence interval of the pooled survival probabilities can be obtained by a similar way to Greenwood’s formula [25], by application of the delta method [26]. In this approach, the logarithm of the estimated summary survival probability is assumed approximately normally distributed. However, when the pooled survival is close to one, this assumption may not be supported. Alternatively, a bootstrap procedure may be used to avoid this issue. Details are given in Appendix A. 2.3. Pooled mean and median survival times The pooled mean survival time can be calculated as the area under the summary survival curve, which is defined by the pooled survival probabilities. As the pooled survival probabilities are assessed at certain time points, we propose interpolating the summary survival curve between two successive times using a linear function. Thus, the mean survival time  is calculated as O D

J X

 h    i 0:5 tj  tj 1 SO DL tj C SO DL tj 1

j D1

Note that this formula assumes that the estimated probability of survival at the final time point is very small, ideally zero, so that the total area under the survival curve is well approximated. A 95% confidence interval can be obtain by bootstrapping the transformed pooled conditional survival DL probabilities using a multivariate normal distribution with mean O DL and covariance VO . We propose calculating the pooled median survival time, denoted by tmed , using a linear interpolation of the summary survival curve between two successive time points. Denoting tm the time for which SO DL .tm / < 0:5 and SO DL .tm1 / > 0:5, the pooled median survival time is calculated as tmed D tm  .tm  tm1 /

SO DL .tm /  0:5 SO DL .tm /  SO DL .tm1 /

A 95% confidence interval can also be obtained by a bootstrap procedure. 2.4. Heterogeneity The total heterogeneity test statistic qtot for the meta-analysis is qtot D

K X 

O k  O FE

T

 1  VO k O k  O FE

(6)

kD1

where O FE is the pooled estimate obtained from Equation (3) assuming fixed effects [17]. Under the null hypothesis that  k are equal for all k, asymptotically qtot follows a chi-square distribution with degrees of freedom, denoted by v, equal to 1 0 J J X X D .Kj  1/ D @ Kj A  J j D1

j D1

where Kj is the number of studies reporting the survival probabilities at time tj and J is the number of time points. Statistics I 2 and H 2 for multivariate can be used to quantify the impact of  meta-analysis  the heterogeneity [17]: H 2 D qtot = and I 2 D H 2  1 =H 2 . We can also test for a potential categorical heterogeneity factor. We assume we have a study-level heterogeneity factor with G categories and that each stratum of the factor contains at least two studies with a length of follow-up equal to tJ , the last time point, so that we can apply our method using random effects to all strata. The heterogeneity test statistic for the meta-analysis in the g th strata of studies, denoted by qg , is

2524

qg D

X  T 1   O k  O FE VO k O k  O FE g g k2Kg

Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 2521–2537

C. COMBESCURE, Y. FOUCHER AND D. JACKSON

where Kg is the set of studies in the g th strata and O FE g are the pooled estimates estimated from the th studies of the g strata and assuming fixed effects. The total heterogeneity test statistic for the stratified meta-analysis is the sum over g of qg . As in meta-analyses with a single outcome [27], the between-strata heterogeneity test statistic, denoted by qB , can be calculated using qtot and qg : qB D qtot 

G X

qg

(7)

gD1

Under the null hypothesis that all O FE g are equal for all g, the statistic qB follows asymptotically a chisquare distribution. As the length of the vectors O FE g is J for any strata g, the number of degrees of freedom of qB is  B D J.G  1/. Details of the proof of Equation (7) and the number of degrees of freedom of qB are given in Appendix B. An R package, entitled MetaSurv, to perform these analyses is available from the authors and on www.divat.fr.

3. Simulation study

Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 2521–2537

2525

A simulation study was performed to analyze the accuracy of the proposed approach under various scenarios with 10 000 meta-analyses of 10 studies per scenario. We explored 55 scenarios with various sample hsizes (from i 50 to 300 per study by steps of 25) and five Weibull survival functions S .t / D exp  .t =/ with parameters .; / equal to (1.5, 5.0), (1.5, 9.0), (1.5, 20.0), (0.5, 25.0), and (2.5, 25.0). The corresponding survival curves are shown in Supporting information Figure 1. These survival functions illustrate situations with various survival probabilities at 10 time points. The survival function with .; / D .2:5; 25:0/ corresponds to a situation where events are rare. For each simulation in each scenario, the individual times-to-event for all studies was randomly generated from the same Weibull distribution. The simulation scheme therefore corresponds to a fixed effect situation. The length of follow-up was randomly generated to simulate studies with different lengths of follow-up. Among the 10 studies of each meta-analysis, the follow-up of two randomly selected studies was set to 10 time units, and the length of follow-up of the right other studies was randomly selected between five to 10 time units from a uniform distribution. The Kaplan–Meier survival estimates and the numbers of at-risk patients were then extracted for each time point to the end of follow-up. Additional scenarios were explored with censored data. The time of censoring was generated from exponential distributions, independently of the time-to-event, with parameters adjusted so that the censoring mechanism resulted in approximate censoring rates of 25% (moderate censoring) and 50% (strong censoring) over 10 units of time. The censoring distribution used was the same in all studies of a given meta-analysis. The arcsine transformed pooled conditional survival probabilities at times 1 to 10 (in steps of 1) were assessed by applying the proposed MetaSurv approach, with random effects to simulate a reallife situation where the data structure is unknown. The pooled survival probability at all 10 time points was calculated with the 95% confidence interval obtained by the extension of Greenwoods’ formula, as described in Section 2.2 and Appendix A. For each scenario, the median absolute bias (estimated minus true survival probability at the last time point), the coverage percentage of the 95% confidence interval and the mean squared error for the survival probability at the last time point, the global heterogeneity statistics (median qtot statistic, percentage of simulations with a significant heterogeneity test, and median I 2 / as described in Section 2.2 were assessed. Data were simulated with R version 2.15.1, and the package MetaSurv was used for these analyses. The results of the simulation study are shown in Table I. The median absolute bias was close to 0 for sample size of 50 except in the situation of rare events. In this situation, a negative bias appeared especially for small sample sizes (less than 100 per study): the underestimation of 10 units of time survival, in median, reached 0.027 for a sample size of 50 when data were not censored and 0.044 when 50% of patients were censored. This bias may be caused by the arcsine transformation applied to the conditional survival probabilities. By applying a correction of 0.25 (Section 2.2), the corrected conditional survival probabilities are lower than 1 even if no events are observed. In the case of rare events (or small sample size), numerous observed conditional survival probabilities may be equal to 1, and the bias due to the correction may have an important impact on the pooled estimates. A direct consequence of this bias is the slightly low percentage of coverage in this situation (around 93%). The coverage was similar for the various censoring mechanisms. In the other situations, the coverage percentage was around 98%

1.4 1.2 1.0 0.8 0.6 0.4

Observed Pooled estimates (fixed effect) Pooled estimates (random effects)

0.2

Arc−sine transformed conditional surv. prob.

C. COMBESCURE, Y. FOUCHER AND D. JACKSON

0

10

20

30

40

Follow−up (months)

Figure 1. Pooled estimates of the arcsine transformed conditional survival probabilities for the untreated patients with hepatocellular carcinoma with fixed and random effects in the 27 studies of the meta-analysis using the MetaSurv method.

for any sample size and censoring, higher than the nominal value 95%. Despite the observed bias that increases with the rate of censoring, the coverage percentages are not affected greatly by the extent of the censoring. This may be explained by the width of 95% confidence intervals, which increase with the rate of censoring: as shown in Table I, the mean squared errors increased with the rate of censoring. O  0/, the coverage percentage was around When the MetaSurv method was applied with fixed effect .D the nominal value. The coverage percentages were found to be higher with random effects because heterogeneity was found in some of the simulated meta-analysis, and 95% confidence intervals for these meta-analyses were broader and contained the true value of the survival probability more often. The qtot statistic for heterogeneity was around 63.0 in most of the simulated scenarios but decreased in the case of rare events. The null hypothesis of the equality of k across studies was rejected in less than 10% of the simulations in all scenarios (with a nominal significance level of 0.05), with the exact coverage depending on the scenario. The median I 2 ranged between 0.0% and 4.3%. The heterogeneity statistics were similar for the various censoring mechanisms but depended more on the shape of the true survival curve. We also examined the accuracy of estimates of the restricted mean survival time (RMST). The summary RMST was estimated by the area under the summary survival curve (MetaSurv method with random effects). The results are shown in Supporting information Table 1. The median bias was close to zero except when the risk of an event was very low. This bias is caused by the one observed in the survival probabilities, which are underestimated because of the continuity correction. However, it remained small compared with the true values of RMST. The coverage percentages were above 95% in almost all scenarios, but when the risk of an event was very low and when the sample size was small, the coverage percentages decreased dramatically. This low performance may be explained by the size of the 95% confidence interval, which was very small when the RMST is close to the maximal value (10). The coverage percentage increased to 91% when the sample size was 300 per study. Our approach should not be used when sample sizes of studies are small and when the risk of event is very low.

4. Applications 4.1. Survival of untreated patients in randomized clinical trials of hepatocellular carcinoma

2526

In the current guidelines for the management of patients with hepatocellular carcinoma, mortality risk estimates are a vital aid for decision making [28]. However, the variability of the mortality risk over studies is also important. In a recent meta-analysis of 30 studies, Cabibbo et al. [18] assessed the mortality at Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 2521–2537

50 75 100 200 300 50 75 100 200 300 50 75 100 200 300 50 75 100 200 300 50 75 100 200 300

 D 5:0I  D 1:5

Statist. Med. 2014, 33 2521–2537

0.002 0.001 0.001 0.001 0.000 0.003 0.003 0.003 0.001 0.001 0.009 0.004 0.003 0.002 0.001 0.009 0.005 0.004 0.002 0.001 0.027 0.016 0.011 0.004 0.002

Bias

97.2 96.7 96.6 97.0 96.4 97.2 96.9 97.1 97.3 97.3 97.9 97.4 97.2 97.3 97.3 98.2 98.2 97.7 97.6 97.8 93.3 95.9 96.8 97.3 97.4

Cov. (%) 0.0004 0.0003 0.0002 0.0001 0.0001 0.0012 0.0009 0.0006 0.0003 0.0002 0.0011 0.0007 0.0006 0.0003 0.0002 0.0009 0.0006 0.0005 0.0002 0.0001 0.0012 0.0006 0.0004 0.0001 0.0001

MSE

qtot 63.3 62.7 62.5 61.6 61.6 63.8 63.5 62.9 61.9 61.7 55.4 60.6 62.3 63.2 62.8 60.3 62.9 63.3 62.8 62.1 21.7 28.2 32.8 43.3 48.1

No censoring

8.9 7.2 6.5 5.7 5.4 8.9 9.0 7.8 6.3 5.7 0.3 3.5 6.1 8.3 7.0 3.4 6.8 8.0 7.5 6.5 0.0 0.0 0.0 0.0 0.1

q (%) 2.2 1.6 0.8 0.0 0.0 2.9 2.4 1.7 0.0 0.0 0.0 0.0 0.5 2.1 1.3 0.0 1.8 2.0 1.5 0.6 0.0 0.0 0.0 0.0 0.0

I2 0.001 0.002 0.002 0.001 0.001 0.007 0.004 0.004 0.002 0.001 0.013 0.007 0.004 0.002 0.001 0.017 0.009 0.005 0.002 0.002 0.028 0.016 0.011 0.004 0.002

Bias 97.1 97.2 96.9 96.6 96.6 97.0 97.1 97.2 96.9 96.7 98.1 97.5 97.5 97.2 97.0 98.5 98.2 97.8 97.5 97.8 92.9 96.2 96.7 97.1 97.3

Cov. (%) 0.0006 0.0004 0.0004 0.0002 0.0001 0.0019 0.0012 0.0009 0.0004 0.0003 0.0015 0.0010 0.0007 0.0004 0.0002 0.0013 0.0008 0.0006 0.0003 0.0002 0.0013 0.0006 0.0004 0.0002 0.0001

MSE 63.5 63.5 63.3 62.5 62.4 64.5 64.2 63.4 62.0 62.0 53.2 59.8 62.4 63.9 62.9 57.7 61.3 62.5 62.8 62.4 21.4 28.0 32.8 43.3 48.1

qtot

Moderate censoring

8.4 8.4 8.1 6.8 5.9 9.4 9.6 8.3 6.7 5.9 0.1 2.1 5.3 8.8 7.5 1.4 4.6 6.5 8.2 7.4 0.0 0.0 0.0 0.0 0.1

q (%) 2.5 2.5 2.4 0.8 0.8 4.3 3.4 2.3 0.2 0.0 0.0 0.0 0.6 3.0 1.6 0.0 0.0 0.7 1.5 1.0 0.0 0.0 0.0 0.0 0.0

I2 0.007 0.003 0.000 0.002 0.002 0.012 0.007 0.005 0.002 0.002 0.021 0.009 0.006 0.003 0.002 0.039 0.022 0.013 0.004 0.003 0.044 0.025 0.017 0.006 0.003

Bias 96.0 96.4 96.7 96.7 96.3 97.7 97.2 97.1 96.6 96.7 98.5 97.7 97.6 97.1 96.9 98.2 98.6 98.5 97.9 97.7 92.5 96.3 97.4 97.5 97.4

Cov. (%) 0.0009 0.0007 0.0006 0.0004 0.0003 0.0032 0.0021 0.0017 0.0008 0.0005 0.0023 0.0014 0.0010 0.0005 0.0004 0.0032 0.0017 0.0011 0.0005 0.0003 0.0029 0.0013 0.0008 0.0003 0.0002

MSE

60.7 62.1 63.0 63.8 64.1 63.9 64.8 64.3 63.1 62.6 49.9 57.9 62.0 64.0 63.6 52.2 57.0 59.5 62.8 63.1 16.8 22.7 27.5 39.9 46.2

qtot

Strong censoring

6.7 7.7 7.8 8.1 8.6 7.0 9.5 10.1 7.5 6.9 0.0 0.9 3.7 9.1 9.0 0.3 1.4 3.0 6.8 7.2 0.0 0.0 0.0 0.0 0.0

q (%)

0.9 1.7 2.4 3.0 3.0 3.1 4.3 3.6 1.8 1.0 0.0 0.0 0.1 3.2 2.8 0.0 0.0 0.0 1.5 1.9 0.0 0.0 0.0 0.0 0.0

I2

Bias, median of the absolute difference between the pooled estimate and the true survival probability; Cov., coverage percentage (percentage of simulations in which the true value was contained in the 95% confidence interval); MSE, mean squared error; qtot , the median value of the qtot heterogeneity statistic; q (%), the percentage of simulations with a significant heterogeneity test; I 2 , the median value of the heterogeneity statistic I 2 .

 D 25:0I  D 2:5

 D 25:0I  D 0:5

2527

Copyright © 2014 John Wiley & Sons, Ltd.

 D 20:0I  D 1:5

 D 9:0I  D 1:5

N

Weibull model

Table I. Results of the simulation study.

C. COMBESCURE, Y. FOUCHER AND D. JACKSON

C. COMBESCURE, Y. FOUCHER AND D. JACKSON

2528

1 and 2 years of patients in placebo or inactive treatment arms of RCTs of palliative treatments for hepatocellular carcinoma and analyzed their heterogeneity. The data from this meta-analysis were re-analyzed by applying our approach, using both random and fixed effects. All calculations were performed using the R package MetaSurv proposed by the authors. Three studies were excluded from the analysis because the survival curve was not reported [29], the picture of the survival curve was unreadable [30], or the reported survival curve was erroneous [31]. Twenty-seven studies were included in the analysis. The survival probabilities were read from the survival curves at close time points during the first 6 months (each month) because the survival decreased dramatically over this period and were read at more spaced intervals (each 3 months) after 6 months to limit the number of observed conditional survival probabilities equal to 1. For this purpose, the pictures of the curves were digitalized using the R package ReadImage, and the probabilities were extracted using the package digitize, as proposed by Poisot [32]. Then, the numbers of at-risk patients during the different time intervals were assessed from the total sample size and the survival probabilities. In seven studies, all patients died (no censored data) before the end of the study. In 11 other studies, the duration of follow-up for the censored patients was reported. For these 18 studies, the numbers of atrisk patients for all the time intervals could be derived without extrapolation. Five studies reported the number of patients alive at different times, and the method proposed by Williamson et al. [8] was used to assess the number of at-risk patients during the different intervals. For four studies, the numbers of at-risk patients were extrapolated using the method proposed by Parmar et al. [4]. We give an example of the approximation of the number of at-risk patient during the interval 1–2 months. A study of the meta-analysis [33] reported respectively 295 and 272 at-risk patients at 1 and 2 months. The survival proportions read from the curve were respectively 0.980 and 0.917. Applying the formula given by Williamson et al. [8], the approximated number of at-risk patients during the interval 1–2 months was .295 C 272/0:980=.0:980 C 0:917/ D 292:9. The sample size of studies ranged from 11 to 303 patients, with a median of 37 patients. The follow-up ranged from 4 to 60 months but was longer than 48 months in only two studies. The method MetaSurv was applied over a follow-up of 48 months with fixed and random effects. The estimates of the arcsine transformed conditional survival probabilities are shown in Figure 1 and Table II. The estimates assuming fixed and random effects were similar for the first months of the follow-up but differed between 9 and 20 months. These discrepancies can be explained by the weight allocated to the studies in fixed and random effects models. The studies with the lowest conditional survival probabilities were also the studies with a low number of at-risk patients in these intervals, and consequently with high variance VO k of the estimated O k . In the DerSimonian and Laird methodology, the estimates O DL are a sum over k of O k weighted by the sum of the within-study and between-study variance matrices (Equation (3)). In presence of heterogeneity, the random effects weights tend to be more similar than those from a fixed effect model. A consequence is that small studies were given a greater weight in the random effects model than in the fixed effect model. Another consequence of accounting for the between-study variance is the increased standard errors of the pooled estimates of the arcsine transformed conditional survival probabilities (Table II). The summary survival estimates derived from the estimate of O k using Equation (5) are given in Table II. The 95% confidence intervals were obtained by the extension of Greenwood’s formula. The summary survival estimates were systematically lower in the random effects model than in the fixed effect model. The difference reached 0.083 for a follow-up of 15 months. The 95% confidence intervals were also wider, and considerable heterogeneity was detected: qtot D 747:9; H 2 D 2:4 and I2 D 58:3%, so the random effects approach was deemed preferable in this case. The estimate of the between-study covariance matrix D is shown in Supporting information Table 2. The survival curves extracted from studies and the summary survival curve with random effects are shown in Figure 2. The heterogeneity between the survival curves was readily apparent. The median survival time (with 95% confidence interval) was 5.4 months [4.5; 6.7]. As the summary survival estimate was close to zero at the end of the follow-up, the mean survival time was also assessed: 8.5 months [6.8; 10.4]. To illustrate an analysis that investigates the reasons for the heterogeneity, a leave-one-out sensitivity analysis and between-strata comparisons were conducted. The leave-one-out sensitivity analysis indicated that this heterogeneity was not caused by a single study: the I 2 statistic ranged from 54.6% to 59.9% when studies were removed one by one. Additionally, the pooled survival curves were calculated using the MetaSurv method and stratifying the studies on the region (Asia vs. North America/Europe). Because our approach using random effects can be applied only if there are at least two studies in each stratum at each time point, the heterogeneity analysis was conducted for a follow-up

Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 2521–2537

C. COMBESCURE, Y. FOUCHER AND D. JACKSON

Table II. Pooled estimates of the arcsine transformed conditional survival probabilities and summary survival estimates from the 27 studies of the meta-analysis for the untreated patients with hepatocellular carcinoma, using the MetaSurv method. Fixed effect Follow-up (months) 1 2 3 4 5 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48

O

FE

(SE)

Random effects

Summary survival [95% CI]

1.334 (0.012) 1.239 (0.012) 1.210 (0.013) 1.242 (0.014) 1.228 (0.015) 1.226 (0.016) 1.090 (0.017) 1.099 (0.021) 1.115 (0.026) 1.164 (0.031) 1.175 (0.036) 1.195 (0.041) 1.188 (0.047) 1.220 (0.058) 1.098 (0.076) 1.159 (0.087) 1.337 (0.107) 1.218 (0.113) 1.243 (0.121) 1.307 (0.133)

0.945 [0.935; 0.956] 0.845 [0.828; 0.862] 0.739 [0.719; 0.760] 0.662 [0.641; 0.685] 0.587 [0.565; 0.611] 0.520 [0.497; 0.544] 0.409 [0.386; 0.433] 0.324 [0.302; 0.349] 0.262 [0.240; 0.285] 0.221 [0.199; 0.244] 0.188 [0.167; 0.211] 0.163 [0.142; 0.186] 0.140 [0.120; 0.163] 0.123 [0.104; 0.147] 0.098 [0.078; 0.123] 0.082 [0.062; 0.108] 0.078 [0.058; 0.104] 0.068 [0.049; 0.096] 0.061 [0.042; 0.089] 0.057 [0.038; 0.085]

O

DL

(SE)

1.312 (0.030) 1.225 (0.029) 1.172 (0.031) 1.220 (0.030) 1.186 (0.031) 1.166 (0.031) 1.028 (0.036) 1.030 (0.039) 1.030 (0.044) 1.107 (0.044) 1.102 (0.044) 1.176 (0.047) 1.169 (0.056) 1.178 (0.063) 1.115 (0.086) 1.168 (0.090) 1.286 (0.108) 1.144 (0.117) 1.078 (0.134) 1.277 (0.134)

Summary survival [95% CI] 0.935 [0.906; 0.964] 0.827 [0.776; 0.882] 0.702 [0.635; 0.777] 0.620 [0.550; 0.698] 0.532 [0.461; 0.614] 0.450 [0.378; 0.536] 0.330 [0.264; 0.411] 0.242 [0.183; 0.321] 0.178 [0.128; 0.247] 0.142 [0.100; 0.203] 0.113 [0.077; 0.168] 0.096 [0.064; 0.145] 0.082 [0.053; 0.125] 0.070 [0.045; 0.109] 0.056 [0.035; 0.091] 0.048 [0.029; 0.078] 0.044 [0.026; 0.074] 0.036 [0.021; 0.064] 0.028 [0.014; 0.055] 0.026 [0.013; 0.052]

SE, standard error.

0.6 0.4 0.0

0.2

Survival

0.8

1.0

Summary survival curve (random effects)

0

10

20

30

40

Time (months)

Figure 2. Curves of the overall survival for the untreated patients with hepatocellular carcinoma in the 27 studies of the meta-analysis. The grey lines represent the survival in each study, and the black square the end of the follow-up. The thick lines represent the summarized survival curves with the 95% confidence bands (dashed lines) obtained using our approach MetaSurv with random effect.

Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 2521–2537

2529

of 33 months. The results are summarized in Table III, and the summary survival curves are represented in Figure 3. The survival was greater in North American/European studies, and the mean and median survival times were twice greater. The stratification by the region substantially decreased the heterogeneity. However, the heterogeneity still persisted in both strata. Multicentric studies were more frequent in

2530

Copyright © 2014 John Wiley & Sons, Ltd.

27 8 19 9 10

0.241 [0.183; 0.319] 0.112 [0.059; 0.214] 0.316 [0.243; 0.412] 0.237 [0.110; 0.509] 0.362 [0.276; 0.473]

1 year 0.097 [0.065; 0.144] 0.028 [0.011; 0.074] 0.143 [0.098; 0.208] 0.087 [0.029; 0.262] 0.173 [0.116; 0.258]

2 years 8.9 [7.4; 10.5] 5.4 [3.8; 7.3] 10.8 [9.1; 12.7] 9.3 [6.4; 13.4] 11.7 [9.6; 14.0]

Mean 5.4 [4.5; 6.6] 2.9 [2.1; 4.4] 7.1 [5.7; 8.9] 6.5 [4.8; 10.4] 7.6 [5.7; 10.4]

Median

Summary survival time (months) [95% CI]

2.6 1.7 2.1 2.3 1.8

61.2 41.7 52.1 55.9 45.0

0.35

Meta-analysis of single-arm survival studies: a distribution-free approach for estimating summary survival curves with random effects.

In epidemiologic studies and clinical trials with time-dependent outcome (for instance death or disease progression), survival curves are used to desc...
408KB Sizes 0 Downloads 2 Views