Brief Report

Nested Case-Control Studies in Cohorts with Competing Events Martin Wolkewitz,a,b Ben S. Cooper,c,d Mercedes Palomar-Martinez,e,f Pedro Olaechea-Astigarraga,g Francisco Alvarez-Lerma,h and Martin Schumachera Abstract: In nested case-control studies, incidence density sampling is the time-dependent matching procedure to approximate hazard ratios. The cumulative incidence function can also be estimated if information from the full cohort is used. In the presence of competing events, however, the cumulative incidence function depends on the hazard of the disease of interest and on the competing events hazard. Using hospital-acquired infection as an example (full cohort), we propose a sampling method for nested case-control studies to estimate subdistribution hazard ratios. With further information on the full cohort, the cumulative incidence function for the event of interest can then be estimated as well. (Epidemiology 2014;25: 122–125)

T

he nested case-control design is the most widely used method for sampling from epidemiologic cohorts when investigators need to collect additional data in a reduced sample.1 Using incidence density sampling, the potential impact of exposures on disease occurrence can be studied by hazard ratios in a reduced data set.1,2 Furthermore, the cumulative incidence function can also be estimated if information of the full cohort is used.3 However, often the observation of the disease of interest is preceded by other “competing” events (or risks).4,5 There are two statistical approaches to deal Submitted 10 September 2012; accepted 13 August 2013; posted 14 November 2013. From the aInstitute of Medical Biometry and Medical Informatics, University of Freiburg, Freiburg, Germany; bFreiburg Center for Data Analysis and Modelling, University of Freiburg, Freiburg, Germany; cCentre for Clinical Vaccinology and Tropical Medicine, Nuffield Department of Clinical Medicine, University of Oxford, Oxford, United Kingdom; dMahidol-Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand; eHospital Universitari Arnau de Vilanova, Lleida, Spain; f Universitat Autónoma de Barcelona, Barcelona, Spain; gService of Intensive Care Medicine, Hospital de Galdakao-Usansolo, Bizkaia, Spain; and h Service of Intensive Care Medicine, Parc de Salut Mar, Barcelona, Spain. Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article (www.epidem.com). This content is not peer-reviewed or copy-edited; it is the sole responsibility of the author. Correspondence: Martin Wolkewitz, Institute of Medical Biometry and Medical Informatics, Stefan-Meier-Str. 26, 79104 Freiburg, Germany. E-mail: [email protected]. Copyright © 2013 by Lippincott Williams & Wilkins ISSN: 1044-3983/14/2501-0122 DOI: 10.1097/EDE.0000000000000029

122 | www.epidem.com

with competing risks data: the event-specific hazard approach, which addresses the etiological point of view, and the subdistribution hazard approach, which is linked to the cumulative incidence function5,6; the latter is suitable for prediction. It is well-known that the covariate effect on event-specific hazard can be very different from the effect on the cumulative incidence function.4,6 The reason is that the cumulative incidence function of the event of interest also depends on all event-specific hazards.4,6,7 Nested case-control studies in cohorts with competing events have been described.8–11 In contrast to these articles, we propose a sampling method to approximate the proportional hazards model for the subdistribution of a competing event.6 Based on this, we use cohort information to estimate the cumulative incidence function. This methodology is illustrated by an example of hospital infection. Competing events for the occurrence of nosocomial infections are discharged from the hospital or dying in the hospital without a nosocomial infection. Competing events are often ignored in the analysis of nosocomial infections, which may lead to wrong conclusions when studying risk factors12 or interventions for nosocomial infections.13 We study the situation when information is available for the full cohort. This situation allows us to know the (“true”) information that we want to approximate with fewer observations by using the nested case-control approach. Description of the cohort study data and detailed event-specific results are given in the eAppendix (http://links.lww.com/ EDE/A741).

METHODS Subdistribution Hazard Approach Nonparametric Estimation of the Cumulative Incidence Function: Full Cohort Nosocomial infection is the event of interest, and discharge and death without nosocomial infection are competing events. According to Andersen et al,14 the marginal survival probability in a competing risks framework with k competing events with hazards α1(u), ..., αk(u) is defined as

 k t  S (t) = exp − ∑ ∫ α h (u)du 0  h =1 

(1)

Epidemiology  •  Volume 25, Number 1, January 2014

Epidemiology  •  Volume 25, Number 1, January 2014

Nested Case-Control Studies in Cohorts with Competing Events

Again, we compare patients with APACHE II scores >15 versus ≤15: the subdistribution hazard ratio is 4.0 (95% confidence interval = 3.3–4.9). The corresponding cumulative incidence functions are displayed in Figure 1 (dark gray).

Then, the cumulative incidence function for event i = 1, …, k is defined as

t

Fi (t ) = ∫ S (u −)α i (u )du

(2)

0

These formulas show that the cumulative incidence function depends on all competing hazards. We used the Aalen-Johansen estimator15 for the nonparametric estimation of the cumulative incidence function of nosocomial infections (Figure 1), comparing patients with Acute Physiology and Chronic Health Evaluation (APACHE II) scores >15 versus ≤15.

Nested Case-control Approach Our aim was to approximate the Fine-Gray6 model of the full cohort. To do this, we copy the principle idea of this model and set the event times of the competing events (discharge or death) in the cohort (source) data to “infinity” (time until potential censoring). This is displayed in Figure 2 as gray lines to show that these admissions remain in the new risk set. Then, we performed incidence density sampling (after breaking ties) with the modified data. As shown in Figure 2, controls must be disease-free at the time of diagnosis of the case to which they are matched. However, owing to the modified time, each infected case has now more eligible controls because discharged patients are still “at risk.” For instance, the patient who acquired an infection on day 10 had only three potential controls in the traditional incidence density sampling, whereas there are seven potential controls in the subdistribution sampling: all admissions who cross the corresponding vertical line.

Semiparametric Estimation: Full Cohort Analogous to the event-specific approach (see eAppendix, http://links.lww.com/EDE/A741), we used a proportional hazards model to calculate the subdistribution hazard ratio of infection.6 To do this, we fit a Fine-Gray6 model with subdistribution hazard λsubI(t;X) = λsubI0(t)exp(X(t)β0) and get subdistribution hazard ratios exp(β0) for nosocomial infections. This sets the event times of the competing event (discharge) to infinity (time until potential censoring) before fitting a proportional hazards model.5,6,16 This principal idea is displayed in Figure 2.

0.15

Risk of nosocomial infection (cumulative incidence function)

© 2013 Lippincott Williams & Wilkins

0.00

FIGURE 1.  Cumulative incidence function of the event of interest (infection), for two categories of Acute Physiology and Chronic Health Evaluation (APACHE II) score. Black, nonparametric AalenJohansen estimates. Dark gray, model-based from the Fine-Gray model using data from the full cohort. Light gray, derived from nested case-control data with using additional cohort information.

0.05

Probability

0.10

Aalen−Johansen: Apache>15 Fine−Gray: Apache>15 nested case−control: Apache>15 Aalen−Johansen: Apache £ 15 Fine−Gray: Apache £ 15 nested case−control: Apache £ 15

0

5

10

15

20

25

30

Days from ICU admission

www.epidem.com | 123

Epidemiology  •  Volume 25, Number 1, January 2014

Wolkewitz et al

calculate the cumulative infection subdistribution hazards for each exposure category.3 With this, we approximated the cumulative incidence function of infection (Figure 1). Note that the crude risks of nosocomial infections in each score category (3.5% and 13.2%) correspond to the cumulative incidence function on the plateau of Figure 1 because administrative censoring is very low (about 0.001%).

15 10 5

Patients

20

Incidence density sampling

DISCUSSION

15 10 5

Patients

20

Subdistribution incidence density sampling

0

5

10

15

20

Follow−up time

FIGURE 2.  Illustration of sampling methods: incidence density and subdistribution sampling. Each horizontal line represents one hypothetical cohort member. Those who experienced an event of interest are marked with a filled black dot and those who experienced a competing event are marked with a circle; censored are marked with a cross. The gray line displays the time until potential censoring for individuals with a competing event. A vertical dotted line is drawn to see who is a potential matching candidate for the patient who experienced the event of interest at time 10.

The estimated odds ratio from the conditional logistic regression model approximates the subdistribution hazard ratio exp(β0) from the Fine-Gray6 model. The results are comparable with those from the full cohort (Table). As with the event-specific cumulative hazards (see eAppendix, http://links.lww.com/EDE/A741), we used cohort information (modified number at risk for event times) to

We propose a modified sampling technique to study the impact of risk factors on the event of interest in terms of a comparison of cumulative incidence functions. The mathematical justification for this approach is based on a straightforward and consecutive combination of established methodology: the Fine-Gray model with adaptation,6,17 incidence density sampling,1,2 and the use of cohort information for absolute risks estimation.3 Statistical software is available.3,16–18 Interpreting results of a competing risks analysis is challenging, but conclusions can easily be misleading if competing events are ignored.13 In our example, the cumulative incidence functions would be highly overestimated if the competing event is ignored: infection risk 30 days after admission would be 35% for those with low APACHE II scores and 45% for those with high APACHE II scores compared with only 3.5% and 13.2%, respectively, when we account for the competing risks. Borgan8 proposed a method that uses the cumulative hazards (only) of the event of interest but assumed that the exposure has no effect on the competing events; in our example, this would mean that the APACHE II score has no effect on the discharge hazard. This also clearly leads to biased results because there is indeed an effect: infection risk 30 days after admission is 5% (low APACHE II score) and 7% (high APACHE II score). Additional knowledge on the cumulative hazards of the competing event is needed to overcome this problem,8 but that would require further nested case-control studies on the competing event or subcohorting. In contrast to this approach, we propose a direct sampling method. Our approach has limitations. First, proportionality of event-specific hazards does not imply proportionality of subdistribution hazards and vice versa. In our example, the fit in both models was acceptable. This was very much in line

TABLE.  Upper Part: Results (Hazard Ratios) from the Event-specific Regression and the Fine and Gray6 Model (Subdistribution Hazard Ratios) Using the Full Cohort and Lower Part: Results from the Nested Case-control 1:1 Studies for Event “Infection” Using Incidence Density Sampling (IDS) and Subdistribution IDS (Sub-IDS) APACHE Score

Infection HR (95% CI)

Event-specific regression (full cohort) >15 vs. ≤15 1.56 (1.28–1.92) Incidence density sampling (IDS) >15 vs. ≤15 1.53 (1.16–2.03)

Death HR (95% CI)

Discharge HR (95% CI)

5.37 (4.46–6.47)

0.35 (0.32–0.37)





Infection HR (95% CI) Fine and Gray6 4.02 (3.30–4.89) Sub-IDS 4.03 (2.94–5.54)

We used conditional logistic regression (averaged over 1000 runs). CI indicates confidence interval; HR, hazard ratio.

124  |  www.epidem.com

© 2013 Lippincott Williams & Wilkins

Epidemiology  •  Volume 25, Number 1, January 2014

with the results of Grambauer et al19 who showed that subdistribution hazard approach gives a summary analysis even if misspecified. Second, we dichotomized a continuous variable (APACHE II score). The reason for dichotomizing was only for illustrative purposes because this score is associated with the hazards for infection, discharge, and death. However, we emphasize that the proposed sampling method works well with exposures on a continuous scale and in a multivariate setting (data not shown). Third, in our cohort, the potential censoring times were available because of administrative censoring. If this is not the case, we recommend imputation of these values before sampling; methodology and software are available.16,17 Researchers who are planning a nested case-control study in a cohort with competing events should ask the question of which actual model they want to study: the etiology model with event-specific hazards or the prediction model with subdistribution hazards. We recommend the study of both to receive a complete picture of direct and indirect effects and to derive correct conclusions.20 If there is one event of interest (in our example, nosocomial infections), it is enough to combine the other competing events (eg, discharge [alive or dead]) because the cumulative incidence function depends on the sum of all competing cumulative hazards (Equation 2). However, before performing separate nested case-control studies for the event of interest and the competing event, one might consider possibilities of re-using controls21,22 or choosing a case-cohort design. REFERENCES 1. Gail MH, Benichou J. Encyclopedia of Epidemiologic Methods. Chicester, UK: John Wiley & Sons Inc; 2000. 2. Vandenbroucke JP, Pearce N. Case-control studies: basic concepts. Int J Epidemiol. 2012;41:1480–1489. 3. Langholz B. Use of cohort information in the design and analysis of casecontrol studies. Scand J Stat. 2007;34:120–136.

© 2013 Lippincott Williams & Wilkins

Nested Case-Control Studies in Cohorts with Competing Events

4. Andersen PK, Geskus RB, de Witte T, Putter H. Competing risks in epidemiology: possibilities and pitfalls. Int J Epidemiol. 2012;41:861–870. 5. Lau B, Cole SR, Gange SJ. Competing risk regression models for epidemiologic data. Am J Epidemiol. 2009;170:244–256. 6. Fine J, Gray RJ. A proportional hazards model for the subdistribution of a competing risk. J Am Stat Assoc. 1999;94:496–509. 7. Grambauer N, Schumacher M, Dettenkofer M, Beyersmann J. Incidence densities in a competing events analysis. Am J Epidemiol. 2010;172: 1077–1084. 8. Borgan Ø. Estimation of covariate-dependent Markov transition probabilities from nested case-control data. Stat Methods Med Res. 2002;11: 183–202. 9. Lubin JH. Extensions of analytic methods for nested and populationbased incident case-control studies. J Chronic Dis. 1986;39:379–388. 10. Lubin JH. Case-control methods in the presence of multiple failure times and competing risks. Biometrics. 1985;41:49–54. 11. Flanders WD, Louv WC. The exposure odds ratio in nested case-control studies with competing risks. Am J Epidemiol. 1986;124:684–692. 12. Wolkewitz M, Di Termini S, Cooper B, Meerpohl J, Schumacher M. Paediatric hospital-acquired bacteraemia in developing countries. Lancet. 2012;379:1484; author reply 1484–1485. 13. Wolkewitz M, Harbarth S, Beyersmann J. Daily chlorhexidine bathing and hospital-acquired infection. N Engl J Med. 2013;368:2330. 14. Andersen PK, Abildstrom SZ, Rosthøj S. Competing risks as a multi-state model. Stat Methods Med Res. 2002;11:203–215. 15. Aalen OO, Johansen S. An empirical transition matrix for non-homogeneous Markov chains based on censored observations. Scand J Stat. 1978;5:141– 150. 16. Beyersmann J, Allignol A, Schumacher M. Competing Risks and Multistate Models with R. New York: Springer; 2011. 17. Ruan PK, Gray RJ. Analyses of cumulative incidence functions via nonparametric multiple imputation. Stat Med. 2008;27:5709–5724. 18. Richardson DB. An incidence density sampling program for nested casecontrol analyses. Occup Environ Med. 2004;61:e59. 19. Grambauer N, Schumacher M, Beyersmann J. Proportional subdistribution hazards modeling offers a summary analysis, even if misspecified. Stat Med. 2010;29:875–884. 20. Latouche A, Allignol A, Beyersmann J, Labopin M, Fine JP. A competing risks analysis should report results on all cause-specific hazards and cumulative incidence functions. J Clin Epidemiol. 2013;66:648–653. 21. Støer NC, Samuelsen SO. Comparison of estimators in nested case-control studies with multiple outcomes. Lifetime Data Anal. 2012;18:261–283. 22. Salim A, Yang Q, Reilly M. The value of reusing prior nested case-control data in new studies with different outcome. Stat Med. 2012;31:1291–1302.

www.epidem.com | 125

Nested case-control studies in cohorts with competing events.

In nested case-control studies, incidence density sampling is the time-dependent matching procedure to approximate hazard ratios. The cumulative incid...
364KB Sizes 0 Downloads 0 Views