pharmacoepidemiology and drug safety 2014; 23: 140–151 Published online 18 October 2013 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/pds.3539

ORIGINAL REPORT

Variable selection on large case-crossover data: application to a registry-based study of prescription drugs and road traffic crashes† Marta Avalos1,2*, Ludivine Orriols1,2, Hélène Pouyes2,3, Yves Grandvalet4, Frantz Thiessard1,2, Emmanuel Lagarde1,2 and on behalf of the CESIR research group 1

Univ. Bordeaux, ISPED, Centre INSERM U897-Epidemiologie-Biostatistique F-33000 Bordeaux, France INSERM, ISPED, Centre INSERM U897-Epidemiologie-Biostatistique F-33000 Bordeaux, France 3 Univ. de Pau et des Pays de l′Adour, F-64012 Pau, France 4 Univ. de Tech. de Compiègne & CNRS, Heudiasyc UMR 7253 F-60203 Compiègne, France 2

ABSTRACT Purpose In exploratory analyses of pharmacoepidemiological data from large populations with large number of exposures, both a conceptual and computational problem is how to screen hypotheses using probabilistic reasoning, selecting drug classes or individual drugs that most warrant further hypothesis testing. Methods We report the use of a shrinkage technique, the Lasso, in the exploratory analysis of the data on prescription drugs and road traffic crashes, resulting from the case-crossover matched-pair interval approach described by Orriols and colleagues (PLoS Med 2010; 7:e1000366). To prevent false-positive results, we consider a bootstrap-enhanced version of the Lasso. To highlight the most stable results, we extensively examine sensitivity to the choice of referent window. Results Antiepileptics, benzodiazepine hypnotics, anxiolytics, antidepressants, antithrombotic agents, mineral supplements, drugs used in diabetes, antiparkinsonian treatment, and several cardiovascular drugs showed suspected associations with road traffic accident involvement or accident responsibility. Conclusion These results, in relation to other findings in the literature, provide new insight and may generate new hypotheses on the association between prescription drugs use and impaired driving ability. Copyright © 2013 John Wiley & Sons, Ltd. key words—case-only design; injury; multiple exposures; paired data; pharmacoepidemiology; road safety Received 30 January 2013; Revised 23 August 2013; Accepted 25 September 2013

INTRODUCTION Disentangling the impact of medicinal drugs on road traffic crashes is a complex issue for several reasons: (i) the large variety of pharmaceutical classes, with various prevalence of use in the general population; (ii) the confounding underlying health conditions; (iii) the potential medicinal benefits of drugs that may lead to improved rather than impaired driving ability; (iv) the adaptive behaviors: while on medication, people may pay more attention to compensate for

*Correspondence to: M. Avalos, ISPED, Univ. Bordeaux Segalen, F-33076 Bordeaux, France. E-mail: [email protected] † We confirm that this manuscript has not been published elsewhere and is not under consideration by another journal. No prior posting (including internet) or presentations exist.

Copyright © 2013 John Wiley & Sons, Ltd.

changes in perceived risk; (v) the dose, cumulative dose, and duration of drug consumption (prevalent, intermittent, incident); (vi) the co-consumption and interaction of drugs,…1,2 As a result, there is a relatively small epidemiologic literature examining the associations between medicinal drugs and impaired driving. Some studies have yielded inconsistent results, and a large proportion of them focus on hypnotics and anxiolytics (particularly benzodiazepines).3–5 The case-crossover design, introduced in epidemiology 20 years ago to assess whether a given transient exposure with a transient effect may have triggered an immediate short-term, acute event,6–8 is frequently used in the road safety and pharmacoepidemiology fields.9–19 This observational study uses only cases, and then the association between event onset and risk factors is estimated by comparing exposure during the period of time just prior to the event onset (case window) to the same subject’s

multiple exposures in the case-crossover design

exposure during one or more referent windows. As a result, this design inherently eliminates the bias in control selection and removes the confounding effects of time-invariant factors. While the choice of referent windows has received considerable attention in the literature, because of the sensitivity of the case-crossover analysis to the effects of time-varying risk factors,14,20–22 as well as the adjustment for exposure trend bias,23–25 few research studies have addressed the variable selection issue. It is generally agreed that prior knowledge from the scientific literature should guide model selection in epidemiological studies. However, in large epidemiologic studies such as the registry-based study on the impact of medicinal drugs on the risk of road traffic crashes described in Orriols et al., 2010,15 hundreds of covariates potentially relevant for regression models are available. With a large number of candidates, it is awkward to go through each one manually in turn to make a decision about its selection or elimination from the model. Conventional automatic selection methods such as stepwise selection are usually applied, despite admitted drawbacks (instability and omission of important predictors in final models with small sample sizes, standard errors for regression coefficients underestimated, etc.).26–28 Shrinkage methods, such as the Lasso,29 have emerged and gained popularity among statisticians as an alternative to conventional methods, but they are underused in the epidemiological literature.27,30 In particular, we have recently adapted the Lasso and related techniques to conditional logistic regression, the standard statistical tool used to analyze case-crossover studies.31 While this paper focused on methodological issues, the main objective of the present paper is to illustrate the use of the Lasso in exploratory analysis of the case-crossover matchedpair interval study of prescription drugs and road traffic crashes described in Orriols et al., 2010,15 focusing on practical aspects. We also provide here a more detailed analysis of the pharmacoepidemiological results, relating them to findings in the literature and to results obtained using the Lasso approach in a case–control design from the same registry-based study.32

METHODS Data sources and design Information on drug prescriptions and road traffic accidents was obtained from the following anonymized population-based registries: the national health care insurance database (which covers the whole French population and includes data on reimbursed prescription Copyright © 2013 John Wiley & Sons, Ltd.

141

drugs), police reports, and the national police database of injurious road traffic crashes.15 Drivers involved in an injurious crash in France, between July 2005 and May 2008, were included in the study (Figure 1). Traffic crash data including information about alcohol impairment and drivers’ responsibility for the crash were collected. All drivers involved in a road traffic crash are supposed to be tested for the presence of alcohol using a breath test. When the breath test is negative (concentration

Variable selection on large case-crossover data: application to a registry-based study of prescription drugs and road traffic crashes.

In exploratory analyses of pharmacoepidemiological data from large populations with large number of exposures, both a conceptual and computational pro...
869KB Sizes 0 Downloads 0 Views