MAIN PAPER (wileyonlinelibrary.com) DOI: 10.1002/pst.1657

Published online 6 November 2014 in Wiley Online Library

Effect of reporting bias in the analysis of spontaneous reporting data Palash Ghosha * and Anup Dewanjib It is well-known that a spontaneous reporting system suffers from significant under-reporting of adverse drug reactions from the source population. The existing methods do not adjust for such under-reporting for the calculation of measures of association between a drug and the adverse drug reaction under study. Often there is direct and/or indirect information on the reporting probabilities. This work incorporates the reporting probabilities into existing methodologies, specifically to Bayesian confidence propagation neural network and DuMouchel’s empirical Bayesian methods, and shows how the two methods lead to biased results in the presence of under-reporting. Considering all the cases to be reported, the association measure for the source population can be estimated by using only exposure information through a reference sample from the source population. Copyright © 2014 John Wiley & Sons, Ltd. Keywords: adverse drug reaction; BCPNN; EBGM; reference sample; spontaneous reporting database; under-reporting

1. INTRODUCTION

20

In pharmacovigilance, spontaneous reporting (SR) databases play an important role to generate signal regarding the relationship between drug and adverse drug reactions (ADRs) based on reported data. These reports are collected from all over the world with the help of clinicians and/or health professionals, who are responsible for recognizing and reporting suspected side effects known as ADRs, once the drug is in the market. All ADR reports are stored in databases at the national centers and sent to the World Health Organization (WHO) Collaborating Center for International Drug Monitoring (the Uppasala Monitoring Center) as well. The US Food and Drug Administration (FDA) and the UK Yellow Card database also maintain such SR databases, consisting of medical events happening to patients taking different kinds of drugs [1]. These kinds of databases are used to provide early warnings or suspicions regarding the drug and ADR relationship, which have not been recognized prior to marketing of a drug because of limitations of clinical trials, for example, drugs may not be given to pregnant women during the trial. It is well-known that an SR database suffers from significant under-reporting [2] of ADRs from the source population, consisting of all those people having a particular disease for which some prescribed drugs are being used. Note that the SR database arises from a subset of this source population consisting of those patients experiencing and reporting some ADRs, including the one of interest. There are several reasons for under-reporting of ADR often depending on the severity of the ADR, the drug associated with the ADR, the disease for which the drug has been prescribed, and so on. Presently, the commonly used methodologies to detect signals from an SR database are the Bayesian confidence propagation neural network (BCPNN) [3] or DuMouchel’s [4] empirical Bayes geometric mean (EBGM) methodology, both of which are based solely on reports in the SR database. These methodologies suffer from under-reporting, as also recognized in these works, and fail to infer about the true drug–ADR associa-

Pharmaceut. Statist. 2015, 14 20–25

tion in the source population. Even if there is some information on the reporting pattern of the ADR, these methodologies do not adjust for it. This work is intended to illustrate the effect of ignoring under-reporting. The reporting probabilities are incorporated into the measures used in these methodologies to modify them into measures for the source population. While the measures in BCPNN [3] and EBGM [4] relate to the SR database, the modified measures relate to the source population. The reporting probabilities help to find a relationship between these two kinds of measures, as discussed in Section 2. We also show that BCPNN and EBGM are based on the same basic idea of comparison of joint probability of drug and ADR with the corresponding marginal probabilities. In Section 3, we assume that all the cases are reported and obtain the source population association measure by using information on exposure probability. Section 4 ends with some concluding remarks.

2. UNDER-REPORTING IN BAYESIAN CONFIDENCE PROPAGATION NEURAL NETWORK AND EMPIRICAL BAYES GEOMETRIC MEAN In this section, we incorporate the reporting probabilities in the expressions of association measures used in BCPNN and EBGM, showing the effect of under-reporting or the extent of reporting bias. These measures are based on reporting data in the form of a 2  2 table as given in Table I. Here, A denotes the presence .A D 1/ or absence .A D 0/ of the ADR under study, D denotes the use of the specific drug a

Centre for Quantitative Medicine, Duke-NUS Graduate Medical School, Singapore

b

Applied Statistics Unit, Indian Statistical Institute, Kolkata, India

*Correspondence to: Palash Ghosh, Centre for Quantitative Medicine, Duke-NUS Graduate Medical School, Singapore 169856. E-mail: [email protected]

Copyright © 2014 John Wiley & Sons, Ltd.

P. Ghosh and A. Dewanji 2.1. Bayesian confidence propagation neural network Table I. Summary of data obtained from spontaneous reporting (SR) database.

DD1 DD0 Total

A D 1, R D 1

A D 0, R D 1

Total

n11 n01 n1

n10 n00 n0

n1 n0 n

Bate et al. [3] have introduced the BCPNN methodology to generate signal from SR databases. Since 1998, the WHO has implemented BCPNN methodology for routine signal detection [5]. The BCPNN method is based on an information component (IC), a measure of association between the two binary variables A and D defined as

A, D and R are binary variables denoting adverse drug reaction (ADR), drug and reporting status, respectively.

(D D 1 means using the drug under study, and D D 0 means not using the drug under study). Note that the eventfA D 0g means either no ADR or some ADR other than the one under study. Therefore, there is some probability of reporting for those with A D 0. This indicates a possible reason for differential reporting probability for the presence or absence of the ADR under study. Also, R denotes the reporting status, where R D 1 represents the event of reporting to the SR database, and 0 means ‘not reporting’. Let us define ij D P.R D 1jD D i, A D j/ and ij D P.D D i, A D j/, for i, j D 0, 1 as the reporting probability and cell probability corresponding to the .i, j/th cell. A commonly used association measure, known as odds ratio (OR), is worth mentioning. As discussed in Ghosh and Dewanji (2011), the observed OR n11 n00 =n01 n10 from the reporting data in Table I is a biased estimate for the OR due to under-reporting. This, however, estimates a quantity called reporting odds ratio .ROR/ as given by P.D D 1, A D 1jR D 1/P.D D 0, A D 0jR D 1/ ROR D P.D D 0, A D 1jR D 1/P.D D 1, A D 0, jR D 1/ 11 00 11 00  . D 01 10 01 10

IC SR D log2 (1)

This ROR and OR are, therefore, related through the reporting probabilities as OR D ROR 

01 10 . 11 00

(2)

D log2

P.D D 1, A D 1jR D 1/ P.D D 1jR D 1/P.A D 1jR D 1/ P  1 11 11 i,jD0 ij ij

(4)

.11 11 C 10 10 /.11 11 C 01 01 /

.

The BCPNN method, in fact, works on this IC SR , whereas the population IC is defined differently. The reporting data in Table I estimates IC SR as log2 Œn11 n=n1 n1 . Note that P.D D 1, A D 1jR D 1/ P.D D 1jR D 1/P.A D 1jR D 1/ P.R D 1jD D 1, A D 1/P.R D 1/ P.D D 1, A D 1/  , D P.R D 1jD D 1/P.R D 1jA D 1/ P.D D 1/P.A D 1/

(5)

resulting in the following relationship between the two information components: IC D IC SR  log2  ,

(6)

where D

D

P.R D 1jD D 1, A D 1/P.R D 1/ P.R D 1jD D 1/P.R D 1jA D 1/ P  1 11 i,jD0 ij ij .11 C 10 /.11 C 01 / .11 11 C 10 10 /.11 11 C 01 01 /

(7) .

The reporting bias in this measure IC, given by the second term in the right hand side (RHS) of (6), depends directly on the reporting probability P.R D 1jD D 1, A D 1/. It also depends, not so explicitly, on the reporting probabilities in the other three cells of Table I through the marginal reporting probabilities. For example,

Copyright © 2014 John Wiley & Sons, Ltd.

21

Note that when the reporting status R does not depend on either or both of the ADR status A and drug status D, ROR coincides with OR. In fact it can be shown [1] that, in such case, the SR database represents the source population with respect to the exposure (or drug) distribution. Nevertheless, the ROR is based on reported data, which may not represent the corresponding source population. One can obtain OR from ROR using (2), if information on the reporting probabilities is available. The under-reporting (i.e., last part of (2)) is directly affected by all the four reporting probabilities corresponding to the four cells of Table I. In a sense, this bias measures the departure of the SR database from representing the source population in a multiplicative manner with regard to the OR. The association measures used in BCPNN and EBGM are similarly affected by the reporting probabilities as derived in Section 2.1 and 2.2, respectively. Note that, as discussed in [1], the population here means the set of patients suffering from a particular disease for which the drug under study, or some other drug, is prescribed, which are to be compared with respect to a particular ADR.

Pharmaceut. Statist. 2015, 14 20–25

P.D D 1, A D 1/ 11 D log2 . P.D D 1/P.A D 1/ .11 C 10 /.11 C 01 / (3) Note that this IC reflects the observed/expected ratio in logarithm scale (base 2), where the expected quantity is under the null hypothesis of no association, which is independence in this context. So the IC is a measure of deviance from independence. This IC is estimated from the reporting data in Table I using a Bayes method. Note that this IC corresponds to the (1,1) cell of the 2  2 table. One can consider three other IC measures corresponding to the remaining three cells. Assuming beta priors for the three probabilities, these are estimated by the corresponding posterior means; hence, IC is estimated by using (3) even for small numbers of reports in Table I. As more reports accumulate, the posterior distributions are also updated, and the corresponding estimate of IC along with its variance estimate are improved. A signal is said to be detected if the lower limit of the 95% confidence interval of IC is greater than zero. The IC, as defined, is treated as the population IC. Note that this information component based on SR data only may be defined as IC D log2

P. Ghosh and A. Dewanji

Table II. Cross-classified data (2006-08) of Prograph–LTR combination for patients with liver transplantation from the SR database of the US FDA along with estimates of ROR, IC SR and SR 11 (standard errors in parentheses).

Prograph Other drugs

LTR

OADR

ROR

b

b SR IC

O SR 11

27 53

152 293

0.98(0.25)

-0.01(0.21)

0.99(0.17)

Liver transplant rejection (LTR) and other adverse drug reaction (OADR) denote liver transplant rejection and other ADR, respectively.SR, spontaneous reporting (SR); FDA, Food and Drug Administration; ROR, reporting odds ratio; IC, information component.

Table III. Estimate of information component (IC) and 11 with different  for the data of Table II.  0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

b IC

O 11



b IC

O 11



b IC

O 11

3.31 2.31 1.73 1.31 0.99 0.73 0.50 0.31

9.93 4.97 3.31 2.48 1.99 1.66 1.42 1.24

0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6

0.14 -0.01 -0.15 -0.27 -0.39 -0.50 -0.59 -0.69

1.10 0.99 0.90 0.83 0.76 0.71 0.66 0.62

1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4

-0.78 -0.86 -0.94 -1.01 -1.08 -1.15 -1.21 -1.26

0.58 0.55 0.52 0.50 0.47 0.45 0.43 0.42

P P.R D 1/ D i,j P.R D 1jD D i, A D j/P.D D i, A D j/ depends on the reporting probabilities in all the four cells. As in the case of the OR, this bias measures the departure of the SR database from representing the source population with regard to IC. The objective in this work is to investigate when the IC SR can be used as a representative of the source population IC. Clearly, as in the case of OR, if the reporting status .R/ does not depend on any one or both of the ADR status .A/ and drug status .D/, then the second term in the RHS of (6) is zero, resulting in equality of the two information components. In practice, this kind of equality rarely holds, as the reporting pattern depends on various factors. There may be other factors besides ADR–drug status affecting the event of reporting to the SR database [6], but in this work, we assume the influence of those factors to be ignorable. To illustrate how the reporting probabilities can affect the derivation of IC SR from IC, we consider the data [1] in Table II. Here, the drug prograph and the ADR liver transplant rejection (LTR) are compared with respect to those patients with liver transplantation in the SR database of the US FDA over the time span 2006–2008. The OADR in Table II means other  ADRs. From Table II, the estimated reporting odds ratio ROR is 0.98, and

b

b SR , is -0.01. Both the corresponding estimate of IC SR , denoted by IC

22

estimates indicate no association between the drug prograph and the ADR liver transplant rejection based on the SR data. If there is no reporting anomaly in the SR data, we could take the information component measure as the representative of the corresponding source population measure. In practice, this may not be the case. Considering different reporting probabilities to see the deviation of the IC from the IC SR , as in Table III, it is evident that b can go either the corresponding source population estimate IC b D 3.31 way depending on the value of  in (7). For example, IC b D 1.26 when  D 2.4. Therefore, any when  D 0.1, whereas IC b SR may turn out to be decision that has been taken based on IC

Copyright © 2014 John Wiley & Sons, Ltd.

wrong depending on the reporting probabilities. In other words, there may be a false positive or false negative decision based on only SR data. 2.2. Empirical Bayes geometric mean DuMouchel [4] introduced an empirical Bayes signal detection procedure. This methodology assumes that each observed count nij , corresponding to D D i and A D j (Table I), for i, j D 0, 1, is a draw from a Poisson distribution with an unknown mean ij , and interest centers on the ratio ij D ij =Eij , where Eij is the expected count under the null hypothesis of no association between the drug and the ADR. It also assumes that each  is drawn from a common prior distribution, a mixture of two gamma distributions, instead of treating different s as unrelated. The ratio may be interpreted in similar fashion as relative risk with value 1 under the null hypothesis. This method provides Bayesian estimates of the ij s and, in particular, an empirical Bayes measure given by EBGMij D 2EB log 2ij with EB log 2ij D EŒlog2 .ij /jnij . Without delving into the details of this Bayesian methodology, we incorporate the reporting probabilities into the quantities ij ’s of interest to investigate how these are affected. It can be easily seen that, for i, j D 0, 1, and N being the size of the source population, ij NP.D D i, A D j/ P.D D i, A D j/ D D Eij NP.D D i/P.A D j/ P.D D i/P.A D j/ ij , D .i1 C i0 /.1j C 0j /

ij D

(8)

with the corresponding quantity based on the SR data only written as SR ij D D

P.D D i, A D jjR D 1/ P.D D ijR D 1/P.A D jjR D 1/ P  1 ij ij i,jD0 ij ij .i1 i1 C i0 i0 /.1j 1j C 0j 0j /

(9) .

Similar probability calculation, as in Section 2.1, shows that P.R D 1jD D i, A D j/P.R D 1/ P.R D 1jD D i/P.R D 1jA D j/ ij D .i1 C i0 /.1j C 0j / P  1 ij i,jD0 ij ij .i1 C i0 /.1j C 0j /  . .i1 i1 C i0 i0 /.1j 1j C 0j 0j /

SR ij D ij 

(10)

Pharmaceut. Statist. 2015, 14 20–25

P. Ghosh and A. Dewanji The reporting bias, the second term in the RHS of (10), has a similar interpretation as given in case of OR and IC. As before, the measure SR ij coincides with the corresponding source population measure ij , when the reporting status .R/ does not depend on any one or both of the ADR status .A/ and drug status .D/. From the definition of the IC in Section 2.1 and the quantity ij ’s in Section 2.2, it is clear that IC D log2 11 , and IC SR D log2 SR 11 . In other words, both the measures compare the joint probability of the ADR and the drug under study with the product of corresponding marginal probabilities. Therefore, although the methodologies of Bate et al. (1998) and DuMouchel (1999) are different, the two basic quantities (IC and ij ) are essentially same. For the illustration with liver transplant patients, as in 0.01 D 0.99 Section 2.1, the corresponding estimate of SR 11 is 2 (Table II), indicating no association. However, as before, the source population estimate b 11 can be either way, depending on the value of  (Table III). 2.3. Effect of reporting probabilities We now investigate the effect of the different reporting probabilities on the three association measures in (2), (6), and (10), respectively. We first study the effect of changing one of the ij s, keeping the other three fixed. Clearly, the reporting bias in OR, given by 11 00 =01 10 in (2), is increasing in 11 and 00 , but decreasing in 10 and 01 (Table IV). It can be easily shown that the reporting bias  in IC, given by (7), is an increasing function of 00 and decreasing functions of 01 and 10 , when other parameters are fixed (Table IV). However, the effect of 11 on  is non-monotonic. For the sake of comparison, we present log2 .ROR/ and IC SR in Table IV. We do not report SR 11 because this is directly related to

IC SR (Section 2.2). In the first panel of Table IV, log2 .ROR/ increases with 11 , whereas IC SR increases with 11 up to a value near 11 D 0.80, after which it decreases. From second and third panels of Table IV, it is clear that both the measures log2 .ROR/ and IC SR are decreasing functions of 10 and 01 , respectively, when other parameters are fixed. It is evident from the fourth panel of Table IV, that both the measures are increasing functions of 00 . The value of log2 .ROR/ remains unchanged with interchange of the reporting probabilities 11 with 00 , or 01 with 10 (see the fifth panel of Table IV), while IC SR may vary with such interchange of reporting probabilities, as seen from (7).

3. AN ESTIMATE BASED ON SR DATABASE Here, we assume that all the ADRs under study (called ‘cases’) are reported to the SR database [1] (i.e., 100% reporting of cases with PŒR D 1jA D 1 D 1). Although, in general, under-reporting of ADR is a severe problem, steps are being taken to improve the current scenario. For example, the Drug Safety Research Unit (http://www.dsru.org/) in Southampton, UK, uses the prescription event monitoring (PEM) which collects data on all prescriptions for the first 20,000–50,000 patients for a new drug, resulting in 100% reporting of ADR for some drug–ADR combinations in PEM data. Also, this assumption may be reasonable in case of serious ADR as health professionals are well-informed about the possible adverse effects of the drug. Note that, with this assumption, the cells (1,1) and (0,1), the first column of Table I, are free from reporting bias. But the other two cells (1,0) and (0,0), the second column of Table I corresponding to A D 0, still suffer from reporting bias. Simple probability calculations show that, in such case, 11 D 01 D 1 and also P.D D 1jA D 1, R D 1/ D P.D D 1jA D 1/; one can also show that  in (7) simplifies to P.D D 1/=P.D D 1jR D 1/.

Table IV. Effect of reporting probabilities on the association measures based on SR data only with 11 D 01 D 10 D 00 D 0.25 when corresponding population measures OR and IC are 1 and 0, respectively. Panel

1

2

3

4 5

.11 10 01 00 /

11 00 01 10



log2 .ROR/

IC SR

0.50 0.30 0.40 0.60 0.60 0.30 0.40 0.60 0.70 0.30 0.40 0.60 0.80 0.30 0.40 0.60 0.81 0.30 0.40 0.60 0.82 0.30 0.40 0.60 0.90 0.30 0.40 0.60 0.99 0.30 0.40 0.60 0.50 0.50 0.40 0.60 0.50 0.70 0.40 0.60 0.50 0.90 0.40 0.60 0.50 1.00 0.40 0.60 0.50 0.30 0.50 0.60 0.50 0.30 0.70 0.60 0.50 0.30 0.90 0.60 0.50 0.30 1.00 0.60 0.50 0.30 0.40 0.70 0.50 0.30 0.40 0.80 0.50 0.30 0.40 1.00 0.50 0.30 0.40 0.60 0.60 0.30 0.40 0.50

2.5000 3.0000 3.5000 4.0000 4.0500 4.1000 4.5000 4.9500 1.5000 1.0714 0.8333 0.7500 2.0000 1.4286 1.1111 1.0000 2.9167 3.3333 4.1667 2.5000 2.5000

1.2500 1.2667 1.2727 1.2727 1.2725 1.2722 1.2692 1.2643 1.1111 1.0185 0.9524 0.9259 1.1875 1.0938 1.0268 1.0000 1.3194 1.3889 1.5278 1.2500 1.2000

1.3219 1.5850 1.8074 2.0000 2.0179 2.0356 2.1699 2.3074 0.5850 0.0995 -0.2630 -0.4150 1.0000 0.5146 0.1520 0.0000 1.5443 1.7370 2.0589 1.3219 1.3219

0.3219 0.3410 0.3479 0.3479 0.3477 0.3474 0.3440 0.3384 0.1520 0.0265 -0.0704 -0.1110 0.2479 0.1293 0.0381 0.0000 0.3999 0.4739 0.6114 0.3219 0.2630

SR, spontaneous reporting; OR, odds ratio; IC, information component; ROR, reporting odds ratio.

23

Pharmaceut. Statist. 2015, 14 20–25

Copyright © 2014 John Wiley & Sons, Ltd.

P. Ghosh and A. Dewanji When P.D D 1/ is known, we can obtain an estimate of  by estimating P.D D 1jR D 1/ using the information from the SR data. When exposure probability is unknown, we consider a reference sample of size m from the source population and observe the exposure status to estimate P.D D 1/. A reference sample can be drawn from some external source other than the SR database, for example, a prescription database. In the UK, a general practitioner provides primary health care and issues prescriptions for the medicines considered medically necessary to all the persons registered with him or her. This facility is provided to virtually all the persons in UK. A patient needs to take the prescription to a pharmacist for the medication. The information in the prescription is sent to a central Prescription Pricing Authority (PPA) which arranges the reimbursement of the pharmacist [7]. From this PPA database, the source population corresponding to a particular disease can be screened. A random sample with exposure information from this source population can be treated as a reference sample. Now, under the assumption of 100% reporting of cases and using (4) and (6), we have IC D log2

P.D D 1, A D 1jR D 1/ , P.D D 1/P.A D 1jR D 1/

(11)

b D log2 Œn11 m=n1 y, where y is the which can be estimated by IC number of individuals using the drug under study in the reference b can be obtained sample of size m. Variance of this estimate IC using the delta method as described in the Appendix. For illustration, we consider data from the Netherlands Pharmacovigilance Centre Lareb (NPCL) SR database [8], consisting of 9822 reports from health professionals in The Netherlands, concerning patients older than 50 years between 1 January 1990 to 1 January 1999 (Table V). Here, the objective is to find whether the diuretics are associated with the ADR congestive heart failure (CHF) [9]. We assume that the source population consists of the patients suffering from cardiovascular disease and stroke because we have no information about the disease for which the drugs (diuretic and other drugs) have been taken. We also assume 100% reporting of cases (CHF) to the NPCL SR database. The probability P.D D 1jR D 1/ can be estimated from Table V as (78 + 1697)/9822 = 0.18. However, the exposure probability P.D D 1/ cannot be estimated from the SR data because the control sample is suffering from reporting bias. The exposure probability is obtained from the study of Pharmaceutical Use and Expenditure for Cardiovascular Disease and Stroke (PUECDS) [10]. This study reported the average percent of diuretic use (ATC code beginning with C03) from hospital outlets in The Netherlands for the time period 1989–1999 as 14.8% ([10], p21). In order to estimate the population IC, we use this information to obtain the

Table V. Cross-classified data for drug diuretic and ADR CHF from NPCL. Diuretic Present Absent

CHF Present 78 227

Absent 1697 7820

ADR, adverse drug reaction; CHF, congestive heart failure; NPCL, Netherlands Pharmacovigilance Centre Lareb.

24 Copyright © 2014 John Wiley & Sons, Ltd.

probability of exposure to diuretic in the source population as 0.148. Because the concerned periods for both NPCL SR database and PUECDS [10] study are similar, it is assumed that the information provided in the PUECDS study and the events captured in the NPCL database both approximately represent the same source population of interest. In other words, we assume the exposure probability in the PUECDS study to be invariant with respect to age of the patients. Using (11), the estimated source population IC is 0.79, whereas the corresponding estimate of IC SR is 0.5. The standard error of the estimated IC is 0.14, calculated using delta method as shown in Appendix considering the exposure probability as fixed. The estimate of 11 D 2IC is 1.73 with the corresponding standard error 0.24, whereas the estimate of SR 11 is 20.5 D 1.414. This indicates a positive association between the drug and CHF ([1,9]). Because we have imposed the assumption of 100% reporting of cases to the NPCL SR database, the results of the analysis should be interpreted as an indication subject to validity of this assumption. Note that the exposure probability from the PUECDS study, used in this example, is based on data from hospital outlets, which may not be a good representative of the corresponding source population. Estimation of OR requires additional information on the size of the source population ([1]) although the estimate turns out to be robust against misspecification of this assumption.

4. DISCUSSION The primary purpose of SR databases is to generate hypotheses regarding the relationship between drugs and ADRs and not so much about establishing a causal relationship between them [5]. However, the ultimate objective of pharmacovigilance is to detect whether a particular drug is responsible for a particular ADR. In other words, after detecting a signal from an SR database, we have to consider further studies, for example, an epidemiological study, to come up with a decision regarding the association. The main problem with SR data, which prevents us from obtaining a stronger association measure, is under-reporting of ADRs. This makes all association measures based on SR data as only ‘reporting measures’, not as a measure of association in the corresponding source population. This work intends to bridge this gap. We have shown how source population association measures can be obtained from the measures based on SR data using different reporting probabilities. These reporting probabilities are not easily available in practice. Nevertheless, steps may be taken to gather information on the reporting probabilities from external sources. In this regard, we have discussed the importance of PPA in Section 3. The example shown at the end of Section 3 indicates how the methodology can be useful in the existing structure. This work also hints at the necessity of linking different databases to make the reference samples readily available. More work in this direction is needed to have a greater understanding of the usefulness of the methodology. The methodology developed in this work is based on SR data and some external information in terms of the exposure probability in the source population from which the SR data arises. These SR data relate to a national or international SR database. Note that, for a regional SR database, it may be easier to obtain this external information. However, if information from the

Pharmaceut. Statist. 2015, 14 20–25

P. Ghosh and A. Dewanji prescription databases of different countries, for example, is available then an estimate of the global exposure probability of the concerned drug can be obtained. For the sake of simplicity, we have not gone into the details of the Bayesian approaches in both BCPNN and EBGM. The objective has been to derive the relationship between the source population measures and those in SR database. As a result, when we have information on the reporting probabilities, the source population measures can be obtained from the SR database. The aim has been to assess the measures of BCPNN and EBGM in view of reporting bias and in the direction of early detection of the drug-ADR relationship due to public health concern and the related cost of delaying the detection.

Acknowledgements

[9] Heerdink ER, Leufkens HG, Herings RMC, Ottervanger JP, Stricker BHC, Bakker A. NSAIDs associated with increased risk of congestive heart failure in elderly patients taking diuretics. Archives of Internal Medicine 1998; 158:1108–1112. [10] Dickson M, Jacobzone S. Pharmaceutical use and expenditure for cardiovascular disease and stroke: a study of 12 OECD countries. OECD Health Working Papers, No. 1, 2003.

APPENDIX b obtained from (11) can be calThe variance of the estimate IC culated by applying the delta method and using the estimated covariance matrix of .n11 , n01 , n0 , y/, with n0 D n10 C n00 , where nij , for i, j D 0, 1, are the cell frequencies as in Table I, and y is the number of exposed individuals in the reference sample of size m. In order to define the covariance matrix of .n11 , n01 , n0 , y/, note that y and .n11 , n01 , n0 / are independent with

The authors are thankful to the associate editor and two anonymous referees for their helpful comments which have improved the paper.

REFERENCES [1] Ghosh P, Dewanji A. Analysis of spontaneous adverse drug reaction (ADR) reports using supplementary information. Statistics in Medicine 2011; 30:2040–2055. [2] Roux E, Thiessard F, Fourrier A, Begaud B, Tubert-Bitter P. Evaluation of statistical association measures for the automatic signal generation in pharmacovigilance. IEEE Transactions on Information Technology in Biomedicine 2005; 9(4):518–527. [3] Bate A, Lindquist M, Edwards IR, Olsson S, Orre R, Lansner A, De Freitas RM. A Bayesian neural network method for adverse drug reaction signal generation. European Journal of Clinical Pharmacology 1998; 54:315–321. [4] DuMouchel W. Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system. American Statistician 1999; 53:177–190. [5] Bate A, Evans SJW. Quantitative signal detection using spontaneous ADR reporting. Pharmacoepidemiology and Drug Safety 2009; 18:427–436. [6] van Puijenbroek EP, Bate A, Leufkens HGM, Lindquist M, Orre R, Egberts ACG. A comparison of measures of disproportionality for signal detection in spontaneous reporting systems for adverse drug reactions. Pharmacoepidemiology and Drug Safety 2002; 11:3–10. [7] Mann RD. Prescription-event monitoring-recent progress and future horizons. British Journal of Clinical Pharmacology 1998; 46:195–201. [8] van der Heijden PGM, van Puijenbroek EP, van Buuren S, van der Hofstede JW. On the assessment of adverse drug reactions from spontaneous reporting systems: the influence of under-reporting on odds ratios. Statistics in Medicine 2002; 21:2027–2044.

y  Bin.m, pe / .n11 , n01 , n0 /  Multinomial.n, p11 , p01 , p0 /, where, p11 D PŒD D 1, A D 1jR D 1, p01 D PŒD D 0, A D 1jR D 1, p0 D PŒA D 0jR D 1 and pe D P.D D 1/ with p11 C p01 C p0 D 1. Note that the estimates of p11 , p01 and pe b are given by n11 =n, n01 =n and y=m, respectively. Variance of IC is approximately .g0 /T Vg0 ,

(12)

where 2

p11 .1p11 /

n 6 V D 4  p11np01 0

 p11np01

0 0

0

pe .1pe / m

p01 .1p01 / n

3 7 5,

(13)

and the function g is given by  g D log2

p11 .p11 C p01 /pe

 (14)

with g0 denoting the vector of partial derivatives of g with respect to .p11 , p01 , pe /. The estimate of the variance can be obtained by replacing .p11 , p01 , pe / in (12) by the corresponding estimates. For the example in Section 3, the variance is estimated by assuming pe to be known as 0.148. In this case, only the multinomial distribution of .n11 , n01 , n0 / is considered and the first 2  2 submatrix of V is used in the calculation of (12) with g0 being the 2  1 vector of partial derivatives of g with respect to .p11 , p01 /.

25

Pharmaceut. Statist. 2015, 14 20–25

Copyright © 2014 John Wiley & Sons, Ltd.

Effect of reporting bias in the analysis of spontaneous reporting data.

It is well-known that a spontaneous reporting system suffers from significant under-reporting of adverse drug reactions from the source population. Th...
132KB Sizes 0 Downloads 5 Views