Original Article

Notes on testing equality and interval estimation in Poisson frequency data under a three-treatment three-period crossover trial

Statistical Methods in Medical Research 0(0) 1–19 ! The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/0962280213519249 smm.sagepub.com

Kung-Jong Lui1 and Kuang-Chao Chang2

Abstract When the frequency of event occurrences follows a Poisson distribution, we develop procedures for testing equality of treatments and interval estimators for the ratio of mean frequencies between treatments under a three-treatment three-period crossover design. Using Monte Carlo simulations, we evaluate the performance of these test procedures and interval estimators in various situations. We note that all test procedures developed here can perform well with respect to Type I error even when the number of patients per group is moderate. We further note that the two weighted-least-squares (WLS) test procedures derived here are generally preferable to the other two commonly used test procedures in the contingency table analysis. We also demonstrate that both interval estimators based on the WLS method and interval estimators based on Mantel-Haenszel (MH) approach can perform well, and are essentially of equal precision with respect to the average length. We use a double-blind randomized threetreatment three-period crossover trial comparing salbutamol and salmeterol with a placebo with respect to the number of exacerbations of asthma to illustrate the use of these test procedures and estimators. Keywords Poisson distribution, equality, interval estimators, count data, crossover design, power, precision

1 Introduction When studying treatments for non-curable chronic diseases, such as epilepsy, angina pectoris, hypertension or asthma, we may often consider the use of a crossover trial to reduce the number of patients needed for a parallel groups design.1–6 In fact, the research on crossover designs has been intensive. One may ﬁnd reviews of literatures covering various aspects relevant to crossover trials.3–9 Most publications in crossover trials have focused discussions on the continuous data based on the normality assumptions or the binary data based on a random eﬀects logistic risk model with a 1 2

Department of Mathematics and Statistics, College of Sciences, San Diego State University, San Diego, CA, USA Department of Statistics and Information Science, Fu-Jen Catholic University, New Taipei, Taiwan, ROC

Corresponding author: Kung-Jong Lui, Department of Mathematics and Statistics, San Diego State University, San Diego, CA 92182-7720, USA. Email: [email protected]

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

2

Statistical Methods in Medical Research 0(0)

two-period crossover trial.4–16 In practice, however, we can come across the count data, such as the number of seizures in epilepsy or the number of exacerbations in asthma under a three-period crossover trial.1,17–19 For example, consider the double-blind crossover trial comparing the number of exacerbations between using salbutamol (400 mg four times daily), salmeterol (50 mg twice daily) and placebo in asthma patients.1 The trial compromised a four-week run in period and three treatment periods of 24 weeks, each of which was followed by a four-week washout interval. There were 165 patients randomly assigned to receive one of the groups distinguished by the six possibly diﬀerent treatment-receipt sequences. There were ﬁve patients (&3%) with incomplete information on the number of exacerbations. We summarize in Table 1 the data (kindly provided by Profs Herbison and Taylor at the University of Otago) about the subtotal frequencies of exacerbations for the rest of 160 patients in the six groups at the three periods and the number of patients assigned to each group. We wish to study whether taking salbutamol or salmeterol can reduce the mean frequency of exacerbations as compared with the placebo, and assess the relative treatment eﬀects of the former to the latter. As a second example, we may consider the three-treatment three-period crossover trial with each treatment period of 16 weeks for comparing treatments 250 mg of ﬂuticasone twice daily, 100 mg of ﬂuticasone plus 50 mg of a long-acting betaagonist twice daily, and 100 mg of ﬂuticasone twice daily plus 5 or 10 mg of a leukotriene-receptor antagonist montelukast daily.17 The initial four weeks of the last two 16-week periods were considered to be the active washout from the previous period. We may wish to study whether the mean numbers of asthma non-control days are diﬀerent from one another among treatments. Note that count data, such as the number of seizures in epilepsy or the number of exacerbations in asthma, are discrete and are often skewed to the right. The common normality assumption can be seriously violated, and hence statistical methods derived under the normality for a threetreatment three-period crossover trial5 are generally not appropriate for use in count data. Layard and Arvesen20 concentrated attentions on a simple (or an AB/BA) crossover trial and proposed a Poisson regression model with assuming ﬁxed patient eﬀects. Senn5 applied arguments using Poisson process and proposed a closed form estimator for a simple crossover trial. Recently, Lui and Chang have developed asymptotic and exact test procedures and estimators based on an assumed exponential mixed eﬀects multiplicative risk model for the simple crossover trial as well.21,22 Lui23 has further developed sample size calculation formulae for testing equality in Poisson frequency under the AB/BA crossover design. None of these publications considers or discusses methods for a three-treatment three-period crossover trial as focused here.

ð gÞ ð gÞ ð gÞ 0 Table 1. The subtotal frequencies ðYþ1 ,Yþ2 ,Yþ3 Þ of exacerbations in asthma over ng patients with complete information at three periods in group g ( ¼ 1, 2, 3, 4, 5, 6) determined by different treatment-receipt sequences (P ¼ Placebo, A ¼ Salbutamol, B ¼ Salmeterol). ð gÞ ð gÞ ð gÞ 0 ðYþ1 ,Yþ2 ,Yþ3 Þ

Group P-A-B P-B-A A-P-B A-B-P B-P-A B-A-P

(g ¼ 1) (g ¼ 2) (g ¼ 3) (g ¼ 4) (g ¼ 5) (g ¼ 6)

(5, 4 (26, 12, (23, 24, (12, 1, (9, 14, (3, 13,

4)0 34)0 3)0 4)0 4)0 19)0

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

ng 27 26 27 26 28 26

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

Lui and Chang

3

Using a random eﬀect exponential multiplicative risk model, we derive asymptotic test procedures for testing equality between treatments and interval estimators of the ratio of mean frequencies in Poisson frequency data under a crossover trial. We employ Monte Carlo simulation to evaluate and compare the performance of these test procedures with respect to Type I error and power, as well as the performance of these interval estimators with respect to the coverage probability and average length. Finally, we use the data (Table 1) comparing salbutamol and salmeterol with a placebo with respect to the number of exacerbations in asthma to illustrate the use of test procedures and interval estimators developed here.

2 Notation, model assumptions and methods Suppose that we compare two experimental treatments A and B with a placebo (P) under a threeperiod crossover trial. We use the treatment-receipt sequence X–Y–Z to denote that a patient receives treatments X, Y and Z at periods 1, 2 and 3, respectively. Suppose that we randomly assign ng patients to group g (¼ 1 with P–A–B treatment-receipt sequence; ¼ 2 with P–B–A treatment-receipt sequence; ¼ 3 with A–P–B treatment receipt sequence; ¼ 4 with A–B–P treatment-receipt sequence; ¼ 5 with B–P–A treatment-receipt sequence; and ¼ 6 with B–A–P treatment-receipt sequence). For patient i (¼1, 2, . . . , ng ) assigned to group g (¼ 1, 2, . . . , 6), we let YðitgÞ denote the frequency of event occurrences at period t (¼ 1, 2, 3). Furthermore, we let Xðit1gÞ denote the indicator function of treatment-receipt for treatment A, and Xðit1gÞ ¼ 1 if the corresponding patient at period t receives treatment A, and ¼ 0, otherwise. Similarly, we let Xðit2gÞ denote the indicator function of treatment-receipt for treatment B, and Xðit2gÞ ¼ 1 if the corresponding patient at period t receives treatment B, and ¼ 0, otherwise. We further let 1ði1gÞ ðt ¼ 2Þ and 1ði2gÞ ðt ¼ 3Þ represent the indicator functions of period by setting 1ði1gÞ ðt ¼ 2Þ ¼ 1 for period t ¼ 2, and ¼ 0, otherwise; and 1ði2gÞ ðt ¼ 3Þ ¼ 1 for period t ¼ 3, and ¼ 0, otherwise. As commonly assumed for a crossover design, we assume with an adequate washout period that there is no carry-over eﬀect due to the treatment administered at an earlier period on the patient response. If the assumption of no carry-over eﬀect cannot be ensured on the basis of our subjective knowledge, as noted by Fleiss,4,24 Senn5,25 as well as Schouten and Kester,26 the crossover design should not be employed. Fleiss24 and Senn25 further contended that the simple carry-over model was not as useful as it initially was perceived, and the best strategy to deal with carryover eﬀects was to have an adequate wash-out period. We assume that the random frequency YðitgÞ of event occurrences on patient i (i ¼ 1, 2, . . . , ng ) assigned to group g (g ¼ 1, 2, . . . , 6) at period t (t ¼ 1, 2, 3) follows the Poisson distribution with mean that can be modeled as ð gÞ ðt ¼ 3Þ , EðYðitgÞ Þ ¼ exp ði gÞ þ 1 Xðit1gÞ þ 2 Xðit2gÞ þ 1 1ði1gÞ ðt ¼ 2Þ þ 2 1i2

ð1Þ

where ði gÞ represents the random eﬀect due to the underlying characteristics of the ith patient assigned to group g and is assumed to independently follow an unspeciﬁed probability density function fg ðÞ; 1 and 2 , respectively, denote eﬀect of treatments A and B relative to a placebo; and 1 and 2 , respectively, denote the eﬀect for periods 2 and 3 versus period 1. As considered by Senn5 (p.139–143) and Grizzle,10 we assume here that the relative treatment eﬀect does not vary between patients (i.e. there is no patient-by-treatment interaction) and the patient random eﬀects ði gÞ are ﬁxed from period to period (i.e. there is no patient-by-period interaction). When there should exist either of these interactions in a trial, we may sometimes be able to apply stratiﬁed analysis to

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

4

Statistical Methods in Medical Research 0(0)

alleviate this concern. For example, if there is a patient-by-treatment interaction due to genders, we can form strata by genders and do stratiﬁed analysis.5 We can assess the relative treatment eﬀect for males and females separately when this eﬀect varies between genders, or obtain a summary test or estimator when this eﬀect is constant across genders. We refer readers to the book by Senn5 (p. 43–44) that provides a systematic and complete discussion on these and other types of interactions. As noted by Senn,5 however, all the above interactions, including the carryover eﬀects, are not unique in crossover trials, and can exist in the parallel groups design as well. Given a ﬁxed period, the ratio of two mean frequencies of event occurrences on a given patient between treatment A and a placebo under model (1) is equal to RMAP ¼ expð1 Þ. If there is no diﬀerence in eﬀects between treatment A and placebo, the ratio of two mean frequencies RMAP ¼ 1 (or equivalently, 1 ¼ 0). When treatment A increases the frequency of responses, RMAP 4 1. When treatment A decreases the frequency of responses, RMAP 5 1. Similarly, the ratio of two mean frequencies between treatment B and placebo for a ﬁxed period on the same patient is RMBP ¼ expð2 Þ. Similar interpretations as those for RMAP are applicable to RMBP . Conditional upon patient i in group g ( ¼ 1, 2, . . . , 6), we can show that the random vector of frequencies ðYði1gÞ , Yði2gÞ , Yði3gÞ Þ0 of event occurrences on patient i assigned to group g at the three periods, given YðiþgÞ ¼ Yði1gÞ þ Yði2gÞ þ Yði3gÞ ¼ yðiþgÞ ﬁxed, follows the trinomial distribution with parameters yðiþgÞ and ð pð1gÞ , pð2gÞ , pð3gÞ Þ0 , where ð pð1gÞ , pð2gÞ , pð3gÞ Þ0 are deﬁned by (19)–(24) in Appendix 1. We deﬁne Png ð gÞ YðþtgÞ ¼ i¼1 Yit as the total frequency of responses over ng patients assigned to group g at period t ( ¼ 1, 2, 3). Note that because the vector of cell probabilities ð pð1gÞ , pð2gÞ , pð3gÞ Þ0 does not depend, as shown in Appendix 1, on random eﬀects ði gÞ under model (1), we can claim that the gÞ gÞ gÞ 0 random vector ðYðþ1 , Yðþ2 , Yðþ3 Þ of subtotal frequencies over in group g at the three periods Ppatients ng gÞ follows the trinomial distribution with parameters yðþþ (¼ i¼1 yðiþgÞ ) and ð pð1gÞ , pð2gÞ , pð3gÞ Þ0 .

2.1

Procedures for testing equality

ð1Þ ð1Þ 0 ð2Þ ð2Þ ð2Þ 0 Note that the equality 1 ¼ 2 ¼ 0 will imply that ð pð1Þ 1 , p2 , p3 Þ ¼ ð p1 , p2 , p3 Þ ¼ ð6Þ ð6Þ ð6Þ 0 ¼ ð p1 , p2 , p3 Þ . Thus, when testing the null hypothesis H0 : 1 ¼ 2 ¼ 0, we can apply commonly used Pearson’s chi-squared test to study whether the distributions of cell frequencies are identical among groups.27 We will reject H0 : 1 ¼ 2 ¼ 0 at the -level if 6 X 3 h i2 h i X ð gÞ ðþÞ ðþÞ ð gÞ ðþÞ 2 YðþtgÞ ðYðþÞ ð2Þ þt yþþ Þ=yþþ = ðYþt yþþ Þ=yþþ 4 ð10Þ, g¼1 t¼1

P6 ð gÞ 2 where yðþÞ þþ ¼ g¼1 yþþ and ðdf Þ is the upper 100()th percentile of the central chi-squared distribution with df degrees of freedom. To test H0 : 1 ¼ 2 ¼ 0, we may also apply the asymptotic likelihood ratio test based on the above conditional multinomial distribution of gÞ gÞ gÞ 0 27 ð0Þ ð0Þ 0 ð gÞ ð gÞ ð gÞ 0 ðYðþ1 , Yðþ2 , Yðþ3 Þ . We let ð pð0Þ 1 , p2 , p3 Þ denote the common value of ð p1 , p2 , p3 Þ (for g ¼ 1, 2, . . . , 6) under H0 : 1 ¼ 2 ¼ 0. Thus, we will reject H0 : 1 ¼ 2 ¼ 0 at the -level if " 2

6 X 3 X

YðþtgÞ

logðp^ðt gÞ Þ

g¼1 t¼1

gÞ where p^ðt gÞ ¼ YðþtgÞ =yðþþ and p^ ð0Þ t ¼

3 X

# YðþÞ þt

logðp^ ð0Þ t Þ

4 2 ð10Þ,

t¼1

P6

g¼1

YðþtgÞ =yðþÞ þþ .

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

ð3Þ

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

Lui and Chang

5

To estimate the parameter 1 in model (1), we can apply the following weighted-least-squares (WLS) estimator28 (Appendix 1) as given by " ^ ðWLSÞ 1

¼

3 X

=

^ Þ W1k logðOR 1k

3 X

# W1k

=2,

ð4Þ

i¼1

k¼1

^ 11 ¼ ðp^ ð1Þ p^ ð3Þ Þ=ðp^ ð1Þ p^ ð3Þ Þ, ^ 12 ¼ ðp^ð2Þ p^ ð4Þ Þ=ðp^ð2Þ p^ð4Þ Þ, ^ 13 ¼ ðp^ ð5Þ p^ð6Þ Þ=ðp^ ð5Þ p^ ð6Þ Þ, where OR OR OR 2 1 1 2 3 1 1 3 3 2 2 3 ð3Þ ð1Þ ð3Þ ð2Þ ð4Þ ð2Þ ð4Þ þ 1=Y þ 1=Y þ 1=Y Þ, W ¼ 1=ð1=Y þ 1=Y þ 1=Y þ 1=Y Þ, and W11 ¼ 1=ð1=Yð1Þ 12 þ2 þ1 þ1 þ2 þ3 þ1 þ1 þ3 ð5Þ ð6Þ ð5Þ ð6Þ W13 ¼ 1=ð1=Yþ3 þ 1=Yþ2 þ 1=Yþ2 þ 1=Yþ3 Þ. We can easily show that the asymptotic variance for (4) is the WLS estimator ^ ðWLSÞ 1 d ^ ðWLSÞ Þ ¼ 1 Varð 1

h

= 4XW 3

1k

i :

ð5Þ

k¼1

Similarly, to estimate the parameter 2 , we may use the WLS estimator (Appendix 1) as given by " ^ ðWLSÞ 2

¼

3 X

=

^ 2k Þ W2k logðOR

3 X

# W2k

=2,

ð6Þ

i¼1

k¼1

^ 22 ¼ ðp^ð2Þ p^ ð5Þ Þ=ðp^ð2Þ p^ð5Þ Þ, ^ 23 ¼ ðp^ ð3Þ p^ð4Þ Þ=ðp^ ð3Þ p^ ð4Þ Þ, ^ 21 ¼ ðp^ ð1Þ p^ ð6Þ Þ=ðp^ ð1Þ p^ ð6Þ Þ, OR OR where OR 3 1 1 3 2 1 1 2 3 2 2 3 ð6Þ ð1Þ ð6Þ ð2Þ ð5Þ ð2Þ ð5Þ þ 1=Y þ 1=Y þ 1=Y Þ, W ¼ 1=ð1=Y þ 1=Y þ 1=Y þ 1=Y Þ, and W21 ¼ 1=ð1=Yð1Þ 22 þ3 þ1 þ1 þ3 þ2 þ1 þ1 þ2 ð4Þ ð3Þ ð4Þ þ 1=Y þ 1=Y þ 1=Y Þ. Furthermore, we may obtain an asymptotic variance W23 ¼ 1=ð1=Yð3Þ þ3 þ2 þ2 þ3 (6) as estimator for ^ ðWLSÞ 2 d ^ ðWLSÞ Þ ¼ 1 Varð 2

3 h X

=4

i W2k :

ð7Þ

k¼1

Note that the two WLS estimators ^ ðWLSÞ (4) and ^ ðWLSÞ (6) are correlated. Using the delta method,27 1 2 d ^ ðWLSÞ , ^ ðWLSÞ Þ between ^ ðWLSÞ and ^ ðWLSÞ is given we can show that the estimated covariance Covð 1 2 1 2 by (27). When ng is large, we may employ normal approximation to test H0 : 1 ¼ 2 ¼ 0 based on (4)–(7) and Bonferroni’s inequality to adjust the inﬂation due to multiple tests in Type I error. We will reject H0 : 1 ¼ 2 ¼ 0 at the -level if either of the following two inequalities holds

=

qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ^ ^ ðWLSÞ varð Þ 4 Z=4 1

=

qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ^ ^ ðWLSÞ varð Þ 4 Z=4 2

j^ ðWLSÞ j 1 or j^ ðWLSÞ j 2

ð8Þ

where Z is the upper 100()th percentile of the standard normal distribution. Note that because the test procedure (8) does not account for dependence structure between ^ ðWLSÞ (4) and ^ ðWLSÞ (6), 1 2

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

6

Statistical Methods in Medical Research 0(0)

we may lose power. Thus, we consider the following bivariate test procedure accounting for d ^ ðWLSÞ , ^ ðWLSÞ Þ. We will reject H0 : 1 ¼ 2 ¼ 0 at the -level if Covð 1 2 ! ^ ðWLSÞ 1 ðWLSÞ ðWLSÞ ^ 1 ^ 1 , ^ 2 ð9Þ 4 2 ð2Þ ðWLSÞ ^ 2 d ^ ðWLSÞ Þ (5) and ^ is the estimated covariance matrix with diagonal elements equal to Varð where 1 d ^ ðWLSÞ , ^ ðWLSÞ Þ (27). d ^ ðWLSÞ Þ (7), and the oﬀ-diagonal element equal to Covð Varð 2 1 2 When treatments A and B are known to fall in the same relative direction as compared with the placebo, we may wish to account for this information to improve power and consider the following summary test procedure based on a weighted average of treatment eﬀects w^ ðWLSÞ þ ð1 wÞ^ ðWLSÞ , 1 2 where 0 5 w 5 1 is the weight reﬂecting the relative importance of treatment A to that of treatment B, and can be assigned by clinicians based on their subjective knowledge. This leads us to reject H0 : 1 ¼ 2 ¼ 0 at the -level if qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ðWLSÞ ðWLSÞ d ^ ðWLSÞ þ ð1 wÞ^ ðWLSÞ Þ 4 Z=2 þ ð1 wÞ^ 2 ð10Þ = Varðw w^ 1 1 2 d ^ ðWLSÞ , d ^ ðWLSÞ þ ð1 wÞ^ ðWLSÞ Þ ¼ w2 Varð d ^ ðWLSÞ Þ þ ð1 wÞ2 Varð d ^ ðWLSÞ Þ þ2wð1 wÞCovð where Varðw 1 2 1 2 1 ðWLSÞ Þ. If we have no prior preference to assign the weight w or feel equally important for the two ^ 2 treatments, we may set w equal to 0.50. For the purpose of illustration, we will focus our attention on w ¼ 0.50 in the following discussion.

2.2

Interval estimation

It is well known that one can ﬁnd signiﬁcance based on hypothesis testing even for a tiny diﬀerence of no clinical importance between treatments as long as the number of patients in a trial is large. Thus, we may often want to produce an interval estimator to assess the magnitude of the relative treatment eﬀect especially after obtaining signiﬁcant results. d ^ ðWLSÞ Þ (5), we obtain an asymptotic 100(1 )% conﬁdence On the basis of ^ ðWLSÞ (4) and Varð 1 1 interval for RMAP ( ¼ expð1 Þ) based on the WLS method as qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ d ^ ðWLSÞ Þ , exp ^ ðWLSÞ þ Z=2 Varð d ^ ðWLSÞ Þ Varð Z exp ^ ðWLSÞ =2 1 1 1 1

ð11Þ

When some of observed frequencies YðþtgÞ are not large, the weights used in the WLS estimator (4) can be subject to a large variation and thereby, interval estimator (11) can lose accuracy. Thus, we may also consider use of the Mantel-Haenszel (MH) estimator.27 The MH point estimator of RMAP is simply given by d ðMHÞ RM AP

¼

3 X k¼1

f11k f22k =fþþk

=

3 X

!1=2 f12k f21k =fþþk

k¼1

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

ð12Þ

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

Lui and Chang

7

where fþþk ¼ f11k þ f12k þ f21k þ f22k and ð f11k , f12k , f21k , f22k Þ for k ¼ 1, 2, 3 are deﬁned by (29) in ^ ðMHÞ Þ as Appendix 2. As shown elsewhere,27,29 we may obtain an asymptotic variance for logðOR AP 8P 9 P ðf11k þ f22k Þðf11k f22k Þ=f2þþk ½ðf11k þ f22k Þðf12k f21k Þ þ ðf12k þ f21k Þðf11k f22k Þ=f2þþk > > k k > > >

P P þ> þ

P 2 > > < = 2 f f =f f f =f k 11k 22k þþk k 12k 21k þþk 2 f f =f 11k 22k þþk ðMHÞ k d d P ÞÞ ¼ Varðlogð RM 2 AP > > k ðf12k þ f21k Þðf12k f21k Þ=fþþk > > > >

P 2 > > : ; 2 k f12k f21k =fþþk

=

4:

ð13Þ On the basis of (12) and (13), we obtain an asymptotic 100(1 )% conﬁdence interval for RMAP based on the MH estimator as given by qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ðMHÞ ðMHÞ ðMHÞ d d d d ðMHÞ ÞÞ : d d RM RMAP exp Z=2 VarðlogðRMAP ÞÞ , RMAP exp Z=2 Varðlogð ð14Þ AP d ^ ðWLSÞ Þ (7), we obtain an asymptotic 100(1 )% conﬁdence (6) and Varð On the basis of ^ ðWLSÞ 2 2 interval for RMBP (¼expð2 Þ) based on the WLS method as qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ðWLSÞ ðWLSÞ ðWLSÞ d d ^ ðWLSÞ Þ : Þ , exp ^ 2 ð15Þ Z=2 Varð^ 2 þ Z=2 Varð exp ^ 2 2 Also, when YðþtgÞ are not large for some t and g, we may consider the MH point estimator of RMBP as given by !1=2 3 3 X X ðMHÞ d RM ¼ f f =f f f =f ð16Þ 11k 22k

BP

k¼1

þþk

=

12k 21k

þþk

k¼1

where fþþk ¼ f11k þ f12k þ f21k þ f22k and ð f11k , f12k , f21k , f22k Þ for k ¼ 1, 2, 3 are deﬁned by (30) in Appendix 2. Thus, we obtain the corresponding asymptotic 100(1 )% conﬁdence interval for d ðMHÞ (16) as RMBP based on the MH estimator RM BP qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ðMHÞ ðMHÞ ðMHÞ d d d d ðMHÞ ÞÞ : d d RM RMBP exp Z=2 VarðlogðRMBP ÞÞ , RMBP exp Z=2 Varðlogð ð17Þ BP

d d ðMHÞ ÞÞ is obtained by substituting f for fijk in (13). where Varðlogð RM BP ijk

3 Monte Carlo simulations To evaluate the ﬁnite-sample performance of test procedures as well as interval estimators developed here, we employ Monte Carlo simulation. To account for the variation of ði gÞ between patients, we assume that the random eﬀect ði gÞ due to patient i in group g follows the normal distribution with mean ¼ log(1), log(3), and standard deviation ¼ 0.50, 1.0, and 2.0; these correspond to the underlying mean frequency (Eðexpðði gÞ ÞÞ ¼ expð þ 2 =2Þ) of occurrences ranging from approximate 1.0 (when ¼ 0 and ¼ 0.50) to more than 20 (when ¼ 3 and ¼2.0). To further account for the variation of the number ng of patients assigned to various groups, we assume that the number ng of patients assigned to group g also independently follows a Poisson distribution with mean size Eðng Þ ¼ 20, 30, and 50. In our simulations, we arbitrarily set the nuisance parameters for

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

8

Statistical Methods in Medical Research 0(0)

period eﬀects: 1 ¼ 0.10 and 2 ¼ 0.15. To evaluate the performance of asymptotic test procedures (2), (3), (8), (9), and (10), we ﬁrst calculate the estimated Type I error for testing H0 : 1 ¼ 2 ¼ 0 at the 0.05 level. To determine which test procedure is preferable to the others subject to Type I error less than or approximately equal to the nominal 0.05 level, we calculate the simulated power of these test procedures at the 0.05 level in the situations in which the ratio of mean frequencies between treatment A and the placebo, RMAP ð¼ expð1 ÞÞ ¼ 1, 1.2, and the ratio of mean frequencies between treatment B and the placebo, RMBP ð¼ expð2 ÞÞ ¼1.2, 1.5. Finally, to evaluate and compare the performance of interval estimators (11), (14), (15) and (17), we calculate the estimated coverage probability (for measuring the accuracy) and the average length (for measuring the precision) in the situations, in which ði gÞ independently follows the normal distribution with mean ¼ log(1), log(3) and standard deviation ¼1.0; the ratio of mean frequencies between treatments and placebo: RMAP ¼ 1.0, 1.2; RMBP ¼0.80, 1.0, 1.5; and the number ng of patients independently following the Poisson distribution with mean size Eðng Þ ¼ 20, 30, 50. For each conﬁguration determined by a combination of the above parameters, we write programs in SAS30 and generate 10,000 repeated samples, each consisting of ng trivariate responses ðYði1gÞ , Yði2gÞ , Yði3gÞ Þ according to model (1) per group g ( ¼ 1, 2, 3, 4, 5, 6) to calculate the estimated Type I error and the simulated power for test procedures, as well as the estimated coverage probability and average length for interval estimators. Note that if YðþtgÞ ¼ 0 for some g and t, the asymptotic test procedure (2) and likelihood ratio test (3) are inapplicable or the WLS estimators ^ ðWLSÞ and ^ ðWLSÞ are undeﬁned. 1 2 ð gÞ In our simulations, we exclude those simulated samples for which Yþt ¼ 0 for some g and t. We calculate the estimated Type I error, the simulated power, the estimated coverage probability, and average length over those samples for which YðþtgÞ > 0 for all g and t. For completeness, we also calculate the proportion of 10,000 simulated samples, for which YðþtgÞ ¼ 0 for some g and t, and hence asymptotic test procedures or WLS estimators developed here are inapplicable.

4 Results We ﬁrst note that the proportion of samples, for which the test procedures or WLS estimators developed here are inapplicable due to obtaining YðþtgÞ ¼ 0 for some g and t, is negligible (& 0.000) in all the situations considered here. We summarize in Table 2 the estimated Type I error for test procedures (2), (3), (8), (9), and (10). All these asymptotic test procedures can perform well even when the mean size (Eðng Þ ¼) n of patients per group is as small as 20 in all situations considered in Table 2; all the estimated Type I error agree well with the nominal 0.05 level. We summarize in Table 3 the simulated power for test procedures (2), (3), (8), (9), and (10). We may see that the powers of the two most commonly used test procedures (2) and (3) for the contingency table are generally less than those of the WLS test procedures (8) and (9). When there is only one of the two experimental treatments with a non-zero eﬀect as compared with the placebo, the bivariate test procedure (9) seems to be preferable to the others with respect to power. For example, when ¼ 0.50, expðÞ ¼ 1, RMAP ¼ 1, RMBP ¼1.2 and Eðng Þ ¼50, the powers for procedures (2), (3), (8), (9), and (10) are 0.503, 0.500, 0.642, 0.761, and 0.298, respectively. Using procedure (9) can gain 50% more power than using Pearson’s chi-squared test (2) and likelihood ratio test (3) in this case. When the two experimental treatments have the same magnitude of eﬀects in the same direction, we note that the summary test procedure (10) is probably the best with respect to power (Table 3). When one of two experimental treatments has an eﬀect relatively large to placebo, the test procedure (8) based on the univariate test procedure with Bonferroni’s equality can be still of use to improve power (Table 3).

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

Lui and Chang

9

Table 2. The estimated Type I error for asymptotic test procedures (2), (3), (8), (9), and (10) at the 0.05 level in the situations in which the random effect ði gÞ follows the normal distribution with mean ¼ log(1), log(3) and standard deviation ¼0.50, 1.0, 2.0; and the number ng of patients independently follows a Poisson distribution with mean size Eðng Þ ¼20, 30, 50.

expðÞ

Eðng Þ

(2)

(3)

(8)

(9)

(10)

0.5

1

20 30 50 20 30 50 20 30 50 20 30 50 20 30 50 20 30 50

0.047 0.050 0.054 0.047 0.048 0.050 0.050 0.050 0.054 0.052 0.050 0.049 0.049 0.056 0.047 0.048 0.050 0.048

0.051 0.052 0.056 0.049 0.048 0.052 0.053 0.052 0.054 0.053 0.051 0.049 0.050 0.057 0.047 0.049 0.050 0.048

0.041 0.049 0.046 0.045 0.045 0.049 0.045 0.043 0.051 0.046 0.050 0.044 0.045 0.049 0.048 0.046 0.048 0.049

0.043 0.051 0.050 0.047 0.049 0.052 0.050 0.045 0.053 0.049 0.050 0.047 0.048 0.053 0.051 0.049 0.051 0.052

0.048 0.054 0.051 0.048 0.048 0.050 0.048 0.051 0.051 0.048 0.051 0.049 0.049 0.054 0.051 0.045 0.050 0.051

3

1.0

1

3

2.0

1

3

Note: Each entry is calculated on the basis of 10,000 repeated samples.

When evaluating the performance of interval estimators (11), (14), (15), and (17), we summarize in Table 4 the estimate coverage probability and average length (in parenthesis) of the 95% conﬁdence intervals for RMAP and RMBP in a variety of situations. We ﬁnd that these interval estimators all perform well; the estimated coverage probability is larger than or approximately equal to the desired 95% conﬁdence level. Furthermore, we note that interval estimators based on the WLS and MH estimators are essentially of equal precision with respect to the estimated average length in all the situations considered in Table 4.

5 An example Consider the data in Table 1 regarding a double-blind three-treatment three-period crossover trial comparing salbutamol (400 mg four times daily), salmeterol (50 mg twice daily), and placebo in asthma patients. When applying test procedures (2), (3) and (8)–(10) to test H0 : 1 ¼ 2 ¼ 0, we obtain the corresponding p-values to be 0.000, 0.000, 0.000, 0.000, and 0.008, respectively. All these small values suggest that there be strongly signiﬁcant evidence that the mean numbers of exacerbations are diﬀerent between treatments. When assessing the magnitude of the relative treatment eﬀects between salbutamol and placebo, we obtain the exponential transformation of d ðEWLSÞ ð¼ expð^ ðWLSÞ ÞÞ ¼ 0.948 and RM d ðMHÞ ¼0.955, as well as WLS and MH point estimators RM AP AP 1 the corresponding 95% conﬁdence intervals as (0.647, 1.388) and (0.681, 1.338). Although taking salbutamol can slightly reduce the mean number of exacerbations as compared with the placebo, there is no evidence that this decrease is signiﬁcant at the 0.05 level, because the above resulting conﬁdence intervals cover 1. When comparing salmeterol with placebo, we obtain

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

10

Statistical Methods in Medical Research 0(0)

Table 3. The estimated power for asymptotic test procedures (2), (3), (8), (9), and (10) at the 0.05 level in the situations in which the random effect ði gÞ follows the normal distribution with mean ¼ log(1) and log(3); standard deviation ¼ 0.50 and 1.0; the ratio of mean frequencies between treatments and placebo: RMAP ¼1.0 and 1.2; RMBP ¼1.2 and 1.5; and the number ng of patients independently follows a Poisson distribution with mean size Eðng Þ ¼ 20, 30, and 50.

expðÞ

RMAP

RMBP

Eðng Þ

(2)

(3)

(8)

(9)

(10)

0.5

1

1.0

1.2

20 30 50 20 30 50 20 30 50 20 30 50 20 30 50 20 30 50 20 30 50 20 30 50 20 30 50 20 30 50 20 30 50 20 30 50 20 30 50 20 30 50

0.193 0.296 0.503 0.885 0.979 1.000 0.188 0.275 0.481 0.753 0.925 0.996 0.578 0.796 0.969 1.0003 1.0003 1.0003 0.564 0.777 0.966 0.999 1.0003 1.0003 0.278 0.428 0.684 0.969 0.998 1.0003 0.262 0.417 0.664 0.903 0.987 1.0003 0.773 0.935 0.997 1.000 1.000 1.000

0.198 0.297 0.500 0.881 0.978 1.000 0.199 0.287 0.491 0.758 0.926 0.996 0.574 0.793 0.968 1.0003 1.0003 1.0003 0.575 0.785 0.968 0.999 1.0003 1.0003 0.279 0.427 0.679 0.967 0.998 1.0003 0.272 0.430 0.676 0.904 0.987 1.0003 0.768 0.933 0.997 1.0003 1.0003 1.0003

0.271 0.402 0.642 0.932 0.992 1.000 0.394 0.551 0.785 0.9353 0.9923 1.0003 0.718 0.891 0.988 1.0003 1.0003 1.0003 0.841 0.959 0.998 1.0003 1.0003 1.0003 0.368 0.558 0.804 0.983 0.999 1.0003 0.516 0.714 0.909 0.9823 0.9993 1.0003 0.856 0.969 0.999 1.0003 1.0003 1.0003

0.3513 0.5193 0.7613 0.9753 0.9983 1.0003 0.354 0.508 0.752 0.932 0.991 1.0003 0.8253 0.9493 0.9983 1.0003 1.0003 1.0003 0.818 0.950 0.998 1.0003 1.0003 1.0003 0.4733 0.6723 0.8953 0.9953 1.0003 1.0003 0.475 0.682 0.896 0.979 0.9993 1.0003 0.9263 0.9903 1.0003 1.0003 1.0003 1.0003

0.134 0.195 0.298 0.519 0.705 0.910 0.4413 0.6143 0.8293 0.854 0.966 0.998 0.346 0.479 0.702 0.941 0.990 1.0003 0.8863 0.9773 0.9993 0.999 1.0003 1.0003 0.173 0.264 0.401 0.669 0.853 0.977 0.5763 0.7743 0.9403 0.944 0.993 1.0003 0.468 0.636 0.848 0.985 0.999 1.0003

1.5

1.2

1.2

1.5

3

1.0

1.2

1.5

1.2

1.2

1.5

1.0

1

1.0

1.2

1.5

1.2

1.2

1.5

3

1.0

1.2

1.5

(continued)

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

Lui and Chang

11

Table 3. Continued

expðÞ

RMAP

RMBP

Eðng Þ

(2)

(3)

(8)

(9)

(10)

1.2

1.2

20 30 50 20 30 50

0.753 0.925 0.997 1.0003 1.0003 1.0003

0.761 0.929 0.997 1.0003 1.0003 1.0003

0.945 0.991 1.0003 1.0003 1.0003 1.0003

0.937 0.990 1.0003 1.0003 1.0003 1.0003

0.9693 0.9963 1.0003 1.0003 1.0003 1.0003

1.5

Note: Each entry is calculated on the basis of 10,000 repeated samples.

Table 4. The estimated coverage probability and average length (in parenthesis) of asymptotic 95% confidence interval using (11), (14), (15), and (17) in the situations in which the random effect ið gÞ follows the normal distribution with mean ¼ log(1), log(3) and standard deviation ¼1.0; the ratio of mean frequencies between treatments and placebo: RMAP ¼1.0 and 1.2; RMBP ¼ 0.80, 1.0, and 1.5; and the number ng of patients independently follows a Poisson distribution with mean size Eðng Þ ¼ 20, 30, and 50. expðÞ

RMAP

RMBP

Eðng Þ

(11)

(14)

(15)

(17)

1

1.0

0.8

20

0.955 (0.405) 0.951 (0.322) 0.952 (0.246) 0.952 (0.405) 0.948 (0.322) 0.950 (0.246) 0.956 (0.405) 0.954 (0.323) 0.949 (0.246) 0.952 (0.465) 0.952 (0.370) 0.947 (0.282) 0.952 (0.464) 0.952 (0.371)

0.954 (0.402) 0.951 (0.321) 0.951 (0.245) 0.949 (0.403) 0.947 (0.321) 0.949 (0.245) 0.954 (0.402) 0.953 (0.322) 0.949 (0.245) 0.950 (0.463) 0.952 (0.369) 0.947 (0.282) 0.951 (0.462) 0.951 (0.369)

0.955 (0.344) 0.953 (0.274) 0.950 (0.208) 0.952 (0.405) 0.951 (0.322) 0.948 (0.246) 0.952 (0.553) 0.950 (0.442) 0.949 (0.337) 0.952 (0.344) 0.949 (0.274) 0.954 (0.209) 0.952 (0.404) 0.951 (0.323)

0.954 (0.342) 0.952 (0.272) 0.949 (0.208) 0.950 (0.402) 0.950 (0.321) 0.948 (0.245) 0.951 (0.551) 0.948 (0.441) 0.949 (0.336) 0.949 (0.342) 0.949 (0.273) 0.954 (0.208) 0.950 (0.402) 0.950 (0.321)

30 50 1.0

20 30 50

1.5

20 30 50

1.2

0.8

20 30 50

1.0

20 30

(continued)

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

12

Statistical Methods in Medical Research 0(0)

Table 4. Continued expðÞ

RMAP

RMBP

1.5

Eðng Þ

(11)

(14)

(15)

(17)

50

0.951 (0.282) 0.951 (0.464) 0.949 (0.371) 0.951 (0.283) 0.952 (0.229) 0.951 (0.184) 0.950 (0.141) 0.954 (0.230) 0.952 (0.184) 0.949 (0.141) 0.950 (0.229) 0.951 (0.184) 0.952 (0.141) 0.950 (0.264) 0.949 (0.212) 0.944 (0.162) 0.954 (0.263) 0.951 (0.212) 0.948 (0.162) 0.949 (0.264) 0.950 (0.212) 0.951 (0.162)

0.951 (0.281) 0.950 (0.462) 0.949 (0.370) 0.950 (0.282) 0.952 (0.229) 0.951 (0.184) 0.950 (0.141) 0.954 (0.229) 0.952 (0.184) 0.949 (0.141) 0.949 (0.229) 0.951 (0.184) 0.952 (0.141) 0.950 (0.263) 0.948 (0.211) 0.945 (0.162) 0.953 (0.263) 0.950 (0.212) 0.948 (0.162) 0.949 (0.263) 0.950 (0.211) 0.951 (0.162)

0.952 (0.246) 0.954 (0.553) 0.951 (0.442) 0.954 (0.337) 0.952 (0.195) 0.952 (0.156) 0.951 (0.120) 0.952 (0.230) 0.953 (0.184) 0.948 (0.141) 0.952 (0.314) 0.953 (0.252) 0.951 (0.193) 0.948 (0.195) 0.949 (0.156) 0.953 (0.120) 0.951 (0.229) 0.951 (0.184) 0.950 (0.141) 0.948 (0.314) 0.951 (0.252) 0.950 (0.193)

0.952 (0.245) 0.953 (0.551) 0.950 (0.441) 0.954 (0.336) 0.951 (0.194) 0.951 (0.156) 0.951 (0.120) 0.951 (0.229) 0.952 (0.184) 0.947 (0.141) 0.952 (0.313) 0.953 (0.252) 0.951 (0.193) 0.947 (0.195) 0.948 (0.156) 0.953 (0.120) 0.950 (0.229) 0.951 (0.184) 0.950 (0.141) 0.948 (0.314) 0.951 (0.252) 0.950 (0.193)

20 30 50

3

1.0

0.8

20 30 50

1.0

20 30 50

1.5

20 30 50

1.2

0.8

20 30 50

1.0

20 30 50

1.5

20 30 50

Note: Each entry is calculated on the basis of 10,000 repeated samples.

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

Lui and Chang

13

d ðEWLSÞ ð¼ expð^ ðWLSÞ ÞÞ ¼ 0.430 and RM d ðMHÞ ¼ 0.433, as well as the corresponding 95% RM BP BP 2 conﬁdence intervals as (0.279, 0.664) and (0.283, 0.661). Both of these point estimates suggest that taking salmeterol can reduce the mean number of exacerbations by more than 50% as compared with the placebo. Furthermore, because the above resulting upper conﬁdence limits fall below 1, there is signiﬁcant evidence that taking salmeterol can reduce the mean number of exacerbations as compared with the placebo at the 5% level.

6 Discussion Note that if we wish to compare salmeterol and salbutamol in the above example, we may obtain the 95% conﬁdence interval for the ratio RMBA ð¼ expð2 1 ÞÞ between these two treatments by use of qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ðWLSÞ ðWLSÞ d ^ ðWLSÞ ^ ðWLSÞ Þ ^ 1 Z=2 Varð exp ^ 2 2 1 d ^ ðWLSÞ ^ ðWLSÞ Þ ¼ Varð d ^ ðWLSÞ Þ þ Varð d ^ ðWLSÞ Þ 2Covð d ^ ðWLSÞ , ^ ðWLSÞ Þ. Using the data where Varð 2 1 2 1 2 1 d BA ¼ 0.454 (¼expð^ ðWLSÞ ^ ðWLSÞ Þ) with considered in Table 1, we obtain the point estimate RM 2 1 the 95% conﬁdence interval given by (0.282, 0.732). We can also apply the same idea as for deriving Þ (for i ¼ 1, 2) and obtain the WLS estimator based on (28) (Appendix 1) and its expð^ ðEWLSÞ i corresponding 95% conﬁdence interval for RMBA . Using the data in Table 1, we obtain d ðWLSÞ ¼ 0.440 with the 95% conﬁdence interval given by (0.263, 0.734). Because both upper RM BA limits of these resulting conﬁdence intervals are below 1, we may conclude that there is signiﬁcant evidence that taking salmeterol can reduce the mean number of exacerbations as compared with taking salbutamol at the 5% level too. Using the random eﬀects exponential multiplicative risk model (1), we note that the two most commonly used test procedures (2) and (3) for the contingency table can be applied to study whether there is a diﬀerence in eﬀects between treatments for a three-period crossover trial in Poisson frequency data. We further note that one can generally improve power of these two procedures by employing the WLS procedures (8) and (9). We ﬁnd that when one of two treatments RMAP and RMBP is close to 1, the bivariate test procedure (9) is likely to be the best among all procedures considered here. In these cases, because there is a substantial probability of obtaining a negative value of ^ ðWLSÞ (4) when AP ¼ 0 (or equivalently, RMAP ¼ 1), the numerator in test procedure (10) 1 can be small due to cancelation between possibly negative and positive values of ^ ðWLSÞ and ^ ðWLSÞ , 1 2 and hence the summary test procedure (10) is lack of power. When both RMAP and RMBP (or equivalently 1 and 2 ) are of equal magnitude in the same relative direction, we note that the summary test procedure (10) can be preferable to the others. When there is at least one of the two relative treatment eﬀects RMAP and RMBP (or equivalently, 1 and 2 ) far away from 1, the WLS test procedure (8) with use of Bonferroni’s inequality can be of use. We note that the power for all test procedures increases as the standard deviation of the random eﬀect ði gÞ increases. This is because the underlying mean frequency Eðði gÞ Þ ¼ expð þ 2 =2Þ, which is an increasing function of . Furthermore, we may easily see that both the estimated asymptotic d ^ ðWLSÞ Þ and Varð d ^ ðWLSÞ Þ tend to decrease as the underlying frequencies of event variances Varð 1 2 occurrences increase. Therefore, the larger the standard deviation , the larger is the power of the test procedure (Table 3) or the shorter is the average length of an interval estimator (not shown here for brevity).

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

14

Statistical Methods in Medical Research 0(0)

When investigating the relative period eﬀect measured by the ratio of mean frequencies, for example, RM21 ð¼ expð1 ÞÞ between periods 2 and 1, we can easily show that RM21 is equal to ð3Þ ð1Þ ð3Þ 1=2 ð5Þ ð2Þ ð5Þ 1=2 ð6Þ ð4Þ ð6Þ 1=2 expð1 Þ ¼ ½ð pð1Þ ¼ ½ð pð2Þ ¼½ð pð4Þ : 2 p2 Þ=ð p1 p1 Þ 2 p2 Þ=ð p1 p1 Þ 2 p2 Þ=ð p1 p1 Þ

ð18Þ

Using the same ideas as before, we can derive the WLS and MH estimators for 1 based on (18). Similarly, we can derive the WLS and MH estimators for 2 , as well as test procedures and interval estimators considered previously to study period eﬀects. Finally, we note that it is simply straightforward to extend the above results to accommodate a four-treatment four-period crossover trial. Because these results are quite tedious, however, we outline only the ideas in Appendix 3. In summary, we have proposed a random eﬀects exponential multiplicative risk model for the Poisson frequency data under a three-treatment three-period crossover design. We have developed test procedures for testing equality of treatments as well as interval estimators for the ratio of mean frequencies between treatments. We have employed Monte Carlo simulations to evaluate the performance of these test procedures and interval estimators in a variety of situations. We have found that the test procedures and interval estimators developed here can perform well even when the number of patients is moderate. The results, the ﬁndings, and the discussions should have use for biostatisticians and clinicians when they encounter count data and wish to compare three treatments under a three-period crossover design. Funding This research received no speciﬁc grant from any funding agency in the public, commercial, or not-for-proﬁt sectors.

Acknowledgements The authors want to thank Prof. Stephen Senn and two anonymous reviewers for their valuable comments and suggestions to improve the contents and clarity of this paper. The authors also wish to express their greatest appreciations to Profs G Peter Herbison and D Robin Taylor at the University of Otago for providing the data considered in the Example.

References 1. Taylor DR, Town GI, Herbison GP, et al. Asthma control during long term treatment with regular inhaled salbutamol and salmeterol. Thorax 1998; 53: 744–752. 2. Nicholson KG, Nguyen-Van-Tam JS, Ahmed AH, et al. Randomised placebo-controlled crossover trial on effect of inactivated influenza vaccine on pulmonary function in asthma. Lancet 1998; 351: 326–331. 3. Hills M and Armitage P. The two-period cross-over clinical trial. British Journal of Clinical Pharmacology 1979; 8: 7–20. 4. Fleiss JL. The design and analysis of clinical experiments. New York: Wiley, 1986. 5. Senn S. Cross-over trials in clinical research, 2nd ed. Chichester: Wiley, 2002. 6. Sinn S. Carry-over in cross-over trials in bioequivalence: theoretical concerns and empirical evidence. Parma Stat 2004; 3: 133–142. 7. Sinn S. Cross-over trials in statistics in medicine: the first ‘25’ years. Stat Med 2006; 25: 3430–3442.

8. Jones B and Kenward MG. Design and analysis of crossover trials. London: Chapman & Hall, 1989. 9. Ratkowsky DA, Evans MA and Alldredge JR. Cross-over experiments: Design, analysis and application. New York, NY: Marcel Dekker, 1993. 10. Grizzle JE. The two-period change-over design and its use in clinical trials. Biometrics 1965; 21: 467–480. 11. Schouten H and Kester A. A simple analysis of a simple crossover trial with a dichotomous outcome measure. Stat Med 2010; 29: 193–198. 12. Lui K-J and Chang K-C. Test non-inferiority (and equivalence) based on the odds ratio under a simple crossover trial. Stat Med 2011; 30: 1230–1242. 13. Ezzet F and Whitehead J. A random effects model for binary data from crossover clinical trials. Appl Stat 1992; 41: 117–126. 14. Gart JJ. An exact test for comparing matched proportions in crossover designs. Biometrika 1969; 56: 75–80.

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

Lui and Chang

15

15. Zimmermann H and Rahlfs V. Test hypotheses in the twoperiod change-over with binary data. Biom J 1978; 20: 133–141. 16. Lui K-J and Chang K-C. Estimation of the proportion ratio under a simple crossover trial. Comput Stat Data Analysis 2012; 56: 522–530. 17. Lemanske RF Jr, Mauger DT, Sorkness CA, et al. Step-up therapy for children with uncontrolled asthma receiving inhaled corticosteroids. New Engl J Med 2010; 362: 975–985. 18. Taylor DR, Drazen JM, Herbison GP, et al. Asthma exacerbations during long term b agonist use: influence of b2 adrenoceptor polymorphism. Thorax 2000; 55: 72–767. 19. Patefield M. Conditional and exact tests in crossover trials. J Biopharm Stat 2000; 10: 109–129. 20. Layard MW and Arveson JN. Analysis of Poisson data in crossover experimental designs. Biometrics 1978; 34: 421–428. 21. Lui K-J and Chang K-C. A semi-parametric approach to the frequency of occurrence under a simple crossover trial. Stat Meth Med Res, (in press). 22. Lui K-J and Chang K-C. Analysis of Poisson frequency data under a simple crossover trial. Stat Meth Med Res, (in press).

23. Lui K-J. Sample size determination for testing equality in Poisson frequency data under an AB/BA crossover trial. Pharm Stat 2013; 12: 74–81. 24. Fleiss JL. A critique of recent research on the twotreatment crossover design. Control Clin Trials 1989; 10: 237–243. 25. Sinn SJ. Is the simple carry-over model useful? Stat Med 1992; 11: 715–726. 26. Schouten H and Kester A. A simple analysis of a simple crossover trial with a dichotomous outcome measure. Stat Med 2010; 29: 193–198. 27. Agresti A. Categorical data analysis. New York: Wiley, 1990. 28. Fleiss JL. Statistical methods for rates and proportion, 2nd ed. New York: Wiley, 1981. 29. Robins J, Breslow N and Greenland S. Estimators of the Mantel-Haenszel variance consistent in both sparse data and large-strata limiting models. Biometrics 1986; 42: 311–323. 30. Institute Inc SAS. SAS Language, reference version 6, 1st ed. Cary, NC: SAS Institute, 1990.

Appendix 1 Conditional upon patient i in group g (¼ 1, 2, . . . , 6), we can show that the random vector of frequenciesðYði1gÞ , Yði2gÞ , Yði3gÞ Þ0 , given YðiþgÞ ¼ Yði1gÞ þ Yði2gÞ þ Yði3gÞ ¼ yðiþgÞ ﬁxed, follows the multinomial distribution with parameters yðiþgÞ and ð pð1gÞ , pð2gÞ , pð3gÞ Þ0 , where for g ¼ 1 pð1Þ 1 ¼ 1=ð1 þ expð1 þ 1 Þ þ expð2 þ 2 ÞÞ, pð1Þ 2 ¼ expð1 þ 1 Þ=ð1 þ expð1 þ 1 Þ þ expð2 þ 2 ÞÞ,

ð19Þ

pð1Þ 3 ¼ expð2 þ 2 Þ=ð1 þ expð1 þ 1 Þ þ expð2 þ 2 ÞÞ; for g ¼ 2 pð2Þ 1 ¼ 1=ð1 þ expð2 þ 1 Þ þ expð1 þ 2 ÞÞ, pð2Þ 2 ¼ expð2 þ 1 Þ=ð1 þ expð2 þ 1 Þ þ expð1 þ 2 ÞÞ,

ð20Þ

pð2Þ 3 ¼ expð1 þ 2 Þ=ð1 þ expð2 þ 1 Þ þ expð1 þ 2 ÞÞ; for g ¼ 3 pð3Þ 1 ¼ expð1 Þ=ðexpð1 Þ þ expð1 Þ þ expð2 þ 2 ÞÞ, pð3Þ 2 ¼ expð1 Þ=ðexpð1 Þ þ expð1 Þ þ expð2 þ 2 ÞÞ, pð3Þ 3 ¼ expð2 þ 2 Þ=ðexpð1 Þ þ expð1 Þ þ expð2 þ 2 ÞÞ;

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

ð21Þ

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

16

Statistical Methods in Medical Research 0(0)

for g ¼ 4 pð4Þ 1 ¼ expð1 Þ=ðexpð1 Þ þ expð1 þ 2 Þ þ expð2 ÞÞ, pð4Þ 2 ¼ expð1 þ 2 Þ=ðexpð1 Þ þ expð1 þ 2 Þ þ expð2 ÞÞ,

ð22Þ

pð4Þ 3 ¼ expð2 Þ=ðexpð1 Þ þ expð1 þ 2 Þ þ expð2 ÞÞ; for g ¼ 5 pð5Þ 1 ¼ expð2 Þ=ðexpð2 Þ þ expð1 Þ þ expð1 þ 2 ÞÞ, pð5Þ 2 ¼ expð1 Þ=ðexpð2 Þ þ expð1 Þ þ expð1 þ 2 ÞÞ,

ð23Þ

pð5Þ 3 ¼ expð1 þ 2 Þ=ðexpð2 Þ þ expð1 Þ þ expð1 þ 2 ÞÞ; for g ¼ 6 pð6Þ 1 ¼ expð2 Þ=ðexpð2 Þ þ expð1 þ 1 Þ þ expð2 ÞÞ, pð6Þ 2 ¼ expð1 þ 1 Þ=ðexpð2 Þ þ expð1 þ 1 Þ þ expð2 ÞÞ, pð6Þ 3

ð24Þ

¼ expð2 Þ=ðexpð2 Þ þ expð1 þ 1 Þ þ expð2 ÞÞ:

Note that the vector of cell probabilities ð pð1gÞ , pð2gÞ , pð3gÞ Þ0 deﬁned in (19)–(24) does not depend on Png ð gÞ ð gÞ ð gÞ 0 random eﬀects ði gÞ . Thus, the sum of independent random vectors i¼1 ðYi1 , Yi2 , Yit Þ , each ð gÞ following a conditional multinomial distribution with parameters yiþ and the vector of constant cell probabilities ð pð1gÞ , pð2gÞ , pð3gÞ Þ0 , follows the conditional multinomial distribution with parameters Png ð gÞ ð gÞ ð gÞ ð gÞ ð gÞ 0 i¼1 yiþ ð¼ yþþ Þ and ð p1 , p2 , p3 Þ . Note also that the equality 1 ¼ 2 ¼ 0 implies that ð1Þ ð1Þ 0 ð2Þ ð2Þ ð2Þ 0 ð6Þ ð6Þ ð6Þ 0 When testing the null hypothesis ð pð1Þ 1 , p2 , p3 Þ ¼ ð p1 , p2 , p3 Þ ¼ ¼ ð p1 , p2 , p3 Þ . H0 : 1 ¼ 2 ¼ 0, we can apply the commonly used test procedures, such as Pearson’s chi-squared test and asymptotic likelihood ratio test, for testing whether the distributions of cell frequencies are identical between groups g. On the basis of (19)–(24), we can easily show that the ratio of mean frequencies between treatment A and placebo is equal to ð3Þ ð1Þ ð3Þ 1=2 ð4Þ ð2Þ ð4Þ 1=2 ð6Þ ð5Þ ð6Þ 1=2 RMAP ¼ expð1 Þ ¼ ½ð pð1Þ ¼ ½ð pð2Þ ¼ ½ð pð5Þ : ð25Þ 2 p1 Þ=ð p1 p2 Þ 3 p1 Þ=ð p1 p3 Þ 3 p2 Þ=ð p2 p3 Þ

Similarly, we can show from (19) to (24) that the ratio of mean frequencies between treatment B and placebo is ð6Þ ð1Þ ð6Þ 1=2 ð5Þ ð2Þ ð5Þ 1=2 ð4Þ ð3Þ ð4Þ 1=2 expð2 Þ ¼ ½ð pð1Þ ¼ ½ð pð2Þ ¼ ½ð pð3Þ : 3 p1 Þ=ð p1 p3 Þ 2 p1 Þ=ð p1 p2 Þ 3 p2 Þ=ð p2 p3 Þ

ð26Þ

On the basis of (25) and (26), the WLS estimators27 for 1 and 2 are given by (4) and (6), respectively.

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

Lui and Chang

17

Note that the two WLS estimators ^ ðWLSÞ (4) and ^ ðWLSÞ (6) are correlated. Using the delta 1 2 26 method, we can show that the estimated covariance between ^ ðWLSÞ and ^ ðWLSÞ is 1 2 1 d ^ ðWLSÞ ; ^ ðWLSÞ ¼ h P i Cov P3 1 2 3 4 k¼1 W1k k¼1 W2k 8 ! !! ð2Þ ð2Þ 9 ð1Þ ð1Þ > > ^ ^ p p ^ p > > 2 > > d log 2 ; log 3 d log 3ð2Þ ; log p^ð2Þ þ W12 W22 Cov > > W11 W21 Cov > > ð1Þ ð1Þ ^1 ^1 p p > > p^ 1 p^ 1 > > > > > > > > > > > > > = < ð3Þ ð3Þ ð4Þ ð4Þ > p^ 3 p^3 3 ; W11 W23 Cov d log p^2ð3Þ ; log p^ð3Þ d W Cov log þ W ; log 12 23 > > p^1 p^2 p^ ð4Þ p^ð4Þ > > 1 2 > > > > > > > > > > > > > > > > ð5Þ ð6Þ ð6Þ ð5Þ > > ^ ^ ^ ^ p p p p > > 3 3 3 2 d d > > þ W13 W21 Cov log ð6Þ ; log ð6Þ ; : W13 W22 Cov log ð5Þ ; log ð5Þ p^ 2

p^1

p^2

p^ 1

ð27Þ where d log Cov

d log Cov

d log Cov

p^ ð1Þ 2

!

p^ ð1Þ 1 p^ ð3Þ 2

p^ ð5Þ 2

, log

p^ ð3Þ 3

, log

p^ ð5Þ 2

!

p^ ð3Þ 1 p^ ð5Þ 3

, log

p^ ð1Þ 3

!

!!

p^ ð1Þ 1

¼

ð1=Yð3Þ þ2 Þ,

¼

ð1=Yð5Þ þ2 Þ,

p^ð2Þ 3 p^ð2Þ 1

!!

p^ ð3Þ 2

p^ ð5Þ 1

d log Cov

¼

1=Yð1Þ þ1 ,

d log Cov

! , log

p^ ð4Þ 3 p^ ð4Þ 1

!! d log and Cov

p^ ð2Þ 2

!! ¼ 1=Yð2Þ þ1 ,

p^ ð2Þ 1

!

p^ð4Þ 3

, log

p^ð6Þ 3 p^ð6Þ 2

p^ð4Þ 2

! , log

!! ¼ 1=Yð4Þ þ3 , p^ ð6Þ 3 p^ ð6Þ 1

!! ¼ 1=Yð6Þ þ3 :

Note that we can show that the ratio of mean frequencies between treatments B and A is equal to ð2Þ ð1Þ ð2Þ 1=2 ð5Þ ð3Þ ð5Þ 1=2 ð6Þ ð4Þ ð6Þ 1=2 RMBA ¼ expð2 1 Þ ¼ ½ð pð1Þ ¼ ½ð pð3Þ ¼ ½ð pð4Þ : 3 p2 Þ=ð p2 p3 Þ 3 p1 Þ=ð p1 p3 Þ 2 p1 Þ=ð p1 p2 Þ

ð28Þ On the basis of (28), we can derive the WLS estimator for logðRMBA Þð¼ 2 1 Þ as well.

Appendix 2 For convenience, we deﬁne the vector of frequencies ð f11k , f12k , f21k , f22k Þ (for k ¼ 1, 2, 3) as ð3Þ ð1Þ ð3Þ (f111 ¼ Yð1Þ þ2 , f121 ¼ Yþ2 , f211 ¼ Yþ1 , f221 ¼ Yþ1 ),

ð4Þ ð2Þ ð4Þ f112 ¼ Yð2Þ , f ¼ Y , f ¼ Y , f ¼ Y 122 212 222 þ3 þ3 þ1 þ1 ,

and

ð6Þ ð5Þ ð6Þ f113 ¼ Yð5Þ þ3 , f123 ¼ Yþ3 , f213 ¼ Yþ2 , f223 ¼ Yþ2 :

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

ð29Þ

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

18

Statistical Methods in Medical Research 0(0)

Also, we deﬁne the vector of frequencies ð f11k , f12k , f21k , f22k Þ (for k ¼ 1, 2, 3) as ð6Þ ð1Þ ð6Þ ð2Þ ð5Þ ð2Þ ð5Þ f111 ¼ Yð1Þ , f ¼ Y , f ¼ Y , f ¼ Y ¼ Y , f ¼ Y , f ¼ Y , f ¼ Y , f 112 þ3 121 þ3 211 þ1 221 þ1 þ2 122 þ2 212 þ1 222 þ1 , ð4Þ ð3Þ ð4Þ ¼ Yð3Þ ð30Þ f113 þ3 , f123 ¼ Yþ3 , f213 ¼ Yþ2 , f223 ¼ Yþ2 :

Appendix 3 Suppose that we compare three experimental treatments, A, B and C with the placebo (P) in a four-period crossover trial. There are then 24 groups with diﬀerence treatment-receipt sequences WX-Y-Z at the four periods. We deﬁne group g ( ¼ 1 for P-A-B-C; ¼ 2 P-A-C-B; ¼ 3 for P-B-A-C; ¼ 4 for P-B-C-A; ¼ 5 for P-C-A-B; g ¼ 6 for P-C-B-A; ¼ 7 for A-P-B-C; ¼ 8 A-P-C-B; ¼ 9 for A-B-PC; ¼ 10 for A-B-C-P; ¼ 11 for A-C-P-B; g ¼ 12 for A-C-B-P; ¼ 13 for B-P-A-C; ¼ 14 B-P-C-A; ¼ 15 for B-A-P-C; ¼ 16 for B-A-C-P; ¼ 17 for B-C-P-A; g ¼ 18 for B-C-A-P; g ¼ 19 for C-P-A-B; ¼ 20 for C-P-B-A; ¼ 21 C-A-P-B; ¼ 22 for C-A-B-P; ¼ 23 for C-B-P-A; ¼ 24 for C-B-A-P). We assume that the random frequency YðitgÞ of event occurrences on patient i (i ¼ 1, 2, . . . , ng ) assigned to group g (g ¼ 1, 2, . . . , 24) at period t (t ¼ 1, 2, 3, 4) follows the Poisson distribution with mean that can be modeled as EðYðitgÞ Þ ¼ expðði gÞ þ 1 Xðit1gÞ þ 2 Xðit2gÞ þ 3 Xðit3gÞ þ 1 1ði1gÞ ðt ¼ 2Þ þ 2 1ði2gÞ ðt ¼ 3Þ þ 3 1ði3gÞ ðt ¼ 4ÞÞ

ð31Þ

where ði gÞ , 1 , 2 ,Xðit1gÞ , Xðit2gÞ , 1 , 2 , 1ði1gÞ ðt ¼ 2Þ and 1ði2gÞ ðt ¼ 3Þ are deﬁned as those in model (1), Xðit3gÞ ¼ 1 if the ith patient at period t receives treatment C, and ¼ 0 otherwise; 1ði3gÞ ðt ¼ 4Þ ¼ 1 for period t ¼ 4, and ¼ 0 otherwise; 3 denote the eﬀect of treatments C relative to a placebo, and 3 denote the eﬀect for period 4 versus period 1. We form the strata by matching groups with respect to the same two periods at which the two treatments are under consideration, while keeping the other two treatments to remain ﬁxed at the other two periods for uniqueness. To illustrate this point, for example, suppose that we are interested in estimation of RMAP ð¼ expð1 ÞÞ between treatment A and the placebo (P). We may form the ﬁrst strata by group g ¼ 1 (with P-A-B-C) and group g ¼ 7 (with A-P-B-C). We can show that ð7Þ ð1Þ ð7Þ 1=2 RMAP ¼ ½ð pð1Þ : 2 p1 Þ=ð p1 p2 Þ

ð32Þ

Following similar arguments, we can further show that ð8Þ ð2Þ ð8Þ 1=2 ð9Þ ð3Þ ð9Þ 1=2 ð10Þ ð4Þ ð10Þ 1=2 RMAP ¼ ½ð pð2Þ ¼ ½ð pð3Þ , ¼ ½ð pð4Þ 2 p1 Þ=ð p1 p2 Þ 3 p1 Þ=ð p1 p3 Þ 4 p1 Þ=ð p1 p4 Þ ð11Þ ð5Þ ð11Þ 1=2 ð12Þ ð6Þ ð12Þ 1=2 ð15Þ ð13Þ ð15Þ 1=2 ¼ ½ð pð5Þ ¼ ½ð pð6Þ ¼ ½ð pð13Þ 3 p1 Þ=ð p1 p3 Þ 4 p1 Þ=ð p1 p4 Þ 3 p2 Þ=ð p2 p3 Þ ð16Þ ð14Þ ð16Þ 1=2 ð18Þ ð17Þ ð18Þ 1=2 ð21Þ ð19Þ ð21Þ 1=2 ¼ ½ð pð14Þ ¼ ½ð pð17Þ ¼ ½ð pð19Þ 4 p2 Þ=ð p2 p4 Þ 4 p3 Þ=ð p3 p4 Þ 3 p2 Þ=ð p2 p3 Þ ð22Þ ð20Þ ð22Þ 1=2 ð24Þ ð23Þ ð24Þ 1=2 ¼ ½ð pð20Þ ¼ ½ð pð23Þ : 4 p2 Þ=ð p2 p4 Þ 4 p3 Þ=ð p3 p4 Þ

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015

ð33Þ

XML Template (2014) [9.1.2014–6:07pm] [1–19] //blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/SMMJ/Vol00000/130240/APPFile/SG-SMMJ130240.3d (SMM) [PREPRINTER stage]

Lui and Chang

19

On the basis of (32) and (33), we can then apply the WLS or MH method for the odds ratio to derive the point estimator and interval estimator for RMAP ð¼ expð1 ÞÞ. Note that if the number of patients in each strata is not large, we will recommend use of the MH method that can perform well when the number of patients in strata is small, but the number of strata becomes large.27 Following similar ideas as presented earlier, we can derive the WLS and MH estimators for RMBP ð¼ expð2 ÞÞ, RMCP ð¼ expð3 ÞÞ, and the ratio of mean frequencies for event occurrences between these treatments as well.

Downloaded from smm.sagepub.com at UNIVERSITE DE MONTREAL on June 30, 2015