STATISTICS IN MEDICINE. VOL. 11, 1477-1488 (1992)

STUDY DURATION FOR GROUP SEQUENTIAL CLINICAL TRIALS WITH CENSORED SURVIVAL DATA ADJUSTING FOR STRATIFICATION K Y U N G M A N N KIM Department of Biostathtics, Haruard School of Public Health and Dana-Farber Cancer Institute. 44 Binney Street, Ma-ver 4 , Boston, Massachusetts 02/15. U.S.A.

SUMMARY The study duration in a clinical trial with censored survival data is the sum of the accrual duration, which determines the sample size in a traditional sense, and the follow-up duration, which more or less controls the number of events to be observed. We propose a design procedure for determining the study duration or for calculating the power in a group sequential clinical trial with censored survival data and possibly unequal patient allocation between treatments, adjusting for stratified randomization. The group sequential method is based on the use function approach. We describe a clinical trial receritly activated by the Eastern Cooperative Oncology Group for an illustration of the proposed procedure.

1. INTRODUCTION

Because of ethical as well as practical considerations there has been an increasing interest in designing clinical trials using group sequential methods. The main issue with this type of design is how to maintain a prescribed type I error over the interim analyses. The sequential theory was first introduced in situations where outcome data are immediately available, but has subsequently been extended to survival analysis. The logrank test has received particular attention,' - and practical aspects of its sequential application have been i n ~ e s t i g a t e d . ~ - ~ In most phase 111 clinical trials, there is a set of well-known prognostic factors that are used for stratification in the randomization scheme to avoid confounding the treatment effect with an imbalance in those factors. It is quite common in such clinical trials to assess treatments in a patient population composed of subgroups with quite different prognosis. Although it can be used conservatively, the ordinary logrank test is known to be generally inefficient in a situation in which stratified randomization has been used.'O In a fixed duration clinical trial a stratified logrank test would be appropriate. In practice, covariate information is most commonly dealt with using the proportional hazards model," not the stratified logrank test. However, for determination of the sample size or study duration for clinical trials, the stratification should explicitly be taken into account. Under the assumption of exponential survival time and uniform patient entry during the accrual period, a flexible design procedure was proposed based on the use function approach' ' * I 3 for determining the study duration or calculating the power of group sequential clinical trials with survival data.' The advantage of the procedure based on the use function is that the interim analyses can be performed at arbitrary calendar times or at unequal increments of statistical information. 0277-6715/92/111477-12~11.00 0 1992 by John Wiley & Sons, Ltd.

Received October 1991 Revised February I992

1478

K . KIM

The main objective of this paper is to propose a design procedure for determining the study duration or for calculating the power of a group sequential clinical trial with censored survival data, adjusting for stratification in a randomization with possibly unequal treatment allocation. To do so we rely on the asymptotic properties of the sequentially computed stratified logrank statistics (see the Appendix). Generalizing the design procedure developed by Kim and Tsiatis,’ the proposed design allows for unequal patient allocation between treatments, censoring due to random loss to follow-up, and adjustment for stratified randomization. However, censoring due to random loss to follow-up is often difficult to estimate at the design stage. One may consider early termination both on discovering a treatment difference or on not discovering For the sake of presentation, one-sided tests of hypotheses are discussed for early termination on discovering a treatment difference. The sequentially computed stratified logrank statistic behaves asymptotically like a Gaussian process. This allows a direct extension of the design procedure by Kim and Tsiatis’ for group sequential clinical trials with survival data to trials with stratified randomization. We show how to determine the study duration, which is the sum of the accrual and the follow-up durations, and give an example from the Eastern Cooperative Oncology Group. The derivation of the asymptotic properties of the stratified logrank tests is outlined in the Appendix.

2. DESIGN PROCEDURE The study duration in a clinical trial of a chronic disease consists of two periods: the accrual period, (0, s,), during which patients are entered serially (known as staggered entry) and the follow-up period, (sa, s, + sf), during which no new patients are entered but those already entered are followed until events of interest occur or until the time of study termination, subject to random loss to follow-up. Assume that patient accrual is uniform during the accrual period (0,s,) with a constant accrual rate A, the average number of patients accrued per unit time; that allocation to treatment is by stratified randomization, with possibly unequal allocation between the two treatments, control (c) and experimental (e); that there are J strata with probabilities, n j ,j = 1,. . . , J , of a patient being in each stratum; that events occur with constant hazard rates u = c or e, and random censoring occurs with common constant hazard rate v j for stratum j; and, finally, that the accumulating data are analysed at chronological times 0 < s1 < . . . < sK = s, + sf for a maximum of K times. When computed at time s, the stratified logrank statistic U,(s) has an asymptotic normal distribution U,(.q) N(OV(s), Us))

-

under the null hypothesis H,, : 0 = 0 and local alternatives. Here 6’ is the common log hazard ratio of the control ( i c j )to the experimental ( iej) treatments, that is, 6’ = log(Acj/Aej) for all j = 1, . . . , J and V ( s )= V,(s),where V,(s)is the asymptotic variance of the ordinary logrank statistic for stratum j . In other words, U , ( s )behaves like a time-scaled Brownian motion process with a deterministic drift. This approximation is similar to that for the ordinary logrank statistic used in Kim and Tsiatis.’ The asymptotic variance Vj(s)is closely related to the expected number of events E j ( s )since V,(s) % a: E j ( s ) ,where a: is the variance of the treatment indicator Zij(see the Appendix). The expression for E j ( s ) will be given later. Apparently this approximation gets worse as 0 becomes larger, because the effective sample sizes in the two treatment groups become quite disparate. Assume that the use function approach will be employed for group sequential tests at times 0 < s1 < . . . < sK for testing H , : 8 = 0 at a significance level a against H I :0 = 0, > 0 with power

xi

STUDY DURATION FOR GROUP SEQUENTIAL CLINICAL TRIALS

1479

1 - p. Then, by analogy between the asymptotic Gaussian process, U,(s), and the Brownian motion process, W(t),the maximum amount of statistical information V, is equal to V(sK),the asymptotic variance of the test statistic U,(s,) at the last analysis. To detect the common log hazard ratio 8, > 0 with power 1 - fl this variance must equal

where q is the drift parameter of the corresponding Brownian motion process as in equation (3.2) of Kim and Tsiatis.’ Values of the drift parameter for different design specifications can be determined by recursive numerical integration similar to Armitage et ~ 1 . and ’ ~ McPherson and Armitage” or can be found in Kim and D e M e t ~ . ’ Since ~ * ~ ~V , , , z o ~ E m a approximately xr V,,,/ot events are needed ultimately during the study period to achieve the desired power 1 - fl for detecting a clinically important treatment difference el. To estimate the variance of the stratified logrank statistics, we need first to estimate the number of events. The expected number of events in each stratum by time s can be found by double integration with respect to the density function of the survival time distribution and the uniform density function for patient entry. Under exponential survival times with hazard rates, l C ju ,= c, e, and exponential random censoring with common constant hazard rate vj for stratum j , the expected number of events for stratum j by time s, if all the patients on the study are given treatment u, are exp{ - ALj(s - s a ) + } - exp( - ALjs)

1’. VJ

1

where A:j = A v j + vj, s A s, is the smaller of s and s,, and x + = x if x is positive and 0 otherwise. A formula essentially identical to (2) is given in Rubinstein et a/.*‘ for J = 1 under Poisson patient entry. When J = 1 and v j = 0, the formula (2) reduces to the corresponding formula in Section 2 of Kim and Tsiatis.’ The role of the exponential distribution is primarily to provide a simple calculation of the expected number of events. The formula given in (2) can be extended to other survival time distributions for which the expected number of events can also be calculated, and in general can be extended to the proportional hazards model after a suitable transformation on thc time scale. Since the maximum expected number of events under the alternative hypothesis during the study period is E(l)(sK)= pz

1

Eej(sK)

+ ( l - p z ) 1E c j ( S K )

=

( 31

i

j

where p 2 is the mean of the treatment indicator Zii, the duration s, of the accrual period and the duration sf of the follow-up period should jointly satisfy

Hence, given sa satisfying Emax 7 G

sa

G

Ei;(Ernax)?

the necessary duration of the follow-up period is obtained by sf = EG;(Emax)

- sa.

(4)

1480

K. KIM

The constraints in the inequalities given in (4) mean that the accrual duration should be long enough, but no longer than necessary to have the adequate number of events.

3. AN EXAMPLE Consider the following recently activated lung cancer clinical trial. A group of investigators from the Eastern Cooperative Oncology Group and the Radiation Therapy Oncology Group wished to compare survival following a new treatment, post-operative chemotherapy plus thoracic radiotherapy, with the current standard treatment, post-operative thoracic radiotherapy alone, in patients with completely resected stage I1 and stage IIIA non-small cell lung cancer. The null hypothesis of no treatment difference was to be tested at the 5 per cent significance level (a = 0.05) against a one-sided alternative hypothesis that the new treatment prolongs the median survival time by 40 per cent (a log hazard ratio of 0.336).The stratification factors used for randomization were status of nodal involvement, histology, weight loss in previous 6 months, and type of lymph node dissection. However, in the power calculation, only the nodal status, N , versus N 2 , was considered because it was believed to be the most differential prognostic indicator among the four factors. The nodal status N I is defined as a presence of metastasis to lymph nodes in the peribronchial or the ipsilateral hilar region, or both, including direct extension, while N 2 is defined as a presence of metastasis to ipsilateral mediastinal lymph nodes and subcarinal lymph nodes. There was equal patient allocation between the two treatments. The median survival times of patients receiving the standard treatment were estimated to be about 30 and 20 months (hazard rates of 0.0231 and 0.0347), respectively, for N 1 and N 2 disease. It was expected that only 20 per cent of patients undergoing thoracic surgery prior to the adjuvant therapy would have N , disease. Approximately 10 patients were expected to be accrued per month. No random loss to follow-up was considered. To investigate the feasibility of different designs, the power can be calculated for various combinations of accrual and follow-up durations. In the example, we will consider the following use functions: r : = 21 1 - a q z = , 2 / 4 ) ) and a: = r l o g ( 1 + (e - 1 ) r ) where is the standard normal distribution function and zai2 is its upper u/2 quantile. The use functions, z: and r r , have been proposed by Lan and DeMets” and are known to generate group sequential boundaries close to those of O’Brien and Fleming22 and P o ~ o c k , ~ ~ respectively. The power of the group sequential procedures with K = 1 , 2 , 3 , 4 , 5 scheduled analyses after equal increments of statistical information is given in Tables I to V for the use functions, a: and r : . Note that the decrease in power from the fixed duration design ( K = 1) to the group sequential designs with K = 2, 3,4, 5 scheduled analyses is very small for a : , whereas it can be substantial for r $ . With 48 months of accrual and 18 months of follow-up, the group sequential design with K = 4 equally spaced analyses based on the use function r : attains a power of 0.901. ‘In this case, Em,, = 314 by the formula (3) and 6; = 0.248, so that V,,, = 77.9. Note here that of was estimated to account for the differential rate of failure under the two treatments. Then, based on the formula (I), we obtain q = 2.97. Finally, the power of 0.901 is obtained as the probability of a Brownian motion process crossing the group sequential boundaries when the drift parameter is q = 2.97 using recursive numerical integration similar to McPherson and Armitage.”

1481

STUDY DURATION FOR GROUP SEQUENTIAL CLINICAL TRIALS

Table I. Power of the stratified fixed duration designs with

a = 0.05

Accrual durations

~-0lloW-up durations I2 18 24

36

42

48

54

0.754 0.797 0.828

0.827 0.861 0.885

0.882 0.908 0.925

0.922 0.940 0.952

Table JI. Power of the stratified group sequential designs with

and

a = 0.05

information times (0.5, 1.0)

Accrual durations Follow-up durations

Use functions

.:

12

a:

18

a:

24

4 aT a:

36

42

48

54

0.75 1 0.706 0.794 0.753 0.826 0.787

0.824 0.785 0.859 0.824 0.883 0.85 1

0.880 0.848 0.906 0.878 0.923 0.899

0.920 0.896 0.939 0.9 18 0.95 1 0.933

Table Ill. Power of the stratified group sequential designs with information times (0.333, 0.667, 1.0)

SL

= 0.05

and

Accrual durations Follow-up

Use

functions

36

42

48

54

12

a: a:

18

r:

0.746 0.687 0.790 0.735 0821 0.770

0.820 0.767 0.855 0.808 0.879 0.837

0.877 0.834 0.903 0.866 0921 0.888

0.9 18 0.884 0.936 0.908 0.949 0.925

durations

a:

24

a: a:

If the actual analyses were performed at 24,42, and 54 months, and a t 66 months (the end of the study), the power would be 0.899 with an expected study termination under the alternative hypothesis occurring at 51.5 months, resulting in an expected saving of 14.5 months. A slight decrease in power from 0.901 to 0.899 is a consequence of a change in the information times, from the prespecified (0.25,0.5,0.75, l.O} to the actual {0.213,0.554,0.813, 1.0}. Despite the change in times of interim analyses, the overall type I error probability is always maintained at the prescribed significance level if the information times are adequately estimated. The details of the

1482

K. KIM

Table 1V. Power of the stratified group sequential designs with a = 0.05 and information times (0.25, 0.5,0.75, 1.0) _________~ ~

Accrual durations Follow-~p durations

Use functions

12

2:

z:

18

.:.: 2:

z:

24

36

42

48

54

0.743 0,677 0.787 0.725 0.8 I9 0.761

0.819 0.760 0.852 0.800 0.877 0.830

0.874 ow6 0.90 I 0.859 0-919 0.882

0.9 16 0.878 0.935 0.903 0.948 0.920

Table V. Power of the stratified group sequential designs with a = 0.05 and information times (0.2, 0.4, 0.6, 0.8, 1.0) Accrual durations Follow-up durations

Use functions

12

r: 2:

.:.:

18

x:

2:

24

36

42

48

54

0.740 0.671 0.785 0.719 0.817 0.755

0.815

0.873 0.821 0.899 0.854 0.9 18

0.91 5 0,874 0,934 0.899 0.947 0.9 17

0.754 0.851 0.795 0.876 0.825

0.878

Table VI. Operating characteristics of the stratified group sequential analyses at {24, 42, 54, 66} months with z = 0.05, zy, and 48 months of accrual Number of failures Real time (months)

24 42 54 66

Probability of rejection

Information time

Ho

H,

Number of patients

Upper boundary

0.2 13 0.554 0.8 I 3 1.Ooo

73 189 278 341

67 174 255 314

240 420 480 480

4.087 2.392 1927 1.744

Ho

HI

09000219

0.00331 0.425 0.351 0.120

0.00845 0.0212 0.0203

operating characteristics and the expected stopping times of the group sequential analyses at (24,42, 54,66) months are given in Tables VI and VII. The expected stopping times can be evaluated according to the results in the Appendix of DeMets and WareI4 and in Section 4 of Kim and Tsiatis.’ For comparison, the power of the unstratified group sequential procedures with K = 4 repeated analyses is given in Tables VllI to XI. In the power calculation, 100 per cent of patients

1483

STUDY DURATION FOR GROUP SEQUENTIAL CLINICAL TRIALS

Table VII. Expected stopping times of the stratified group sequential analyses at { 24, 42, 54, 66) months with a = 0.05, a:, and 48 months of accrual Expected stopping times

HO ff

1

Real time (months)

Information time

Number of failures

Number of patients

65.5 51.5

0.992 0.742

339 233

479 454

Table VIII. Power of the unstratified group sequential designs with a = 0.05, information times {0.25,0.5,0.75, 1.0}, and 100 per cent N 1 and 0 per cent N z patients Accrual durations Follow-up durations

12

Use functions

.: a:

18

a: a:

24

a: a:

36

42

48

54

0,659 0.590 0.71 1 0.643 0.751 0.686

0.74 1 0.676 0.786 0.725 0.820 0.763

0.810 0.75 1 0.846 0.793 0.874 0.825

0.864 0.8 14 0.893 0.848 0.9 I 3 0.875

Table IX. Power of the unstratified group sequential designs with a = 0.05, information times (0.25, 0.5, 0.75, 1.0}, and 80 per cent N1 and 20 per cent N z patients Accrual durations Follow-up durations

12 18 24

Use functions

36

42

48

54

0.677 0.609 0.728 0.662 0767 0.703

0.759 0.695 0.802 0.742 0.834 0.779

0.825 0.768 0.860 0-809 0.885 0.839

0.877 0.829 0.903 0.862 0.922 0.886

were assumed to have N1 disease so that the median survival following the control treatment is 30 months for Table VIII; 80 per cent of patients were assumed to have N 1 disease so that the weighted average of the median survival is 28 months for Table 1X; 20 per cent of patients were assumed to have N , disease so that the weighted average of the median survival is 22 months for Table X; and, finally, 0 per cent of patients were assumed to have N , disease so that the median survival is 20 months for Table XI. With 48 months of accrual and 18 months of follow-up, the stratified group sequential design based on the use function a: attains a power of 0.901, whereas the unstratified group sequential designs using the weighted average of the median survival

1484

K. KIM

Table X. Power of the unstratified group sequential designs with a = 0.05, information times r0.25, 0.5, 0.75, 1.0}, and 20 per cent N , and 80 per cent N 2 patients ~

Accrual durations

Follow-up durations

12

Use functions 2:

2:

18

2: 2;

24

2:

1:

36

42

48

54

0.739 0.673 0.783 0.722 0.816 0.758

0.814 0.756 0.850 0.797 0.875 0.827

0.872 0.823 0.899 0.856 0.9 18 0.880

0.9 14 0.875 0.934 0.901 0.947 0.919

Table XI. Power of the unstratified group sequential designs with a = 0.05, information times (0.25,0.5, 0.75, 1.0), and 0 p e r cent N Land 100 per cent N 2 patients Accrual durations Follow-up durations

I2 18 24

Use functions

36

42

48

54

0.761 0.697 0.803 0.743 0.833 0.777

0.833 0.777 0.866 0,816 0.889 0.844

0.887 0.84 1 0.91 1 0.872 0,928 0.893

0.926 0.890 0.943 0.9 I3 0.954 0.929

following the control treatment attain a power in the range from 0-846,0.860,0899 to 0.91 1. The stratified group sequential design based on the use function a ; attains a power of 0.859, while the unstratified group sequential designs attain a power in the range from 0.793,0.809,0.856 to 0.872. In both cases, ignoring the stratification resulted in a slight decrease in power from 0.901 to 0.899 for the design with x : and from 0.859 to 0856 with 3:. 4. DISCUSSION The proposed procedure includes other designs as its special cases. For example, whcn K = 1 (the fixed duration study), and vj = 0 for all j (no random loss to follow-up), it reduces to the design procedure developed by Bernstein and L a g a k ~ s When . ~ ~ J = 1 (no stratification), K = 1, and p 2 = 1/2 (equal paticnt allocation between treatments), it becomes the design procedure developed by Rubinstein et Finally, when J = 1, v j = 0. and pz = 1/2, it becomes identical to the design procedure developed by Kim and Tsiatis.’ The asymptotic results for the sequentially computed logrank statistic apply to more general class of test statistics such as the G Pstatistics.2s With such statistics, the asymptotic variance and its relationship to the expected number of events is different from that for the logrank statistic. The power of the trial will be changed if the actual schedule of interim analyses varies from those prespecified during the design stage. From the relationship between the number of events

STUDY DlJRATION FOR GROUP SEQUENTIAL CLINICAL TRIALS

1485

and the chronological time, the actual information time can be estimated from the chronological times of the scheduled interim analyses,8 and thus the expected operating characteristics of a study can be investigated. These two procedures, one for determining the study duration and the other for investigating the operating characteristics, can be alternated, until the design specifications can be fine-tuned. Although we propose the design procedure based on the use function approach, it can be applied to other group sequential designs with predetermined b~undaries.'~.' '* 7 , 2 2 . 2 3 . 2 6 These procedures require that the frequency and times of repeated analyses be fixed in advance and adhered to, which is neither feasible nor realistic. The proposed procedure and others based on the use function approach'. also require that the expected frequency and times of interim analyses be specified in advance during the design stage. However, only those methods based on the use function approach allow deviations in the frequency and times of repeated analyses from those prespecified, while still preserving the desired significance level. As the example above and other similar exercises indicate," the proposed procedure is also robust in preserving the desired power of the design, in agreement with the observation made in Kim and Tsiatis.' Usually, even when there are strata, it suffices to use the weighted average of the baseline median survival times for strata and assume an unstratified design, since loss of power will be negligible. However, there are occasions when it is necessary to include stratification by covariates in the design. This will depend on the prognostic value of the stratification variable under consideration and, to a larger extent, on the composition of the population with respect to such a prognostic factor. 1 3 3 z 0

APPENDIX: STRATIFIED LOGRANK TEST For discussion of the asymptotic properties of the sequentially computed stratified logrank statistic, we use the same notations as in Tsiatis.' With staggered patient entry and random loss to follow-up, each patient can be represented by ( X i j ( s ) Aij(s), , Z i j ) for i = 1 , . . . , tij and j = 1 , . . . , J at chronological time s of interim analysis; where n j is the maximum number of patients in stratum j with a maximum overall sample size n = x . n . ;X i j ( s )denotes time to events J ! of interest or censoring due to random loss to follow-up or censoring due to analysis for patient i in stratum j ; Aij(s)is an indicator of event; and Z i j is a treatment indicator with finite mean p z and variance at = p Z ( 1- p z ) . For detecting treatment difference in survival time under the proportional hazards model,'

'

i. j ( sI z)

= i o j ( s ) exp( - 8z),

for , j = 1 , . . . , J where AOj(s) is a baseline hazard function for stratum j , and z is an indicator variable for treatment, the stratified logrank test is known to be efficient unless the number of strata is too large," because the eficient scores test statistic for the model given above is equivalent to the stratified logrank statistic. Under the model specified above, B is the log hazard ratio; 8 = 0 corresponds to treatment equivalence and 8 > 0 indicates longer survival on the experimental treatment. A stratified logrank statistic computed at time s can be defined as J

U,(s)

=

"I

C C A i j ( ~ ) { Z -i j Z j ( X i j ( s ) ) )

j = 1 i= 1

where Z j ( x ) = c i . Z i ~ j l (, x~ is ~ ~ x ~ / ~ i ~ I ,with x , . an , ~ indicator s , ~ x ~ variable I Afor a set A . Note that the stratified logrank statistic U,(s) is simply the sum of the ordinary logrank statistics, Unj(s), for

1486

K.KIM

j = 1,. . . ,J , where “J

1 Aij(s)(Zi,

Vnj(s)=

-

Z,(X,,(s))}.

i= 1

We will use this observation to derive the asymptotic joint distribution of the stratified logrank statistics U,(s) computed at s1 < . . . < s K . We will show heuristically that, under the hypotheses of local alternatives with log hazard ratios, On,, for which V,,,,/;;;;;-+ y j , a constant, as n -P G O , the asymptotic distribution of 1

D = - ( V n ( S 1 ), . . . u n ( S K ) ) J n for sI < . . . < s K , is a multivariate normal distribution with means 7

p(Sk) =

1yjJ7c,OIPr(Aij(Sk)= 1) j

and covariances b ( S k , Sk.)

=

C T[jU:Pr(Aij(Sk)= 1) i

for any k < k’,that is, with independent increments. The argument of the heuristic proof is based on two results for the asymptotic distributions of D j , j = I , . . . , J , conditional on rinj = n j / n , where

for s1 < . . . < s K . The first result derived by Tsiatis’ gives the asymptotic distribution of D j for each stratum, which is the sequentially computed ordinary logrank statistic. Result 1

Under the hypotheses of local alternatives with log hazard ratios, OnJ, for which VnJJn7c, + y,, a constant, as n -+ x ,the asymptotic distribution of D,, conditional on 2,,, such that nlr 4 as n + G O , is a multivariate normal distribution with means

fl

w 1

P,(s&) = Y j O 5 Pr(Aij(sk) = 1)

and covariances CJ,(Sk, S & , )

= Of

Pr(A,,(Sk)= 1)

for any k d k’. This result indicates that the asymptotic joint distribution of the sequentially computed ordinary logrank statistics is a multivariate normal distribution with independent increments. The second result shows the conditional independence among D,’s. Result 2

Conditional on finj, j = I , . . . , J , D j and D,. are independent for any j Zj’. Since finj converges in probability to xi, we can now assert that the two previous results hold for the unconditional distribution of D as well. Therefore, the asymptotic distribution of

STUDY DURATION FOR GROUP SEQUENTIAL CLINICAL TRIALS

1487

is a multivariate normal distribution with means & k ) = x j y j J ; ; j ~ :Pr(Aij(sk)= I ) and covariances ( T ( s k , sk.) = C j n j a I P r ( A i j ( s k= ) 1 ) for any k < k’. Finally, noting that the statistic D becomes identical to D‘ when nj is substituted for nj/n, the asymptotic distribution of D follows directly from the results of Sellke and Siegmund4 and S l ~ d . ~ That is, the asymptotic joint distribution of the sequentially computed stratified logrank statistics is a multivariate normal distribution with independent increments, hence enabling us to use the standard results for repeated significance testing of independently and identically distributed normal random variables. ACKNOWLEDGEMENTS

The author wishes to thank Drs. Brigitte Cheuvart and Robert Gray for critical review of the original manuscript and the editor for numerous suggestions that led to an improvement in presentation and clarity of the manuscript. This research was supported in part by Grant CA-52733 from the National Cancer Institute. REFERENCES 1. Tsiatis, A. A. ‘The asymptotic joint distribution of the eficient scores test for the proportional hazards model calculated over time’, Biometrika, 68, 3 I 1-3 15 (1 98 1). 2. Tsiatis, A. A. ‘Repeated significance testing for a general class of statistics used in censored survival analysis’, Journal qf the American Statistical Association, 77, 855-861 (1982). 3. Gail, M. H., DeMets, D. L. and Slud, E. V. ‘Simulation studies on increments of the two-sample logrank score test for survival time data, with application to group sequential boundaries’, in Johnson, R. A. and Crowley, J. (eds), Suruitial Analysis, IMS Lecture Notes-Monograph Series 2. Institute of Mathematical Statistics, Hayward, California, 1982, pp. 287-301. 4. Sellke, K. and Siegmund, D. ‘Sequential analysis of the proportional hazards model’, Biometrika, 70,

315-326 (1983). 5. Slud, E. V. ‘Sequential linear rank tests for two-sample censored survival data’, Annals of Statislics, 12,

551-571 (1984). 6. DeMets, D. L. and Gail, M. H. ‘Use of log rank tests and group sequential methods at fixed calendar times’, Biomefrics, 41, 1039-1044 (1985). 7. Tsiatis, A. A,, Rosner, G. L. and Tritchler, D. L. ‘Group sequential tests with censored survival data adjusting for covariates’, Biomerrika, 72, 365-373 (1985). 8. Lan, K. K . G. and Lachin, J. M. ‘Implementation of group sequential logrank tests in a maximum duration trial’, Biometrics, 46, 759-770 (1990). 9. Kim, K. and Tsiatis, A. A. ‘Study duration for clinical trials with survival response and early stopping rule’, Biornetrics, 46, 8 1 -92 (1 990). 10. Schoenfeld, D. A. and Tsiatis, A. A. ‘A modified log rank test for highly stratified data’, Biometrika, 70, 167-175 (1987). 1 I . Cox, D. R. ‘Regression models and life tables (with discussion)’, Journal of the Royal Statistical Society, Series B, 34, 187-220 (1972). 12. Lan, K. K. G. and DeMets, D. L. ‘Discrete sequential boundaries for clinical trials’, Biometrika, 70, 659-663 (1983). 13. Kim, K. and DeMets, D. L. ‘Design and analysis of group sequential tests based on the type I error spending rate function’, Biometrika, 74, 149- 154 (1987). 14. DeMets, D. L. and Ware, J. H. ‘Group sequential methods for clinical trials with a one-sided hypothesis’, Biometrika, 67, 651 -660 (1980). 15. DeMets, D. L. and Ware, J. H. ‘Asymmetric group sequential boundaries for monitoring clinical trials’, Biometrika, 69, 661 -663 (1982). 16. Whitehead, J. The Design and Analysis of Sequential Clinical Trials, Ellis Horwood, Chichester, 1983. 17. Emerson, S. S. and Fleming, T. R. ‘Symmetric group sequential test designs’, Biometrics, 45, 905-923 (1989). 18. Armitage, P., McPherson, C. K. and Rowe, B. C. ‘Repeated significance tests on accumulating data’, Journal of the Royal Statistical Society, Series A, 132, 235-244 (1969). 19. McPherson, C. K. and Armitage, P. ‘Repeated significance tests on accumulating data when the null hypothesis is not true’, Journal of the Royal Statistical Society, Series A , 134, 15-25 (1971).

1488

K . KIM

20. Kim, K. and DeMets, D. L. 'Sample size determination for group sequential clinical trials with immediate response', Statistics in Medicine, 11(10), 1391-1399 (1992). 21. Rubinstein, L. V., Gail, M. H. and Santner, T. J. 'Planning the duration of a comparative clinical trial with loss to follow-up and a period of continued observation', Journal of Chronic Diseases, 34,469-479 (1981). 22. ORrien, P. C. and Fleming, T. R. 'A multiple testing procedure for clinical trials', Biometrics, 35, 549-556 (1979). 23. Pocock. S. J. 'Group sequential methods in the design and analysis of clinical trials', BiOmetrikA, 64, 191-199 (1977). 24. Bernstein. D. and Lagakos, S. W. 'Sample size and power determination for stratified clinical trials', Journal of Statistical Computing and Simulation, 8, 65--73 (1978). 25. Harrington, D. P. and Fleming, T. R . 'A class of rank test procedures for censored survival data', &WK?trikA,69, 553-566 (1982). 26. Wang, S. K. and Tsiatis, A. A. 'Approximately optimal one-parameter boundaries for group sequential tests', Biometrics, 43, 193 - 199 ( 1 987).

Study duration for group sequential clinical trials with censored survival data adjusting for stratification.

The study duration in a clinical trial with censored survival data is the sum of the accrual duration, which determines the sample size in a tradition...
644KB Sizes 0 Downloads 0 Views