Original Article J Med Screen 2015, Vol. 22(2) 65–68 ! The Author(s) 2015 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/0969141315577847 msc.sagepub.com

A note on the design of cancer screening trials Stephen W Duffy1 and Robert A Smith2

Abstract Objectives: To investigate the consequences of different cancer screening trial designs and follow-up options for accuracy of the estimate of the effect of screening on disease-specific mortality. Methods: We consider a randomized trial of breast cancer screening with a screening phase in which the intervention group is offered screening and the control group is not, and optional further follow-up after this screening phase. Postulating a lead time effect similar to that observed in breast cancer screening trials, we calculate the observed relative risk of disease-specific mortality and compare this with the true relative risk, for four design options: (1) no follow-up beyond the screening phase, ie. the screening phase and the observation period are identical; (2) follow-up continuing beyond the screening phase, all cancerspecific deaths counted, including those diagnosed after the screening phase; (3) follow-up continuing beyond the screening phase, but with only deaths from cancers diagnosed during the screening phase included; and (4) follow-up continuing beyond the screening phase, a single screen of the control group conducted at the end of the screening phase, and only deaths from cancers diagnosed during the screening phase in both arms up to completion of the single control screen included. Results: All designs in which follow-up for mortality continues beyond the screening phase incurred a bias against screening. The design in which the control group undergoes a single screen at the end of the screening phase was least biased in the example used. Conclusions: The expedient of a single screen of the control group at the end of the screening phase has acceptable accuracy, but is still slightly conservatively biased. Keywords Cancer screening, trial design, follow-up Date received: 29 October 2014; accepted: 25 February 2015

Introduction Randomized trials of cancer screening can involve challenging design choices.1,2 Ideally, a population would be randomized to the offer of regular screening (intervention group) or to usual care (control group) for a long period of time. Deaths during that period from cancers diagnosed during that period would constitute the trial endpoint. We are here considering screening tests such as mammography or faecal occult blood testing, which are primarily aimed at detection of cancer at an earlier and more treatable stage, and not tests such as flexible sigmoidoscopy or cervical smear testing, which have a major effect of detection and removal of pre-malignancies, and therefore prevention of cancer, albeit while also detecting some cancers early.3,4 Often, research resources are not available for the screening to continue for long enough to have sufficient endpoint events for adequate statistical power. In this case, a number of options are available. One is to make screening available to the intervention group for a limited period of time, but to extend the period of observation beyond the screening phase. In this design, the

intervention group is invited to screening during the earlier period, but not the later, and the control group is never invited to screening. This will entail a conservative bias, as in the latter period, after the screening phase has ended, neither the intervention nor the control group are receiving screening, so there will be no screening effect on deaths from the cancers diagnosed during this period. A second possibility is to screen the intervention group for a limited period and, as in the previous example, never screen the control group. Mortality is measured in the 1

Professor of Cancer Screening, Wolfson Institute of Preventive Medicine, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, UK 2 Senior Director, Cancer Screening, American Cancer Society Inc, 250 Williams St, Atlanta GA 30303, USA Corresponding author: Stephen W Duffy, Wolfson Institute of Preventive Medicine, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, UK. Email: [email protected]

66 follow-up period, but only deaths from cancers diagnosed during the screening phase are included. This too will be conservatively biased, as there will be deaths from cancers included in the intervention group that would have been diagnosed after the end of the screening phase if no screening had taken place, but were diagnosed during the screening phase as a result of lead time.5 Their counterparts in the control group are diagnosed after the screening phase, due to the absence of lead time in this unscreened population, and are therefore excluded from the analysis. A third option is to screen the intervention group for a limited period, and at the end of that period, offer one round of screening to the control group. In subsequent follow-up for cancer mortality, deaths are included from all cancers diagnosed in both groups between randomization and the end of the single round of screening in the control group. This design too will dilute the effect of screening, as it includes cancers diagnosed in a period when both groups are receiving the same screening intervention. In the material following, we demonstrate these conservative biases more formally, with a numerical example.

Methods Screening designs and endpoints Suppose we propose a trial to screen for a cancer which, in the absence of screening, has annual incidence rate r, and for which the annual case fatality rate from the time of symptomatic diagnosis is p. Suppose further that, on average, the offer of screening changes this case fatality rate (only of those cancers that would occur symptomatically, and from the date when they would have been diagnosed with symptoms) to p, where  < 1. Thus the true relative risk of mortality from the specific cancer conferred by the offer of screening is . This will be an average effect, as those who do not take up the offer of screening will receive no benefit, whereas those who do will receive a greater benefit, ie. a relative risk smaller than . If the screening works by diagnosing the cancer at an earlier stage, it must confer a lead time, ie. the screendetected cancers are diagnosed some time before they would have been diagnosed in the absence of screening. Suppose that at the end of the screening phase, the intervention arm has an additional rate of 2 r cancers, 0.8 r from those that would have arisen without screening during the year following the end of screening, 0.6 r from the next year, 0.4 r from the third year after screening ceases, and 0.2 r from the fourth year. To demonstrate the effect of lead time, we posit this rather specific effect, but note that its magnitude is similar to that observed in the breast screening trials.6 These are cancers that would have arisen in any case. The screening may also detect some cancers that are overdiagnosed, but these will not contribute to mortality from the disease.

Journal of Medical Screening 22(2)

Design 1: Identical screening and observation periods If our screening and observation periods are the same, say 15 years following randomization, in the control group, approximating time of diagnosis in any given year as the mid-point of that year, the observed overall rate of deaths from the relevant cancer will be Dc ¼

15 X

rð15  i þ 0:5Þ p ¼ 112:5rp

i¼1

In the intervention group, it will be Di ¼

15 X

rð15  i þ 0:5Þp ¼ 112:5rp ¼ Dc

i¼1

Thus, this design will yield an unbiased estimated relative risk of RR1 ¼ . The first mortality report of the Nottingham trial of faecal occult blood testing (FOBT) for colorectal cancer used this design and analysis,7 and the Swedish Twocounty Trial’s first mortality results pertained mainly to the period before the control group was screened, and therefore correspond to this scenario.8

Design 2: Extended observation period, all cancers and deaths included If, on the other hand, screening is offered to the intervention group for the first five years, but cancers and deaths are observed in both groups up to fifteen years, we will have in the control group Dc as before, but in the intervention group, the overall rate of deaths from the cancer will be Di ¼

5 X

rð15  i þ 0:5Þp þ ð0:8rp þ 0:2rpÞ

i¼1

 9:5 þ ð0:6rp þ 0:4rpÞ  8:5 þ ð0:4rp þ 0:6rpÞ  7:5 þ ð0:2rp þ 0:8rpÞ  6:5 þ

15 X

rð15  i þ 0:5Þ p

i¼10

¼ 33rp þ 79:5rp ¼ rpð33 þ 79:5Þ This will give a relative risk of

RR2 ¼

79:5 þ 33 112:5

This will exceed , and will therefore be biased against the screening effect. The primary analysis of extended followup of the Nottingham FOBT trial used this design and analysis.9

Duffy and Smith

Design 3: Extended observation period, including only deaths from cancers diagnosed during the screening phase Now suppose that screening is offered to the intervention group for five years, but never to the control group, and both groups are followed up for 15 years, but only deaths from cancers diagnosed in each group during the first five years are included. In the control group, the overall rate of deaths from the relevant cancer will be Dc ¼

5 X

rð15  i þ 0:5Þ p ¼ 62:5rp

In the intervention group, the overall rate will be 5 X

rð15  i þ 0:5 þ 0:8  9:5 þ 0:6  8:5 þ 0:4

i¼1

 7:5 þ 0:2  6:5Þp ¼ 79:5rp and the observed relative risk will be RR3 ¼

79:5 ¼ 1:27   62:5

Again, this is greater than , and hence is conservatively biased. The analysis of extended follow-up of theHealth Insurance Plan of Greater New York Breast Screening Trial used this strategy.10

Design 4: Extended observation period with an exit screen of the control group In this design, screening is offered to the intervention group for five years, to the control group at the closure of the screening period, so that the single screening round of the control group concludes at the end of the fifth year, and only deaths in both groups from cancers diagnosed up to the end of the control group screen are included. The purpose of this design is to confer approximately the same number of additional lead time cases in the control group as in the intervention group. The overall cancer mortality in the control group will be

Dc ¼

5 X

rð15 ¼ i þ 0:5Þ p þ rð0:8  9:5 þ 0:6  8:5

i¼1

þ 0:4  7:5 þ 0:2  6:5Þp ¼ 62:5rp þ 17rp In the intervention arm, the overall mortality will be Di ¼

5 X

as in design 3 above. Thus the observed relative risk will be RR4 ¼

79:5rp 62:5 þ 17 ¼ 62:5rp þ 17rp 62:5 þ 17

Again, this is larger than . The extended follow-up of the Swedish Two-County, Gothenburg and Stockholm screening trials used this estimation strategy.11,12

Fictitious example

i¼1

Di ¼

67

rð15  i þ 0:5 þ 0:8  9:5 þ 0:6  8:5 þ 0:4

i¼1

 7:5 þ 0:2  6:5Þp ¼ 79:5rp

Suppose we are screening for a cancer with annual incidence of 3 per thousand and unscreened case-fatality of 2%, and that the offer of screening reduces this (in real, non-overdiagnosed tumours) to 1.5%, ie. a true relative risk of  ¼ 0.75. Suppose also the same designs and lead time effects as above. These parameters are not dissimilar to those observed in some of the breast cancer screening trials.12,13 The four options above would result in relative risk estimates of RR1 ¼ 0.75, RR2 ¼ 0.82, RR3 ¼ 0.95, and RR4 ¼ 0.79. Thus, the only unbiased design is the one where the screening period and the observation period are the same. The other designs are all conservatively biased, with the smallest bias being observed where an exit screen of the control group takes place.

Discussion The examples above demonstrate that design expedients in screening trials that involve an observation period that exceeds the screening period all tend to be conservatively biased. In our example, the least biased design was number 4, that in which a single screen is offered to the control group at the end of the screening phase, and only cancers diagnosed in both groups up to the end of screening of the control group are followed up for cause-specific mortality during the observation period. This design may not always be the least conservatively biased. The order of inaccuracy will depend on the relative size of the mortality benefit and the magnitude of lead time effects. However, the fact that all designs with observation beyond the screening period are conservative is generalizable. It should be noted that the design of cancer screening trials also has implications for estimation of lead times and overdiagnosis rates. This is not dealt with here, but is the subject of ongoing research. Our conclusions about design number 4 also depend on the magnitude of lead time at the end of the screening period being equal in both groups, which, in turn, will depend on the amount of actual screening exposure in both groups just before the end of the screening period. There may be some mismatch of timing between the last screen of the intervention group and the single screen of the control group. A reasonable check on whether the lead time effects are equal is given by the difference between the groups with respect to cumulative incidence of cancer at

68 the end of the control screen. If this difference is small, then the lead time effects are likely to be the same in each group, which should be the case if randomization was effective, and the conservative nature of design number 4 will hold. The Swedish Two-County Trial is a case in point. Before the screen of the control group, there was an excess incidence in the intervention group.8 Incidence in both groups equalized immediately upon conclusion of the single screen of the control group.6 Concerns have been expressed in the past about the policy of a single screen of the control group at the end of the screening period in cancer screening trials.14 The examples above indicate that with incidence, fatality, and lead time parameters typical of the breast screening trials, this methodological approach is the least conservative of several design expedients, but still will tend to underestimate the benefit of screening. Funding This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

References 1. Moss S. Design issues in cancer screening trials. Stat Methods Med Res 2010;19:451–61. 2. Smith RA, Duffy SW, Gabe R, Tabar L, Yen AMF. Chen HHT. The randomized trials of breast cancer screening: what have we learned? Radiol Clin N Amer 2004;42:793–806. 3. Sasieni P, Adams J, Cuzick J. Benefit of cervical screening at different ages: evidence from the UK audit of screening histories. Br J Cancer 2003;89:88–93. 4. Atkin WS, Edwards R, Kralj-Hans I, Wooldrage K, Hart AR, Northover JM, Parkin DM, Wardle J, Duffy SW, Cuzick J. Once-only flexible sigmoidoscopy screening in prevention of colorectal cancer: a multicentre randomised controlled trial. Lancet 2010;375:1624–33.

Journal of Medical Screening 22(2) 5. Njor S, Nystro¨m L, Moss S, Paci E, Broeders M, Segnan N, Lynge E. Euroscreen Working Group. Breast cancer mortality in mammographic screening in Europe: a review of incidence-based mortality studies. J Med Screen 2012;19(1)Suppl): 33–41. 6. Duffy SW, Agbaje O, Tabar L, Vitak B, Bjurstam N, Bjo¨rneld L, Myles JP, Warwick J. Estimates of overdiagnosis from two trials of mammographic screening for breast cancer. Breast Cancer Research 2005;7:258–65. 7. Hardcastle JD, Chamberlain JO, Robinson MH, Moss SM, Amar SS, Balfour TW, James PD. Mangham CM. Randomised controlled trial of faecal-occult-blood screening for colorectal cancer. Lancet 1996;348:1472–7. 8. Tabar L, Fagerberg CJ, Gad A, et al. Reduction in mortality from breast cancer after mass screening with mammography: randomised trial from the Breast Cancer Screening Working Group of the Swedish National Board of Health and Welfare. Lancet 1985;i:829–32. 9. Scholefiled JH, Moss SM, Mangham CM, Whynes DK, Hardcastle JD. Nottingham trial of faecal occult blood testing for colorectal cancer: a 20-year follow-up. Gut 2012;61:1036–40. 10. Shapiro S. Periodic screening for breast cancer: the HIP radnomized controlled trial. Monogr Natl Cancer Inst 1997;22:27–30. 11. Tabar L, Vitak B, Chen THH, Yen AMF, Cohen A, Tot T, Chiu SYH, Chen SLS, Fann JCY, Rosell J, Fohlin H, Smith RA, Duffy SW. Swedish Two-County Trial: impact of mammographic screening on breast cancer mortality during three decades. Radiol 2011;260:658–63. 12. Nystro¨m L, Andersson I, Bjurstam N, Frisell J, Nordenskjo¨ld B, Rutqvist LE. Long term effects of mammography screening: updated overview of the Swedish randomised trials. Lancet 2002;359:909–19. 13. Duffy SW, Duffy SW, Yen A, Chen T, Chen S, Chiu S, Fan J, Smith RA, Vitak B, Tabar L. Long term benefits of breast screening. Breast Cancer Management 2012;1(1): 31–38. 14. Gøtzsche PC. Relation between breast cancer mortality and screening effectiveness: systematic review of the mammography trials. Dan Med Bull 2011;58:A426.

Copyright of Journal of Medical Screening is the property of Sage Publications, Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.

A note on the design of cancer screening trials.

To investigate the consequences of different cancer screening trial designs and follow-up options for accuracy of the estimate of the effect of screen...
95KB Sizes 2 Downloads 6 Views