758

AMERICAN JOURNAL OF OPHTHALMOLOGY

was for each standard therapy, there would not be this great dichotomy between in­ formed consent for research and informed consent for practice. Somewhere between is a reasonable informed consent for both. REFERENCES

1. Chalmers, T. C, Sebestyen, C. S., and Lee, S. : Emergency surgical treatment of bleeding peptic ulcer. An analysis of the published data on 21,130 patients. Trans. Am. Clin. Climatol. Assoc. 82:188, 1970. 2. Enquist, I. F., Karlson, K. E., Dennis, C, Fierst, S. M., and Shaftan, G. W. : Statistically valid ten-year comparative evaluation of three meth­ ods of management of massive gastroduodenal hemorrhage. Ann. Surg. 162 -.550, 1965. 3. Read, R. C, Huebl, H. C, and Thai, A. P. : Randomized study of massive bleeding from peptic ulcération. Ann. Surg. 162:561, 1965. 4. Spicer, F. W., Carbone, J. V., and Lyon, C. G. :

MAY, 1975

Acute massive hemorrhage from gastroduodenal ulcération. Am. J. Surg. 102:153, 1961. 5. Chalmers, T. C. : Randomization and coronary artery surgery. Ann. Thorac. Surg. 14:323, 1972. 6. McGill, D. B., Humpherys, S. R., Baggenstoss, A. H., and Dickson, E. R. : Cirrhosis and death after jejunoileal shunt. Gastroenterology 63:872, 1972. 7. The University Group Diabetes Program: A study of the effects of hypoglycémie agents on vascular complications in patients with adult-onset diabetes. Diabetes 19 (Suppl. 2) :747, 1970. 8. The University Group Diabetes Program: Effects of hypoglycémie agents on vascular compli­ cations in patients with adult-onset diabetes. 4. A preliminary report on phenformin results. J.A.M.A. 217:777, 1971. 9. Chalmers, T. C, Block, J. B., and Lee, S.: Con­ trolled studies in clinical cancer research. N. Engl. J. Med. 287:75, 1972. 10. Buncher, C. R.: Sounding Board: Adminis­ tratively significant. N. Engl. J. Med. 289:155, 1973. 11. Chalmers, T. C. : Varieties of significance. N. Engl. J. Med. 289:923, 1973.

W H Y DO W E NEED CONTROLS? W H Y DO W E N E E D TO RANDOMIZE? FRED EDERER,

M.A.

Bethesda, Mary fond A controlled clinical trial is a scientific ex­ periment. In the ideal scientific experiment, there is an experimental group and a control group, that is, a comparison group, and the two groups are identical in every respect ex­ cept for the experimental treatment. Then any difference in outcome must be attributable to the difference in treatments. Most clinical trials in the history of medi­ cine have been uncontrolled. The usual prac­ tice has been to treat a series of patients with a new treatment and then form an impres­ sion, or opinion, as to whether the results are excellent, good, same as before, or bad. If the results are excellent or good, they are Mr. Ederer is head, Section of Clinical Trials and Natural History Studies, National Eye Institute, Bethesda, Maryland. Presented at the National Eye Institute Work­ shop on Randomized Controlled Clinical Trials, Washington, D.C., Nov. 6, 1973. Reprint requests to Mr. Fred Ederer, National Eye Institute, National Institutes of Health, Be­ thesda, MD 20014.

more likely to be published than if they are indifferent or bad. Negative findings lack glamour. When a treatment is dramatically effec­ tive, no controls may be needed. The first case of penicillin treated in this country pro­ vided strong indications of its remarkable therapeutic potential.1 But how many penicil­ lins have we had ? Usually we are concerned with much smaller effects. While the truth may eventually emerge from a series of un­ controlled trials, this is at best an inefficient process. And the danger is that the truth may never emerge. Photocoagulation treatment to prevent blindness from diabetic retinopathy2 and screening for early diagnosis of cervical cancer are two widely used, costly procedures, the values of which remain inadequately sub­ stantiated. Of the latter, Geoffrey Rose3 re­ cently said: It is now clear that a definite answer to this major question could only have been obtained by a controlled trial of early diagnosis by cer-

CLINICAL TRIALS

VOL. 79, NO. 5

vical smear versus clinical diagnosis of invasive disease. Such a trial seems no longer possible either ethically or practically. We are now com­ mitted to continuing a costly screening service and shall never know whether it is doing little or much to control the fatal forms of the dis­ ease.

I would like to give three examples illus­ trating the difficulty in drawing conclusions from uncontrolled studies (Table 1). This is from a study of antihistamines in the treat­ ment of the common cold.4 Of colds under one day's duration, 13.4% were cured and 68.2% were cured or improved. Is that good, bad, or indifferent? Without controls, the in­ vestigator would have to search his memory about past results in untreated cases. In place of objective evidence, we would have a sub­ jective, intuitive impression, that is, one per­ son's opinion as to whether the results are excellent, good, or no better than before. For­ tunately, the study in question did use con­ trols—randomized controls, at that—in the form of placebos. The placebo results were essentially the same as the antihistamine re­ sults. The second example comes from the Na­ tional Diet-Heart Study, which attempted to assess the influence of several randomized, double-blind diets on serum cholesterol (Ta­ ble 2). 5 The results from two of the partici­ pating centers showed percent serum choles­ terol changes in middle-aged men after the start of two somewhat different experimen­ tal diets, B and C, and a control diet, D. Without the control diet we would be tempted to conclude that the diets were more effective in Minneapolis-St. Paul than in Oakland. But after subtracting the control results, the net experimental effect is about the same in TABLE 1 COLDS UNDER ONE DAY'S DURATION*

Antihistamine Placebo

Cured, %

Cured or Improved, %

13.4 13.9

68.2 64.7

* Results on second day following treatment.

759 TABLE 2 NATIONAL DIET-HEART STUDY*

Diet B C D

M

st n pa P ul HS "

0akland

-14.7 -IS.5 - 7.3

-11.0 -10.9 - 1.8

* Mean percent serum cholesterol change.

the two cities: 8 or 9%. The cause of the greater Minneapolis-St. Paul drops in all groups was never explained. It might have been some coincidental occurrence, such as a seasonal influence. From results of another double-blind dietheart study conducted over an eight-year pe­ riod it appears that the experimental diet be­ came more and more effective as the study progressed (Figure). 8 But the people on the control diet also showed a progressive serum cholesterol decrease over the eight years. We may not know how to explain these decreases. The important point is that a true measure of the effectiveness of an experimental treat­ ment is obtained not from the treatment alone, but by comparing it with a control. Examples of these kinds led Professor Hugo Muench of Harvard University to formulate his Second Law: "Results can al­ ways be improved by omitting controls."7 The last example again illustrates how co­ incidental occurrences, in the absence of con­ trols, can lead to a misinterpretation of an observed effect.8 During World War II res­ cue workers, digging in the ruins of an apart­ ment house blown up in the London blitz, found an old man lying naked in a bathtub, fully conscious. He said to his rescuers: "You know, that was the most amazing experience I ever had. When I pulled the plug and the water started down the drain, the whole house blew up." The next question is : Why do we need to randomize? Before answering we should ex­ plain what randomization means. It means that the choice of treatment for each unit

MAY. 197S

AMERICAN JOURNAL OF OPHTHALMOLOGY

760

SEBUM CHOLESTEROL

1.05,

y**\ 1..«,./:

Γ4-4 >T

T CONTROL

1

T

.

MEAN DIFFERENCE = 12.7%

YEARS FROM START OF DIET Figure (Ederer). Changes in serum cholesterol in a controlled, double-masked diet trial. (From Dayton and associates.*)

(for example, patient, eye) should be made by an independent act of randomization, such as the toss of a coin or the use of a table of random numbers. We randomize because it prevents any possibility of bias and it is by far the best method to date to make the groups we want to compare as similar as possible. But why take blind luck? Why not match the groups on important prognostic factors, such as age, sex, and severity of disease? The an­ swer is that we can match only on factors we know, or believe, to be important. We cannot match on factors we do not know about. Sec­ ondly, we can only match on factors we can observe or measure. We cannot match on fac­ tors we cannot observe or measure. Random­ ization tends to match on all factors, prog­ nostic and nonprognostic, observable and not observable, measurable and not measurable. The Coronary Drug Project illustrates how effective randomization is in matching groups (Table 3). 9 The percentage at base­ line with ten given findings is nearly identical

in the Atromid-S and placebo groups. The publication lists 44 such findings. Suppose we do not randomize. What kinds of trouble does that get us into ? A number of alternatives to randomization have been used. TABLE 3 CORONARY DRUG PROJECT

Percent with Given Baseline Findings

Age >45 Race, nonwhite Risk group 2 (high risk) >2 previous myocardial infarctions > 12 months since last myo­ cardial infarction Electrocardiogram, Class 1 or 2 Q waves Definite angina pectoris T-wave Relative body weight > 1.00 Cigarette smokers

Atromid-S

Placebo

85.2 6.5 33.8

85.2 6.8 34.3

18.0

19.9

85.3

86.1

83.6 47.1 50.7 87.9 39.0

83.3 46.5 49.5 87.9 37.9

VOL. 79, NO. S

CLINICAL TRIALS

One of these is to use unplanned controls. Some patients are given a drug, and some are not. The drug may be given to clinic pa­ tients and private patients may serve as con­ trols, or vice versa. Whatever the method of selection, the basic question is whether the two groups are comparable, and with un­ planned controls this question cannot ever be answered affirmatively with assurance. A sea captain was given samples of antinausea pills to test during a voyage. The need for controls was carefully explained to him. Upon return of the ship, the captain reported the results enthusiastically. "Practically every one of the controls was ill, and not one of the subjects had any trouble. Really wonderful stuff." A skeptic asked how he had chosen the controls and the subjects. "Oh, I gave the stuff to my seamen and used the passengers as controls."10 Not only are unplanned controls hazardous, but when a randomization procedure is de­ veloped, it is essential to adhere strictly to it. The investigator is sometimes tempted to de­ part from the randomization procedure, par­ ticularly when he doubles as therapist. The 1930 Lanarkshire milk experiment, which involved 20,000 school children, was a test of the effect of extra milk on their height and weight. Selection of the 10,000 subjects and 10,000 controls was left to the teachers, in some cases "by ballot," in others on an alpha­ betical system. But there was an escape clause: the teachers were allowed to "improve" the selections if they considered them unbalanced. It was discovered afterward that sympathies had biased the selections so that at the outset the controls were heavier and taller than the experimental subjects. As a result, the ex­ periment ". . . failed to produce a valid esti­ mate of the advantage of giving milk to chil­ dren "« Studies have been done in which patients are alternated between treatments A and B or in which selection is determined by whether the hospital number or the day of the month is even or odd. While many clinicians believe these methods to be random, in fact they are

761

not. A post-World War II multi-clinic trial of anticoagulant therapy used the day-of-themonth method, and it was discovered that more patients than expected were admitted on odd days. The investigators reported that "as physicians observed the benefits of anti­ coagulant therapy, they speeded up, where feasible, the hospitalization of those patients . . . who would routinely have been hospital­ ized on an even day in order to bring as many as possible under the odd-day deadline."12 Another method is to select controls from among those who refuse the treatment. Again we must ask: Are the two groups comparable in all relevant respects? It has been shown that volunteers and nonvolunteers tend to dif­ fer in a number of ways.13 Still another method that has been used is to use one treatment in one center, and an­ other treatment in a second center. The DietHeart Study example illustrated the kinds of difficulties we can encounter (Table 2 ) . Hos­ pitals differ in kinds of patients, in kinds of doctors, and in kinds of ancillary care, but these differences are often not obvious. Sup­ pose diet B had been administered at the Oak­ land Center and C at the Twin Cities Center. It would have been concluded that C is more effective than B. Sometimes differences are subtle and escape notice. I have another antinausea example. Two drugs and a placebo were tested in several boats. It was argued that all boats were alike and that they would run parallel courses. But the designers of the experiment insisted that all remedies be used on each boat, and this is how it was in fact done. The results were interesting: they showed that all the men on one boat had lower illness rates for each remedy than the men on the other boat. It turned out that this boat carried a different ballast. No one would have bothered to look for the ballast differ­ ence if the boat had contained a single rem­ edy." Another favorite method is the historical control. A new treatment is tried on a series of patients, and case records are drawn from the same clinic for a previous period when

762

AMERICAN JOURNAL OF OPHTHALMOLOGY

the old treatment was used. Again, the fun­ damental assumption is that the two series are alike in all relevant respects, and this is impossible to prove. Many things can change over time, and often these changes are subtle and difficult to detect. 1 * -18 W h e n coronary care units were introduced in the 1960s, attempts were made to compare mortality from coronary heart disease in these units with previous experience in t h e same hospitals. B u t referral practice had changed: more, and undoubtedly different, types of patients were admitted after the coronary care units were opened. I n ophthal­ mology, similar changes in referral patterns can occur gradually as an ophthalmologist becomes better known o r more rapidly after he acquires a new instrument, such as a laser. W h e n conditions permit, a patient may be used as his own control, and this is usually an advantageous experimental design. I n the case of monocular treatment, one eye may be treated while the fellow eye serves as a con­ trol. Here, not the patients, but the eyes a r e randomized. W h e n treatment is binocular, it may be possible to apply one treatment for several weeks o r months, and then switch the patient to another treatment for an equivalent period. H e r e the treatment is randomized.

MAY, 1975

REFERENCES

1. Weinstein, L. : Antibiotics. 2. Penicillin. In Goodman, L. S., and Gilman, A. (eds.) : The Phar­ macological Basis of Therapeutics. New York, MacMillan, 1965, pp. 1193-1195. 2. Ederer, F., and Hiller, R. : Clinical trials, dia­ betic retinopathy, and photocoagulation. A reanalysis of five studies. Survey Ophthalmol. 19:267, 1975. 3. Rose, G. : Early diagnosis of chronic disease. Br. J. Hosp. Med. 1971, vol. 6. 4. Medical Research Council: Clinical trials of antihistaminic drugs in the prevention and treatment of the common cold. Br. Med. J. 2:425, 1950. 5. National Diet-Heart Study Research Group: The National Diet-Heart Study final report. Cir­ culation 37(Suppl. 1) :1, 1968. 6. Dayton, S., Pearce, M. L., Hashimoto, S., Dixon, W. J., and Tomiyasu, U. : A controlled clini­ cal trial of a diet high in unsaturated fat. Circula­ tion 39(Suppl. 2) :1, 1969. 7. Bearman, J. E., Loewenson, R. B., and Gullen, W. H. : Muench's Postulates, Laws, and Corollaries. Biometrics Note No. 4, Office of Biometry and Epi­ demiology, National Eye Institute, April 1974. 8. Chalmers, T. C. : Science versus ethics in hu­ man drug trials. Problems and solutions. In Proger, S. (ed.) : The Medicated Society. New York, MacMillan, 1968, pp. 181-203. 9. Coronary Drug Project Research Group: The Coronary Drug Project. Design, Methods, and Baseline Results. Am. Heart Assoc. Monograph No. 38. New York. Am. Heart Assoc, 1973. 10. Wilson, E. B. : An Introduction to Scientific Research. New York, McGraw-Hill, 1952, p. 42. 11. "Student": The Lanarkshire milk experi­ ment. Biometrika 23:398, 1931. 12. Wright, I. S., Marple, C. D., and Beck, D. F. : Myocardial Infarction. Its Clinical Manifestations DISCUSSION and Treatment with Anticoagulants. New York, Grune and Stratton, 1954, pp. 9-11. D R . BERNARD S C H W A R T Z : Randomization— 13. Crocetti, A. : Volunteering in Medical Re­ isn't that a function of the size of the sam­ search. Doctoral dissertation. Baltimore, Johns Hop­ ple which you are testing? kins University, 1970. 14. Merrel, M. : Clinical therapeutic trial of a new M R . EDERER: If you have two treatments to drug. Bull. Johns Hopkins Hosp. 85 :223, 1949. 15. Mainland, D. : Statistical ward rounds. Clin. compare, you can randomize with two pa­ tients. I am not recommending this as an Pharmacol. Ther. 8:876, 1967. 16. Hill, A. B. : Clinical trials. In Principles of adequate sample size, but there is no mini­ Medical Statistics. New York, Oxford University mum required for randomization. Press, 1971, chap. 20, pp. 251-52.

Randomized controlled clinical trial. National Eye Institute workshop for ophthalmologists. Why do we need controls? Why do we need to randomize?

758 AMERICAN JOURNAL OF OPHTHALMOLOGY was for each standard therapy, there would not be this great dichotomy between in­ formed consent for research...
672KB Sizes 0 Downloads 0 Views