READERS’ COMMENTS [W] e can usually devise a model that will fit the data perfectly; but in general we will find that in doing so we have invoked as many structures . . . in the model as we have data points . . . . In the process, the model-fitting has succeeded brilliantly at a technical level but has failed totally in the scientific endeavor. . . We have replaced a set of twenty data by an equation containing twenty parameters.s George A. Dtemed, MD Los Angeles, California 10 April 1989

Stepwise Transgression The documentation manual1 for one of the most highly regarded software implementations of stepwise regression analysis contains the following “Note of Caution”: Stepwise variable selection can potentially be abused. When many variables are being examined, stepwise methods can easily find significant factors even when no real associations with the dependent variable exist. . . . In general, if there are m observations for the least frequent category of a binary response variable, you should not examine more than m/10 variables in order to derive a model that is somewhat reliable. Two recent studies failed to heed this scrap of stepwisdom.2,3 In each, the investigators examined a large number of candidate variables relative to a small number of outcomes (Table I). This transgression materially degrades the reliability of the resultant conclusions in several ways. First, there is a high probability that variables identified as “important” by stepwise regression are not really the important ones.4-6Second, there is a high probability that the resultant model is overfitted to the particular population from which it derives, and that it will thereby perform poorly in prospective application. A striking illustration of such overfitting is provided in the analysis of an old industrial quality control problem.’ The engineers performing this analysis first identified 16 factors they considered to be potential determinants of quality for the production process they were studying. They expressed each of these factors as a continuous variable, along with its reciprocal, its square and the reciprocal of its square, and then performed a stepwise regression analysis using the 64 raw and derived variables as input, and the observed service life for 22 production batches as outcome. The resultant model explained 80% of the variance in service life. However, when the engineers verified their results by repeating the analysis on a set of data generated completely at random, the new model-based on 21 of the 64 fictitious variables-explained 99.9969% of the variance. Actually, this isn’t really all that surprising.

1. SUGI SupplementalLibrary User’sGuide. Fifth ed. SAS Institute: Gary, North Carolina;

1. lsaaz K, Thompson A, Ethevenot G, Cloez JL, Brembilla B, Pernot C. Doppler echocar-

diographicmeasurementof low velocity motion of the left ventricular posterior wall. Am J Cardial 1989;64:66-75.

2. Kostis JB. MavrogeorgisE, Slater A, Bellet S. Use of a range-gated, pulsed ultrasonic

Dopplertechniquefor continuousmeasurement of velocity of the posterior heart wall. Chesr 1972,62:597-604.

diol 1989,63:517-521.

of Standard 12Lead and Modiffed Exercise Electrocardiograms

GibsonRS, WatsonDD, Belier GA. Quantitative exercisethallium-201 scintigraphy for predicting angina recurrenceafter Percutaneous Comparison transluminal coronaryangioplasty.Am J Car3. Rogers WJ, Bourge RC, Papapietro SE, Wackers FJT, Zaret BL, Forman S, Dodge HT. Robertson TL, Passamani ER, Braunwald E. Variables predictive of good functional out-

come following thrombolytic therapy in the Thrombolysis in Myocardial Infarction Phase II (TIM1 II) pilot study. Am J Cardiol 1989; 63:503-512. 4. Harrell FE Jr, Lee KL, Califf RM, Pryor DB, Rosati RA. Regression modelling strategies for improved prognostic prediction. Sfaf Med 1984;3:143-152. 5. Ferguson JG, Pollock BH, Work JW, Diamond GA. How does sample size affect the reproducibility of a clinical prediction rule? C/in Res 1987;35:344A. 6. Diamond GA. Penny wise. Am J Cardiol 1988,62:806-808. 7. Mayer RP, Stowe RA. Would you believe 99.9969% explained? Industr Eng Chem 1969; 61:42-46. 8. Murphy EA. A Companion to Medical Statistics. Baltimore: Johns Hopkins Universiry Press, 1985:139-140.

Doppler Measuremti of Posterior Left Ventricular Wall VeloCity It has been stated that the half-life of medical knowledge is 6 years and that consequently references to publications >3 half-lives may be unfashionable. However, the statement by Isaaz et al’ that “no attempt has been made to analyze these low Doppler shift frequencies produced by the moving heart wall” reminded me of an article I published only 17 years ago.2 The reference is easily

Stepwise Regression No. of Candidate Variables

Stuckey* Rogers3 Rogers3

John B. Kostis, MD New Brunswick, New Jersey 7 July 1989

1986:280. 2. Stuckey TD, Burwell LR, Nygaard TW,

TABLE I Patients, Outcomes and Candidate Variables in

I 1 Study

retrievable from the National Library of Medicine’s database by entering the words Doppler and wall. The fact that the velocities obtained in the 2 reports are in a similar range proves the French saying that the more things change, the more they stay the same.

I Patients

Outcomes

Untvariate

Multivariate

68 218 135

23 68 41*

22 47 >58t

22 9 15

* This value IS esbmated from the overall outcome frequency (68 of 218 = 31%): t the total number of angugraphlc candidate variables was not reported.

We were interested in the article by Sevilla et al’ comparing the standard 12lead with the exercise electrocardiogram, and pleased to see that their conclusions are essentially identical to ours2 We have also reported that the widely used MasonLikar exercise lead system3 and the standard 12-lead electrocardiogram are not “essentially identical” as was originally claimed and that movement of the limb electrodes onto the torso, as is necessary for exercise stress testing, so distorts the “inferior” leads that they no longer reflect the inferior cardiac surface in isolation.4 In fact, further work from our department, as yet unpublished, suggests that the so-called “inferior” leads of the exercise electrocardiogram are more “anterior” than “inferior.” We agree that the exercise electrocardiogram should be identified as being recorded from torsobased limb electrode locations, either by being labeled “modified” as we suggested, or “torso-based” as suggested by Sevilla et al, so that changes in the inferior leads of such a recording are not necessarily taken to imply disease on the inferior cardiac surface. We would like to point out that the torso locations used by Sevilla et al are not those originally described by Mason and Likar; nor did other workers such as Diamond,5 Rautaharju,6 Gamble’ and their co-workers, quoted by Sevilla et al in their article, use the prescribed Mason-Likar torso electrode locations. Kleiner et als are the only investigators who used the prescribed Mason-Likar locations; each group used their own modifications. In a small survey of British centers we found a wide variation in the location of the torsobased limb electrodes and the same probaLetters (from the United States) concerning a particular article in the Journal must be received within 2 months of the article’s publication, and should be limited (with rare exceptions) to 2 double-spaced typewritten pages. Two copies must be submitted.

THE AMERICAN JOURNAL OF CARDIOLOGY APRIL 15. 1990

1047

Stepwise transgression.

READERS’ COMMENTS [W] e can usually devise a model that will fit the data perfectly; but in general we will find that in doing so we have invoked as m...
153KB Sizes 0 Downloads 0 Views