Mild Sede Dementia of the Alzheirner Type. 4.Evaluation of Intervention Leonard Berg, MD,* J. Philip ,Miller, AB,? Jack Baty, BA,? Eugene H. Rubin, MD, PhD$ John C. Morris, MD,Ӥ and Gary Figiel, MD?:

The design of trials of interventions intended to slow or arrest the progression of senile dementia of the Alzheimer type must be based on analysis of the natural history of the disease. Using a random coefficients statistical model, we analyzed the natural history of senile dementia of the Alzheimer type in carefully defined subjects with mild disease (n = 68) for periods of up to 10 years. Subject performance was assessed longitudinally on batteries of clinical and psychometric measures. The characteristics of these measures were analyzed relevant to their utility as outcome measures for long-term trials in patients with senile dementia of the Alzheimer type. Estimates were made of sample sizes required to show arrest, and 50% or 25% slowing in the progression of mild disease. We suggest that a clinically relevant global measure, such as the Sum of Boxes of the Clinical Dementia Rating scale, and a performance-based clinical scale or psychometric measure would be appropriate in a 12- or 24-month trial enrolling subjects with mild senile dementia of the Alzheimer type. Berg L, Miller JP, Baty J, Rubin EH, Morris JC, Figiel G Mild senile dementia o f the Alzhcimer type 4 Evaluation of intervention Ann Neurol 1992,31 242-249

Clinical trials are i n progress using a multitucle of drugs { 1, 2f for temporary symptomatic improvement of the cognitive impairment characteristic of senile dementia of the Alzheimer type (SDAT). The stage is set for trials of interventions to slow the progression of the disease. The potential use of nerve growth factor has received the most attention { 3 ] . Among many issues central to planning trials of such an intervention are the degree of diagnostic certainty, variabi1ii:y among patients, appropriate characteristics of outcome measures, and biostatistical considerations related to the power of the study, that is, the probability of detecting a true change in the natural history of the disease. Some of these issues were addressed in part b y recent publications {I, 4 , 51. In 1988 our group (the Washington University Memory and Aging Project) wrote three artic1l:s 16-81 dealing with the longitudinal assessment, testability, and implications for therapeutic intervention in a sample of persons with SDAT whose dementia wa.s staged as mild at entry into the study. In this article we extend those observations by incorporating data from subsequent observations on those subjects, pooling those data with data of a replication sample {9], and widening the scope by comparing data from the enlarged sample with those from subjects only minimally impaired with “questionable” or “very mild” SDAT [lo- 123.

Storandt and colleagues [ 131 recently reported on the findings of a 10-year study of the progression of mild SDAT in our first sample, based on a battery o f psychometric tests. The present article differs from that study in that data on brief quantitative clinical measures of dementia are included and the subject sample has been enlarged. Furthermore, we explore the utility of recent advances in statistical techniques appropriate for the evaluation of interventions designed to alter the course of a progressive disease such as SDAT. These advances permit the utilization of data on all subjects regardless of the length of time they have measurable scores. We address these general questions regarding such intervention studies: (1) What are appropriate outcome measures? (2) What are the characteristics of the sample that should be enrolled? (3)What are the necessary characteristics of such a trial (e.g., sample size, length of trial, frequency of assessment)?

From the Alzheimer’s Disease Research Center, the “Department of Neurology and Neurological Surgery (Neurology), the tDivision of Biostatistics, and the :Departments oi Psychiatry and $Pathology (Neuropathology), Washington IJniversity School of Medicine, St LOUlS, M O .

Received Apr 30, 1991, and in revised form Jul 22. Accepted for publication Jul 23, 1991.

242

Materials and Methods Diagnostic Criteria and Staging Our clinical diagnostic criteria [ l 4 ]for SDAT are consistent with (but more stringent than) the criteria for “probable Alzheimer’s disease” recommended by the National Institute of Neurological and Communicative Disorders and Stroke/ Alzheimer’s Disease and Related Disorders Association

Address correspondence fo D~ Berg, Campus Box 81 11, ADRC, Eucliii Washington University School of Medicine, h(,o St Louis, M O 63 110.

Copyright 0 1992 by the American Neurological Association

(NINCDS/ADRDA) Work Group Cl5) and those for “primary degenerative dementia” recommended by the American Psychiatric Association [ 161. These sets of criteria have satisfactory reliability [ 17) and validity 118-201. In our consecutive series of postmortem examinations on research subjects diagnosed by our criteria, the diagnosis of Alzheimer’s disease was confirmed in 85 of 88 patients (as of June 30, 1991) by finding sufficient numbers of senile plaques and neurofibrillary tangles to satisfy accepted criteria 12 11. The Washington University Clinical Dementia Rating ICDR) stages the cognitive decline as none (CDR 01, questionable or very mild (CDR 0.5), mild (CDR JLj, moderate (CDR 2 ) , or severe (CDR 3). The CDR is derived according to published rules C22, 231 from ratings of impairment assigned by the clinician in each of six cognitive domains after a 90-minute interview of the subject and a knowledgeable collateral source. The six domains are (1j memory; (2) orienration; ( 3 ) judgment and problem solving; and cognitive function in ( 4 )community affairs, (5) home and hobbies, and (6) personal care. Impairment in each of the six dom.ains is rated as none (O), slight (0.5),mild ( I ) , moderate (21, or severe (3).

Sumpies Subjects with mild [CDR I)SDAT were recruited El41 in nvo waves, 5 years apart. Published analyses 17,91 showed no differences between the first and second samples with respect to performance on either clinical or psychometric measures. There were 31 men and 37 women in the combined CDR 1 sample. Other demographic data have been provided in detail in previous articles {b, 91 and are summarized in Table 1. The methods of recruitment, steps for obtaining informed consent for all subjects, and all investigative procedures were approved by the Institutional Review Board of Washington University School of Medicine.

Meumre.c The ratings in each of the six cognitive domains (“Boxes”) from which the CDR was derived were summed to provide an additional global clinical measure, the Sum of Boxes (61 (maximum impairment = 6 x 3 = 18).The Sum of Boxes provides an expansion of the CDR scale. The interrater reliability of the overall CDR and the Sum of Boxes has been good (Kendall’s T~ correlation coefficient = .91 and .90, respectively) 1241. The clinical interview included several brief assessment instruments: the behavioral checklist Dementia Scale of Blessed and associates [ 2 5 ] derived from a collateral source, the Dementia ScaleiCognitive (that portion of the Dementia Scale that assesses cognitive function), the InformationMemory-Concentration performance test (IMC) of Blessed and colleagues 1251 (modified by Fuld [261), a six-item shorter version (IMC-6, validated by Katzman and associates 1271, and the Short Portable Mental Status Questionnaire of Pfeiffer 1281. Our Aphasia Battery 1291 is based OTI verbal tasks in the Boston Diagnostic Aphasia Evaluation C.301. O n all of these clinical measures, a higher score means more impairment. Subjects were also tested with a psychometric battery, described in detail elsewhere f31). W e report here analyses involving only selected psychometric measures based on our

Table 1 . Demographic CharacteriJtiu and Score\ at E n t q (Means ? Standard Dezuationsi” Age at entry (yr) Education (yr) Social positionb Clinical measures‘ Sum of Boxes (0-18) DS (0-28) DSC (0-17) IMC (0-3 1) IMC-6 (0-28) SPMSQ (0-10) AB (0-35) Psychometric measuresL WMS digit span forward (0-8) WMS logical memory (0-23) WMS associate learning (0-24) WAIS information (0-29) WAIS digit symbol (0-90) Benton VRT Form C (0-65) Benton VRT Form D (0-10) Boston Naming Test (0-60)

72.1 2 5.0 12.2 ? 1.7

3.3 t 1.2

*

1.4 5.4 _t 2.8 3.3 _t 1.4 15.4 ? 4.9 6.2

18.7 2 5 . 0 5.7 F 2.2 4.1 2 4.0 5H*ll 1 7 1 IX

108 2

42

8 8 2 5 5 202 2 120 16.8 c 5 2

3 5 2 4 1 _t I 5 9

285

”Higher scores indicate greater impairment o n all of rhe clinical measures and on [he two forms of the Benton VRT. Lower scores i d cate greater impairment on ail the ocher psychometric measures. bAs classified by Holhgshead [421. ‘Numbers in parentheses are the range o f possible scores. WMS = Wechsler Memory Scale; WAIS = Wechsler Adult Inrelligence Scale; VRT = Visual Retention Test; DS =: Blessed Demeritia Scale; DSC = Blessed Dementia Scale/Cognitive. IMC Blessed Information-Memory-Concenrration test; IMC-6 = ICS 0item shorter version; SPMSQ = Short Portahle Mtmtul Sratus Que\tionnaire; AB = Aphasia Battery. 7~

experience in assessing the rate of decline o n psychometric test scores [13] W e chose three subtests o t the Wechsler Memory Scale 1321 (digit span forward, logical memory, and recall of easy pairs of associate learning), the information an[\ digit symbol subtests of the Wechsler Adult Intelligencr Scale (WAIS) [33}, two forms ( C and D ) of the Benton Visual Retention Test [341, and the Boston Naming Test I351 Greater impairment is indicated by higher scores on the two forms of the Benton Visual Retention Test (both scored as number of errors) and lower scores on all the other psychometric tests The Blessed IMC and IMC-6 tests were used beginning with the enrollment of the second wave of subjects and results are reported only for these subjects All other clinical and psychometric measures were applied to all subjects

Times

of Testing

Clinical and psychometric testing was performed at entry and at 15, 34, 50, and 66 months after entry, and every 12 months thereafter. Survivors in the first sample had completed their 10th time of testing by January 1991. Survivors in the second sample had completed their fourth or fifth time of testing.

Statistical Analyses In order to address most effectively the issues and questions cired in the introduction, we chose a growth curve model

Berg et al: Intervention in SDAT

243

that allows one to make use of all relevant information. The data for each measure were fit to the random coefficients model of Laird and Ware [36}. Although its mathematical formulation is somewhat different, the underlying model is essentially the same as that for a traditional univariate repeated measures analysis of variance (ANOVA) C37, 381. The ANOVA framework, however, is difficult to apply when observations are not available for all subjects at all time points, as was true here. The Laird and Ware model is appropriate for dealing with groups of regression lines. As Feldman pointed out in a useful review [39],models of this type utilize familiar designs, such as ANOVA, but the hypotheses and designs relate to regression lines rather than to single observations and can deal with random variation of entry scores and rates of progression among subjects. These methods also attend to “random effects,” that is, unmeasured, uncontrolled sources of variability. Ignoring random effects can increase the chance of declaring statistical significance in error because the pooling of intrasubject with intersubject variability falsely reduces the estimate of the error variance. In the Laird and Ware model, a unique linear regression line is fit for the scores of each individual; that is:

t

1

Time

-

Fig 1 . Schematic representation of scores on a single measure f i r a single subject ouer time. Scores indicated by open circles are those obtained during the course of declining Performance. Those indicated by open diamonds are discarded perfect scores bejore decline. Those indicated by open squares are discarded scores obtained after maximum impairment on the measure bas been reached.

Y,,,= at + bit,,, + ei.1, where yi,, represents the score of the ifh individual at the jfh time point, a, represents the intercept for the ifhindividual (score at entry), b, represents the slope for the irhindividual (the rate of progression for that individual), t,,, represents the time from enroilment to the jch time point for the ichindividual, and el,,represents a random error (noise) term having a mean of 0.0 and no correlation with other terms in the model. The model assumes that a, (intercept) and bi (slope) are random variables distributed normally. Using the program of the original authors 1401, restricted maximum likelihood (REML) estimates of the mean and standard deviation of the slopes (bi) and intercept (a,) terms, their correlation, and u, (the standard deviation of the noise term, e,,,) were calculated. These estimates are similar to those obtained with a variance components analysis from a traditional ANOVA model and require an iterative numerical solution, that is, a series of repetitively refined estimates of the parameters. Before calculating the REML estimates, two appropriate modifications in the data set were introduced. Whenever a subject was untestable on a given measure because of cognitive impairment, a score indicating maximum impairment was assigned. With respect to the 295 visits of the 68 subjects, this was necessary an average of 1 5 times for each clinical measure (range, 0-32) and an average of 61 times for each psychometric measure (range, 43-91). A visit was defined as any time of testing for which a Sum of Boxes could be assigned. Many of the measurements exhibited floor or ceiling effects. In order to avoid bias from those effects in the estimated rates of progression, observations before the last perfect score and those subsequent to the time an individual reached maximal imDairment on a measure were discarded (Fig 1). This resulted in discarding an average of 24 of the 295 observations for the clinical variables (range, 2-63) and

244 Annals of Neurology Vol 31 No 3

March 1992

an average of 43 of the 295 observations for the psychometric variables (range, 25-61). This modification allows for estimation of the rate of progression during the period of declining performance. An examination of the plots of the observed scores during the declining performance of individual subjects verified that a linear trajectory provided a satisfactory fit to the data.

Results Figure 2 provides survival plots (Kaplan-Meier) to indicate deaths, institutionalizations, and testability on certain clinical and psychometric measures chosen to exemplify the problems of attrition (61 in such a sample. These curves were generated with PROC LIFETEST

r411. Table 1 provides summary data on scores observed at entry for each measure. More details have been published previously {6, 9, 13, 311. In Table 2 the REML estimated parameters for each measure are provided: the mean intercept (estimated score at entry) and its standard deviation (SD), the mean slope and its SD, the correlation of the slope and intercept, and u, (the SD of the scores observed on each individual around the linear trajectory of that person’s scores). The mean slopes are the rates of progression expressed as points of change in a score per year. More favorable performance of an outcome measure is indicated by the combination of a higher ratio of mean slope to its SD and a smaller ue (e.g., Sum of Boxes, IMC). Measures with substantial floor or ceiling effects at entry or at the end of the study are less desirable as outcome measures. From the data on each measure in

PERCENT

90-

80. 70. 60-

5040.

302010.

0 12 24 36 48 60 72 84 96 108 120

MONTHS Fig 2. Attrition ofthe subject^ owr time. The percentages of the group alive are indicated by plus signs: testable on Sum o j Boxes, open diamonds: testable on the Short Portable Mental StatuJ Qnesrionnaire. open squam; tertable on both the Wechsfrvi M e m o q Scale a.rJ oriate learning suhtest and the Boston Naming 1e.r t , crosses; a rid n o t institutionalized. open

tviangles.

Table 2, the mean and SD of the scores at baseline and at 24 months were estimated. By reference to the cumulative normal distribution, the proportion outside the range for each measure (see Table 1) was then estimated to provide data on the floor and ceiling effects (Table 3). Utilizing the REML estimated parameters from Table 2 and applying the methods of Lefante [43], we calculated the necessary sizes of samples for several study designs and for effect sizes of arresting the progression of the disease and slowing the progression by 50% or 25%. Table 4 shows the sample sizes required in each group, treatment and control, for study durations of 6, 12, and 24 months with a significance level (a)of .OS (one-tailed t test) and power (1 - p) of .80. Sample sizes for 25% slowing are recorded only for the 24-month design because those required for shorter studies are too large to be practical. (There is less confidence in the numbers for the IMC and IMC-6 measures because they are based on a smaller sample, only those 24 subjects recruited in the second wave.) Because these calculations of s-ample size assume that all subjects will be measured at all time points specified, the expected attrition (see Fig 2 ) will require increasing the initial sample size. The numbers in the first seven columns of Table 4 were based on a study design in which all subjects are evaluated only at baseline (before initiation of treatment) and again at the end of the study. Testing every 6 months in a 24-month study adds modestly to the efficiency. However, a study design (“enhanced”) that includes interval assessments and multiple evaluations close to the beginning and end of the trial (last two

Table 2. Restricted Maximum Likelihood (REML) Parameter Estimatej for the MeasureJ Mean Intercept Clinical measures Sum of Boxes DS DSC IMC IMC-6 SPMSQ AB Psychometric measures WMS digit span forward WMS logical memory WMS associate learning WAIS information WAIS digit symbol Benton VRT Form C Benton VRT Form D Boston Naming Test SD = standard deviation; ue Table 1.

6.3 6.1 3.6 15.0 19.0 5.7 3.8 5.9 1.8 11.0 8.7 19.0 17.0 3.0 28.0 =

S D of Intercept

Mean Slope

S D of Slope

Correlation of Slope and Intercept

uc

.48 .26 .37 .18 .34 -.21 .38

1.9 4.6 2.2 2.5 3.0 1.3 4.5

.16

1.1

- .48

1.1 3.3 2.0 5.8 2.9 4.8 5.8

1.4 1.9 1.2 5.2 5.5 2.0 5.2

2.4 2.4 1.7 3.3 2.8 0.9 4.1

1.1 0.9 0.8 1.3 0.9 0.6 2.6

0.6 1.3 2.8

-A - .3 - 2.0 - 1.0 - 3.0 2.6 4.6 -4.0

0.5 0.2 0.6 0.9 1.8 1.2 3.3 2.7

5.2

12.0 4.9 3.5 15.0

.17 - .44 - .70 .18 .83 - .45

SD of each subject’s scores around the linear trajectory of that subjecr’s scores; other abbreviations as in

Berg et al: Intervention in SDAT

245

Entrv

Clinical measures Sum of Boxes DS DSC I MC IMC-6 SPMSQ AB Psychometric measures WMS digit span forward WMS 1ogic;il memory WMS associate learning WAIS information WAIS digit symbol Benton VR’T Form C Benton VRT Form D Boston Naming Test

2

Perfect Score

Maximum Impairment

(1

< I

Score

< 1

< 1

7

1

4 < 1

30 5 1

1 15 < 1

1

6

1

< 1

:c

:. 1

.: 1

< :

M,Lximum Impair men t

Pcrfeit

< I

A Work Group under the auspices of Department o f Health and Human Services Task Force on Alzheimer’s Disease. Neurology 1984;34:939-314 16. American Psychiatric Association. Diagnostic and statistical manual of mental disorders. 3rd ed, revised. Washington, DC:: American Psychiatric Association, 1987 17. Kukull WA, Larson EB, Reifler BV, et al. lnterrater reliability of Alzheimer’s disease diagnosis. Neurology 1990;40:257-260 18. Morris JC, McKeel D W Jr, Fulling K, et al. Validation of clinical diagnostic criteria for Alzheimer’s disease. Ann Neurol 1988; 24:ll-22 19. Tlerney MC, Fisher RH, Lewis AJ, et al Thc NINCDSADRDA Work Group criteria for the clinical &gnosis of prohahle Alzheimer’s disease: a clinicopathologic study of 57 cases. Neurology 1988;38:359-364 20. Kukull WA, Larson EB, Reifler BV, er al. T h e validity of 3 clinical diagnostic criteria for Alzheimer’s disease. Neurology 1990;40:1364-1369 21. Khachaturian 2s. Diagnosis of Alzheimer’s disease. Arch Neurol 1985;42:1097-1105 22. Hughes CP. Berg L, Danziger WL, et al. A new clinical scale for the staging of dementia. Br J Psychiatry 1082;140:566-572 23. Berg L. Clinical dementia rating. Psychopharmacol Bull 1988; 24:637-639 24. Burke WJ, Miller JP, Rubin E H , et al. Reliahility of the Washington University Clinical Dementia Rating (CDR). Arch Ncurol 1988;45:31-32 25. Blessed G, Tomlinson BE, Roth M. T h e association between quantitative measures of dementia and of senile change in the cerebral grey matter of elderly subjects Br J Psychiatry 1968. Il4:797-81 I 26. Fuld PF. Psychological testing in the differential diagnosis of the dementias. In: Katzman R, Terry R D , Bick KL. eds. Alzheimer’s disease: senile dementia and related disorders (aging, vol 7). N e w York: Raven Press, 1978:185-17j 27. Katzman R. Brown T, Fuld P, et al. Validarion of short orientation-memory-concentration test of cognitive impairment. Am J Psychiatry 1982;140:734-739 28. Pfeiffer E. A short portable mental status questionnaire for the assessment of organic brain deficit in elderly patients. J Am Geriatr SOC1975;23:433-441 29. Faber-Langendoen K, Morris JC, Knesevich JW, et al. Aphasia in senile dementia of the Alzheimer type. Ann Neurol 1988, 23~365-370 30. Goodglass H, Kaplan E The assessment of aphasia and related disorders. 2nd ed. Philadelphia: Lea & Febiger. IOX3:Appendix 1-28

3 1. Storandt M, Bonvinick J , Danziger WL, et al. Psychometric differentiation of mild senile dementia of the Alzheimer type. Arch Neurol 1984;41:497-499 (correction, ibid: 820)

32. Wechsler D, Stone CP. Manual: Wechsler Memory Scale. New York: Psychological Corporation, 1973 33. Wechsler D. Manual: Wechsler Adult Intelligence Scale. New York: Psychological Corporation, 1955 34. Benton AL. The Revised Visual Retention Test: clinical and experimental applications. New York: Psychological Corporation, 1963 35. Goodglass H, Kaplan E. Boston Naming Test: scoring booklet. Philadelphia: Lea & Febiger, 1983 36. Laird NM, Ware JH. Random-effects models for longitudinal data. Biometrics 1982;38:963-974 37. Sear1 SR. Mixed models and unbalanced data: wherefrom, whereat, and whereto! Commun Stat Theory Methods 1988; 17:935-968 38. McLean RA, Sanders WL, Stroup WW. A unified approach to mixed linear models. Am Stat 1991;45:54-64 39. Feldman HA. Families of lines: random effects in linear regression analysis. J Appl Physiol 1988;64:1721-1732 40. Stram DO, Laird NM, Ware JH. An algorithmic approach for the fitting of a general mixed ANOVA model appropriate in longitudinal settings. In: Allen DM, ed. Computer science and statistics: the interface. Amsterdam: Elsevier Science, 1986: 149- 158 41. SAS Institute. SAS/STAT user’s guide, vol 2. Version 6, 4th ed. Cary, NC: SAS Institute, 1989

42. Hollingshead AB. Two factor index of social position. New Haven: Hollingshead, 1957 43. Lefante JJ. The power to detect differences in average rates of change in longitudinal studies. Stat Med 1990;9:437-446 44. Miller JP. Statistical considerations for quantitative techniques in clinical neurology. In: Munsat TL, ed. Quantification in neurologic deficit. Stoneham, MA: Butterworth, 19896984 45. Teri L, Hughes JP, h s o n EB. Cognitive deterioration in Alzheimer’s disease: behavioral and health factors. J Gerontol 1990;45:P58-P63 46. Folstein MF, Folstein SE, McHugh PR. “Mini-Mental State”: a practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res 1975;12:189-198 47. van Belle G, Uhlmann RF, Hughes JP, Larson EB. Reliability of estimates of changes in mental status test performance in senile dementia of the Aizheimer type. J Clin Epidemiol 1990; 6:589-595 48. Madsen KS, Miller JP, Province MA, et al. The use of an extended baseline period in the evaluation of treatment in a longitudinal Duchenne muscular dystrophy trial. Stat Med 1986;5: 231-241 49. Wolfinger R, Tobias R, Sall J. Mixed models: a future direction. In: SAS Institute, ed. Proceedings of the 16th annual SUGI conference. Cary, NC: SAS Institute, 1991:1380-1388

Berg e t al: Intervention in SDAT

249

Mild senile dementia of the Alzheimer type. 4. Evaluation of intervention.

The design of trials of interventions intended to slow or arrest the progression of senile dementia of the Alzheimer type must be based on analysis of...
758KB Sizes 0 Downloads 0 Views