Abstract. Twenty-three, nondepressed patients with DSM-III obsessive-compulsive disorder completed the Yale-Brown Obsessive-Compulsive Scale (Y-BOCS), the Symptom Checklist-90 (XL-90), and the National Institute of Mental Health Global Obsessive-Compulsive Scale (NIMH-GOCS) once a week for a total of three times during a 2-week medication-free period and 10 times during a IO-week double-blind drug treatment period. The pretreatment test-retest reliabilities were determined for the Y-BOCS, NIMH-GOCS, and the SCL-90-Obsessive-

Compulsive Subscale (SCL-90-OCS). Comparisons of the three instruments revealed that the Y-BOCS and the NIMH-GOCS were significantly more reliable than the SCL-90-OCS. Posttreatment correlations were obtained between change scores on the Y-BOCS and NIMH-GOCS and the SCL-90-OCS. Correlations were high and statistically significant for both the Y-BOCS and the NIMHGOCS, but the correlations of the SCL-90-OCS with the Y-BOCS, NIMHGOCS, Physician’s Global Rating, and the Patient’s Global Rating were poor. The findings suggest that the SCL-90-OCS may not be a sensitive instrument in assessing change in obsessive-compulsive symptoms. Key Words. Obsessive-compulsive

disorder, Yale-Brown Obsessive-Compulsive Scale, rating scales, symptomatology.

The Symptom Checklist-90 (SCL-90; Derogatis et al., 1973) is a self-report clinical rating scale that is oriented toward the symptomatic behavior of psychiatric outpatients. The first four symptom constructs, including the Obsessive-Compulsive Subscale (OCS). have been validated by Derogatis et al. (1974~). Table 1 lists the 10 items that are included in the OC subscale. The SCL-90-OCS is the revised version of the SCL-58-OCS (Derogatis et al., 1974b), which included only eight obsessivecompulsive items instead of 10. Previous SCL-58 studies by Steketee and Doppelt (1986) showed below acceptable levels of test-retest intraclass correlation coefficients for the SCL-58-OCS (0.56, p < 0.01) (mean length of time from the first to second test = 23.6 days, range = 4-60 days). The test-retest reliability of the SCL-90-OCS based on 425 anxious outpatients was good (r = 0.84) (test-retest interval = 1 week, mean = 1.95, SD = 0.67) (Derogatis et al., 1974~). The items included in the

Suck Won Kim, M.D., is Assistant Professor, Department of Psychiatry, Hennepin County Medical Center and University of Minnesota, Minneapolis, MN. Maurice W. Dysken, M.D., is Professor, Department of Psychiatry, University of Minnesota, and Psychopharmacologist, Geriatric Research, Education, and Clinical Center, Minneapolis VA Medical Center, Minneapolis, MN. Michael Kuskowski, Ph.D., is Psychophysiologist, Geriatric Research, Education, and Clinical Center, Minneapolis VA Medical Center.

38 SCL-90-R-OCS (Derogatis, 1977) (revised version) are essentially the same as the ones in the SCL-90-OCS. In recent years, several authors have used the SCL-58OCS or SCL-90-OCS to measure clinical changes that occur in patients with obsessive-compulsive disorder during drug treatment studies (Turner et al., 1985; Fontaine and Chouinard, 1986; Perse et al., 1987; Kim and Dysken, 1990). The Yale-Brown Obsessive-Compulsive Scale (Y-BOCS) was developed in 1986 by Goodman, Rasmussen, and their colleagues (Goodman et al., 1989a, 19896) to assess not only symptom severity, but especially response to treatment. The reliability (Goodman et al., 19896) and validity (Goodman et al., 1989~) data of the Y-BOCS show excellent interrater reliability between four raters (n = 40, intraclass correlation coefficients for total Y-BOCS score = 0.98, p < O.OOOl), and the YBOCS scores were convergent with two out of three other obsessive-compulsive rating scales tested. The National Institute of Mental Health Global ObsessiveCompulsive Scale (NIMH-GOCS; Murphy et al., 1982) has recently been shown to correlate well with the Y-BOCS (n = 20, r = 0.67, p < 0.001) (Goodman et al., 1989~). Significant correlations between the Y-BOCS and the Patient’s Global Rating (PGR) (n = 15, r = 0.73, p < 0.01) and the Physician’s Global Rating (PHGR) (n = 15, r = 0.54, p < 0.05) have also been reported (Kim et al., 1990). Because of its ease of administration and simplicity of design, the SCL-90 has been used extensively in studies of anxiety as well as obsessive-compulsive disorders. We have not found, however, independent validation studies of the SCL-90-OCS. In the present study, we decided to compare baseline and posttreatment scores of the SCL-90-OCS to those of the Y-BOCS, NIMH-GOCS, PGR, and PHGR.

Methods Patients were participants in the multicenter clomipramine treatment study for obsessive-compulsive disorder (Clomipramine Collaborative Study Group, 1991). Of the 50 patients who participated in the double-blind study, 26 received clomipramine and 24 received placebo. All patients received placebo during the first 2 weeks. Three patients on active drug had to be terminated from the study because of side effects and were not included in the data analyses. We chose only clomipramine-treated patients so that we would be able to examine the performance of the rating scales during drug treatment. Twenty-three patients (8 men, 15 women) who met DSM-ZIZcriteria (American Psychiatric Association, 1980) for obsessive-compulsive disorder and who scored 16 or below on the Subjects.

17-item Hamilton Rating Scale for Depression (HRSD; Hamilton, 1960) were selected for the study. The first 30 patients (16 were assigned to drug treatment) were also given the Schedule for Affective Disorders and Schizophrenia-Lifetime Version (SADS-L; Spitzer and Endicott, 1978) and met both DSM-III and SADS-L criteria for obsessive-compulsive disorder. The mean age at the time of the study was 36.7 (SD = 12.6) the mean age of onset of major symptoms was 22.9 (SD = 7.9), and the mean duration of illness was 13.8 (SD = 10.7) years. The mean HRSD score was 8.9 (SD = 2.8) the mean total Y-BOCS score was 26.6 (SD = 4.1) the mean NIMH-GOCS score was 10.4 (SD = 1.7) and the mean SCL-90-OCS score was 2.2 (SD = 0.9) (moderately severe to severe level). Procedures. The Y-BOCS, NIMH-GOCS, and SCL-90 were administered at baseline, twice during the 2-week single-blind placebo period, and once a week throughout the IO-week clomipramine trial. The PGR and PHGR, which measure change from baseline, were given at the end of the study. The baseline and week 1 Y-BOCS, NIMH-GOCS, and SCL-90-OCS scores were used for the test-retest reliability study. The posttreatment (clomipramine for 10


Table 1. Symptom Checklist-90: Obsessive-Compulsive Subscale items 1.

Having to check and double-check


Having to do things very slowly to insure correctness.


Your mind going blank


Trouble remembering


Difficulty making decisions.


Trouble concentrating.


Worried about sloppiness or carelessness.


Feeling blocked in getting things done.


Having to repeat the same actions, i.e., counting, washing.’


what you do.


Unwanted thoughts, etc., that won’t leave your mind.’

1. Not included in the earlier Symptom Checklist-5%Obsessive-Compulsive


weeks) rating scale scores were used to evaluate convergent validity. The Y-BOCS, NIMHGOCS, and PHGR weregiven on each occasion by one of us (S.W.K.). The SCL-90 and PGR were self-administered. Instruments. The Y-BOCS is a clinician-administered rating scale that assesses intensity of illness, interference, subjective distress, resistance, and control over the past 7 days. The rating scale is divided into obsessive and compulsive sections in which the total scores for each section range from 0 to 20. The maximum possible total score is 40. Patients with mild, moderate, and severe obsessive-compulsive symptoms, respectively, score in the following ranges: 10-20, 20-30, and 30-40. The SCL-90-OCS is a IO-item subscale of the SCL-90 selfrating scale that measures intensity of obsessive-compulsive symptoms during the past 7 days, including the test day. The score for each item ranges from “not at all” (score 0) to “extremely” (score 4). The mean score of the 10 obsessive-compulsive items is then reported (maximum = 4). The NIMH-GOCS is a single-item ordinal rating of the overall severity of obsessivecompulsive symptoms and is rated from 1 (“minimal, within range of normal”) to 15 (“very severe obsessive-compulsive behavior”). Scores of 7-15 represent the usual range of clinical patients with obsessive-compulsive disorder. The PGR and PHGR range from 1 to 7 (I = very much improved, 2 = much improved, 3 = minimally improved, 4 = unchanged, 5 = minimally worse. 6 = much worse, and 7 = very much worse). Analyses. The reliabilities of the Y-BOCS, the NIMH-GOCS, and the SCL-90OCS were measured using the repeated test scores from each subject for the two drug-free testings. The variability of these scores from the subject’s mean pretreatment value was determined and expressed as a pooled standard deviation. This gives an estimate of the repeatability of the test results for a given subject and is known as the test-retest or withinsubject standard deviation (SD) (Snedecor and Cochran, 1980). Due to the difference in means and ranges of the instruments, the coefficient of variation (CV) was computed for each subject to adjust for the different measurement scales and to allow for a statistical comparison of reliability among the Y-BOCS, NIMH-GOCS, and SCL-90-OCS. The intraclass correlation (rt; Bartko and Carpenter, 1976) was calculated for each test as another measure of reliability. The Wilcoxon matched-pair signed-ranks test (Zar, 1984) was applied to the CVs to compare reliabilities among the three scales. Differences between the two testing sessions that were attributable to an order or a learning effect were checked using a nonparametric approach. For the validity study, the differences between the baseline and termination rating scale scores were used. For the PHGR and PGR, the termination scores were used (change from baseline). Correlations between degree of clinical improvement and change in various rating scale scores were compared. A t test for the significance of the difference between dependent correlation coefficients was obtained (Steiger, 1980) and the Bonferroni correction for multiple comparisons was applied. Statistical

40 Results Table 2 presents the test-retest reliabilities of the Y-BOCS, NIMH-GOCS, and SCL-90-OCS scores. Comparison of the average CVs for the SCL-90-OCS, NIMHGOCS, and Y-BOCS scores showed the average CV for the SCL-90-OCS to be statistically greater than those for the NIMH-GOCS (p < 0.01) and the Y-BOCS 0, < 0.01) (Wilcoxon matched-pairs signed-ranks test). There was no significant difference between the NIMH-GOCS and the Y-BOCS. The intraclass correlation for the SCL-90-OCS was significantly smaller 0, < 0.01, Fisher r-to-Z) (Zar, 1984) than the intraclass correlations for the NIMH or Y-BOCS. Within-subject standard deviations were significantly smaller for the NIMH-GOCS than for either the Y-BOCS or SCL-90-OCS (JJ < 0.01, Wilcoxon test). Within-subject standard deviations did not differ between the Y-BOCS and the SCL-90-OCS. These findings suggest that the Y-BOCS and NIMH-GOCS measures of the severity of obsessive-compulsive symptoms may be more reliable than the SCL-90-OCS. There was no evidence of a significant order effect between the repeated testing sessions for any of the rating scales (Wilcoxon test) (Zar, 1984).

Table 2. Within-subject standard deviation, average coefficient of variation, and intraclass correlation coefficient for the three rating scales (n = 23) SCL-OCS Within-subject Coefficient lntraclass







of variation









Note. SCL-OCS = Symptom Checklist-90-Obsessrve-Compulsrve Subscale. NIMH-GOCS = NIMH Global Obsessive-Compulsive Scale. Y-BOCS = Yale-Brown Obsessive-Compulsrve Scale.

The correlation coefficients (Table termination scores minus baseline scores 90-OCS and termination scores for the between the SCL-90-OCS and the PGR

3) for the rating scale scores (reflecting for the Y-BOCS, NIMH-GOCS, and SCLPHGR and PGR) show poor correlations and PHGR. The pattern of correlations at

Table 3. Matrix of Spearman correlation coefficients (n = 23) SCL-OCS’ NIMH-GOCS’ Y-BOCS’ PHGR’ PGR’ PHGR



0.822 0.822


1 .oo



0.1 64

1 .oo








1 .oo

1 .oo

Note. PHGR = Physician’s Global Rating. PGR = Patrent’s Global Rating. Y-BOCS = Yale-Brown ObsessweCompulsive Scale. SCL-OCS = Symptom Checklist-90-Obsessive-Compulsive Subscale. NIMH-GOCS = NIMH Global Obsessive Compulsive Scale. 1. Correlations were calculated based on improvement from baseline 2. p < 0.01. 3. p < 0.05. 4. o = NS.


baseline was the same (SCL-90-OCS vs. Y-BOCS: r = 0.17; SCL-90-OCS VS. HIMH-GOCS: r = 0.33; and Y-BOCS vs. NIMH-GOCS: r = 0.77). Although both the Y-BOCS and the NIMH-GOCS were highly correlated to the PHGR (p < O.OOl), the correlation between the SCL-90-OCS and the PHGR was not statistically significant. Table 4 shows t tests for the significance of the difference between dependent correlation coefficients of the rating scale scores. The correlation between the YBOCS improvement scores from baseline symptom levels and the physician’s estimation of improvement (PHGR) f rom the baseline symptom level was significantly higher than the correlation between the SCL-90-OCS and the PHGR (p < 0.05, with Bonferroni correction for multiple comparisons). The Y-BOCS improvement score/PHGR correlation, however, did not differ from the NIMHGOCS/PHGR correlation. This finding suggests that both the Y-BOCS and the NIMH-GOCS sensitively reflect the clinical changes that occur during drug treatment. This finding is corroborated by the fact that the NIMH-GOCS vs. PHGR correlation was significantly higher than the SCL-90-OCS vs. PHGR correlation (p < 0.05, with Bonferroni correction for multiple comparisons). Examination of correlations of the PGR with rating scale change revealed the same pattern. These findings suggest that the SCL-90-OCS is less sensitive than the Y-BOCS or the NIMH-GOCS. Table 4. Differences







4. 5. 6.


between dependent

vs. (SCL-PHGR) vs. (NIMH-PHGR)



(by t test)






< 0.05




vs. (SCL-PHGR)



< 0.05


vs. (SCL-PGR)



< 0.05


vs. (NIMH-PGR)






< 0.05


Note. Y-BOCS = Yale-Brown Obsessrve-Compulsrve Scale. SCL = SCL-OCS = Symptom Checklrst-90.ObsessiveCompulsrve Subscale. NIMH = NIMH-GOCS = National lnstrtute of Mental Health Global Obsessrve-Compulsive Scale. PHGR = Physician’s Global Rating. PGR = Patient’s Global Rating. 1. Y-BOCS correlatron with PHGR vs. SCL-OCS correlation wrth PHGR. 2. Y-BOCS correlation with PHGR vs. NIMH-GOCS correlation with PHGR. 3. NIMH-GOCS correlation with PHGR vs. SCL-OCS correlation wrth PHGR. 4-6. Correlations wrth PGR Instead of PHGR. Correlatrons were calculated based on the Improvement scores of the Y-BOCS, NIMH-GOCS, PHGR, PGR, and SCL-OCS.

Discussion We found the SCL-90-OCS to be insensitive in measuring clinical change during drug treatment. We were rather surprised at the extent of the poor performance of the rating scale. We had not expected, for example, the failure of the scale to show a significant correlation with any of the established obsessive-compulsive rating scales. We and others have used the SCL-90-OCS in previous clinical drug trials in patients with obsessive-compulsive disorder (Perse et al., 1987; Kim and Dysken, 1990). Our impression was that the rating scale correctly measured clinical changes that occurred during the drug studies. The studies that have used the SCL-90-OCS as

42 part of the rating instrument have shown that the significant symptom improvement reported by the study patients was corroborated by the statistically significant decrease in SCL-90-OCS scores from the baseline symptom levels. Thus, the present results should be viewed as preliminary until confirmed in further studies. We also found that the SCL-90-OCS was not as reliable as either the Y-BOCS or the NIMH-GOCS in repeated measures during the baseline placebo period. Previous SCL-58 studies by Steketee and Doppelt (1986) showed below acceptable levels of test-retest intraclass correlation coefficients for the SCL-58-OCS (mean length of time from the first to second test = 23.6 days, range = 4-60 days) (0.56, p < 0.01). Our study was based on the SCL-90-OCS which includes two additional items: “having to repeat the same actions, i.e., counting, washing” and “unwanted thoughts that won’t leave your mind.” Also, our patients took the tests on a weekly basis. In our study, the within-subject SD, CV, and q of the three instruments suggest that the scales are all reliable but that the SCL-90-OC has the lowest reliability among the three; the Y-BOCS and the NIMH-GOCS seem to have higher reliability. It is interesting to note that the Y-BOCS avoids asking about individual obsessions or compulsions. Instead, it measures the sum total of all the different obsessions or compulsions. The same principle applies to all the other Y-BOCS items. The NIMH-GOCS, PGR, and PHGR also assess the severity of overall obsessive-compulsive symptoms and have been shown to correlate well with the Y-BOCS (Goodman et al., 19890; Kim et al., 1990). This is in contrast to the SCL-90-OCS in which there are a number of specific items that ask about individual obsessions or compulsions. Some of the items in the SCL-90-OCS such as “your mind going blank,” “ trouble remembering things,” and “trouble concentrating” appear to represent depressive symptoms rather than obsessive-compulsive symptoms. In our study, we accepted only nondepressed patients with obsessivecompulsive disorder, as the HRSD scores show. This may have added to the poor results of the scale. We also believe that the SCL-90-OCS would have lower discriminant validity than the Y-BOCS, especially in a group of depressed patients. The Y-BOCS has been shown to have an acceptable discriminant validity in mildly depressed patients (Goodman et al, 1989~). It is possible that some of the discrepancies we found in the SCL-90-OCS may have resulted from the shortcomings of the instrument cited above. We need both reliable and sensitive instruments. Since we now have more than one clinically effective drug to treat obsessive-compulsive disorder, the significance of this issue has increased, especially in comparison studies. Unless we have finely tuned, sensitive instruments, a small difference in treatment results between two study drugs may not be reflected in an outcome study. Although instruments are yardsticks with which we measure changes in clinical symptomatology, in our view, there has not been sufficient attention paid to this area. One example would be an instrument such as the Leyton Obsessional Inventory (LOI; Cooper, 1970). In spite of some advantages of the LOI, we have found many shortcomings with the LO1 and have commented on some of our findings in previous reports (Kim et al., 1989, 1990). Despite its limitations, the LO1 has been used extensively in many studies of obsessive-compulsive and related disorders (Kim and Dysken, 1988).

43 Instruments such as the Maudsley Obsessional-Compulsive Inventory (MOC; Rachman and Hodgson, 1980) are easy to administer, but because they include only inventory items, they may have more value in assessing the presence or absence of the obsessive-compulsive symptoms rather than symptom severity. The ObsessiveCompulsive Subscale of the Comprehensive Psychopathological Rating Scale (CPRS; &berg et al., 1978) has also been used in the past (Kim et al., 1989). Its validity, however, needs to be evaluated further. In our view, there is a tendency to use familiar instruments even when newer, better instruments exist. The quality of experimental results depends heavily on the sensitivity of the measurement instruments used.

