2 Measurement of outcome in rheumatoid arthritis MATTHEW H. LIANG JEFFREY N. KATZ

INTRODUC~ON Over the past decade, the goals of measuring outcome in rheumatoid arthritis have expanded from control of synovitis to minimizing side-effects from therapy, maintaining function, improving quality of life and costeffectiveness (Fries, 1983). This paradigm shift paralleled the view of health as not merely the absence of disease but as a positive attribute, the increased expectations of patients and society from medicine, and the fact that although a variety of agents are useful in treating the disease, its cure is still elusive. Although the importance of function in rheumatic disease has been recognized since the earliest time of the speciality of rheumatology (Steinbrocker et al, 1949), it was not until 1981 that it was recognized that better understanding and quantitative work would require valid, reliable and sensitive measures (Liang and Jette, 1981). Attempts to assess function and health status or quality of life by psychometrically superior questionnaires have expanded throughout the 1980s to the present. We review advances in measuring function and health status in rheumatoid arthritis, their limitations and future research directions. DEFINITIONS The impact of arthritis on the individual can be viewed from several perspectives. 'Impairment' is demonstrable anatomic loss or damage, a physiological state. A limited range of motion or the number of inflamed joints are examples. Disability is the functional limitation caused by an impairment which interferes with what a patient needs or wants to do. Physicalfunction is a complex integrated physical ability dependent on the physical integrity of the musculoskeletal and neurological system to perform tasks needed for activities of daily living (ADL), recreation, work etc. Essential activities in the home and community are termed instrumental or intermediate A D L (or IADL). These might include using a telephone or Bailli~re' s Clinical Rheurnatology--

Vol. 6, No. 1, February 1992 ISBN 0-7020-1635-7

23 Copyright 9 1992, by Bailli~re Tindall All rights of reproduction in any form reserved

24

M. H. LIANG AND J. N. KATZ

shopping for groceries. Inability to perform IADL suggests a need for special services. Health status or quality of life, an ephemeral concept, embodies the dimensions of physical, social and emotional function. Some would add to this list cognitive functioning, general well-being and life satisfaction. MEASUREMENT OF FUNCTION Before 1979, literally hundreds of ad hoc, non-standardized measures of function were developed. Thereafter the field expanded with instruments of proven psychometric properties (Table 1). Some commonly used instruments are described. The most widely used instrument may be the Health Table 1. Measures of physical function in rheumatoid arthritis. American Rheumatism Association Functional Classes* (Steinbrocker et al, 1949) Functional Status Index (FSI) (Jette, 1980) Convery Polyarticular Disability Index (Convery et al, 1977) Health Assessment Questionnaire---short (HAQ) (Fries et al, 1980) Lee Functional Status Instrument (Lee et al, 1973) Toronto Functional Capacity Questionnaire (TFCQ) (Helewa et al, 1982) McMaster Toronto Arthritis Patient Preference Disability Questionnaire (MACTAR) (Tugwell et al, 1987) * Currently under revision (Hochberg et al, 1990).

Assessment Questionnaire (HAQ) (Fries et al, 1980) which has been translated for use in the UK (Kirwan and Reeback, 1986), and into Dutch (Slegert et al, 1984) and Swedish (Ekdahl et al, 1988). The H A Q , a selfadministered instrument, has two formats, a short form that requires about 5 minutes to complete and a longer version requiring 20 minutes. The short form provides 24 questions on activities of daily living and mobility, and has questions relating to pain, global severity, income, job change, cost of medical care and side-effects of therapy. The shorter form evaluates physical function alone, with 24 questions. The HAQ is a validated instrument that has been used in clinical trials of rheumatoid arthritis (Bombardier et al, 1986). It is a predictor of health services utilization (McNevitt et al, 1986). A version was incorporated into the second and third US National Health and Nutrition Examination Studies. The Modified Health Assessment Questionnaire (MHAQ) (Pincus et al, 1983) evaluates physical function through eight questions derived from the HAQ and adds new scales for change in function, satisfaction and pain in the performance of each of these activities. It can be completed in less than 5 minutes. Reliability and validity of the MHAQ and the HAQ are comparable. The Toronto Functional Capacity Questionnaire (TFCQ) assesses function in personal care, upper extremity activities, mobility, work and leisure activities (Helewa et al, 1982). The instrument is administered by an interviewer, and requires approximately 18 minutes to complete. Weighting of responses is based on preferences derived from panels of occupational

MEASUREMENT OF OUTCOME IN RHEUMATOID ARTHRITIS

25

and physical therapists and rheumatologists. Reliability and validity have been demonstrated. The instrument has been shown to be sensitive to change in clinical trials (Bombardier et al, 1986). The Functional Status Index measures dependence, pain and difficulty experienced in the performance of 18 activities of daily living (Jette, 1980). Each activity is rated between 0 and 4 on each dimension. Studies on patients with rheumatoid arthritis show validity and a high degree of interobserver reliability. In self-assessed format, it takes 10-20 minutes to complete. The McMaster Toronto Arthritis Patient Preference Disability Questionnaire (MACTAR) was developed for use in clinical trials in rheumatoid arthritis (Tugwell et al, 1987). Using a semistructured interview, patients are asked to designate key functional activities based on their own preferences, and the five activities that rank highest are evaluated. On reassessment, patients are asked if their ability to perform the ranked activities has improved, worsened or stayed the same. This goal-attainment technique-used in psychological research and originally in rheumatology in a clinical trial (Liang et al, 1984)--may be more sensitive to small changes when compared with conventional standardized questionnaires. The analytical problems in evaluating each patient in a different way are formidable, but this approach merits further investigation. HEALTH-STATUS AND QUALITY-OF-LIFE MEASUREMENT The instruments for measuring health status or quality of life in rheumatoid arthritis (Table 2) usually cover the dimensions of physical, social and Table 2. Quality-ofqife and health-status measures in rheumatoid arthritis.

General Sickness Impact Profile (SIP) (Bergner et al, 1976) Index of Well-Being (IWB) (Kaplan et al, 1976) Rand Health Insurance Study (Brook et al, 1979) Nottingham Health Profile (Hunt et al, 1985) Rheumatoid arthritis McMaster Health Index Questionnaire (MHIQ) (Chambers et al, 1982) Health Assessment Questionnaire (HAQ)--long (Fries et at, 1980) Arthritis Impact Measurement Scales (AIMS) (Meenan et al, 1980)

emotional functioning, and for the most part are in the form of selfadministered questionnaires. Some instruments are general health-status measures which have been applied to rheumatic disorders. Others are measures developed specifically for rheumatoid arthritis. Only one instrument assesses patient priorities (Lee et al, 1973). One published and one in abstract form assess patients' satisfaction with their functional level (Pincus et al, 1983; Meenan et al, 1990). A diverse literature demonstrates the potential of health-status measurement in clinical work and in research. These instruments are as reliable and as sensitive as traditional measures of improvement in clinical status, such as 25-yard walk time or grip strength.

26

M. H . L I A N G A N D J. N . K A T Z

Poor function or health status predict mortality and correlate with utilization of health services (McNevitt et al, 1986; Mitchell et al, 1986). The Arthritis Impact Measurement Scale (AIMS) has 48 multiple-choice questions with nine subscales measuring physical, social and mental-health status (Meenan et al, 1980). The possible range of scores on each subscale is 0-10; subscale results are averaged to obtain a total score. The AIMS is self-administered, takes 15-20 minutes to complete, and has been used in a number of studies in rheumatic disease. A version has been anglicized (Hill et al, 1990). A recent improved version maintains the same length as the original AIMS, but has improved wording in several subscales; attempts to have the patient make a specific attribution of their functional deficit to arthritis; and has a satisfaction scale (Meenan et al, 1990). The McMaster Health Index Questionnaire (MHIQ) measures the quality of life in patients with rheumatoid disease (Chambers et al, 1982). It measures physical function in physical activities, self-care activities, mobility, communication and global physical activity. A social index combines general well-being, work performance, material welfare, support and participation with friends and family, and global social function. The emotional index measures feelings about personal relationships, self-esteem, thoughts about the future, critical life events and global emotional function. It contains 59 items, is self-administered and takes about 15-20 minutes to complete. The Index of Well-Being (IWB) (Kaplan et al, 1976), a general healthstatus measure, evaluates mobility, physical activity and social activity. An interviewer determines what the patient's status was during the last 6 days. The interview requires a trained assessor and takes at least 20 minutes to complete. The instrument has been validated, but may not be as sensitive as measures developed specifically for rheumatoid arthritis (Bombardier et al, 1986). In Oregon, the IWB is being used in a major public policy debate to prioritize health services that will be covered by Medicare. Major limitations are the complexity of the instrument and the requirement for a specially trained interviewer. The Rand Health Insurance Study batteries assess social, psychological and physical functioning, and general health perceptions (Brook et al, 1979). Specific scales in each area of function include social interaction and social participation in community and family; anxiety, depression and selfcontrol; ADL, role activities, household tasks, leisure and physical activity. The instrument is administered by an interviewer and requires at least 60 minutes to complete. The complete Rand batteries have not been used in the rheumatic diseases but the General Health Perceptions portion has been used in a clinical trial of auranofin versus placebo and did not show change, while the Health Assessment Questionnaire showed change (Bombardier et al, 1986). The AIMS anxiety and depression subscales are shortened versions of sections of the Rand instrument. The Sickness Impact Profile (SIP) contains 136 statements related to patient status (Bergner et al, 1976). Patients check items that correctly describe their current health. Scores for items use predetermined weights based upon rater panel estimates of relative severity of the dysfunction. Three of the categories (ambulation, body care and movement, and

MEASUREMENT OF OUTCOME IN RHEUMATOID ARTHRITIS

27

mobility) may be aggregated into a physical dimension, and four categories (emotional behaviour, social interaction, alertness behaviour and communication) into a psychosocial dimension. Five categories are independent (work, sleep and rest, eating, home management, and recreation and pastimes). The SIP is available in a self-administered form or can be administered in an interview. It requires up to 30 minutes to complete as an interview. It has been used in rheumatoid arthritis and low back pain. A total score is computed by summing categories and then standardizing to a percentage of the maximum. The instrument may accurately reflect change for groups of arthritis patients, but appears to be relatively insensitive to changes in individual patients (Deyo and Inui, 1984). The Nottingham Health Profile and its earlier version the Nottingham Health Index are intended to give a brief measure of perceived physical, social and emotional health. The first form, the Nottingham Health Index, measures a patient's ability to carry on a normal life and is intended for use in primary medical care settings where patients can complete it in the waiting room (McDowell et al, 1978). The second version (Hunt et al, 1985) is meant for population surveys and has been used in clinical studies as well. The Nottingham Health Profile (NHP) is similar to the Sickness Impact Profile except that the NHP asks about feelings and emotional states directly rather than by changes in behaviour, and emphasizes the respondent's subjective assessment of his or her health. The NHP has been used in a variety of settings and is simple, sensitive and has broad coverage. The items are designed to represent rather severe problems and therefore healthy populations or those with minor conditions will have low scores making changed scores difficult to compare. The scoring gives a profile of six scores and covers only negative aspects of health that cannot show change beyond an absence of relatively severe experiences. Both instruments are self-administered questionnaires that take less than 10 minutes to complete. The first version contains 33 items selected on the basis of an analysis of responses in rehabilitation patients and surgical patients undergoing hip replacement (McDowell et al, 1978). The NHP contains 38 items that can be grouped into patients covering mobility, pain, sleep, social isolation, emotional reactions and energy level. All items use a yes/no answer format. The NHI has been tested for reliability, content validity and discriminate ability in patients before and after hip replacement surgery. The NHI also shows discriminate ability between rehabilitation patients with physical and mental handicaps. SHORT FORMS

One disadvantage of the instruments described above is their length; most require at least 15-20 minutes to complete. This becomes burdensome if general health status is measured along with disease-specific measures in a study, and may be a problem in daily practice. Shorter measures have been developed which appear to retain the desirable psychometric properties of the longer instruments (Table 3).

28

M. H. LIANG AND J. N. KATZ Table 3. Short forms.

Health status (quality of life) General Functional Status Questionnaire (Jette et ai, 1986) COOP charts (Nelson et al, 1987) SF-36 (Ware, 1990) Rheumatoid arthritis Modified Arthritis Impact Measurement Scale (AIMS) (Wallston et al, 1989)

Function Modified Health Assessment Questionnaire (HAQ) (Pincus et ai, 1983)

The Modified Health Assessment Questionnaire (Pincus et al, 1983), described above, is one such 'short' form of proven reliability, validity, sensitivity to clinical change and prognostic utility. Also, Wallston et al (1989) have reduced the Arthritis Impact Measurement Scales (AIMS) from 48 questions to 18 by selecting two items from each of the nine AIMS subscales. The reliability, validity and sensitivity of this measure in clinical research have not been tested adequately to date. Several short, generic, health-status measures are available. Jette et al (1986) developed the Functional Status Questionnaire, a 34-item measure with subscales for physical, psychological and social/role function which requires just 10-12 minutes to complete. This instrument is reliable, valid and sensitive to clinical changes following total hip replacement (Cleary et al, 1991). Ware, (1990) have developed a 36-item short form, the SF-36, derived from a larger battery of questions administered in the Medical Outcomes Study. While published evidence of its validity and sensitivity to clinical change is not yet available, the instrument is being used in a variety of settings. In studies of total hip replacement and arthroscopic meniscectomy, we find it easily understood by patients and sensitive to improvement following surgery. Investigators at Dartmouth have developed the COOP charts, which assess physical, emotional and role function in 1-2 minutes (Nelson et al, 1987). Patients are shown a diagram for each health dimension which illustrates pictorially and verbally a series of graded functional levels. They select the level that best describes their current function. The COOP charts have been tested for reliability and validity, but their sensitivity to clinical change is not known. The COOP charts are simple, short and easily administered, making them suitable for routine office practice. Short health-status forms offer the promise of decreasing respondent burden in clinical research and integrating health-status measurement into clinical practice. Generic short forms such as the SF-36 will probably soon become the dominant instruments used in research on diverse medical and surgical conditions. These measures have few items relevant to upper extremity function and therefore should be supplemented with diseasespecific questions for use in rheumatoid arthritis. While shorter instruments are desirable, sensitivity is probably greater for the longer instruments, and these instruments will require further evaluation before they can be recommended for use in clinical studies.

MEASUREMENT OF OUTCOME IN RHEUMATOID ARTHRITIS

29

LIMITATION OF QUALITY-OF-LIFE MEASURES The available health-status measures are interchangeable to some degree, but the dimensions covered by each vary and have differential sensitivity to change. For instance, the Household Activity Subscale of AIMS is less relevant for male patients and the Activities of Daily Living Subscale may be insensitive to mild disability (Potts and Brandt, 1987). Health-status instruments also differ in their ability to demonstrate changes in subdimensions such as social and global function (Liang et al, 1985, 1990). Instruments developed in one setting need modification when used in another culture. Functional-status and health-status questionnaires are probably inaccurate in certain groups such as children and elderly patients, the cognitively impaired and the non-English speaking. Also, self-reported function can overestimate or under-report observed performance (Spiegel et al, 1985). Function is a complex and relative phenomenon. Disability arises when there is a discrepancy between ability and need, when one's capabilities are not sufficient for independence. This is dependent on whether there is an actual or perceived need for a specific function, a patient's expectations, his/her motivation and the support system. Function changes over the course of a person's development in terms of what they are capable of doing and what they wish or need to do. In children and adolescents, rapid change and maturation of cognitive, behavioural, emotional and psychological function is the rule, whereas in adult life those capacities are stable but life circumstances are changing. Covering a range of functional activities in a comprehensive questionnaire provides breadth at the sacrifice of depth. The measurement of specific functions by questionnaire is usually too crude for monitoring individual subjects. Both functional and health status instruments have 'floor' and 'ceiling' effects in that they are unable accurately to portray the patient at both ends of the disability spectrum. The floor effect is demonstrated in Figure 1, which depicts data from 289 elderly community-dwelling subjects who were asked to perform a grip strength and were simultaneously asked whether they had difficulty doing tasks requiring a power grip, a question from the HAQ. One can see that many subjects had impaired grip but perceived no difficulty. Questionnaires cannot make a specific aetiological diagnosis or replace clinical evaluation of the subtle, interrelated components of function such as motivation, neuromuscular competency, cognitive ability, joint integrity, availability of environmental modifications and social supports. Even with reliable responses, the information is too general to be useful in the individual patient. For example, in a patient with difficulty in walking, the problem may be due to motivation, pain (structural or inflammatory or both), muscle strength, impairment of the nervous system, stability of the joint etc. Clinicians cannot believe that the preferences of a group can be those of the individual. Inasmuch as function is relative, so are values placed on it. The preferences expressed by healthy reasonable individuals are not those of the sick and the anxious. Preferences of the sick and anxious change

30

M. H . L I A N G A N D J. N . K A T Z

during the vicissitudes of rheumatic illnesses which are characterized by chronicity and an unpredictable waxing and waning course. Values change with time, and experiences of illness or personal circumstances changes because people learn, adjust or accommodate over the course of illness. Despite this, some empirical research suggests that normative preference weighting may be valid. For example, scoring for particular symptoms and levels of function in the Index of Well-Being is based on preference weights derived from normals and has been validated in patients with rheumatoid arthritis (Balaban et al, 1986).

100

8G ~v i-

A B B D C H C D D E F E G J J K L

A A A

N

~ .=. 09 Q. '" (5

4O

2O

N j J L O Q I D F F H B B C A

A B

B C C E A B A B

A B B D A B B B D

A A A A

A

I

I

!

I

0

1

2

3

Health Assessment Questionaire

Figure 1. Grip strengthversus response to a Health AssessmentQuestionnairequestionabout perceived difficultyin doing tasks requiring a power grip, in 289 elderly community-dwelling subjects. A, one observation;B, two observations;etc.

M E A S U R E M E N T OF O U T C O M E I N R H E U M A T O I D ARTHRITIS

31

UTILITY AND PREFERENCE MEASUREMENT A serious limitation of health-status measurement in chronic disease is that the units of measurement are not suitable for economic evaluation nor directly comparable with such 'hard' outcomes as mortality. Suppose, for example, that patients with rheumatoid arthritis undergoing total hip replacement improve in total AIMS score by two points. Was the operation worth its cost? Would the $25 000 have been better spent on haemodialysis or cholesterol-lowering programmes which save years of life? To address these questions, measures of the impact of one treatment must be fungible with dollar costs, or the impact of other interventions and weighting of health status with utility or preference measurement is needed (Froberg and Kane, 1989). One technique for measuring changes in health status in dollar terms is the willingness-to-pay approach (Thompson et al, 1982), which asks patients how much they would pay to attain a particular health state. This technique is limited conceptually, because poor patients value a dollar more than richer patients, and practically, because some patients may not like or understand the question being asked (Thompson et al, 1981). Utility theory has been used to measure health outcome with a single, intuitively meaningful scale. 'Utility' refers to the magnitude of an individual's preference for a health state and is scaled from zero (death) to one (perfect health). Thus, 4 years spent in a state with utility of 0.75 becomes three quality-adjusted life-years (QALYs). A variety of techniques have been employed to measure utility (Torrance, 1987). The rating scale is the simplest. It consists of a line drawn on a page with the least preferred health state (e.g. death) at one end and the most preferred (e.g. perfect health) at the other. Subjects mark the point on the line they associate with the health state of interest (such as rheumatoid arthritis with end-stage hip disease); the distance from the least preferred state to the condition divided by the total length of the line gives the utility of the condition of interest. The time trade-off technique (Froberg and Kane, 1989) is more complicated but mathematically more appealing. Subjects are asked to choose between living for a specified number of years in the chronic condition of interest or for fewer years in perfect health. The number of years in perfect health is varied until the subject is indifferent between the two choices; at that point the number of years in perfect health divided by the fixed duration of chronic disease yields the utility associated with the chronic condition. The standard gamble is another technique which is more complicated to administer but similar conceptually to the time trade-off (Froberg and Kane, 1989). Utility measurements have been validated in a variety of conditions. Individual scores are imprecise, however, with standard deviations of the order of 0.13 (hence, 95% confidence intervals spanning 0.52 on a 1.0-point scale). Precision of group means can be improved with larger samples. Whether utility measures are sensitive to clinical change requires study.

32

M. H . L I A N G A N D J . N . K A T Z

SELECTION OF MEASURES FOR CLINICAL STUDIES The goal of medical care is to do no harm, to relieve pain and suffering, and to improve and maintain one's goals for physical, social and emotional function. Thus, evaluation of any intervention, including medication and surgery, should examine whether these goals are accomplished, to what extent and whether the patient is satisfied with the results. Some interventions relieve symptoms and improve function promptly, whereas others may take time or require that a patient be rehabilitated from some stable level of function. In the same light, known and unknown negative effects can occur immediately or as a delayed or cumulative consequence. For a medical intervention designed to treat a root cause or primary mechanism of joint destruction (i.e. synovitis) one should see evidence of improved inflammation. To the degree that psychosocial dysfunction results from the rheumatic condition, one would expect improvement of these parameters. However, an impoverished social environment, prolonged psychological symptoms, or personality traits or styles of coping are not likely to be helped by attention to synovitis alone. In fact, a discrepancy between one's perceived function and objective signs of disease is often a clue to the presence of another disorder which needs attention. General health-status measures such as the Sickness Impact Profile, the Nottingham Health Profile, the Index of Well-Being and the SF-36 are preferred to arthritis-specific ones for health policy questions in which decision-makers must allocate resources to different conditions or interventions. The use of such scales in a study can help the investigator relate their findings to other diseases and aid the policy-maker in understanding the trade-offs in funding allocation. On the other hand, the use of a general health-status measure may not capture the specific outcomes seen in disorder. The advantage of using an arthritis-specific scale is that clinicians may have a better sense of what changes on the scale mean, and data from other studies may be available for comparison. However, disease-specific scales may omit important consequences of the disease or the interventions used to treat the condition. The best approach is to examine each item to determine whether there is sufficient coverage of all relevant dimensions for a specific disorder, to see whether these items are sufficiently scaled to distinguish individuals on a continuum, and whether the items are symptoms resulting from or a part of the disease. On this latter point, an outcome of disease cannot be a part of the disease syndrome. For instance, inability to get going in the morning, which might be a manifestation of depression, is frequently seen in patients with active rheumatoid arthritis. There is temptation to take a subscale from a general measure to be used in a specific study. Unless the subscale has been examined separately for reliability, validity and sensitivity, disaggregating the scale will not insure these psychometric properties. This problem is particularly germane if the index results in one score; omitting a subscale will not allow one to compute the score. In the actual conduct of trials, leaving out one or two subscales does not materially save time or reduce the burden to the patient~

M E A S U R E M E N T OF O U T C O M E IN R H E U M A T O I D ARTHRITIS

33

In selecting outcome measures for studies of rheumatoid arthritis, one -needs to enumerate all benefits and side-effects that might be expected, and to use the best measure for each of the positive and negative attributes of the intervention. Traditional anthropometric measures should always be supplemented by measures of physical function and health status. The battery should include a general health-status measure which is more likely to pick up side-effects of treatment, iatrogenic complications, and is necessary if the results are to be used in health policy. A rheumatoid arthritisspecific health-status measure such as AIMS, M H A Q , H A Q or M A C T A R should also be used. The evaluation should include a measure of whether patients are satisfied with their state and a global evaluation of whether they felt better. Lastly, side-effects should be ascertained on an active basis in the short and the long term. For formal economic evaluation, direct and indirect costs should be evaluated after efficacy has been determined. The most expensive portion of health care is hospitalization or institution-based care followed by diagnostic tests, radiography, drugs etc. Health-care costs paid for by insurance or out-of-pocket are poorly remembered and there should be validation against objective data. Hospital costs should include information on length of stay and use some relative value scale which reflects actual resource costs rather than charges which are idiosyncratic and variable. Using joint replacement surgery as a model, we have studied the performance and relative measurement sensitivity of five health-status instruments, the Functional Status Index (FSI), Sickness Impact Profile (SIP), Index of Well-Being (IWB), Health Assessment Questionnaire ( H A Q ) and Arthritis Impact Measurement Scales (AIMS) (Liang et al, 1985, 1990). All the instruments correlated highly with one another and demonstrated sensitivity to clinical change. We found that of the five instruments, the FSI had the most missing data. The AIMS, FSI and SIP were equally efficient in detecting improvement in mobility, but the H A Q and IWB were about a half as efficient as the other three instruments. For pain evaluation, the AlMS was more sensitive than the H A Q . The IWB and SIP did not have a pain subscale. With regard to social function, the SIP, IWB and H A Q were more sensitive than the AIMS. For global function the SIP, AIMS and IWB were more efficient than the FSI or H A Q . In practice, the IWB is an arduous questionnaire to administer, somewhat artificial for patients and requires trained interviewers. It has, however, the advantage of being a ratio scale with a true zero point (which is not terribly relevant in rheumatoid arthritis, in general), thus making it the best instrument for calculating quality-adjusted life-years, a prerequisite for cost-effectiveness studies. In a controlled drug trial in rheumatoid arthritis, the IWB displayed the smallest change of the techniques used to measure change (Bombardier et al, 1986). The omission of pain as a dimension makes it less desirable as a single instrument in rheumatic disorders since pain is a central concern for patients. The SIP is easier to understand by patients, is self-administered, and has been used in numerous rheumatic disorders. It results in one score. The SIP contains questions related to continence and communication which are

34

M. H . L I A N G A N D J. N . K A T Z

usually not relevant to rheumatic disorders. Also, even the most disabled patients with rheumatoid arthritis score in the range of 15-20 on a 100-point scale, indicating that the entire range of disability in ambulatory rheumatoid patients utilizes just 20% of the total scale. The FSI enables a patient to disaggregate function along three dimensions of dependence, pain and difficulty experienced. This is not always understood or possible for patients. In elderly patients, there was poor compliance and response rates with this question (Liang et al, 1984). We found the same in a study of patients undergoing total joint arthroplasty (Liang et al, 1985, 1990). Because there are no perfect measures of function or health status, careful consideration of likely outcomes and of which instruments provide the best coverage is necessary. Multiple measures, serial assessment of outcome and combining questionnaires with open-ended interviews may be necessary when the sensitivity of an instrument is unknown. Furthermore, no instrument will help a weak study design; blinded evaluation, randomization and definition of 'meaningful' effect are essential. RESEARCH DIRECTIONS

Questionnaires quantitatively measure physical function and health status in patients with rheumatic disease. Their limitations are inherent in their attempt to circumscribe a boundless and sometimes amorphous dimension of the impact of disease on the average patient. Further development of instrumentation is unlikely to overcome this limitation entirely. Important goals for future research include developing specific items for different types of rheumatic disease disability, and refining the evaluation of patients' priorities and satisfaction with their health status. Research is ongoing in all these areas. Ultimately, if patient care and outcome are to be improved, attention to function must be incorporated into daily practice. Practical tools for functional assessment and diagnosis of specific functional deficits need to be developed. It is likely that these-advances will emerge from improved understanding of the epidemiology of function in the rheumatic diseases, the natural course of functional decline, and identification of critical thresholds at which intervention might make a difference. Increasingly, policy-makers and administrators are turning to outcome measurement as a way to assess the quality of care delivered by individual physicians or institutions. Hospital-specific mortality rates are insensitive measures of quality, reflecting case-mix more than physician or institution performance (Gree et al, 1990). Variations in outcome in chronic disease is multifactorial and may be similarly related to disease severity (Gordon et al, 1990) or factors over which the physician has little influence. This is particularly true in the vast majority of rheumatic disorders where cure is not possible. Health-status measures are reliable, valid, and may be more sensitive to differences in quality than mortality rates, readmissions and other administrative data. In addition, health-status measures are valid indicators of clinical improvement and thus offer the opportunity to detect

MEASUREMENT OF OUTCOME IN RHEUMATOID ARTHRITIS

35

salutory as well as deleterious effects of care. It is important to note, however, that these instruments were developed in clinical settings and have not been validated as indicators of physician or institution performance. Even if changes in health status are demonstrated to correlate with independent assessments of the quality of care, careful interpretation will be required. Much of the dysfunction identified in health-status measures involves a complex interaction of disease process, personal and social factors, and the environment. The extent to which physicians can and should influence social and environmental problems is a matter of ongoing debate.

Acknowledgements We gratefully acknowledge the expert assistance of Jacqueline Mazzie, Mary Scamman and Cindy Aiello in the preparation of this manuscript.

REFERENCES Balaban DJ, Sagi PC, Goldfarb NI & Nettler S (1986) Weights for scoring the quality of well-being (QWB) instrument among rheumatoid arthritis: a comparison to general population weights. Medical Care 24: 973-980. Bergner M, Bobbitt RA, Pollard WE, Martin DP & Gilson BS (1976) The sickness impact profile: validation of a health status measure. Medical Care 14: 57-67. Bombardier C, Ware J, Russell IJ, Larson M, Chalmers A, Read JL and the Auranofin Cooperating Group (1986) Auranofin therapy and quality of life in patients with rheumatoid arthritis: results of a multicenter trial. American Journal of Medicine 81: 565-578. Brook RH, Ware JE, Davied-Avery R et al (1979) Overview of adult health status measures fielded in RAND's health insurance study. Medical Care 17 (supplement): 1-131. Chambers LW, MacDonald LA, Tugwell P e t al (1982) The McMaster health index questionnaire as a measure of quality of life for patients with rheumatoid disease. Journal of Rheumatology 9: 780-784. Cleary PD, Greenfield S, Mulley A G et al (1991) Variations in length of stay and outcome for six medical and surgical conditions in Massachusetts and California. Journal of the American Medical Association 266: 73-79. Convery FR, Minteer MA, Amiel D & Connett KL (1977) Polyarticular disability: a functional assessment. Archives of Physical Medicine and Rehabilitation 58: 494-499. Deyo R A & Inui TS (1984) Towards clinical applications of health status measures: sensitivity of scales to clinically important changes. Health Services Research 19: 275-289. Ekdahl C, Eberhardt K, Anderson SI & Svensson B (1988) Assessing disability in patients with RA. Scandinavian Journal of Rheumatology 17" 263-271. Fries JF (1983) Towards an understanding of patient outcome measurement. Arthritis and Rheumatism 26" 697-704. Fries JF, Spitz P, Kraines RG & Holman HR (1980) Measurement of patient outcome in arthritis. Arthritis and Rheumatism 23: 137-145. Froberg D G &~Kane RL (1989) Methodology for measuring health-state preferences----IV: Progress arid a research agenda. Journal of Clinical Epidemiology 42: 675-685. Gordon SM, Ct~lver DH, Simmons BP & Jarvis WR (1990) Risk factors for wound infections after total knee arthroplasty. American Journal of Epidemiology 131: 905-916. Gree J, Winfield N, Sherkey P & Passman 1_3 (1990) The importance of severity of illness in assessing hospital mortality. Journal of the American Medical Association 263: 241-246. Helewa A, Goldsmith CH & Smyth H A (1982) Independent measurement of functional capacity in rheumatoid arthritis. Journal of Rheumatology 9: 794-797. Hill J, Bird HA, Lawton CW & Wright V (1990) The Arthritis Impact Measurement Scales: an

36

M. H. LIANG AND J. N. KATZ

anglicized version to assess the outcome of British patients with RA. British Journal of Rheumatology 29: 193--196. Hochberg MC, Chang R, Dwosh I, Lindsey S, Pincus T & Wolfe F (1990) Preliminary revised ACR criteria for functional status (FS) in rheumatoid arthritis (RA). Arthritis and Rheumatism 33: S15. Hunt SM, McEwen J & McKenna SP (1985) Measuring health status: a new tool for clinicians and epidemiologists. Journal of the Royal College of General Practitioners 35: 185-188. Jette AM (1980) Functional Status Index: reliability of a chronic disease evaluation instrument. Archives of Physical and Medical Rehabilitation 61: 395--401. Jette AM, Davies AR, Cleary PD et al (1986) The Functional Status Questionnaire: reliability and validity when used in primary care. Journal of General Internal Medicine 1: 143-149. Kaplan RM, Bush JW & Berry CC (1976) Health status: types of validity for an index of well-being. Health Services Research 11: 478-507. Kirwan JR & Reeback JS (1986) Stanford Health Assessment Questionnaire modified to assess disability in British patients with rheumatoid arthritis. British Journal of Rheumatology 25: 206--209. Lee P, Jasani MK, Dick WC & Buchanan WW (1973) Evaluation of a functional index in rheumatoid arthritis. Scandinavian Journal of Rheumatology 2: 71-77. Liang MH & Jette AM (1981) Measuring functional ability in chronic arthritis: a critical review. Arthritis and Rheumatism 24: 80-86. Liang MH, Partridge A J, Larson MG et al (1984) Evaluation of comprehensive rehabilitation services for elderly homebound patients with arthritis and orthopedic disability. Arthritis and Rheumatism 27: 258-266. Liang MH, Larson MG, Cullen KE & Schwartz JA (1985) Comparative measurement efficiency and sensitivity of five health status instruments for arthritis research. Arthritis and Rheumatism 28: 542-547. Liang MH, Fossel A H & Larson MG (1990) Relative efficacy of five health status instruments for orthopedic evaluation. Medical Care 28: 632-642. McDowell IW, Martini CJM & Waugh W (1978) A method for self-assessment of disability before and after hip replacement operations. British Medical Journal 2: 857-859. McNevitt MC, Yelin EH, Henke CJ & Epstein WV (1986) Risk factors for hospitalization and surgery for rheumatoid arthritis: implications for capitated medical payments. Annals of Internal Medicine 105: 421-428. Meenan RF, Gertman PM & Mason JH (1980) Measuring health status in arthritis: the arthritis impact measurement scales. Arthritis and Rheumatism 23: 146-152. Meenan RF, Mason JH, Anderson J J, Kazis LE & Guccione A A (1990) AIMS-2: The content and properties of a revised and expanded AIMS. Arthritis and Rheumatism 33: S15. Mitchell DM, Spitz PW, Young DY et al (1986) Survival, prognosis, and causes of death in rheumatoid arthritis. Arthritis and Rheumatism 29: 706-714. Nelson E, Wasson J, Kirk J e t al (1987) Assessment of function in routine clinical practice: description of the COOP chart method and preliminary findings. Journal of Chronic Diseases 40 (supplement 1): 555-645. Pincus T, Summey JA, Soraci SA Jr, Wallston KA & Hummon NP (1983) Assessment of patient satisfaction in activities of daily living using a modified Stanford Health Assessment Questionnaire. Arthritis and Rheumatism 26: 1346-1353. Potts MK & Brandt KD (1987) Evidence of the validity of the Arthritis Impact Measurement Scales. Arthritis and Rheumatism 30: 93-96. Slegert CEH, Vieming L J, Vandenbroucke JP & Cats A (1984) Measurement of disability in Dutch rheumatoid arthritis patients. Clinical Rheumatology 3: 305-309. Spiegel JS, Hirshfield MS & Spiegel TM (1985) Evaluating self-care activities: comparison of a self-reported questionnaire with an occupational therapist interview. British Journal of Rheumatology 24: 357-361. Steinbrocker O, Traeger CH & Battman RC (1949) Therapeutic criteria in rheumatoid arthritis. Journal of the American Medical Association 140: 659-662. Thompson MS, Read JL & Liang MH (1981) Feasibility of willingness-to-pay measurement in chronic arthritis. Medical Decision Making 4: 195-215. Thompson MS, Read JL & Liang MH (1982) Willingness-to-pay concepts for societal diseases in health. In Kane RL & Kane RA (eds) Values and Long Term Care, pp 103-105. Lexington, Massachusetts: DC Health Publishers.

MEASUREMENT OF OUTCOME IN RHEUMATOID ARTHRITIS

37

Torrance GW (1987) Utility approach to measuring health-related quality of life. Journal of Chronic Diseases 40: 593-600. TugweU P, Bombardier C, Buchanan WW, Goldsmith CH & Grace E (1987) The MACTAR questionnaire--an individualized functional priority approach for assessing improvement in physical disability in clinical trials in rheumatoid arthritis. Journal of Rheumatology 14: 446-451. Wallston KA, Brown GK, Stein MJ & Dobbins CJ (1989) Comparing the short and long versions of the Arthritis Impact Measurement Scales. Journal of Rheumatology 16:1105-1109. Ware JE (1990) A short health status form from the Medical Outcomes Study. (in preparation).

Measurement of outcome in rheumatoid arthritis.

2 Measurement of outcome in rheumatoid arthritis MATTHEW H. LIANG JEFFREY N. KATZ INTRODUC~ON Over the past decade, the goals of measuring outcome in...
931KB Sizes 0 Downloads 0 Views