http://informahealthcare.com/dre ISSN 0963-8288 print/ISSN 1464-5165 online Disabil Rehabil, Early Online: 1–7 ! 2015 Informa UK Ltd. DOI: 10.3109/09638288.2015.1027005

RESEARCH PAPER

The Arm Function in Multiple Sclerosis Questionnaire (AMSQ): development and validation of a new tool using IRT methods Lidwine B. Mokkink1, Dirk L. Knol1, Femke H. van der Linden1,2, Judith M. Sonder1,2, Marie D’hooghe3, and Bernard M. J. Uitdehaag2 Disabil Rehabil Downloaded from informahealthcare.com by Nyu Medical Center on 07/01/15 For personal use only.

1

Department of Epidemiology and Biostatistics and 2Department of Neurology, VU University Medical Center, Amsterdam, The Netherlands, and National MS Center, Melsbroek, Belgium

3

Abstract

Keywords

Purpose: We developed the Arm Function in Multiple Sclerosis Questionnaire (AMSQ) to measure arm and hand function in MS, based on existing scales. We aimed at developing a unidimensional scale containing enough items to be used as an itembank. In this study, we investigated reliability and differential item functioning of the Dutch version. Method: Patients were recruited from two MS Centers and a Dutch website for MS patients. We performed item factor analysis on the polychoric correlation matrix, using multiple fit-indices to investigate model fit. The graded response model, an item response theory model, was used to investigate item goodness-of-fit, reliability of the estimated trait levels (y), differential item functioning, and total information. Differential item functioning was investigated for type of MS, gender, administration version, and test length. Results: Factor analysis results suggested one factor. All items showed p-values of the item goodness-of-fit statistic above 0.0016. The reliability was 0.95, and no items showed differential item functioning on any of the investigated variables. Conclusion: AMSQ is a unidimensional 31-item questionnaire for measuring arm function in MS. Because of a well fit in a graded response model, it is suitable for further development as a computer adaptive test.

Activity limitations, Arm Function in Multiple Sclerosis Questionnaire, differential item functioning, graded response model, item fit, item response model, multiple sclerosis, psychometrics History Received 25 August 2014 Revised 2 March 2015 Accepted 4 March 2015 Published online 24 March 2015

ä Implications for Rehabilitation    

A new questionnaire for arm and hand function recommended in people with multiple sclerosis (AMSQ). Scale characteristics make the questionnaire suitable for use in clinical practice and research. Good reliability. Further development as a computer adaptive test to reduce burden of (repetitive) testing in patients is feasible.

Introduction Researchers have set up the International Progressive Multiple Sclerosis (MS) Collaborative to stimulate the development of effective therapies for disease modification and symptom management in progressive MS [1]. However, when monitoring MS disease progression or evaluating the effectiveness of interventions, it is crucial that validated and relevant outcome measures are available. Regulatory bodies, such as the United States Food and Drug Administration [2] and the European Medicines Agency [3] advise using of patient-reported outcome measures (PROMs) when measuring symptoms, function and quality of life. PROMs capture the patient’s perspective of his or her health condition, directly and without anyone else adding their interpretation.

Address for correspondence: Dr. Lidwine B. Mokkink, PhD, Department of Epidemiology and Biostatistics, VU University Medical Center, Van der Boechorststraat 7, 1081 BT Amsterdam, The Netherlands. Tel: +31204449813. Fax: +31204444475. E-mail: [email protected]

In addition, these measures are often quick and inexpensive to administer and therefore valuable in large-scale epidemiological studies. PROMs may be generic or disease-specific and measure one or more domains. For instance, the MS-Walking scale is a diseasespecific unidimensional PROM focusing on the ability of a patient to walk [4]. Although walking ability receives a lot of attention, restriction in arm function has huge impact on the MS patient’s ability to perform activities of daily living and thus on his or her general health perception [5]. Arm function can be affected early in the course of the disease, but more often it becomes more apparent later on, particularly in the progressive phase of the disease. Although the MS Functional Composite uses a performance test to quantify hand and arm function, there is currently no validated unidemensional PROM to measure hand and arm function in patients with MS. Self-report questionnaires measuring arm and hand function have been developed for other patient groups. Examples are the Australian/Canadian (AUSCAN) Osteoarthritis Hand Index [6,7] for osteoarthritis of the hand, the ABILHAND for rheumatoid arthritis [8], the Functional

Disabil Rehabil Downloaded from informahealthcare.com by Nyu Medical Center on 07/01/15 For personal use only.

2

L. B. Mokkink et al.

Status Scale (FSS) for carpal tunnel syndrome [9] and the DASH for upper extremity musculoskeletal conditions [10]. In addition, researchers have recently presented the non-disease specific Upper Extremity Function scale as part of the Neuro-QOL project to measure fine motor skills and activities of daily life in patients with neurological conditions [11]. However, these instruments are multidimensional, too short to serve as an item bank, include gender-specific items, or do not focus on the specific difficulties faced by MS patients. It is important that a validated PROM for arm function becomes available, so that it can be used as an outcome measure in clinical studies and as a monitoring tool in clinical practice. In addition, it would be useful if a PROM in the form of an item bank were available. The items could then be presented as a computer adaptive test, potentially reducing testing time and improving tolerability while preserving assessment precision. The Arm Function in Multiple Sclerosis Questionnaire (AMSQ) is a novel instrument. It consists of 31 items and addresses the need for a reliable and validated MS-specific PROM focusing on hand and arm function. Here, we describe the development of the questionnaire via item generation and item reduction. Subsequently, we present item fit, reliability, construct validity by investigating differential item functioning, and total test information in the context of item response theory (IRT) of the Dutch version.

Materials and methods Patients We recruited patients via the outpatient at the MS Center of the VU University Medical Center, Amsterdam, the Netherlands, and the National MS Center, Melsbroek, Belgium, and via a Dutch language website aimed at MS patients and their partners, i.e. www.msweb.nl. The majority of patients were treated at any of the two MS centers. Patients enrolled in the study by e-mail and were eligible if they stated that they had MS. The local medical ethics committees of both centers approved the study. Item generation First, we conducted a systematic literature search with the aim to find all existing questionnaires that measure arm or hand functioning. Functioning is defined according to the International Classification of Functioning as activity limitations, which are difficulties an individual may have in executing activities [12]. Our literature search produced 42 English or Dutch language questionnaires measuring arm and hand function. These questionnaires contained a total of 529 items. We used a simple procedure to translate all items, which were not available in Dutch, into Dutch. We did not use a forward–backward translation procedure. We reformulated all items to start ‘‘During the past two weeks, to what extent has MS limited your ability to. . .’’ The response categories were ‘‘not at all’’, ‘‘a little’’, ‘‘moderately’’, ‘‘quite a bit’’, ‘‘extremely’’, ‘‘no longer able to’’, or ‘‘not applicable’’. Five MS experts (a neurologist, a rehabilitation specialist, two physical therapists and an occupational therapist) read all items, and removed those, which did not measure hand or arm function. At this point, 257 items remained in our study (Figure 1). Item reduction We asked 213 patients to complete all 257 items. In the next step, items with a percentage missing values 410% in either males or females were removed. This way, 89 items were removed. All remaining 168 items were divided in temporarily domains, e.g. washing, dressing, knotting, holding, catching/grabbing, lifting,

Disabil Rehabil, Early Online: 1–7

Figure 1. Flow chart of item reduction of AMSQ.

pouring, turning/twisting, opening, writing, cutting, and other. The 168 item version was completed electronically by 88 other patients. We analyzed these items of all patients (n ¼ 301), to determine (1) percentage respondents who scored ‘‘not at all’’ on an item, (2) percentage missing on an item, (3) percentage respondents who scored ‘‘not applicable’’, and (4) number of times that residual correlation between an item and another item was above 0.20 (see below for details on factor analysis). Based on relative results of these calculations, and based on the content of the item, we selected two to three items per domain. In total we included 31 items (Appendix), and removed the last response category (i.e. ‘‘not applicable’’). Respondents who answered ‘‘not applicable’’ on one of the 31 items were later treated as missing at random [13]. The remaining 31 items were completed by another 217 patients, either by computer (n ¼ 45), or by pen-and-paper (n ¼ 172). In total, 518 patients completed the 31 items. See also Figure 1. The AMSQ was developed in Dutch, and afterwards translated using the forward–backward approach [14] into English. All analyses described in this article are based on the Dutch 31-item version of the AMSQ. Statistical analysis We obtained the means, medians, and standard deviation of the sum score over all 31 items of the scale, and investigated the frequencies of missing data per patient. In this study, we used IRT methods. In IRT models, the response probabilities of each patient to the individual items are modeled as a function of the trait level (y) of that patient. A higher trait level (y) score corresponds to more disability. An IRT model is particularly well-suited for analyzing item fit (i.e. the correspondence between the model predictions and observed data) and differential item functioning. An item shows differential item functioning when respondents of different groups (e.g. males and females) with the same trait level (y) (allowing for group differences) do not respond similarly to a particular item. In addition, IRT is able to handle missing data. We used the graded response model, which is an IRT model. This model was developed to analyze data from items with ordered polytomous response categories [e.g. 15,16]. There are several assumptions underlying the graded response model, i.e. unidimensionality, local independence, and increasing item step response curves of the response categories, meaning that lower response categories need lower trait level (y) values to endorse than do higher response categories.

Development and validation of the AMSQ

Disabil Rehabil Downloaded from informahealthcare.com by Nyu Medical Center on 07/01/15 For personal use only.

DOI: 10.3109/09638288.2015.1027005

We used item factor analysis to investigate the dimensionality and local independence of the scale [17]. We estimated the model parameters using Mplus version 6.12 [18], and the weighted least squares with mean and variance adjustment (WLSMV), and evaluated model fit using the root mean square error of approximation (RMSEA), comparative fit index (CFI), TuckerLewis index (TLI), and standardized root mean square residual (SRMR). We considered RMSEA50.06, CFI and TLI40.95, and SRMR 50.08 as good [19]. RMSEA between 0.05 and 0.08 was considered reasonable, and RMSEA 0.1 suggests poor fit [20]. Local independence was investigated by inspecting the residual correlations between items. In case an item showed high residual correlations (40.20) with several other items, the item was considered for removal. We calculated Guttman’s lambda2 for each factor, which is a measure for the internal consistency or single test administration reliability [21]. The graded response model was used to investigate item fit, the reliability of the estimated trait level (y), differential item functioning, and total test information. We used marginal maximum likelihood with the expectation-maximization algorithm of Bock and Aitkin [22] (choosing 99 quadrature points) to obtain discrimination parameters (a) and threshold parameters (b), and calculated expected a posteriori (EAP) trait levels (ys) [23]. All IRT analyses were performed in IRTPRO [24]. An item goodness-of-fit statistic (S-X2 test [25,26]) with p50.0016 was considered as an indication of a poor fit of the item. This value was the result of a p-value50.05 and the use of a Bonferroni correction, to protect against multiple testing, i.e. the significance level was set at 0.05/31 ¼ 0.0016. Subsequently, a reliability coefficient of the estimated trait level (y) was calculated. Differential item functioning was identified using a modelbased likelihood ratio test approach [27]. We investigated both uniform and non-uniform differential item functioning for the group variables type of MS, gender, administration version, i.e. either an online version or a paper version, and test length, i.e. completion of 31 items within a larger questionnaire (either 168 or 257 potential items), or the final 31 items version (including the revision of the response options, i.e. removing response category ‘‘not applicable’’). For type of MS, we classified patients into (1) relapsing remitting MS (RRMS) or clinically isolated syndrome (CIS), and (2) secondary progressive (SPMS) or primary progressive MS (PPMS). We used all items as the anchor items (except the item under study for differential item functioning). First, we investigated whether differential item functioning occurred, i.e. when a likelihood-ratio chi-squared (G2) test had a p-value below 0.0016. Next, we investigated whether uniform differential item functioning or non-uniform differential item functioning was present. We produced an item information curve, indicating the amount of psychometric information that an item contains at all points along the y-continuum [16] for each item. We combined the item information curves to obtain the total information curve and the standard error of the estimated y [28].

3

Table 1. Patient characteristics (N ¼ 518). Age (years) (n ¼ 425) Mean (SD) Min–max Disease duration (years) (n ¼ 499) Mean (SD) Min–max Type of MS (%) Relapsing remitting (RR) Primary progressive (PP) Secondary progressive (SP) CIS or unknown Administration version (%) Paper version Online version Gender (%) Female Male Unknown Test length (%) Larger version 31-item version

47.5 SD 11.2 24–81 11.9 SD 8.1 51–42 43 18 26 13 33 67 67 30 3 58 42

SD, standard deviation; min, minimum; max, maximum.

Mean sum score of 408 complete cases on the AMSQ was 24.6 (SD ¼ 25.3) and the median was 15.8 (range 0–100). Higher scores mean worse arm function. Item factor analysis Item factor analysis showed a unidimensional structure of the 31 items of the scale, explaining 78% of the variance. Factor loadings of the 31 items were between 0.80 and 0.94. No item pairs showed a residual correlation above 0.20. Fit statistics for the model were reasonable: RMSEA ¼ 0.090, CFI ¼ 0.978, TLI ¼ 0.976 and SRMR ¼ 0.039. Guttman’s lambda2 was 0.99. IRT analyses Based on results of the item goodness-of-fit (S-X2) statistics, we combined the adjacent response categories ‘‘a little’’ and ‘‘moderately’’, for all items. All IRT analyses are performed after combining the response categories. None of the items showed p-values of the goodness-of-fit statistics below 0.0016. Discrimination parameters (a) and threshold parameters (b) per item are given in Table 2. The reliability of the estimated trait level (y) was 0.95. Next, we investigated whether differential item functioning occurred in each item. No differential item functioning was present in any of the 31 items for the group variables type of MS, gender, administration version, and test length. All G2 tests had p-values above 0.0016. Figure 2 shows the total information curve and the standard error of the estimated y of the AMSQ.

Discussion Results Table 1 shows the characteristics of the patients. In total, 518 patients participated in this study. For the analyses described in this article, we used the scores of all patients of only the 31 remaining items. The 31 items were fully completed by 408 (79%) patients; 51 patients had one missing item, 34 patients had two or three missing items, and 25 patients had four to 25 missing items. In total, 2.7% of the data were missing, of which 0.9% was due to considering the answer ‘‘not applicable’’ as missing. We used all available data and did not impute any missing values.

We report on a new tool to measure activity limitations due to arm and hand functioning. The Dutch version of this 31-item selfreport questionnaire, which is shown to be unidimensional, fits appropriately to the graded response model, which paves the way for the development of a computer adaptive test. First evaluations support that the Dutch version of the AMSQ is both a reliable and a valid scale for measuring arm and hand function in patients with MS. The fit indexes CFI, TLI and SRMR for item factor analysis satisfy guidelines [19]. These guidelines are based on simulation

4

L. B. Mokkink et al.

Disabil Rehabil, Early Online: 1–7

Table 2. Item parameters and goodness-of-fit statistics. Item goodness-of-fit statistic (S-X2)

Item parameters

Disabil Rehabil Downloaded from informahealthcare.com by Nyu Medical Center on 07/01/15 For personal use only.

Item 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Write down a short sentence with a pen Grasp small objects Put on a coat Tie shoelaces Hold a full plate Pour from a bottle into a glass Turn the pages of a book use a mouse of a computer Use a pen or pencil Turn a key in a lock Cut off a piece of paper with a pair of scissors Fasten a seatbelt in a car Fasten buttons Unbutton your shirt Take off a sweater or T-shirt Pick up coins from the table Use a keyboard Zip up a coat Carry a shopping bag Wash your hands Cut something with a knife Pierce food with a fork Dry off your body Open a bottle that has a screw cap Open a bottle of soft drink Wash the back of your shoulder Wash your hair Open a bag of crisps Bring a full glass or cup to your mouth Put toothpaste on a toothbrush Tuck a (T-)shirt in the back of your trousers using your hand

y  N (0,1); a, discrimination parameters; b, threshold parameters.

Figure 2. Total information curve of the AMSQ and standard error of the estimated thetas. A higher theta corresponds to more disability.

a

b1

b2

b3

b4

X2

df

p

2.36 3.10 3.67 3.78 3.60 3.50 3.17 2.42 2.92 3.37 4.10 3.87 4.10 4.19 3.81 3.17 2.80 4.47 2.47 4.37 4.16 3.75 3.64 2.72 3.16 3.15 3.42 3.77 3.69 4.02 4.13

0.25 0.33 0.04 0.30 0.34 0.25 0.07 0.17 0.35 0.06 0.06 0.07 0.71 0.45 0.11 0.42 0.10 0.14 0.62 0.56 0.21 0.19 0.08 0.62 0.47 0.24 0.12 0.05 0.11 0.22 0.06

0.87 1.03 1.07 0.62 0.67 0.89 1.18 1.36 0.80 1.12 1.04 1.05 0.45 0.61 1.19 0.90 1.17 0.99 0.62 1.46 0.69 1.26 1.06 0.70 0.83 0.72 0.92 0.92 1.20 1.16 0.90

1.39 1.70 1.47 1.00 1.04 1.27 1.62 1.79 1.28 1.57 1.43 1.44 0.84 1.08 1.58 1.49 1.83 1.35 0.99 1.78 1.03 1.63 1.48 1.11 1.21 1.00 1.24 1.29 1.45 1.55 1.34

2.01 2.47 1.97 1.28 1.36 1.72 2.41 2.44 1.83 2.01 1.75 1.89 1.30 1.55 2.03 2.09 2.18 1.75 1.39 2.24 1.52 2.16 1.84 1.62 1.56 1.32 1.57 1.75 1.80 1.85 1.78

105.71 96.71 82.03 82.85 87.97 84.50 68.65 69.06 96.17 95.57 83.31 76.03 78.98 94.96 71.34 81.07 80.04 82.29 120.05 53.98 74.31 63.98 63.77 102.22 92.14 113.48 97.42 65.98 81.06 63.68 91.68

91 62 55 75 68 72 61 70 80 65 61 62 74 61 54 70 66 57 91 39 69 51 60 82 74 74 66 65 57 54 64

0.139 0.003 0.011 0.250 0.052 0.149 0.234 0.510 0.105 0.008 0.030 0.108 0.324 0.004 0.057 0.172 0.115 0.016 0.022 0.056 0.309 0.105 0.345 0.065 0.075 0.002 0.007 0.444 0.020 0.172 0.013

Development and validation of the AMSQ

Disabil Rehabil Downloaded from informahealthcare.com by Nyu Medical Center on 07/01/15 For personal use only.

DOI: 10.3109/09638288.2015.1027005

research on continuous data. The result for the RMSEA fit index was mediocre. In this study, ordinal data were used. As far as we know, the performance of these fit indexes is less clear on ordinal data. Marsh [29] emphasized that Hu and Bentler [19] never suggested that their guidelines should be interpreted as universal golden rules, absolute cut-off values, or highly rigid criteria that were universally appropriate. A unidimensional scale is usually much shorter. Guttman’s lambda2 calculated on the 31 items in this scale was high, indicating that fewer items may be sufficient. However, we decided not to shorten the scale, because it is our aim to develop a computer adaptive test, based on an itembank containing a sufficient number of items. There are no guidelines of how many items should be included in an item bank. Within the PROMIS items banks, for example, this ranges from less than 10 up to above 100 items [30]. The 31 items fit appropriately to the graded response model. To estimate item parameters, the inclusion of at least 500 subjects is recommended [31]. In this study, we had a sample size of 518. Part of the patients (n ¼ 172) completed a pen-and-paper version, and part (n ¼ 346) an online version. In addition, more than half of the patients (n ¼ 301) completed the 31 items as part of a larger set of items, including the response option ‘‘not applicable’’. Differential item functioning analysis on these group variables (i.e. administration version and test length) showed no differential item functioning on any of the 31 items. Moreover, none of the other investigated group variables, i.e. type of MS and gender, showed differential item functioning. This means that all items in the scale are equivalent for each subgroup of the investigated group variables. This result implies that item parameters do not have to be estimated differently in these groups. It allows (e.g.) using both the paper version and the online version in one and the same study, without the need to calculate y scores using different item parameters. The test information curve and the standard error of the estimated y give information about the precision of the questionnaire. In Figure 2, it can be seen that the total information of patients with a negative y (i.e. those having a relatively good arm function) is low, and the standard error is high. The total information of patients with impaired arm function, at the right side of the x-axes, is much higher, and the standard error is smaller. This means that the questionnaire is better able to distinguish between patients with arm function problems (i.e. high total information), and that the y of these patients can be estimated more precisely (i.e. small standard error) than for patients without arm function problems, i.e. with a negative trait level. When this questionnaire is used in research or clinical practice, one is probably most interested in (change) scores of patients with impaired arm function. In addition, it may be useful as an instrument to detect arm function limitations in early MS. In this study, it is likely that a considerable part of the included patients had only limited problems with their arm and hand function. A reason for this is that we did not formulate an inclusion criteria for having arm and hand function problems, and that patients were asked to complete the questionnaire themselves. This could have discouraged people with arm and hand problems to participate. When investigating the responsiveness of the AMSQ, it is important to include patients with more severe arm and hand function problems. Although the response categories 2 and 3 (i.e. ‘‘a little’’ and ‘‘moderately’’) were combined in the analyses in this study, at the moment we do not recommend to use five response categories (i.e. combining options 2 and 3), because this has not been tested yet. In Figure 2, it can be seen that the curve is bimodal. The test information curve for the data without combining response

5

categories 2 and 3 does not show two modes. This may be related to the Dutch wording of these two response categories and may be different in other languages. Notably, the response categories are formulated exactly the same as the response categoriesof the Dutch version of the MSIS-29 [32]. It would be interesting to study whether the same goodness-of-fit pattern can be seen in that scale. In the meantime, we have produced an English version of the AMSQ using standard back and forth translating (see Appendix). This version is now under investigation in an Irish population. In addition, another study is ongoing for translating and validating the German version of the AMSQ. In this article, we use trait level (y) scores of the patients. A disadvantage of IRT is that these scores are more complex to calculate than sum scores. However, when a patient has missing values sum scores can be difficult to calculate, and often depends on quite arbitrary assumptions. In our study the correlation between the trait level (y) score and the sum score was 0.93, thus the latter can be used as an alternative. However, for the use in research, we recommend to use trait level (y) scores. During the development process of the AMSQ, the Upper Extremity Function (Fine motor, ADL) scale as part of the NeuroQOL project became available. This scale was developed for various neurological disorders. Scores on two scales that claim to measure the same construct can be linked to each other [33]. That would imply that in the future trait level (y) scores obtained with any of the two scales can be estimated on the same latent trait level, and subsequently will be interchangeable. Further research is needed to evaluate if that is truly the case, or that there are (subtle) differences with respect to the underlying trait.

Acknowledgements We would like to thank Rebecca Holman for checking the content of this manuscript on the English spelling and grammar, and Francisca Galindo for her additional statistical help.

Declaration of interest The authors report no conflict of interest. The authors alone are responsible for the content and writing of this article. The study was financially supported by the VU University Medical Center, Amsterdam, the Netherlands.

References 1. Fox RJ, Thompson A, Baker D, et al. Setting a research agenda for progressive multiple sclerosis: the International Collaborative on Progressive MS. Mult Scler 2012;18:1534–40. 2. U.S. Department of Health and Human Services, Food and Drug Administration (FDA), Center for Drug Evaluation and Research (CDER), Center for Biologics Evaluation and Research (CBER), and Center for Devices and Radiological Health (CDRH). 2007. Guidance for Industry patient-reported outcome measures: use in medical product development to support labeling claims. Available from: http://www.fda.gov/cder/guidance/5460dft.pdf [last accessed 26 Feb 2013]. 3. European Medicines Agency. Reflection paper on the regulatory guidance for the use of health-related quality of life (HRQL) measures in the evaluation of medical products; 2009. Available from: http://www.ema.europa.eu/docs/en_GB/document_library/ Scientific_guideline/2009/09/WC500003637.pdf [last accessed 26 Feb 2013]. 4. Hobart JC, Riazi A, Lamping DL, et al. Measuring the impact of MS on walking ability: the 12-Item MS Walking Scale (MSWS-12). Neurology 2003;60:31–6. 5. Mansson E, Lexell J. Performance of activities of daily living in multiple sclerosis. Disabil Rehabil 2004;26:576–85. 6. Bellamy N, Campbell J, Haraoui B, et al. Clinimetric properties of the AUSCAN Osteoarthritis Hand Index: an evaluation of

6

7.

8. 9.

10. 11. 12.

Disabil Rehabil Downloaded from informahealthcare.com by Nyu Medical Center on 07/01/15 For personal use only.

13. 14. 15. 16. 17. 18. 19. 20.

L. B. Mokkink et al. reliability, validity and responsiveness. Osteoarthrit Cartilage 2002;10:863–9. Bellamy N, Campbell J, Haraoui B, et al. Dimensionality and clinical importance of pain and disability in hand osteoarthritis: development of the Australian/Canadian (AUSCAN) Osteoarthritis Hand Index. Osteoarthrit Cartilage 2002;10:855–62. Penta M, Thonnard JL, Tesio L. ABILHAND: a Rasch-built measure of manual ability. Arch Phys Med Rehabil 1998;79:1038–42. Levine DW, Simmons BP, Koris MJ, et al. A self-administered questionnaire for the assessment of severity of symptoms and functional status in carpal tunnel syndrome. J Bone Joint Surg Am 1993;75:1585–92. Hudak PL, Amadio PC, Bombardier C. Development of an upper extremity outcome measure: the DASH (disabilities of the arm, shoulder and hand. Am J Ind Med 1996;29:602–8. Northwestern University, Chicago, IL, USA. Available from: www.neuroqol.org [last accessed 12 May 2014]. World Health Organization: ICF. International classification of functioning, disability and health. Geneva: World Health Organization; 2001. Little RJ, Rubin DB. Statistical analysis with missing data. 2nd ed. New York (NY): Wiley; 2002. Beaton DE, Bombardier C, Guillemin F, Ferraz MB. Guidelines for the process of cross-cultural adaptation of self-report measures. Spine 2000;25:3186–91. Van der Linden WJ, Hambleton RK. Handbook of modern item response theory. New York (NY): Springer; 1997. Embretson SE, Reise SP. Item response theory for psychologists. Mahwah (NJ): Lawrence Erlbaum Associates, Inc., 2000. Wirth RJ, Edwards MC. Item factor analysis: current approaches and future directions. Psychol Methods 2007;12:58–79. Mplus (Version 6.1) [Computer software]. Los Angeles (CA): Muthe´n & Muthe´n; 2010. Hu L, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct Equat Model 1999;6:1–55. University of Connecticut, CT, USA. Available from: www.davidakenny.net/cm/fit.html [last accessed 8 Jun 2014].

Disabil Rehabil, Early Online: 1–7

21. Sijtsma K. On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika 2009;74:107–20. 22. Bock RD, Aitkin M. Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika 1981;46:443–5. 23. Bock RD, Mislevy RJ. Adaptive EAP estimation of ability in a microcomputer environment. Appl Psychol Measure 1982;6: 431–44. 24. Scientific Software International. IRTPRO: user’s guide. Lincolnwood (IL): Scientific Software International, Inc.; 2012. 25. Orlando M, Thissen D. Likelihood-based item-fit indices for dichotomous item response theory models. Appl Psychol Measure 2000;24:50–64. 26. Orlando M, Thissen D. Further investigation of the performance of S - X2: an item fit index for use with dichotomous item response theory models. Appl Psychol Measure 2003;27:289–98. 27. Orlando M, Thissen D, Teresi J, et al. Identification of differential item functioning using item response theory and the likelihoodbased model comparison approach: application to the Mini-Mental State Examination. Med Care 2006;44:S134–42. 28. Mokkink LB, Knol DL, Van Nispen RMA, Kramer SE. Improving the quality and applicability of the Dutch scales of the communication profile for the hearing impaired using item response theory. JSLHR 2010;53:556–71. 29. Marsh HW, Hau KT, Wen Z. In search of golden rules: comment on hypothesis-testing approaches to setting cutoff values for fit indexes and dangers in overgeneralizing Hu and Bentler’s (1999) findings. Struct Equat Model 2004;11:320–41. 30. Patient Reported Outcomes Measurement Information System (PROMIS). Available from: https://www.assessmentcenter.net/documents/InstrumentLibrary.pdf [last accessed 12 Jul 2013]. 31. Reise SP, Yu J. Parameter recovery in the graded response model using MULTILOG. JEM 1990;27:133–44. 32. Van der Linden FAH, Kragt JJ, Hobart JC, et al. The size of the treatment effect: do patients and proxies agree? BMC Neurol 2009; 9:12. 33. Dorans NJ. Linking scores from multiple health outcome instruments. Qual Life Res 2007;16S1:85–94.

Development and validation of the AMSQ

DOI: 10.3109/09638288.2015.1027005

7

Appendix Arm Function in Multiple Sclerosis Questionnaire (AMSQ) Copyright of the AMSQ is held by the MS Center Amsterdam of VUmc. For translations in other languages please contact the corresponding author. Please note that the AMSQ was developed in Dutch, and afterwards translated into English. All analyses described in this article are based on the Dutch 31-item version of the AMSQ. Please read the instructions below carefully before starting on the questions All questions are about the past 2 weeks. For each question, please circle one number that best describes your situation. In case you never perform an activity:  Choose ‘‘no longer able (to)’’ if you no longer perform the activity because of limitations in the use of your arm.  When you are asked about an activity you never perform (or performed), please try to imagine whether you are limited in your ability to perform the activity.  Some questions are about activities that you can perform with one hand. When answering these questions, please choose the arm with which you always performed this activity (before you had any complaints). If you use aids or adapted equipment to perform an activity, please try to imagine how you would do without these aids

Disabil Rehabil Downloaded from informahealthcare.com by Nyu Medical Center on 07/01/15 For personal use only.

  

Nr.

During the past 2 weeks, to what extent has MS. . .

1 2

Limited your ability to write down a short sentence with a pen? Limited your ability to grasp small objects, for example a key or a ballpoint pen? Limited your ability to put on a coat? Limited your ability to tie shoelaces? Limited your ability to hold a full plate? Limited your ability to pour from a bottle into a glass? Limited your ability to turn the pages of a book? Limited your ability to use a mouse of a computer? Limited your ability to use a pen or pencil? Limited your ability to turn a key in a lock? Limited your ability to cut off a piece of paper with a pair of scissors? Limited your ability to fasten a seatbelt in a car? Limited your ability to fasten buttons? Limited your ability to unbutton your shirt? Limited your ability to take off a sweater or T-shirt? Limited your ability to pick up coins from the table? Limited your ability to use a keyboard? Limited your ability to zip up a coat? Limited your ability to carry a shopping bag? Limited your ability to wash your hands? Limited your ability to cut something with a knife? Limited your ability to pierce food with a fork? Limited your ability to dry off your body? Limited your ability to open a bottle that has a screw cap? Limited your ability to open a bottle of soft drink? Limited your ability to wash the back of your shoulder? Limited your ability to wash your hair? Limited your ability to open a bag of crisps? Limited your ability to bring a full glass or cup to your mouth? Limited your ability to put toothpaste on a toothbrush? Limited your ability to tuck a T-shirt/shirt in the back of your trousers using your hand?

3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Not at all

A little

Moderately

Quite a bit

Extremely

No longer able to

1 1

2 2

3 3

4 4

5 5

6 6

1 1 1 1 1 1 1 1 1

2 2 2 2 2 2 2 2 2

3 3 3 3 3 3 3 3 3

4 4 4 4 4 4 4 4 4

5 5 5 5 5 5 5 5 5

6 6 6 6 6 6 6 6 6

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5

6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6

The Arm Function in Multiple Sclerosis Questionnaire (AMSQ): development and validation of a new tool using IRT methods.

We developed the Arm Function in Multiple Sclerosis Questionnaire (AMSQ) to measure arm and hand function in MS, based on existing scales. We aimed at...
252KB Sizes 0 Downloads 6 Views