Patient DOI 10.1007/s40271-014-0058-z

ORIGINAL RESEARCH ARTICLE

Development and Validation of the AFSympTM: An Atrial Fibrillation-Specific Measure of Patient-Reported Symptoms Jennie Medin • Rob Arbuckle • Linda Abetz • Katarina Halling • Karoly Kulich • Nils Edvardsson Karin S. Coyne



 Springer International Publishing Switzerland 2014

Abstract Background Patients with atrial fibrillation (AF) can be severely incapacitated by symptoms, but validated symptom measures are lacking. The aim of this study was to develop an AF-specific symptom questionnaire (AFSympTM). Methods Following a literature review, qualitative interviews with 91 patients (United States [US], n = 30; United Kingdom [UK], n = 16; France, n = 15; Germany, n = 15; Japan, n = 15) with paroxysmal, persistent, and permanent AF were conducted to identify emergent concepts and to develop the items and response options for the AFSympTM. Clinical experts (n = 21) in the US, the UK, France, Germany, and Japan provided feedback on the most clinically relevant symptoms via an email survey. Cognitive interviews with 30 patients were conducted to evaluate content validity. A prospective, observational, psychometric evaluation study (n = 313) consisting of two study visits was performed at 32 sites across the US.

Electronic supplementary material The online version of this article (doi:10.1007/s40271-014-0058-z) contains supplementary material, which is available to authorized users. J. Medin  K. Halling  K. Kulich AstraZeneca R&D, Mo¨lndal, Sweden R. Arbuckle  L. Abetz Adelphi Values (known as Mapi Values at the time the work was conducted), Manchester, UK N. Edvardsson Sahlgrenska Academy at Sahlgrenska University Hospital, Go¨teborg, Sweden K. S. Coyne (&) Evidera, 7101 Wisconsin Ave, Suite 600, Bethesda, MD 20814, USA e-mail: [email protected]

Results After item reduction, the AFSympTM consisted of 11 items with a 1-week recall period. Exploratory and confirmatory factor analysis resulted in three subscales (heart symptoms, tiredness, chest discomfort) and two items: dizziness and shortness of breath. Internal consistency was strong across subscales (Cronbach’s a 0.82–0.91). The test– retest reliability of items and subscales was acceptable (intraclass correlation [ICC] 0.58–0.78). The reproducibility of the single global score was strong (ICC 0.78). The construct and known-groups validity was acceptable. Conclusion The AFSympTM demonstrates evidence of reliability and validity as a comprehensive measure of AF symptoms that can be used to assess patient outcomes in clinical and research settings. More research is needed to evaluate the instrument’s responsiveness. Key Points The AFSympTM is a patient-reported questionnaire consisting of 11 items that can be used to evaluate symptoms of atrial fibrillation (AF). The instrument includes a global score (7 items); three subscales: heart symptoms (4 items), tiredness (3 items), and chest discomfort (2 items); and two items measuring dizziness and shortness of breath that are scored as single items. The AFSympTM is a comprehensive measure of AFrelated symptoms that is easy for patients to understand and complete and demonstrates good psychometric validity and reliability. Developed based on qualitative research with patients in multiple countries and cultures and with input from clinicians to ensure all clinically relevant concepts were included, the AFSympTM is a viable alternative to the few existing AF-specific symptom measures.

J. Medin et al.

2 Methods

1 Background Atrial fibrillation (AF) is the most common cardiac condition, with prevalence estimates of 1 % among patients younger than 60 years and 8 % among those older than 80 years [1]. Symptoms and their impact on domains of patients’ daily lives are best measured through direct report by patients themselves. Studies of AF symptoms and their health-related quality of life (HR-QOL) impact have most often used the generic Short Form-36 (SF-36) [2] or EuroQoL Five Dimension (EQ-5D) [3], which are not always sensitive to impairments resulting from AF [4]. There are also a number of condition-specific tools [5–8] that evaluate a range of symptoms and AF impacts. However, many of these were not developed in a manner that would satisfy contemporary regulatory standards [7, 8], while others designed to assess symptoms and HRQOL impact may be too lengthy for routine use. Thus, there is a need for a tool specifically designed to assess AF symptoms in the context of research and clinical practice. The aim of this manuscript is to describe the development of the AFSympTM, a new, AF-specific symptom measure developed to evaluate AF symptoms and suitable for use in multiple languages and cultural contexts and to meet contemporary regulatory standards [7, 8].

The development and validation of the AFSympTM was conducted in two phases consisting of qualitative and quantitative research (Fig. 1). Independent Research Board approval was obtained prior to initiation of data collection and all participants provided written informed consent. A second instrument—the AFImpactTM score—was developed concurrently to assess the impact of AF-related symptoms on domains of HR-QOL following these research activities and will be described in a separate article. 2.1 Phase I: Concept Elicitation, Instrument Development, and Cognitive Interviewing The development of the instrument involved a review of the literature and in-depth qualitative research. Peerreviewed articles on non-valvular AF and atrial flutter were reviewed to identify existing patient-reported outcome (PRO) measures. In-depth, qualitative, concept elicitation interviews were conducted with 91 patients (United States [US], n = 30; United Kingdom [UK], n = 16; France, n = 15; Germany, n = 15; Japan, n = 15) identified from clinical sites with confirmed diagnosis of paroxysmal, persistent and permanent AF (n = 30, 30 and 31, respectively) to explore the range of AF symptom experiences.

Literature Review

In-depth Qualitative Patient Interviews (n=91)

Draft AFSymp™ (28 items)

US: n=30 UK: n=16 France: n=15 Germany: n=15 Japan: n=15

Cognitive Interviews with US Patients (n=30) 20 ePRO 10 paper

Psychometric Evaluation in US N=313

Final AFSymp™ (11 items)

Expert Clinician Survey (n=21) US: n=5 UK: n=3 France: n=2 Germany: n=5 Japan: n=6

Fig. 1 Schematic presentation of development and validation of the AFSympTM phase I: concept elicitation, instrument development, and evaluation of content validity phase II: psychometric evaluation.

AFSymp atrial fibrillation-specific symptom questionnaire, ePRO electronic patient-reported outcome, UK United Kingdom, US United States

AF-Specific Symptom Measure

Interviews were conducted by experienced, trained interviewers native to the respective countries following a standardized interview guide and audio recorded with participants’ permission. Interviews started with openended questions (e.g., ‘‘Tell me about a typical day with your atrial fibrillation’’), with more specific direct questioning only used if concepts of interest did not arise. The qualitative data were transcribed verbatim and analyzed in the native language using thematic analysis techniques to identify emergent concepts and to provide qualitative content to support the development of the items for the AFSympTM using natural patient language. Feedback from expert clinicians regarding the most relevant symptoms to assess for each AF subtype was then obtained via an e-mail survey administered to 21 clinicians in the US (n = 5), the UK (n = 3), France (n = 2), Germany (n = 5), and Japan (n = 6) [9]. Following the development of the initial draft AFSympTM, cognitive interviews were conducted to further evaluate content validity and determine the interpretability, comprehension, and ease of use of the instrument. Thirty patients in the US (ten paroxysmal, ten persistent, ten permanent) were recruited through three clinicians. Twenty participants were administered the questionnaires in an electronic PRO (ePRO) format, and ten were administered in a pen/paper format. All participants were asked the same questions regarding the length, comprehensiveness, and format of the questionnaires as well as specific questions about their understanding and the relevance of each item. 2.2 Phase II: Psychometric Evaluation After development of the AFSympTM in Phase I, a prospective, observational psychometric evaluation study consisting of two study visits, baseline and 14 ± 3 days later, without any investigational medications or procedures, was performed at 32 sites across the US using the ePRO version of the AFSympTM. Patients with symptomatic AF were recruited through 32 clinical sites in the US and screened by clinical staff to determine study eligibility. Inclusion criteria included age 18–80 years, paroxysmal, persistent or permanent AF with symptoms in the past 7 days, ability to read and complete questionnaires on the electronic device, and provision of written informed consent. At baseline, the AFSympTM and other PRO instruments were completed: the SF-36 (1-week recall period) [10–14], the Toronto Atrial Fibrillation Severity Scale (AFSS) (a 19-item, disease-specific measure developed to capture patient ratings of frequency, duration and severity of episodes) [9, 15], and the AFImpactTM. At follow-up, the overall treatment effect (OTE), an instrument with three items addressing whether a patient had improved or

deteriorated since last visit [16], was administered together with the AFSympTM and AFImpactTM. As part of the OTE, if patients indicated an improvement or deterioration, they were asked to score the magnitude and the importance of the experienced change on a 7-graded scale. 2.3 Statistical Analyses All analyses were performed on Statistical Analysis System (SAS), version 8.0 and 9.1.3 software [17], except for Rasch analysis which was performed using Rasch Unidimensional Model Measurement (RUMM), version 3.1 software [18]. Rasch analysis and exploratory and confirmatory factor analyses (EFA, CFA) were used to reduce the items and uncover subscales to inform the scoring for the instrument. Descriptive statistics were used to summarize sample demographic and clinical characteristics. Internal consistency reliability was assessed using Cronbach’s formula for coefficient alpha [19]. Test–retest reliability was assessed in patients whose status was considered stable (defined as patients who reported that their health was ‘‘about the same’’ on the OTE questionnaire at study Visit 2). Construct validity was assessed by examining Pearson’s correlations between the domain scores of the AFSympTM and those of SF-36 and AFSS at baseline. Convergent validity is supported when scale score correlations are of magnitude r C 0.30 with items or scales measuring similar concepts, while divergent validity is supported by low correlations between theoretically unrelated dimensions [20]. Values [ 0.70 were considered acceptable for estimates of internal consistency reliability and test-retest reliability and Pearson’s correlations and intra-class correlations (ICC), while values exceeding the more conservative 0.80 threshold were considered strong [21]. Known-groups validity was assessed by comparing AFSympTM scores among groups of patients who differed according to the physician-rated severity of their AF symptoms and according to patient reported AF status (currently in AF or not according to item 3 of the AFSS). Analysis of variance (ANOVA) was used for comparisons among groups with Scheffe’s adjustment for post hoc pairwise comparisons.

3 Results 3.1 Phase I: Concept Elicitation, Instrument Development, and Evaluation of Content Validity The in-depth qualitative interviews included 60 (66 %) male and 31 (34 %) female AF patients aged \60 (27%),

J. Medin et al.

60–69 (30 %) and [69 (44 %) years. All patients were currently receiving medication for their AF, and the majority (79 %) had a comorbid condition, with hypertension being the most common (43 %). A number of themes emerged across the multi-country qualitative interviews. The most commonly reported symptoms in all countries were fatigue, shortness of breath, and awareness of sensations in the heart including a rapid heartbeat, a pounding heartbeat, a skipping heartbeat, an irregular heartbeat, a slow heartbeat, and palpitations. Other less commonly reported symptoms included chest pain, dizziness, feeling faint, and shaking/trembling. A feeling of anxiety or ‘a panic feeling’ was also described. In terms of the impact on patients’ activities of daily living, patients found that they had to stop to rest more often during daily activities and consequently took longer to get things done. All patients reported that their level of energy was impacted. Common descriptions included becoming tired or fatiguing easily, lacking energy, and falling asleep during the day. AF symptoms that were consistently identified in the literature review, qualitative interviews, and expert clinician survey were used to generate 28 items in US English for the AFSympTM. All items were worded using terms used by patients in order to maximize ease of understanding. Table 1 illustrates the process of mapping representative patient quotations to items by showing the development of one item. Based on the qualitative patient data, a 1-week recall period and seven-point categorical response scale (none of the time, hardly any of the time, a little of the time, some of the time, a good bit of the time, most of the time, all of the time) [19, 20] were selected. After the development of the initial item bank, cognitive interviews were conducted with 17 (57 %) male and 13 (43 %) female patients with demographic and clinical characteristics similar to that of the in-depth qualitative study sample (data on file). Both the pen/paper and ePRO versions of the questionnaire (Dana, Invivo Data Inc.) were generally reported to be clear and equally easy to use by most patients. Patient feedback resulted in minor revisions to the wording of nine of the items to improve clarity. Two items (feeling lethargic and cold sweats) were deleted because of difficulties in comprehension and ambiguous interpretation.

3.3 Item Reduction and Scoring Algorithm None of the patients in the psychometric evaluation study had any missing data for the AFSympTM questionnaires. This was most likely due to the electronic format, which prevented skipping of items. Frequency distributions and Rasch analysis identified items with unfavorable response distributions (heavily skewed distributions, relevant floor– ceiling effects, or bi-modal distributions) or inability of patients to distinguish among response options. Items were eliminated if they could not be systematically assigned to a unidimensional scale, if they had a high floor or ceiling effect ([70 % scoring at floor or ceiling) or if they correlated highly with other items, indicating redundancy. The clinical importance of the items based on the clinical expert feedback and qualitative data obtained in Phase I also guided decisions about item retention/deletion. Of the 28 items that were tested during this process, 17 were deleted due to the reasons noted above. Thus, the final version of the AFSympTM consists of 11 items. Results of the EFA and CFA derived and confirmed the appropriateness of three subscales: heart symptoms (four items), tiredness (three items), and chest discomfort (two items). A single global score was also developed, consisting of the seven AF symptoms that were experienced most frequently by patients and were considered most reflective of AF symptoms according to clinicians. Additionally, two items measuring dizziness and shortness of breath were each scored as single items and not included in the single global score or any of the three subscales. The seven-item global AFSympTM scale was found to fit the data very well as a hierarchical latent model with excellent fit indices (Table 3). The tiredness subscale exceeded all fit parameters, while the heart symptoms subscale exceeded two of the three fit parameters (Bentler’s Confirmatory Fit Index and standardized root mean square residual [SRMR]). The chest discomfort subscale, which contains two items, was an over-identified model which was difficult to fit. Importantly, a model’s fit need not meet all pre-designated parameters to be deemed acceptable when the scaling is clinically relevant and important to patients [17]. A copy of the instrument and scoring instructions are provided in the electronic supplementary material A and B. 3.4 Reliability

3.2 Phase II: Psychometric Evaluation A total of 313 AF patients were enrolled in the psychometric evaluation study (103 with paroxysmal AF, 100 with persistent AF, and 110 with permanent AF). The majority of patients were men (60 %), with a mean age of 65 years (range 25–80; Table 2).

Cronbach’s alpha coefficients demonstrated strong internal consistency for the multi-item subscale domains and total score: 0.91 (tiredness—three items), 0.82 (heart symptoms—four items), 0.79 (chest discomfort—two items) and 0.87 (single global score). The test-retest reliability (ICC) of the AFSympTM individual items and subscales in stable patients as defined by OTE was acceptable, with ICCs

AF-Specific Symptom Measure Table 1 Representative respondent quotations by concept and supporting information for ‘heart symptoms’ domain Domain

Supporting information

Representative quotes

Item

Heart symptoms

22 participants said they could feel their heart skip beats (10 paroxysmal, 7 persistent, 5 permanent)

‘‘I can tell like uh-oh something’s wrong and I can feel my pulse myself and feel like it’s skipping a beat.’’ (0081-021)

Item 8: How often did you feel your heartbeat skipping?

‘‘…it’s just like it skips, you can hear the valve…’’ (0221-044) ‘‘Once in a while—when you lay on the pillow you can hear your heart. Not the regular one but the beat, it skips a beat— oh, my God and then you worry about it when it happens.’’ (0087-027) ‘‘It’ll go along and then skip a beat and then take off again.’’ (0083-023) ‘‘I could feel it skip a beat.’’ (0221-044) ‘‘… if you count by seconds, it’s like you skip one. That’s what it feels like. I would say maybe, out of a whole day, probably three or four times that I really notice.’’ (0209-042) ‘‘It feels like my heart skips a few beats or something.’’ (21) ‘‘Every now and then, my heart does skip a beat—maybe two or three beats. I experienced that yesterday. It wasn’t all day long it was just that quick, boom, boom, boom (quickly) and it was over.’’ (14) ‘‘It’s an arrhythmia … skip beats.’’ (242)

ranging from 0.58 (item 4, feel weak) to 0.78 (item 9, feel lack of energy) (Table 4). The AFSympTM seven-item global score demonstrated strong reproducibility with an ICC coefficient of 0.78, as did the three multi-item subscale scores: heart symptoms (ICC = 0.74), tiredness (ICC = 0.77), and chest discomfort (ICC = 0.76). 3.5 Construct Validity

score than patients not currently in AF (mean scores of 3.4 vs 2.9, p \ 0.0001, respectively). Similar trends were also present in the subscales with the exception of chest discomfort where there was no significant difference between the two groups. Known-groups validity was also examined by comparison of scores by clinician report of severity. Patients with more clinically severe AF reported significantly higher mean subscale scores than patients with mild to moderate symptoms (Table 5).

3.5.1 Convergent and Discriminant Validity Correlations among the SF-36 and the AFSympTM subscales showed that these concepts behaved as expected in relation to one another (r range = -0.38 to 0.72; all p \ 0.0001). Scales measuring similar concepts correlated more highly (convergent validity) than scales measuring dissimilar concepts (divergent validity) (e.g., the SF-36 vitality scale correlated more highly with AFSympTM tiredness than with heart symptoms, r = -0.68 and r = -0.41, respectively) (data on file). The correlations of the AFSympTM with the AFSS subscales ranged between 0.55 and 0.67 in scales measuring similar concepts. 3.5.2 Known-Groups Validity The seven-item single global score and all subscales, with the exception of chest discomfort scale and dizziness item, differentiated between patients who stated they were currently in AF versus those who were not in AF at statistically significant levels. Patients currently in AF had significantly higher mean AFSympTMseven-item global

4 Discussion Results presented here support the AFSympTM as a comprehensive measure of AF-related symptoms with evidence of content validity and psychometric validity and reliability. This 11-item instrument covers five domains and includes one single global score. The instrument’s simple format and brevity (taking less than 5 minutes to complete) make it practical for use in evaluating symptoms in men and women—including those aged 65 and over—in the context of research or in clinical practice. Developed based on qualitative research with patients in multiple countries and cultures and with input from expert clinicians to ensure all clinically relevant concepts were included/retained, the AFSympTM is a viable alternative to the few existing AF-specific symptom measures. Many existing instruments (e.g., the Symptom Checklist [SCL] [22], AFSS [23], AF6 [24], and AF-QoL [25]) lack sufficient evidence of content validity based on qualitative research with patients based on current regulatory

J. Medin et al. Table 2 Psychometric evaluation study demographic and clinical characteristics Characteristic

All n = 313

Paroxysmal n = 103

Persistent n = 100

Permanent n = 110

Gender, n (% male)

189 (60)

63 (61)

55 (56)

71 (64)

Mean (SD)

65.0 (11.3)

62.0 (12.4)

63.9 (11.6)

68.7 (8.8)

Range

25–80

25–80

35–80

40–80

Caucasian

275 (88)

85 (83)

88 (89)

102 (92)

African-American

19 (6)

8 (8)

5 (5)

6 (5)

Hispanic American

10 (3)

5 (5)

4 (4)

1 (1)

Asian/Oriental/Pacific Islander

3 (1)

2 (2)

1 (1)

0 (0)

Other

6 (2)

3 (3)

1 (1)

2 (2)

Age (years)

Ethnicity, n (%)

Education, n (%) High school or less

52 (17)

21 (20)

10 (10)

21 (19)

High school diploma/GED

105 (34)

32 (31)

35 (35)

38 (34)

Some college

48 (15)

12 (12)

20 (20)

16 (14)

College degree

69 (22)

27 (26)

20 (20)

22 (20)

Graduate/postgraduate

39 (12)

11 (11)

14 (14)

14 (13)

Working (FT/PT)

135 (43)

51 (50)

50 (50)

34 (31)

Retired—heart condition

21 (7)

6 (6)

5 (5)

10 (9)

Current work status, n (%)

Retired—other reason

136 (44)

43 (42)

35 (35)

58 (53)

Never employed/other

21 (7)

3 (3)

10 (10)

8 (7)

AF episodes during the last month, n (%) 0

210 (67)

0 (0)

100 (100)

110 (100)

1

61 (20)

61 (59)

0 (0)

0 (0)

2–5

37 (12)

37 (36)

0 (0)

0 (0)

[5

5 (2)

5 (5)

0 (0)

0 (0)

Severity of AF symptoms, n (%) Very mild

41 (13)

25 (24)

7 (7)

9 (8)

Mild

147 (47)

46 (45)

51 (51)

50 (46)

Moderate

114 (36)

31 (30)

35 (35)

48 (44)

Severe

11 (4)

1 (1)

7 (7)

3 (3)

Hypertension

207 (66)

57 (55)

63 (63)

87 (79)

Hypercholesterolemia

139 (44)

43 (42)

50 (50)

46 (42)

Diabetes mellitus

59 (19)

15 (15)

14 (14)

30 (27)

Comorbid health conditions, n (%)

Anxiety

47 (15)

21 (20)

10 (10)

16 (15)

Heart failure

46 (15)

12 (12)

15 (15)

19 (17)

Endocrine disease

37 (12)

15 (15)

9 (9)

13 (12)

Pulmonary disease

36 (12)

8 (8)

14 (14)

14 (13)

Depression

38 (12)

13 (13)

13 (13)

12 (11)

Left ventricular dysfunction

32 (10)

7 (7)

16 (16)

9 (8)

Valvular disease

30 (10)

8 (8)

9 (9)

13 (12)

AF atrial fibrillation, FT full time, GED general educational development, PT part time, SD standard deviation

standards [26] and vary in the level of empirical data available supporting their psychometric properties. Moreover, they do not seem to have been developed cross-culturally, raising the possibility of cultural bias in the

concepts included. Recently, the development and validation of the Atrial Fibrillation Effect on Quality-of-Life questionnaire (AFEQT) was reported [27]. Advantages of AFEQT include the rigor of the instrument’s development

AF-Specific Symptom Measure Table 3 The AFSympTM domains and single global score options Item no.

Item description

Domain

7-item version

CFI

Alpha

1

Feel heart pounding

Heart symptoms

1

0.942

0.824

3

Feel heart racing

1

7

Irregular heart beat

1

8

Feel heart beat skipping

2

Feel chest pain

1.0

0.793

10

Feeling of pressure in chest

4

Feel weak

0.994

0.905

5

Feel tired

1

9

Feel a lack of energy

1

6

Feel dizzy

Dizziness

NA

NA

11

Shortness of breath during activities

Shortness of breath

NA

NA

1 Chest discomfort 1

Tiredness

? indicates inclusion in 7-item version, CFI Bentler’s confirmatory fit indices, NA not applicable CFI [ 0.90 good fit; 7-item CFI = 0.998; alpha = 0.866 Table 4 AFSympTM test–retest reliability (single items, single global score, domains) from psychometric validation study AFSympTM

Mean (SD) baseline

Mean (SD) visit 2a

Differenceb

t value

p value

Spearman’s rc

ICC

AFSympTM items 1. Feel heart pounding

3.4 (1.5)

3.0 (1.6)

-0.4

-1.93

0.062

0.68***

0.70

2. Feel chest pain

2.5 (1.1)

2.3 (1.3)

-0.2

-1.21

0.222

0.63***

0.67

3. Feel heart racing

3.5 (1.3)

3.2 (1.5)

-0.4

-1.88

0.070

0.66***

0.66

4. Feel weak

3.5 (1.5)

3.0 (1.6)

-0.5

-2.16

0.039

0.55***

0.58

5. Feel tired

3.8 (1.5)

3.7 (1.4)

-0.0

-0.15

0.882

0.65***

0.68

6. Feel dizzy

2.8 (1.5)

2.4 (1.6)

-0.4

-1.68

0.102

0.54**

0.60

7. Irregular heartbeat

3.5 (1.5)

3.1 (1.7)

-0.5

-2.14

0.040

0.59***

0.66

8. Feel heartbeat skipping

3.0 (1.6)

2.7 (1.6)

-0.3

-1.62

0.115

0.66***

0.76

9. Feel a lack of energy 10. Feeling of pressure in chest

3.6 (1.6) 2.8 (1.5)

3.7 (1.7) 2.5 (1.5)

-0.1 -0.2

0.31 -1.00

0.757 0.322

0.73*** 0.63***

0.78 0.66

11. Shortness of breath during activities

3.0 (1.6)

2.7 (1.8)

-0.3

-1.24

0.224

0.63***

0.66

3.5 (1.3)

3.2 (1.4)

-0.3

-1.96

0.059

0.66***

0.78

Tiredness

3.6 (1.4)

3.5 (1.4)

-0.2

-0.95

0.350

0.71***

0.77

Chest discomfort

2.7 (1.3)

2.4 (1.3)

-0.2

-1.45

0.158

0.71***

0.76

Heart symptoms

3.4 (1.3)

3.0 (1.5)

-0.4

-2.31

0.027

0.71***

0.74

AFSympTM (7 items) AFSympTM domains

Evaluated in 33 stable patients, defined as having a score of ‘‘about the same,’’ by overall treatment effect (OTE) AFSymp atrial fibrillation-specific symptom questionnaire, ICC intra-class correlation coefficient, SD standard deviation **p \ 0.01; ***p \ 0.001 a

11- to 14-day window

b

Week 3 average score—week 2 average score Spearman rank order correlations

c

and psychometric evaluation, which followed recommendations in the PRO guidelines issued by the US Food and Drug Administration (FDA) [26] and demonstrated evidence of responsiveness to change. The AFSympTM also followed contemporary FDA guidelines and consisted of multiple phases of qualitative and quantitative research. Strengths of the AFSympTM in

contrast to existing instruments include the cross-cultural and subgroup-specific approach to development; concurrent testing of content validity in both electronic tablet and paper modes of administration; and utilization of a 1-week recall period. First, the AFSympTM item generation was informed by research with clinicians and patients in five different countries, ensuring that the items assess concepts

J. Medin et al. Table 5 Known-groups validity of the AF symptoms according to severity assessed by the clinician Pairwise comparisonsa

Severity of atrial fibrillation symptoms, mean (SD) Very mild n = 41

Mild n = 147

Moderate n = 114

Severe n = 11

2.5 (0.9)

3.0 (1.0)

3.6 (1.1)

3.6 (1.0)

1*, 2***, 3*, 4**

Tiredness Chest discomfort

2.8 (1.3) 1.8 (1.0)

3.2 (1.3) 2.2 (1.1)

3.6 (1.3) 2.6 (1.2)

3.9 (1.1) 2.8 (1.2)

2** 2**

Heart symptoms

2.3 (1.0)

2.9 (1.1)

3.5 (1.2)

3.3 (1.3)

1*, 2***, 4***

Dizziness

2.0 (1.1)

2.4 (1.2)

2.7 (1.3)

2.8 (1.3)

2*

Shortness of breath

2.1 (1.2)

2.7 (1.4)

3.3 (1.4)

3.9 (1.6)

2***, 3**, 4*

AFSympTM (7 items) AFSympTM subscales

General linear model (PROC GLM) by clinician’s assessment of severity AF atrial fibrillation, AFSymp atrial fibrillation-specific symptom questionnaire, SD standard deviation a

Pairwise comparisons between means were performed using Scheffe’s test adjusting for multiple comparisons. Comparisons are 1 = very mild vs mild, 2 = very mild vs moderate, 3 = very mild vs severe, 4 = mild vs moderate, 5 = mild vs severe, and 6 = moderate vs severe; *p \ 0.05, **p \ 0.01, ***p \ 0.001

that are relevant clinically as well as to patients, use wording patients understand, minimize cultural bias, and facilitate translation and cultural equivalence (i.e., carry the same meaning across countries). Second, in all stages of the development and psychometric validation of the AFSympTM, care was taken to include equal numbers of patients with paroxysmal, persistent, and permanent AF to ensure validity in all AF subpopulations. Third, administering the questionnaire on an electronic tablet precluded skipped responses; of note, despite the elderly population (mean age 65 [SD 11] years), all patients were able to complete the questionnaire and few had any problems using the touch screen. Finally, the use of a 1-week recall period reflects the patient preference for recall period, allows for a clear recall of the symptom experience, and limits the instrument’s vulnerability to response bias, in contrast to other instruments that use longer recall periods (e.g., AFEQT, which uses a 4-week recall period). Several limitations should be acknowledged. An important weakness of the present study is that it did not permit full evaluation of the responsiveness of the AFSympTM to treatment changes over time as there was no intervention. Second, two of the items comprising this scale (dizziness and shortness of breath) did not demonstrate adequate factor loadings and, thus, were excluded from the subscale scores and global AFSympTM score and are recommended for use as descriptive items only. Additionally, only seven of the remaining nine items were found to best represent the global AFSympTM scale. This scoring algorithm may be less straightforward for clinical use than a simple summation of all of the items; however, given that these items were found to be clinically relevant and important to patients, it was decided to retain the items until further evaluation in intervention studies could be

conducted. A third limitation is that, despite conducting qualitative research in multiple countries, the psychometric validation study was conducted in a primarily Caucasian sample in community settings in the US. Lastly, it is possible that additional subgroup analyses to evaluate the performance of this tool (e.g., age, gender), would have been informative. Psychometric validation of the questionnaire in other populations and countries will be essential to confirm the broader validity of the instrument. It should also be noted that while both paper and electronic versions of the questionnaire were tested for acceptability to patients in cognitive interviews, only the electronic version was psychometrically validated. 5 Conclusion In conclusion, the AFSympTM was developed and psychometrically evaluated using rigorous methodology according to FDA guidelines. Initial evidence presented here supports its reliability and validity for use in evaluating symptoms in patients with different subtypes of AF. In the continued preparations for the clinical use of AFSympTM, documentation of responsiveness of the AFSympTM over time in patients in whom restoration and maintenance of sinus rhythm is achieved would aid the interpretation of changes in scores in future studies. Additionally, while the AFSympTM was evaluated using an electronic application, further validation using other electronic media (e.g., web-based versions using tablets, iPhones, Android devices) is needed. Acknowledgments We would like to acknowledge Anders Ingelga˚rd and Daniel Eek, PRO scientists at AstraZeneca, who contributed to conception and design of the studies. Edits and review of the

AF-Specific Symptom Measure manuscript were provided by Chris Sexton, PhD, research scientist at Evidera. Competing interest Conflicts of interest: JM, KH, and KK are or were at the time of this work employees of AstraZeneca; RA and LA were employees of Mapi Values (now known as Adelphi Values), a health outcomes consultancy, and were contracted as consultants to AstraZeneca to conduct the research; and KC is an employee of Evidera (formerly United BioSource Corporation) and a scientific consultant to AstraZeneca in connection with the development of this measure. Financial support: The study was funded by AstraZeneca. Authors’ Contribution JM drafted the manuscript and performed analysis and interpretation of the data. RA, LA, KH, KK, NE, and KC all participated in the design and delivery of the studies and questionnaire, performed analysis, and interpreted data. KC will act as the guarantor on behalf of all authors. All authors read and approved the final manuscript.

References 1. Singer DE, Albers GW, Dalen JE, Go AS, Halperin JL, Manning WJ. Antithrombotic therapy in atrial fibrillation: the Seventh ACCP Conference on Antithrombotic and Thrombolytic Therapy. Chest. 2004;126(3 Suppl.):429S–56S. 2. Ware JE Jr, Kosinski M, Keller SD. SF-36 physical and mental health summary scales: a user’s manual. Boston: Health Assessment Lab, New England Medical Center; 1994. 3. Brooks R. EuroQol: the current state of play. Health policy. [Review]. 1996;37(1):53–72. 4. Roalfe AK, Bryant TL, Davies MH, Hackett TG, Saba S, Fletcher K, et al. A cross-sectional study of quality of life in an elderly population (75 years and over) with atrial fibrillation: secondary analysis of data from the Birmingham Atrial Fibrillation Treatment of the Aged study. Europace. [Research Support, Non-U.S. Gov’t]. 2012;14(10):1420–7. 5. Steg PG, Alam S, Chiang CE, Gamra H, Goethals M, Inoue H, et al. Symptoms, functional status and quality of life in patients with controlled and uncontrolled atrial fibrillation: data from the RealiseAF cross-sectional international registry. Heart. 2012; 98(3):195–201. 6. Thrall G, Lane D, Carroll D, Lip GY. Quality of life in patients with atrial fibrillation: a systematic review. Am J Med. 2006;119(5):448 e1–19. 7. European Medicines Agency (EMA; formerly EMEA). Reflection paper on the regulatory guidance for the use of health-related quality of life (HRQL) measures in the evaluation of medicinal products; EMEA/CHMP/EWP139391/2004. London. [Report]. 2004. 8. Food and Drug Administration (FDA). Guidance for Industry. Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. Journal [serial on the Internet]. 2009 Date; 74(235). http://www.fda.gov/downloads/ Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ UCM193282.pdf. 9. Guedon-Moreau L, Capucci A, Denjoy I, Morgan CC, Perier A, Leplege A, et al. Impact of the control of symptomatic paroxysmal atrial fibrillation on health-related quality of life. Europace. 2010;12(5):634–42.

10. McHorney CA, Kosinski M, Ware JE Jr. Comparisons of the costs and quality of norms for the SF-36 health survey collected by mail versus telephone interview: results from a national survey. Med Care. 1994;32(6):551–67. 11. McHorney CA, Tarlov AR. Individual-patient monitoring in clinical practice: are available health status surveys adequate? Qual Life Res. 1995;4(4):293–307. 12. McHorney CA, Ware JE Jr, Lu JF, Sherbourne CD. The MOS 36-item Short-Form Health Survey (SF-36): III. Tests of data quality, scaling assumptions, and reliability across diverse patient groups. Med Care. 1994;32(1):40–66. 13. McHorney CA, Ware JE Jr, Raczek AE. The MOS 36-Item ShortForm Health Survey (SF-36): II. Psychometric and clinical tests of validity in measuring physical and mental health constructs. Med Care. 1993;31(3):247–63. 14. Ware JE Jr, Sherbourne CD. The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Med Care. 1992;30(6):473–83. 15. Hays RD, Stewart AL. Sleep measures. In: Stewart AL, Ware JE, Jr., editors. Measuring functioning and well-being: the medical outcomes study approach. Durham: Duke University Press; 1992. 16. Juniper EF, Guyatt GH, Willan A, Griffith LE. Determining a minimal important change in a disease-specific Quality of Life Questionnaire. J Clin Epidemiol. 1994;47(1):81–7. 17. Hatcher L. A step-by-step approach to using the SAS system for factor analysis and structural equation modeling. Cary: SAS Institute, Inc.; 1994 Contract No.: Document Number. 18. Andrich D, Lyne A, Sheridan B, Luo G. RUMM2020. Rumm Laboratory Pty Ltd.; 2003 [updated 2003; cited]. http://www. rummlab.com. 19. Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika. 1951;16:297–334. 20. Nunnally JC, Bernstein IH. Psychometric theory. 3rd ed. New York: McGraw-Hill; 1994. 21. Marlene H, Frost BBR, Liepa AM, Stauffer JW, Hays RD. What is sufficient evidence for the reliability and validity of patientreported outcome measures? Val Health. 2007;10(2):S94–105. 22. Hays RD, Sherbourne CD, Mazel R. User’s manual for the medical outcomes study (MOS) core measures of health-related quality of life. RAND Corporation; 1995 [updated 1995; cited]. http://www.rand.org/pubs/monograph_reports/MR162.html. 23. Jaeschke R, Singer J, Guyatt GH. A comparison of seven-point and visual analogue scales. Data from a randomized trial. Control Clin Trials. 1990;11(1):43–51. 24. Bubien RS, Knotts-Dolson SM, Plumb VJ, Kay GN. Effect of radiofrequency catheter ablation on health-related quality of life and activities of daily living in patients with recurrent arrhythmias. Circulation. 1996;94(7):1585–91. 25. Dorian P, Jung W, Newman D, Paquette M, Wood K, Ayers GM, et al. The impairment of health-related quality of life in patients with intermittent atrial fibrillation: implications for the assessment of investigational therapy. J Am Coll Cardiol. 2000;36(4): 1303–9. 26. Harden M, Nystrom B, Kulich K, Carlsson J, Bengtson A, Edvardsson N. Validity and reliability of a new, short symptom rating scale in patients with persistent atrial fibrillation. Health Qual Life Outcomes. 2009;7:65. 27. Arribas F, Ormaetxe JM, Peinado R, Perulero N, Ramirez P, Badia X. Validation of the AF-QoL, a disease-specific quality of life questionnaire for patients with atrial fibrillation. Europace. 2010;12(3):364–70.

Development and validation of the AFSymp™: an atrial fibrillation-specific measure of patient-reported symptoms.

Patients with atrial fibrillation (AF) can be severely incapacitated by symptoms, but validated symptom measures are lacking. The aim of this study wa...
307KB Sizes 0 Downloads 3 Views