542163 research-article2014

WJNXXX10.1177/0193945914542163Western Journal of Nursing ResearchMuehrer et al.

Methods Article

Depicting Changes in Multiple Symptoms Over Time

Western Journal of Nursing Research 2015, Vol. 37(9) 1214­–1228 © The Author(s) 2014 Reprints and permissions: sagepub.com/journalsPermissions.nav DOI: 10.1177/0193945914542163 wjn.sagepub.com

Rebecca J. Muehrer1, Roger L. Brown1, and Dorothy M. Lanuza1

Abstract Ridit analysis, an acronym for Relative to an Identified Distribution, is a method for assessing change in ordinal data and can be used to show how individual symptoms change or remain the same over time. The purposes of this article are to (a) describe how to use ridit analysis to assess change in a symptom measure using data from a longitudinal study, (b) give a stepby-step example of ridit analysis, (c) show the clinical relevance of applying ridit analysis, and (d) display results in an innovative graphic. Mean ridit effect sizes were calculated for the frequency and distress of 64 symptoms in lung transplant patients before and after transplant. Results were displayed in a bubble graph. Ridit analysis allowed us to maintain the specificity of individual symptoms and to show how each symptom changed or remained the same over time. The bubble graph provides an efficient way for clinicians to identify changes in symptom frequency and distress over time. Keywords ridit analysis, symptom change, bubble graph, transplant, lung transplant Symptoms are subjective sensations that are perceived by the individual to be indications of a physical or psychological disorder (Hegyvary, 1993). Many tools use ordinal scales (e.g., a 5-point Likert-type scale) to measure patients’ 1University

of Wisconsin–Madison, USA

Corresponding Author: Rebecca J. Muehrer, School of Nursing, University of Wisconsin–Madison, Cooper Hall, 701, Highland Avenue, Madison, WI 53792, USA. Email: [email protected]

Downloaded from wjn.sagepub.com at University of Sydney on November 14, 2015

1215

Muehrer et al.

subjective reports of symptoms. As ordinal data are not considered normally distributed, analyses using the parametric approach (e.g., t tests) may not be appropriate (Gibbons, 1993) and chi-square based approaches are less efficient. Ridit analysis, an acronym for Relative to an Identified Distribution, is an approach used to compare two or more sets of ordered quantitative data that takes into consideration the inherent ordering of ordinal data (Beder & Heim, 1990; Bross, 1958, 1978; Fleiss, 1981; Fleiss, Chilton, & Wallenstein, 1979). Ridit analysis has been suggested as a method for assessment of change in ordinal data because it is based on the observed distribution rather than the normal distribution. Ridit analysis has been used to assess symptom change in the lung (Dobbels et al., 2008), heart (Moons, De Geest, Abraham, Cleemput, & Van Vanhaecke, 1998), kidney (Dobbels et al., 2008; Koller et al., 2010; Moons et al., 2003), and liver (Drent, Moons, De Geest, Kleibeuker, & Haagsma, 2008) transplant populations. Ridit analysis has also been used to assess differences in pain scores among treatment groups (Donaldson, 1998), and to examine patient and nurse satisfaction (Goossen et al., 2001). Furthermore, mean ridits may be converted into a d effect size (i.e., effect size differences between groups) and simulation studies have demonstrated that effect sizes based on ridit analysis are a closer approximate to Cohen’s d effect size for ordered normally distributed data than Cliff’s d nonparametric effect size based on the Mann–Whitney U value (Brown, 2011). Although other methods are available to analyze ordinal data, the authors chose ridit analysis because it provided a method for assessing change in each specific symptom. The purposes of this article are to (a) describe how to use ridit analysis to assess change over time in a symptom measure, using data from a longitudinal study with lung transplant patients; (b) give an in-depth example of ridit analysis using the symptom “nausea/upset stomach” at pre-transplant and 6 months post-transplant time points; (c) show the clinical relevance of applying ridit analysis to a large multisymptom longitudinal data set; and (d) display the results in an innovative see at-a-glance graphic.

Method Ridits are measures relative to an empirical distribution. The calculation of individual ridit effect sizes for analysis of a multisymptom data set maintains the specificity of each individual symptom and shows how it changes or remains the same over time in an aggregate population. To help illustrate the method, we use the symptom frequency of nausea/upset stomach from a lung transplant study (Lanuza et al., 2012). In this example, patients were asked to rate the frequency with which they experienced the symptom of

Downloaded from wjn.sagepub.com at University of Sydney on November 14, 2015

1216

Western Journal of Nursing Research 37(9)

nausea/upset stomach before lung transplant (reference scores) and then again 6 months after lung transplant (comparison scores). The selection of the reference group requires considerable care (Bross, 1958). As the overall purpose of the Lanuza study was to examine pre- to post-transplant changes in symptoms that lung transplant recipients experienced during their firstyear post-transplant, the patients’ pre-transplant symptoms were used as the reference group to which the post-transplant symptoms were compared. A ridit score may be calculated for each response category (l . . . k) for the symptom nausea/upset stomach. The ridit score for each response category is the percentile rank of a category. It is equal to the frequency of responses in all of the lower response categories plus one-half the frequency of responses in the specific response category, divided by the total sample size. Although the ridits rl . . . rk have a probability interpretation, they also may be viewed as a set of reference scores (nausea/upset stomach frequency before transplant) for the k categories in making comparisons with another set of scores (nausea/upset stomach frequency 6 months after transplant). The ridit effect size for each symptom category may be taken as the value of a dependent variable for symptom change comparison, using the normal distribution family of statistics (e.g., means, standard deviation, etc.; Graubard & Korn, 1987; Selvin, 1977). The mean ridits may be considered approximately normal as n → infinity. The interpretation of the mean ridit for a comparison group is as follows: Let X denote the frequency of the symptom of nausea/upset stomach before transplant (reference score). Let Y denote the frequency of the symptom of nausea/upset stomach 6 months after transplant (comparison score). The assessment of the comparison mean ridit is then an estimate of the probability that the symptom is less frequent at 6 months after transplant (comparison) than before transplant (reference). The reference mean ridit is always considered .5 under this definition. Let f1j be the frequency in the jth symptom distress category for the before transplant (reference group), so that the ridit (rj) for the symptom in jth distress category may be defined as rj =

f 1j 2

J

+ ∑ f 1 j −1 (1) j =1 , N1

where f1j − 1 is a lag 1 before transplant response frequency (see Table 1, column 5), and N1 is the total frequency count for the before transplant (reference) group. Again, if N1 denotes the frequency of the symptom of nausea/ upset stomach before transplant, and N2 denotes total frequency of the after

Downloaded from wjn.sagepub.com at University of Sydney on November 14, 2015

Downloaded from wjn.sagepub.com at University of Sydney on November 14, 2015

1217

Never Rarely Sometimes Often Always Total (N1) Sum of the product of f2 and rj M ridit (r–)

Response Category

20 65 55 18 8 166

10.0 32.5 27.5 9.0 4.0 20 85 140 158

10.0 52.5 112.5 149.0 162.0

Accumulated Sum of the Halved Response and Halved Response Frequency Response Frequencies f1 Frequency of Responses in All Lower Response Categories Frequency Lagged Once f1 0.060 0.316 0.678 0.898 0.976

Ridit

83.0

1.20 20.54 37.29 16.16 7.81

Product of Individual Ridit Times the Response Frequency of Each Category

Table 1.  Before Transplant Calculation of Mean Ridit for the Frequency of the Symptom of Nausea/Upset Stomach.

0.500

             



1218

Western Journal of Nursing Research 37(9)

transplant (comparison) group’s symptom of nausea/upset stomach, then the mean ridit for the after transplant (comparison) group is J

∑ ( rj f 2 j )



r=

j =1

,

N2

(2)

where f2 is the response frequency for the after transplant group (see Table 2, column 2). If the mean ridit for the after transplant (comparison) group is greater than .50, then one would infer that the frequency of the symptom of nausea/upset stomach was greater after transplant than before transplant. If the mean ridit for the after transplant (comparison) group is less than .50, then one would infer that the frequency of the symptom of nausea/upset was less after transplant than before transplant. The estimate of the standard error of the mean ridit is approximately J

se(r ) =

∑ ( f 1 j + f 2 j )3

1 2 3N 2

1+

N 2 +1 1 j =1 + − , N1 ( N1) ( N1 + N 2 −1) N1 ( N1 + N 2)( N1 + N 2 −1)



(3)

where N1 is the size of the before transplant (reference) group, and N2 is the size of the after transplant (comparison) group. The significance of the difference between the after transplant (comparison) mean ridit and the standard value of .5 may be tested assuming N(0,σ2) using

Z=

r − .5 . (4) se(r )

The standard deviation of the mean ridit is then

sd (r ) = se(r )

N1N 2 . (5) 2

A standardized difference in the average (mean) ridit divided by the standard deviation of the difference, a measure similar to that of Cohen’s d-family of effect sizes, may be calculated as

d (r ) =

r − .5 (6) sd (r )

A computer program (Brown, 2010) that calculates the mean ridit, significance tests, 95% confidence intervals, and effect sizes is available from the authors.

Downloaded from wjn.sagepub.com at University of Sydney on November 14, 2015

Downloaded from wjn.sagepub.com at University of Sydney on November 14, 2015

1219

Never Rarely Sometimes Often Always Total (N2) Sum of the product of f2 and rj M ridit (r–)

Symptom Scale 13 32 47 20 16 128

Response Frequency (After Transplant Group) f2 0.060 0.316 0.678 0.898 0.976

Reference Group Ridit (Before Transplant) rj

Table 2.  After Transplant Calculation of Mean Ridit for Frequency of Nausea/Upset Stomach.

0.596

0.78 10.11 31.87 17.96 15.62   76.33

Product of Response Frequency and Reference Group Ridit (f2 × rj)

1220

Western Journal of Nursing Research 37(9)

Study Example The symptom data used in this example were derived from a multicenter, longitudinal quality of life study conducted with a convenience sample of lung transplant patients (Lanuza et al., 2012). After obtaining institutional review board (IRB) approval, participants’ consents were obtained. Symptom data were then collected before lung transplant (baseline) and at 1, 3, 6, 9, and 12 months post-transplant using the investigator developed Transplant Symptom Inventory (TSI). The TSI consists of 64 symptoms that both male and female lung transplant patients may experience before and/or after transplant. Participants were asked to rate both the frequency and level of distress of each symptom (e.g., “nausea/upset stomach”) at each time point. The TSI has evidence for both reliability (Lanuza et al., 2012) and validity (Lanuza, McCabe, Norton-Rosko, Corliss, & Garrity, 1999). The investigators were interested in examining changes in frequency and distress of each symptom from pre-transplant to 1, 3, 6, 9, and 12 months post-transplant. In the ridit analyses performed in the Lanuza et al. (2012) study, all available data were used. The investigators did not impute values for missing data. For a more detailed explanation of this study, please see Lanuza and colleagues (2012). As noted above, ridit analysis is based on an observed distribution rather than a normal distribution (Beder & Heim, 1990; Bross, 1958, 1978; Fleiss, 1981). For this example, we focus only on the reported frequency of the symptoms, scored as “never,” “rarely,” “sometimes,” “often,” and “always.” The mean value for frequency of each symptom reported by the participants before lung transplant (baseline) was used as the initial observation. Then, ridit analysis was calculated to determine the effect size change for each symptom from pre to 6 months post-transplant. The calculation of individual ridit effect sizes allowed us to show how each symptom changed or remained the same over time. As stated earlier, a ridit for a given category of an ordinal variable is the proportion of individuals responding to a lower response category plus onehalf the proportion of individuals in the response category itself. Table 1 illustrates how the mean ridit is calculated for the frequency of the symptom of nausea/upset stomach before lung transplant. For example, the ridit for the response category “rarely” is calculated as follows: Add the frequency of responses for all lower response categories (Never = 20) to the halved response frequency for Rarely (32.5) and then divide by the total sample size (N1 = 166). The resulting ridit is 0.316. Ridits are calculated similarly for each response category (i.e., “never,” “rarely,” “sometimes,” “often,” and “always”). The ridits then may be viewed as adjusted median quantile ranks. If one calculates the product of the ridit and the response frequency, and

Downloaded from wjn.sagepub.com at University of Sydney on November 14, 2015

1221

Muehrer et al.

averages across the products for each category (Equation 2), one obtains the mean ridit for the reference distribution, which will equal 0.50. Table 2 illustrates how the mean ridit is calculated for the frequency of the symptom of nausea/upset stomach 6 months after lung transplant. To compare the frequency of the post-transplant group with the frequency of the pretransplant group for a specific symptom, it is only necessary to compute the average ridit for the post-transplant group by weighting the ridits of the reference group by the response frequency of each category in the comparison group. For example, the product of the response frequency and reference group ridit for the response category “rarely” is calculated as follows: Multiply the post-transplant response frequency of the category “rarely” (f2 = 32) by the pre-transplant reference group ridit (rj = 0.316). The resulting product is 10.11. Products are calculated similarly for each response category. The resulting mean ridit is 0.596. If the mean ridit for the post-transplant group is greater than .50, then more than half of the time a randomly selected patient from the post-transplant group will have a more extreme symptom frequency value than a randomly selected patient from the pre-transplant group. In our example, the post-transplant frequency mean ridit = 0.596 indicated that the frequency of nausea/upset stomach occurred more frequently 6 months after lung transplant than pre-transplant. The same steps would be followed to calculate change in the distress ratings for the symptom of nausea/upset stomach.

Issues Related to Ridit Analysis As studies of symptom change commonly explore numerous symptoms, one encounters two issues. The first is multiple testing. In this example, ridits would be calculated to determine effect size changes for both the frequency and distress ratings of 64 symptoms. To control for possible Type I errors due to multiple testing, we used the false discovery rate (FDR) approach (Benjamini & Hochberg, 1995, 2000). FDR may be defined as the expected ratio of the number of erroneous rejections (Type I errors) to the total number of rejections. Table 3 is an abbreviated example of how the FDR is calculated assuming 64 symptoms with an error rate of p = .05. To calculate FDR, the p values of each symptom are ranked from largest to smallest. The largest p value (p = .10 in this example) remains as it is. The second largest p value (p = .06) is then multiplied by the total number of symptoms (n = 64) and then divided by its rank (63). The resulting p value is .0610. As this value is not ≤ .05, the symptom change is not significant. As one can see from the example, the correction becomes more stringent as the p value decreases; subsequently the

Downloaded from wjn.sagepub.com at University of Sydney on November 14, 2015

1222

Western Journal of Nursing Research 37(9)

Table 3.  FDR Calculation Example. Symptom Unadjusted p Value Symptom Number (Largest to Smallest) Rank 1 2 3

 64

0.10 0.06 0.04

64 63 62





0.001

1

Correction

Is the Symptom Significant After Correction?

None (64/63) × 0.06 = 0.0610 (64/62) × 0.04 = 0.0413  (64/1) × 0.001 = 0.064

No No Yes  No

Note. FDR = false discovery rate.

error rate is a proportion of the number of symptoms. Although a less stringent correction than Bonferroni, FDR provides a good balance between discovering statistically significant changes and limiting false positive occurrences. For a more detailed description of FDR, the reader is directed to Keselman, Cribbie, and Holland (2002). The second issue is how to present a set of findings that contains a large number of contrasts (e.g., changes in both frequency and distress of 64 symptoms over five time points). What we propose is the conversion of symptom changes into ridit effect sizes. Changes in ridit effect sizes may indicate increases, decreases, or no change in symptom frequency and distress. One way to present these data is in a table. Table 4 contains data for 8 of the 64 symptoms from the Lanuza and colleagues’ (2012) study. Pre-transplant means and standard deviations as well as ridit effect sizes and adjusted probability tests based on FDRs for all five post-transplant time points are presented. A negative ridit effect size indicates a decrease (i.e., improvement), whereas a positive ridit effect size indicates an increase (i.e., worsening) of symptom frequency or distress. For example, the symptom “shortness of breath (SOB) with activity” was markedly improved at all five post-transplant time points compared with pre-transplant (i.e., was both less frequently occurring and less distressful). In contrast, “tremors” is an example of a symptom that significantly increased in frequency at all five post-transplant time points, and significantly increased in distress for three of the five time points (i.e., 1 month, 3 months, and 6 months). “Nausea/upset stomach” is another example of a symptom that got worse after transplant. It occurred significantly more frequently at four of the five time points (i.e., 1 month, 3 months, 6 months, and 9 months) and was significantly more distressing at three of the five time points (i.e., 1 month, 3 months, and 6 months). In

Downloaded from wjn.sagepub.com at University of Sydney on November 14, 2015

Downloaded from wjn.sagepub.com at University of Sydney on November 14, 2015

1223

0.5549

0.7748

0.7748

0.8343

0.0001

0.0001

0.0001

0.0005

0.3304

0.0001

Depicting Changes in Multiple Symptoms Over Time.

Ridit analysis, an acronym for Relative to an Identified Distribution, is a method for assessing change in ordinal data and can be used to show how in...
435KB Sizes 1 Downloads 4 Views