Qual Life Res (2015) 24:565–566 DOI 10.1007/s11136-015-0920-z

COMMENTARY

Response shift in the presence of missing data D. L. Fairclough

Accepted: 7 January 2015 / Published online: 28 January 2015 Ó Springer International Publishing Switzerland 2015

The authors of these three papers [1–3] have undertaken a very challenging task by simultaneously attempting to address two very difficult problems: response shift (RS) and missing data. Part of the challenge is that both approaches often rely on making untestable assumptions. These papers also illustrate other challenges. The first challenge is that there is no gold standard for detecting RS. This is illustrated in two of the articles. Sajobi et al. [2] and Guilleux et al. [3] both include different methods for estimating RS. The challenge is to figure out how much of the variability of the results is due to the different methods for detecting RS versus differences in the methods for handling missing data. For example, Guilleux et al. [3] contrast IRT and SEM methods. It would be interesting to know how consistent these methods are when the data are complete, then when the data are missing at random (MAR), and finally, when the data are not MAR. The second challenge is to conceptually and explicitly identify the linkage between the RS and the missing data mechanisms, and the interaction between them. This is critical to understanding the results of any analyses as well as for utilizing the appropriate method for imputing missing data. For example, Verdam et al. [1] examine a model where the changes in associations measured in a structural equation model are associated with survival (or number of assessments). The premise and the implied underlying hypotheses are valid. However, it is difficult to interpret the results from the tables. A creative graphical presentation would facilitate both understanding of the proposed model and interpretation. In Sajobi et al. [2], the RS model D. L. Fairclough (&) Department of Biostatistics and Informatics, Colorado School of Public Health, Aurora, CO, USA e-mail: [email protected]

hypothesized that differences in stroke severity would be associated with reprioritization by caregivers of the domains of the SF-12 over a 6-month period. Missing data was observed to be associated with greater severity of the stroke. What does this suggest, however, in terms of the association of the missing caregiver SF-12 scores with the initial severity? Do the missing values reflect a weaker (or stronger) association between stroke severity with some domains relative to other domains? The third challenge is to adapt the concepts of missing data that were originally adapted for assessing means to those that assess associations. It may be necessary to consider a conceptual variation on the classical definitions of the missing data proposed by Little and Rubin [4]. Most methods used to identify RS examine relationships between two or more variables over different conditions. For example, in Sajobi et al. [2], recalibration is assessed based on the relationship between caregiver SF-36 domain scores (y) and an indicator of stroke severity (x). For the purposes of this discussion, I will designate the relationship as the correlation between x and y, qxy . If either x or y are missing, then qxy is missing. When the probability that qxy is missing is unrelated to qxy or other data z, then the value is missing completely at random (MCAR). In this case, the estimate of the relationship is the same for the individuals obs with missing or observed data (qmis xy ¼ qxy ) and the observed data can be used to obtain an unbiased estimate of the relationship for the entire sample. If missingness is related to observed values of qxy or other data z, then the value is MAR. In this case, estimating the relationship is unbiased for certain methods (e.g., maximum likelihood) when the observed qxy or other data z are included in the estimation process. Finally, if the probability is related to the unobserved values of qxy , then the value is missing not

123

566 obs at random (MNAR). In this case, qmis xy 6¼ qxy and it is not possible to get an unbiased estimate of qxy unless we can identify auxiliary data (z) that can be used to convert MNAR to MAR. It is MNAR that we are most worried about, and unfortunately, it is impossible to prove that missing data are not MNAR because that proof would require that we know the value of the missing x or y measures. I would like to add a couple of comments about imputation. The first is a caution about simple imputation techniques. Simple imputations methods that use a predicted value will underestimate the true variance. They falsely increase the power of statistical tests as the amount of missing data increases. Thus, any comparisons based on statistical significance are biased. The second concern relates to the second and third challenges describe above. Specifically, if the relationship that is being tested is not explicitly incorporated into the imputation algorithm, the results are biased toward the null hypothesis. For example, in Sajobi et al. [2], the model associates change in scores over time with stroke severity. If this association is not incorporated into the imputation model, the analyses of the imputed datasets will underestimate the associations.

123

Qual Life Res (2015) 24:565–566

In summary, these three manuscripts attempt to address the issue of RS in the presence of missing data. They represent a good first step, and given the extensive literature on missing data for simpler settings, it will continue to be an important area of research.

References 1. Verdam, M. G. E., Oort, F. J., van der Linden, Y. M., & Sprangers, M. A. G. (2015). Taking into account the impact of attrition on the assessment of response shift and true change: A multigroup structural equation modeling approach. Quality of Life Research. doi:10.1007/s11136-014-0829-y. 2. Sajobi, T. T., Lix, L. M., Singh, G., Lowerison, M., Engbers, J., & Mayo, N. E. (2015). Identifying reprioritization response shift in a stroke caregiver population: A comparison of missing data methods. Quality of Life Research. doi:10.1007/s11136-014-0824-3. 3. Guilleux, A., Blanchin, M., Vanier, A., Guillemin, F., Falissard, B., & Schwartz, C. E., Hardouin, J.-B., & Se´bille, V. (2015). RespOnse Shift ALgorithm in Item response theory (ROSALI) for response shift detection with missing data in longitudinal patientreported outcomes studies. Quality of Life Research. doi:10.1007/ s11136-014-0876-4. 4. Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). Hoboken, NJ: Wiley.

Response shift in the presence of missing data.

Response shift in the presence of missing data. - PDF Download Free
114KB Sizes 0 Downloads 4 Views