Vol. 50 No. 2 August 2015

Journal of Pain and Symptom Management 241

Original Article

Validation of the Patient-Reported Outcome Mortality Prediction Tool (PROMPT) Christine W. Duarte, PhD, Adam W. Black, Kimberly Murray, Amy E. Haskins, Lee Lucas, Sarah Hallen, and Paul K.J. Han, MD, MA, MPH Center for Outcomes Research and Evaluation (C.W.D., A.W.B., K.M., L.L., P.K.J.H.), Maine Medical Center Research Institute, Portland; Department of Family Medicine (A.E.H.), Maine Medical Center, Portland; Geriatric Medicine (S.H.), Maine Medical Center, Portland, Maine, USA

Abstract Context. The Patient-Reported Outcome Mortality Prediction Tool (PROMPT) estimates six-month mortality risk in elderly patients with declining health, but its external validity has not been established. Objectives. To prospectively validate the PROMPT in an independent patient cohort and explore its clinical utility. Methods. The study cohort comprised a diverse sample of 467 patients aged 65 years and older. Model calibration and discrimination were assessed on the original PROMPT and in two updated models. Clinical utility of the final updated PROMPT was examined using decision curve analysis. Results. The validation cohort had a lower six-month mortality rate than the derivation cohort (6.9% vs. 15.0%). Discrimination was virtually unchanged (area under the curve 0.73 compared with 0.75), but calibration was suboptimal (P < 0.05 for the Hosmer-Lemeshow test). The PROMPT, therefore, was updated with a new intercept and slope parameter that significantly improved calibration (Hosmer-Lemeshow statistic of 0.66). Specificity of the PROMPT was high (92% and 97%, respectively, at the 10% and 20% mortality risk thresholds), although sensitivity was modest (53% and 44% at the corresponding thresholds), consistent with diagnostic performance in the derivation sample. Decision curve analysis demonstrated greater net benefit of the updated PROMPT than ‘‘treat all’’ or ‘‘treat none’’ strategies, especially at low to moderate risk thresholds. Conclusion. The PROMPT demonstrated good discrimination but poor calibration in an independent heterogeneous clinical population. Model updating improved calibration and diagnostic performance and decision curve analysis demonstrated potential clinical utility of the PROMPT for initiating advance care planning rather than hospice referrals. J Pain Symptom Manage 2015;50:241e247. Ó 2015 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved. Key Words Predictive modeling, clinical prediction models, end-of-life care, hospice care, decision curves, net benefit

Introduction Estimates of six-month mortality risk are important determinants of the quality of end-of-life (EOL) care in the U.S., given their critical role in triggering advance care planning (ACP) activities and determining access to hospice services under the Medicare Hospice Benefit.1,2 Yet accurate, evidence-based

Address correspondence to: Christine W. Duarte, PhD, Center for Outcomes Research and Evaluation, Maine Medical Center Research Institute, 509 Forest Avenue, Suite 200, Portland, ME 04101, USA. E-mail: [email protected] Ó 2015 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.

methods of estimating six-month mortality risk have historically been lacking and physicians have had to rely on their own prognostic judgments, which have been shown to be inaccurate.3e5 The lack of accurate prognostic tools for estimating six-month mortality is thus a major barrier to high-quality EOL care. This problem has prompted a growing number of efforts to develop clinical prediction models (CPMs) for Accepted for publication: February 14, 2015.

0885-3924/$ - see front matter http://dx.doi.org/10.1016/j.jpainsymman.2015.02.028

242

Duarte et al.

estimating six-month mortality. One CPM recently developed by our team, the Patient-Reported Outcome Mortality Prediction Tool (PROMPT), uses patientreported clinical and health-related quality of life (HRQOL) data to estimate six-month mortality risk in elderly patients with declining health.6 The PROMPT was derived and internally validated using a large national sample of ambulatory community-dwelling adults aged 65 years and older (n ¼ 21,870) participating in the Medicare Health Outcomes Survey (MHOS) from 1998 to 2003. The model demonstrated favorable calibration and discrimination, comparable or superior to existing CPMs developed for specific patient populations (e.g., elderly dementia patients, nursing home residents). Before the PROMPT can be applied to clinical practice, however, its external validity and transportability to other patient populations need to be prospectively assessed, and its clinical utility needs to be determined. Like other existing CPMs for estimating six-month mortality,7e10 the PROMPT demonstrated relatively low sensitivity, particularly at thresholds of high mortality risk. Therefore, if the PROMPT was used as a basis for hospice referral decisions, a large proportion of deaths would be missed. The PROMPT did demonstrate high specificity at most risk thresholds, which is valuable given the negative consequences of false-positive prognostic estimates (inappropriate hospice referral, withdrawal of life-sustaining interventions). There may thus be a net benefit of using the PROMPT to screen patients for hospice eligibility or to identify patients for whom ACP is appropriate. Exactly how large this potential benefit is, and what risk thresholds offer the most acceptable tradeoffs between false negatives (FNs) and false positives (FPs), however, remain to be determined. The purpose of the present study was to prospectively validate the PROMPT model and begin to assess the clinical utility of the model. We conducted a prospective validation study in an independent sample of patients, aged 65 years and older, representing a broad spectrum of illness and receiving care in diverse clinical settings. We assessed the PROMPT’s performance in estimating sixmonth mortality using conventional measures (calibration and discrimination) and updated the model to improve its prognostic performance in the new population. Finally, we assessed the PROMPT’s clinical utility using decision curve analysis, an innovative technique that allows quantification of the potential net benefit of using a CPM at different risk thresholds.

Methods Data Source and Sample Population The population comprised a convenience sample of patients (n ¼ 467) recruited from several inpatient and outpatient care settings at Maine Medical Center,

Vol. 50 No. 2 August 2015

a major 637-bed tertiary referral hospital in Portland, ME. Care settings included the adult inpatient medicine service (n ¼ 116), medical intensive care unit (n ¼ 39), advanced heart failure consultation service (n ¼ 6), palliative care consultation service (n ¼ 11), multidisciplinary geriatric assessment clinic (n ¼ 189), skilled nursing facility (n ¼ 45), and outpatient hemodialysis unit (n ¼ 61). These specific care settings were selected to obtain a sample with sufficient heterogeneity to assess the PROMPT’s transportability and with sufficiently high risk for mortality to warrant routine clinical use of the tool. Patient recruitment was entirely opportunistic and conducted by clinical staff at each setting between July 2010 and April 2012; nonparticipation was not assessed. The planned duration of recruitment at each site was one year; however, the actual duration for individual sites ranged from four weeks to six months because of limitations in clinical staff resources. Patients who were unconscious or lacking in decisional capacity were excluded if there was no identifiable proxy decision maker capable of completing the survey. The PROMPT survey took approximately 15 minutes to complete and assessed sociodemographic characteristics, health and lifestyle behaviors, comorbidities, activities of daily living, and HRQOL as assessed by the Medical Outcomes Survey Short-Form-36 Health Survey (SF-36Ò, Version 1). A copy of the full survey is found in the Appendix (available at jpsmjournal.com). Mortality status at six-month follow-up was ascertained through vital records from the State of Maine Department of Data, Research, and Vital Statistics. The study was approved by the Maine Medical Center Institutional Review Board (protocol numbers 3738 and 4038).

Working Definitions Throughout this article, ‘‘original PROMPT’’ refers to the PROMPT model before any updates and does not by itself refer to the MHOS data set. The term ‘‘derivation’’ implies that we are referring to the MHOS sample.

Statistical Analyses and Model Evaluation We restricted the analysis to people aged 65 years and older as in the original MHOS sample. Variable coding was performed consistent with the specification in the original PROMPT model derivation,6 although contrary to the approach in that study, we did not restrict the sample based on answers to the health transition question from the SF-36, ‘‘Compared to one year ago, how would you rate your health in general now?’’ Age was modeled with the same restricted cubic spline used in the original PROMPT model. Model discrimination and calibration were assessed by applying the PROMPT to our present study

Vol. 50 No. 2 August 2015

Validation of the PROMPT

243

predictor from the original PROMPT model. These two approaches are the recommended updating methods for refining a predictive model and addressing miscalibration when applying CPMs to new populations.11 All analyses were conducted using SAS v. 9.2 (SAS Institute Inc., Cary, NC) or R v. 3.0.3 (R Foundation for Statistical Computing, Vienna, Austria). To assess clinical utility, we conducted decision curve analysis of the updated PROMPT using the method of Vickers and Elkin.12 This technique measures a model’s clinical usefulness by incorporating the harms associated with FPs and FNs into the assessment without needing to directly estimate these harms, based on the assumption that the choice of threshold probability implies a subjective judgment about the relative harm of FPs compared with that of FNs. A threshold under 50% implies that FNs are considered worse than FPs, whereas a threshold above

sample. Missing data totaled less than 10% and were handled using multiple imputation, which generated 10 predicted probabilities for each individual; their mean was treated as the model’s predicted risk. To assess discrimination, we calculated the c-statistic or area under the receiver operating characteristic curve. To assess calibration, we compared the observed and expected mortality of patients in categories stratified by quintiles of predicted mortality risk and by calculating the Hosmer-Lemeshow statistic. Finally, we assessed performance characteristics of the PROMPT (sensitivity, specificity, positive and negative predictive values, and positive and negative likelihood ratios) at different mortality risk thresholds. Model updating was conducted by repeating the model assessment procedures after fitting: 1) a new intercept to the validation sample, and 2) a scaling parameter, or slope term, to the overall linear

Table 1 Distribution of PROMPT Model Variables for the MMC Validation Sample in Comparison to the Original MHOS Study Sample MMC (N ¼ 467) Categorical Variables Six-month mortality Died Survived Sex Male Female Other Any cancer Yes No Missing Congestive heart failure Yes No Missing COPD Yes No Missing Smoking status Never smoked Former smoker Current smoker Missing Proxy status Other person Addressee Missing Continuous Variables Age ADLs HRQOL General health perceptions Social functioning Energy/fatigue

MHOS (N ¼ 21,870)

%

95% CI

%

Pr > c2

32 435

6.9 93.2

4.6e9.1 90.9e95.4

15.1 84.9

Validation of the Patient-Reported Outcome Mortality Prediction Tool (PROMPT).

The Patient-Reported Outcome Mortality Prediction Tool (PROMPT) estimates six-month mortality risk in elderly patients with declining health, but its ...
2MB Sizes 1 Downloads 5 Views