Methods for Performing Survival Curve Quality-of-Life Assessments.

ORIGINAL ARTICLE

Methods for Performing Survival Curve Quality-of-Life Assessments Walton Sumner, MD, Eric Ding, PhD, Irene D. Fischer, Michael D. Hagen, MD

Background. Many medical decisions involve an implied choice between alternative survival curves, typically with differing quality of life. Common preference assessment methods neglect this structure, creating some risk of distortions. Methods. Survival curve quality-of-life assessments (SQLA) were developed from Gompertz survival curves fitting the general population’s survival. An algorithm was developed to generate relative discount rate-utility (DRU) functions from a standard survival curve and health state and an equally attractive alternative curve and state. A least means squared distance algorithm was developed to describe how nearly 3 or more DRU functions intersect. These techniques were implemented in a program called X-Trade and tested. Results. SQLA scenarios can portray realistic treatment choices. A side effect scenario portrays one prototypical choice, to extend life while experiencing some loss, such as an amputation. A risky treatment scenario portrays procedures with an initial mortality risk. A time trade scenario

mimics conventional time tradeoffs. Each SQLA scenario yields DRU functions with distinctive shapes, such as sigmoid curves or vertical lines. One SQLA can imply a discount rate or utility if the other value is known and both values are temporally stable. Two SQLA exercises imply a unique discount rate and utility if the inferred DRU functions intersect. Three or more SQLA results can quantify uncertainty or inconsistency in discount rate and utility estimates. Pilot studies suggested that many subjects could learn to interpret survival curves and do SQLA. Limitations. SQLA confuse some people. Compared with SQLA, standard gambles quantify very low utilities more easily, and time tradeoffs are simpler for high utilities. When discount rates approach zero, time tradeoffs are as informative and easier to do than SQLA. Conclusions. SQLA may complement conventional utility assessment methods. Key words: quality of life; utility assessment; survival curve; discount rate. (Med Decis Making 2014;34:787–799)

M

reflect this structure. The standard gamble poses immediate mortality risk without describing later risks.1 The closest analogy is to operative risks, but real operations involve small immediate mortality risks followed by additional risk accruing during ensuing months. These risks are balanced by expected survival and quality-of-life benefits. Time tradeoff (TTO) involves losing distant, potentially highly discounted, years of life without acknowledging intervening risk and usually without assessing individuals’ discount rate. In contrast, developed countries usually ban or tightly regulate treatments that shorten average life span while improving

any medical decisions involve choosing between options with well-documented survival curves. Patients presumably desire to compare the combined value of survival and quality of life, discounted over time, among available options. Common preference assessment methods do not

Received 18 January 2013 from the Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO, USA (WS, IDF); American Board of Family Medicine, Inc., Lexington, KY, USA (ED, MDH); and Department of Family and Community Medicine, University of Kentucky College of Medicine, Lexington, KY, USA (MDH). Financial support for this study was provided entirely by a contract with the American Board of Family Medicine. The funding agreement ensured the authors’ independence in designing the study, interpreting the data, and writing and publishing the report. The following authors are employed by the sponsor: Eric Ding, Michael D. Hagen. Revision accepted for publication 2 November 2013. Ó The Author(s) 2014 Reprints and permission: http://www.sagepub.com/journalsPermissions.nav DOI: 10.1177/0272989X13514775

Supplementary material for this article is available on the Medical Decision Making Web site at http://mdm.sagepub.com/supplemental. Address correspondence to Walton Sumner, MD, Department of Internal Medicine, Washington University School of Medicine, Box 8005, 660 South Euclid Avenue, St. Louis, MO, USA; telephone: (314) 454-8164; fax: (314) 454-5113; e-mail: [email protected].

MEDICAL DECISION MAKING/AUGUST 2014

Downloaded from mdm.sagepub.com at OAKLAND UNIV on June 1, 2015

787

SUMNER AND OTHERS

quality.2–5 Even narcotic pain relief, which is structured like a TTO, is better represented as a choice between survival curves. Many treatments extend life while impairing quality: examples range from effective drugs with side effects to resection of diseased body parts. Willingness-to-pay assessment results vary with wealth and risk, implying heterogeneity that complicates pooling of data. The decision analytic literature documents a long debate regarding the merits of these and other preference assessment techniques.6 Subjects in formal studies assert diverse outcome utilities7 and discount rates.8 Discount rates can diverge from the Standard Reference Model.9 Studies document anchoring,10 transitivity violations, and framing,11 casting doubt on validity. Because subjects construct preferences imperfectly during the assessment process,12–14 these effects can be—and have been—manipulated in economic studies to cause preference reversals that harm subjects.15 Prospect theory posits that irrational events such as preference reversals result from widely used decisionmaking short cuts.16 Prospect theory is descriptive, not prescriptive: most decision makers want to be rational. Kahneman and Tversky16(p277) state, These departures from expected utility theory must lead to normatively unacceptable consequences, such as inconsistencies, intransitivities, and violations of dominance. Such anomalies of preference are normally corrected by the decision maker when he realizes that his preferences are inconsistent, intransitive, or inadmissible. In many situations, however, the decision maker does not have the opportunity to discover that his preferences could violate decision rules that he wishes to obey. In these circumstances the anomalies implied by prospect theory are expected to occur.

Even without an opportunity to discover violations, preference reversals are much less common when economic questions relate to real, looming, important decisions.17 Using realistic scenarios, such as choices between relevant survival curves, for preference elicitation could reduce inconsistencies. Preference surveys also might benefit from providing opportunities to discover and explore normative violations. In the early 1990s, ‘‘continuous risk utility assessments’’—survival curves—were proposed as the basis for attractively realistic preference assessments.18 That effort was abandoned because lay subjects had difficulty understanding graphs (G.B.

788

Hazen, personal communication). Nevertheless, the idea of using survival curves to assess utilities remains appealing. A series of survival curve trading exercises could provide insight into the utility of a health state, the discount rate applied while in that health state, and the precision of those estimates. This report describes the development of core Survival curve Quality-of-Life Assessments (SQLA) methods, SQLA implementation in X-Trade (a preference assessment software platform that supplants UTiter19), and observations from pilot tests. METHODS We developed a list of general SQLA scenarios with exemplary baseline and alternative survival curve functions. In these scenarios, a subject would identify one of several alternative survival curves, which, if experienced in health state A, has a quality-adjusted life expectancy (QALE) closest to the QALE of the baseline curve experienced in health state B. We examined discount rate-utility (DRU) functions implied by the results of each scenario and developed one method for illustrating convergence and quantifying divergence in DRU curves. Preliminary studies that guided software development are reported in the online appendix, along with software designs. An ongoing utility assessment study using SQLA prompted refinements and observations regarding appropriate use of SQLA. The Human Research Protections Office of the Washington University School of Medicine approved studies involving human subjects. Curve Functions We developed survival curve equations for 4 general scenarios: 1. Time trade, analogous to conventional TTO, in which one shortens average life expectancy to improve average quality, as in chronic pain treatment 2. Side effect, in which one loses function to extend life, as in surgical interventions with immediate functional loss, such as an amputation 3. Risky treatment, in which one takes an initial risk for a chance to extend life, as in surgical interventions with perioperative mortality balanced by eventual mortality benefits 4. Carpe diem, in which one takes an initial benefit at the price of delayed risk, approximating toxic treatments for immediately threatening problems



SURVIVAL CURVE QUALITY-OF-LIFE ASSESSMENTS

We used 2004 US population survival tables20 to derive a broadly applicable Gompertz survival curve function21 using version 1.3 of the CurveExpert program (www.curveexpert.net). From this function, we derived age-appropriate curves by taking the ratio of survival at later ages to survival at initial age. The derived survival curve was shifted left by increasing the age terms or right by decreasing the age terms. Survival curves were stretched or compressed by slowing or accelerating passing time in the numerator age term (e.g., by multiplying time increments by values below or above 1, respectively). For the carpe diem scenario, we manually specified a sigmoid curve that had the desired shape relative to a realistic Gompertz curve, fit a Gompertz equation to that curve, and then tested it over a range of input ages.

We plotted the range of DRU functions implied by representative survival curve families to determine the range of discount rates and utilities that can be inferred from pairs of SQLA scenarios. Preference Survey A series of preference assessment surveys were begun to evaluate subjects’ interactions with SQLA. An introductory survival curve presentation was developed external to X-Trade, and within X-Trade, a review, comprehension test, and a series of conventional and SQLA assessments were implemented. Subjects recruited from a volunteer registry were observed interacting with SQLA tasks. RESULTS

Inferring Utilities and Discount Rates The result of a conventional TTO is typically given as a ratio, between 0 and 1, of a span of time in better health to a longer span in an impaired health state, given that the different time span–health combinations are equally attractive. For instance, a subject might assert that 7 y in perfect health is as attractive as living 10 y with blindness, so that the ratio is 0.7. This ratio is often treated as a utility. However, the ratio is the subject’s utility only if the subject’s discount rate on these 2 square wave survival curves is zero. If the discount rate is positive, then the ratio will underestimate the utility of the state: the higher the discount rate, the less years 7 through 10 contribute to the subject’s QALE estimation. Conversely, if the discount rate were negative, then the ratio would overestimate the utility of the state: years 7 through 10 contribute significantly to QALE, even in impaired health. Thus, unless discount rates are routinely zero, the expected result of a conventional TTO is a function that predicts utility based on discount rate. We will refer to this as a discount rateutility function, or DRU function. Algorithms were developed for the following tasks related to DRU functions: 1. Given a pair of equally desirable survival curves (i.e., 2 curves representing equal QALE), in which one curve is asserted to have utility 1, generating a DRU function. Conventional TTO uses 2 survival step functions that can be treated as survival curves and converted to DRU functions. 2. Identifying intersections of DRU functions. 3. Rapidly estimating the distance from each pixel in the DRU space to a DRU function.

Curve-Generating Functions Base Case One form of the Gompertz mortality formula is21 Mortality 5 a expðexpðb c ageÞÞ:

We obtained these coefficients: a = 11.085803 b = 2.9573519 c = 0.019627862

These coefficients fit the unstratified U.S. population well (r = 0.999). We will refer to a Gompertz mortality function with these coefficients as G(age), where age is an age-related term. A general survival curve function was obtained as the ratio of survival values at initial and later ages. A formula that generates a survival curve (S) is S5

ð1 GðageÞÞ : ð1 GðnowÞÞ

ð1Þ

The variable now is the initial age in years. Age is the x-axis, in years, starting from now. Each SQLA scenario requires a baseline survival curve and a family of alternative survival curves. The subject will identify one of the alternative curves that is about as attractive as the baseline survival curve. To avoid ambiguity, the family of alternative curves must maintain a stable QALE-based sorting sequence from best to worst for any credible, fixed discount rate and any positive, fixed utility. In each family of X curves (curve 1, curve 2, . curve X), N = 1 and N = X give worst and best survival,

ORIGINAL ARTICLE

789


SUMNER AND OTHERS

respectively. The number of the curve, N, was a convenient variable to embed in template formulae. Each SQLA scenario has a typical baseline curve, assumed to represent survival in a state with utility \1 unless otherwise specified.

where t is time in years since now. The survival curve is constrained by limiting maximum survival to some ceiling level and stretched to extend beyond the baseline survival curve, as in the following formula, obtained by trial:

Time Trade

S 5 ceiling term 3 stretched survival function

A conventional TTO uses baseline and alternative survival curves with 100% survival until falling sharply to zero. In this analogous SQLA scenario, age is increased by some value k that varies inversely with N. For instance, a formula yielding a family of worse-than-expected survival curves for N = 1 to X is

S2 5 0:49 1 N=ð2 3 XÞ 3 0

S5

ð1 Gðage 1 kÞÞ ; ð1 Gðnow 1 kÞÞ

ð2Þ

where k 5 f ðNÞ and k . 0. Reasonable formulae for k are k = X 1 1 – N and k = (105-now) 3 (X 1 1 – N)/X. Formula 2 is sensitive to inputs: G(age) is greater than 1 for ages greater than 105, causing S values greater than 1. Setting k to zero generates a realistic baseline survival curve for comparison. Side Effect Curves can be offset by some constant k \ 0, with the result that curves plot the survival of younger patients. In this case, k = m 3 N produces aesthetically reasonable curves, where m is a small number less than zero (e.g., –5 \ m \ 0). In this scenario, the baseline represents time spent in perfect health. The longer alternative curves represent time spent with a functional loss due to treatment. The user identifies the worst acceptable alternative curve. A baseline comparable to the subject’s age-based expected survival curve is distracting because some alternative curves will reach currently absurd ages. Baselines therefore need to be shortened for SQLA side effect scenarios. Risky Treatment A curve with an initial sharp decline below baseline survival is followed by a Gompertz-shaped curve that crosses the baseline later. It can be generated by taking the maximum of a steeply declining function and a Gompertz ratio. For instance, a steeply declining function for N = 1 to X is X 11 N ; S1 5 1 t 3 53X

790

ð3Þ

1 N= 3 A 1 G @now 1 15 1 t 3 max t; N=3 : 1 Gðnow 1 15Þ 1

ð4Þ

The complete risky treatment curve formula is then S 5 max ðS1 ; S2 Þ:

ð5Þ

Carpe Diem These Gompertz formula coefficients generate survival curves with initially high survival followed by a steep decline: a = 1.113022 b = 7.8255897 c = 0.092484551

We will refer to a Gompertz mortality formula using these coefficients as F(a). For instance, this family of functions will cross baselines derived from risky treatment formulae, or baseline formula 1 when now is 40 to 50: 1 F age 1 15 N 3 20=X S5 : 1 F now 1 15 N 3 20=X

ð6Þ

If the baseline and alternative curves in risky treatment and carpe diem scenarios represent the same state of health, then a pair of equally attractive, intersecting baseline and alternative curves identifies a discount rate. Credula postero: One can invert the carpe diem assertions, making the alternative curve the lower utility curve. In this situation, improved future survival follows a long period of risk. Constraints Some limitations in these survival curve formulas should be noted. First, the age term in G(age) cannot exceed 106, because 1 – G(age) becomes negative and




survival curve results become nonsensical. This limits left shifts, especially at older starting ages. Second, age-sensitive curve formulas generate quite differentlooking curves at different ages. Potentially stark visual differences between survival curves starting from disparate ages are easily eliminated, if desired, by removing age terms, rescaling time terms, and rescaling the x-axis. The displayed shapes of survival curves are then consistent, whereas the time horizons are specific to the subject. For instance, this revision of equation 1 generates a consistently shaped baseline survival curve for all ages: S5

1 G 70 1 35t=d ; 1 Gð70Þ

ð7Þ

where d, the time duration to be illustrated on the x-axis, is a function of age. Inferring Utilities and Discount Rates Estimation of utilities and discount rates from survival curve choices requires integration over time of 2 or more equally desirable survival curves. Consider a survival function f(t) in a state with utility u. Survival curve f(t) with utility u is as appealing as survival curve g(t) in perfect health (utility = 1). QALE can be defined as QALE 5

LEnow ð

1 11d

gðtÞ 3

t dt 5

t50 LEnow ð

u 3 f ðtÞ 3

1 11d

t dt:

ð8Þ

t50

Rearrangement of this function allows estimation of utility as a function of discount rate. If u and d are not temporal functions, then these can be treated as real numbers. In the simplest case, the health states are the same; therefore, utility is equal on both survival curves, and plausible discount rates can be tested systematically to find a rate where the integrals match. This could be a subject’s constant discount rate. The ratio of the 2 integrals gives the ratios of the utilities: if one of the utilities is known or asserted, then the other can be determined. For example, given a linear survival curve with maximum survival time T2 in a health state with utility u, if T1 is the maximum survival time in an equally appealing straightline survival curve in perfect health, the discount rate is d, and T1 \ T2, then substitution of linear

functions for g(t) and f(t) in the above formula and integration by parts yields u5

1 T1 3 ð1 1 dÞT1 1 T2 3 ð1 1 dÞT2

ln 1 11 d 1 ln 1 11 d 1

1 T1 1 T2

:

ð9Þ

A range of discount rates from –99% to 100% can be tested systematically to calculate corresponding utilities. The result is a DRU function. Although linear survival curves are rarely, if ever, realistic or relevant, formula 9 is useful for cross-checking utilities calculated by programmed integration algorithms. Each SQLA scenario generates a set of characteristic DRU functions, each corresponding to indifference between the baseline curve and one of the alternative survival curves (Figure 1). Conventional TTO and SQLA time trade, side effect, and carpe diem scenarios offer alternative curves with shorter life with higher utility than the baseline. These generate monotonically rising sigmoid DRU functions (Figure 2). On the left side of the DRU function, a negative discount rate implies a high value for lost future years on the alternative curve, so utility in the baseline curve must be low to balance the loss. On the right, a large positive discount rate implies little or no value for future years, so utility on the baseline curve is relatively close to utility on the alternative curve. These DRU curves are often steep when midrange utilities, discount rates near zero, and longer life expectancy are combined. In these situations, a small change in discount rate may change the inferred utility substantially. When utilities are very high (e.g., in excess of 0.95), SQLA resolution is clearly inferior to TTO. TTO can easily represent trades of months, days, or hours at the end of life. Other SQLA scenarios generate nonsigmoid DRU functions (Figure 3). Risky treatment scenarios with familiar degrees of risk imply relatively high utilities across a wide range of discount rates. On the left side of the DRU function, a large negative discount rate implies a high value for preserved future years on the alternative curve, so utility in the shorter baseline curve must be high to balance the gain. On the right, a large positive discount rate implies that the primary value is immediate survival, so utility on the baseline curve must be low to balance the immediate risk accepted in the alternative. Risky treatment and carpe diem scenarios can reveal negative and positive discount rate estimates, respectively, if the utilities in the baseline and alternative curves are equal. Utilities obtained from standard gambles are horizontal lines. Only standard gambles reliably traverse low utilities at positive discount rates.

ORIGINAL ARTICLE

791


SUMNER AND OTHERS

Figure 1 Mapping Survival curve Quality-of-Life Assessments (SQLA) exercises to discount rate-utility (DRU) functions. A SQLA time trade scenario is illustrated on the left, with a dashed baseline survival curve SB and 3 representative alternative survival curves, S1, SX/2, and SX. A subject indicates indifference between SB with utility u, and one of the alternative curves, S1 . . . SX with utility 1. Corresponding DRU functions are illustrated on the right. If SB and one of the darkest alternatives (e.g., S1), are equally attractive, then utility is low regardless of discount rate, as shown by the dark DRU functions. Conversely, if SB and one of the lighter alternatives (e.g., SX), are equally attractive, the utility is high, as shown by the light DRU functions. DRU functions illustrate where utilities are a steep function of discount rates.

Health state utility assessments can combine SQLA scenarios that generate qualitatively different DRU curves. For instance, combining one SQLA scenario that generates sigmoid DRU functions and another SQLA scenario that generates nonsigmoid DRU functions may identify intersecting DRU curves. The intersection estimates utility and discount rate values if the person adheres to underlying assumptions. A third SQLA, TTO, or standard gamble will generate another DRU function that intersects the first two, yielding additional estimates of discount rate and utility. The most informative quartet of assessments would include one SQLA from the sigmoid DRU group, a risky treatment SQLA, a discount rate SQLA, and a standard gamble. In general, the first 2 DRU should create an ‘‘X’’-shaped intersection, the last 2 DRU always create a ‘‘ 1 ’’-shaped intersection, and the intersections ideally would be superimposed. Appropriately selected SQLA may fail to generate intersecting DRU functions in several situations. SQLA may not generate DRU functions through the correct point: the illustrated risky treatment scenarios do not traverse positive discount rates at low utility. Confused subjects might not understand the task at all, might not attend to the task, might evaluate the wrong outcome, or might not notice a change in the time frame.22 Finally, subjects could violate assumptions: for instance, their values might vary over time,

792

or the discount rate might be a function of utility. SQLA can generate nonsense DRU functions in these situations, although deliberate interview designs can detect some of these problems. An alternative for inferring utilities and discount rates when 3 or more DRU do not intersect at one point is to find the pixels on a DRU plane that are most consistent with a list of DRU functions (Figure 4). The least mean squared distance from each pixel to each DRU function is estimated. The set of points with the least mean squared distances from the curves are candidate DRU pairs. Figure 3 illustrates a rapid algorithm for estimating these pairs. The DRU plane is divided into a grid of cells of size defined by the investigator. For each DRU function, the coordinates of the cells that the function traverses lie at zero distance from the DRU function. Isobars are drawn at single-cell increments around the zero-distance cells until every cell in the plane has an estimated distance from every DRU curve. These distances are then squared and averaged for each cell. Cell(s) may be color coded by average squared distance. Preference Survey We surveyed 125 volunteers having diseases associated with shortness of breath, such as asthma, emphysema, and angina, and on comorbid conditions such as depression. Their median age was 55 y, with an interquartile range of 35 to 65 y. Sixty percent were white, and 62% were female. They had completed a median of 14 y of school (interquartile range, 14–16). Some noteworthy events occurred during these interviews. Two subjects were too confused by SQLA to complete the survey. On the other hand, some subjects with engineering backgrounds or mathematics skills achieved remarkable internal consistency on multiple stringent internal consistency assessments. Subjects’ comments frequently called temporal assumptions into question. One depressed subject had a definite desire to live only about 10 more years, which made a choice between 4 y in good health and 20 y with impairments harder than a choice between 4 good years and 10 impaired years. Similarly, some older subjects declared indifference between 2 healthy years and 20 impaired years but preferred 10 impaired years to 2 healthy years. These subjects explain that they do not wish to live long past a maximum time horizon. They are logical but likely violate a normative assumption about time: more time is not




Figure 2 Survival curve Quality-of-Life Assessment (SQLA) scenarios with sigmoid discount rate-utility (DRU) functions. SQLA and DRU functions are illustrated following the pattern explained in Figure 1. SQLA scenarios are illustrated in column 2. Dotted lines indicate a fixed baseline survival curve. Black lines in column 2 correspond to the worst alternative, dark gray to the midrange alternative, and light gray to the best alternative. Column 3 illustrates 10 of the 41 possible DRU functions for an SQLA trading game. The black lines correspond to worse survival curves, dark gray lines to midrange survival curves, and light gray lines to the best survival curves. The utility axis range is 0 to 1.2: values greater than 1 could indicate that an assumed preference order should be reversed. The discount rate range is –0.99 to 1.00/ y. Other SQLA that generate sigmoid shapes include the carpe diem scenario and risky treatment scenarios in which baseline utility is 1 and alternative utility is unknown.

ORIGINAL ARTICLE

793


SUMNER AND OTHERS

Figure 3 Survival curve Quality-of-Life Assessment (SQLA) scenarios with sigmoid discount rate-utility (DRU) functions. The colorcoding has the same meaning as in Figure 2. In general, to obtain a discount rate and utility estimate when neither is known, 2 SQLA exercises are needed: one that generates a sigmoid DRU and one that generates a nonsigmoid DRU. These need to intersect in the region of interest in the DRU plane. For instance, the risky treatment scenario as illustrated does not traverse utilities less than 0.4 at any discount rate and would need modification before assessing quality of life in terrible health states. The credula postero scenario, an inversion of the carpe diem scenario, also generates nonsigmoid DRU curves.

794




Figure 4 Estimating least mean square distances for pixels in a discount rate-utility plane. A grid of cells with investigator-defined height and width is laid over the discount rate-utility plane. Curves in the plane have distance 0 from the cells they traverse. Distance to other cells is estimated by counting along roughly orthogonal rays. Distances to missed peripheral cells are filled by translating the last orthogonal ray (horizontal arrows on right edge). Least mean square distances to every cell can be estimated quickly. If the curves converge, color coding cells by least mean squared distance creates target-like shapes.

necessarily better. One particularly ill subject, expecting to be in a nursing home within 3 y, found the time frames unrealistic but also wished that his doctors would show treatment-specific survival curves to him.

DISCUSSION Survival curves have some appealing attributes as preference assessment tools. First, survival curves are intelligible, informative, and relevant. Survival curves have been used successfully in patientoriented decision support systems and risk communication studies.23–25 Subjects look at intermediate points, not just endpoints, when interpreting survival curves, and they will take short-term risks to improve long-term survival.26 Preferences may shift with increasingly detailed explanations of shortand long-term risks, which survival curves convey succinctly.27–29 Superimposed survival curves illustrate absolute survival benefit, which is more easily understood than relative risk reduction, absolute risk reduction, or number needed to treat.30,31 The slope of a survival curve is an illustration of event

rates, which may aid understanding, once the concept is understood.32 Survival curves intrinsically illustrate the surviving part of an initially whole population, helping subjects attend to that ratio.33 Survival curves can be incorporated in most preference assessment methods, as has been done occasionally since at least 1980, when survival curves were used to describe baseline prospects in an operative scenario.34 Relevant survival curves are often available and should reduce inconsistencies among numerate subjects expressing real preferences. Second, it is technically practical to construct interactive preference assessment programs using survival curves. We have implemented and tested SQLA, using interface elements preferred by potential subjects and efficient search and analysis algorithms (see the online appendix). A small set of survival curve equations defined scenarios that mirror many medical decisions. A majority of subjects understood survival curve mechanics after a basic explanation. SQLA presented as trading tasks support relatively clear questions, high resolution, and inferences at upper and lower bounds. Interview dropout rates are less than 5%, and about 10% of subjects meet the most stringent internal consistency

ORIGINAL ARTICLE

795


SUMNER AND OTHERS

checks. Realistic survival curve scenarios coupled with incentives may create sufficiently real, looming, and important decisions that serious anomalies are minimized. Third, the intersections of DRU curves generated by carefully selected SQLA scenarios provide valuable estimates of utilities and discount rates. By comparing 2 or more utility and discount rate estimates, investigators will often discover inconsistencies. These may be those ‘‘anomalies of preference [that] are normally corrected by the decision maker when he realizes that his preferences are inconsistent’’16 but that can escape our attention in preference surveys. Subjects could reconcile such anomalies by altering some responses. However, anomalies also might result from invalid analytic assumptions: in particular, DRU functions could be shapes in a time-discount rate-utility space rather than curves on a DRU plane. That space may be hard to probe, but comments such as, ‘‘I would not want to live that long,’’ suggest a temporal dimension for discount rates, utilities, or both. Finally, modest discrepancies suggest subjective uncertainty: intersections could define a best estimate and ranges of utilities and discount rates that deserve testing in sensitivity analysis from the subject’s perspective. Fourth, SQLA data may inform individual perspective quality-adjusted life-years (QALY) calculations and sensitivity analyses. DRU functions for midrange conventional TTO results are steep at a discount rate of zero, especially when life expectancy is long. Discount rate correction therefore could be important for some TTO utilities. If utilities or discount rates are temporal functions, then additional SQLA could be used to estimate the shape of the function. Individual perspective QALY calculation is a potentially interesting policy tool. The usual practice is to collect (or estimate) preferences from a sample of N individuals and then pool their discount rates and utilities into distributions that are sampled as needed to run a decision model M 3 N times. An alternative is to collect preferences from N individuals and then, for each individual, use their set of discount rates and utilities throughout the model for M runs. The decisions and expected utilities would be pooled after the model is run. The exercise should describe the range of actions that a well-informed patient population would pursue. Survival curves may have appeal as clinical tools. Survival curves are well suited to eliciting patient expectations regarding treatment benefits. Survival curves are excellent for explaining to patients and families the consequences of different treatment

796

options. Survival curve trading games are a better foundation for informed consent processes than other preference assessment methods. Limitations SQLA does not resolve problems intrinsic to calculating QALE. We concur with others’ observations that assumptions of utility independence, constant proportional tradeoff, and risk neutrality are all suspect.35–42 SQLA may complement methods being developed to calculate utilities and QALY without such strong assumptions.43 Other drawbacks of SQLA include common problems of preference assessment: the tasks are timeconsuming, unfamiliar to patients and research subjects, and difficult to contemplate. As a 2-dimensional trading task, it is even more complex than 1-dimensional standard gambles and TTOs. Although SQLA do not present risk as starkly as standard gambles do, risk is more obvious in SQLA than in TTO. Evidence of risk seeking and risk aversion in SQLA probably will follow previously reported patterns, with prospect theory remaining a relevant explanatory model.16,44 Some discount rates could lie beyond the limits we used. In principle, SQLA assessments could portray a series of health states rather than the idealized single health state, but the mathematics will be substantially more complex. This is a weakness shared by standard gambles and TTO. For preference studies that do not seek to quantify utilities or discount rates, evolving state SQLA may be attractive. For instance, an investigator might build a decision model to predict the average health state and mortality progression conditioned on treatment strategy, illustrate these progressions as annotated survival curves, and allow subjects to select among the survival curves. Related decision support literature describes psychometric issues with survival curves. Health literacy and numeracy are sometimes intractable challenges.45,46 Specific problems include difficulty converting or equating probabilities presented in different formats, which has been called representational fluency,33 and graphical literacy, the ability to interpret a survival curve.47 Practice exercises improve comprehension.33 Subjects may prefer simple graphical interfaces, such as bar charts, to more complex representations such as survival curves.33 Survival curves probably need to be presented as simply as possible. Subjects’ inattention to assessment details is a potential pitfall. Changes in time scales should be avoided because subjects may not notice.22,48 This




is occasionally challenging. For instance, practice curves are easier to develop with fixed time scales than age-specific time scales, and initial risk is easier to see on a short time scale than a lifetime scale. When time scales do change, the subject’s attention should be drawn to the change. Framing effects are evident. Subjects make different choices when randomized to review mortality or survival curves presenting the same information.49,50 Comprehension is comparable when viewing survival curves alone or mortality and survival curves, whereas mortality curves alone are difficult to understand, especially for subjects with a high school education or less.50 Introduction and practice tasks should emphasize the mortal implications of the fall in a survival curve. At present, we plan to use survival curves alone in SQLA. Subjects may not want to see realistic survival curves. More than one-third of subjects in a surgery study did not want to know survival prospects.51 Some interview protocols may need to deemphasize the realism of age and comorbidity-adjusted SQLA. There may be psychological limits on mathematically workable SQLA. Tempting manipulations of baseline survival curves may affect results. For example, SQLA side effect scenarios look most sensible when baseline survival is below average. Consider an assessment of blindness using the side effect scenario. If the baseline survival curve is based on your cohort’s prospects, the alternative offer must be to accept an acquired state of blindness, starting now, in order to outlive your peers. In addition to feeling unrealistic, the scenario suggests self-imposed isolation and loneliness. However, if the baseline survival curve is shorter than your cohort’s prospects, then the alternative offer is to acquire blindness, starting now, in order to live among one’s peers a bit longer. This asymmetry again raises the concern that discount rates and utilities might be functions of time or events. The report by Fagerlin and others, ‘‘If I’m better than average, then I’m ok?’’ suggests that comparative outcomes influence perceptions and decisions,52 a result replicated in other studies.53–55 SQLA, in which either the baseline or the alternatives involve significant divergence from realistic expectations for a subject’s cohort, should be treated with caution until the effects of such divergence are established.

subjects performing trading exercises generate internally consistent results. Literature suggests that effective instruction, which is likely to require investment of several minutes of time, will result in more meaningful preference assessments. These tools can generate plausible, individualized discount rate and utility ranges from clinically realistic scenarios and may therefore be useful in researching various preference topics. For direct preference assessments, however, conventional TTOs are more convenient if discount rates are known to be near zero or if utilities are very high and if the lack of realism does not distort results. COMPETING INTERESTS Drs. Ding and Hagen are employed by the ABFM, and Dr. Sumner receives salary support from the ABFM. X-Trade is copyrighted by Washington University School of Medicine and is available through Dr. Sumner. Dr. Sumner provides support for X-Trade through contracts with the School of Medicine. SQLA is not protected.

AUTHORS’ CONTRIBUTIONS Dr. Sumner developed the SQLA concept, user interface and pilot tests, and X-Trade program. Dr. Ding helped solve calculus equations related to utility calculations. Ms. Fischer administered surveys to volunteers and recorded their reactions. Dr. Hagen supervised the work and provided decision analytic perspective.

ACKNOWLEDGMENTS The American Board of Family Medicine supported this work financially. Steve Kymes provided helpful insights regarding applications of SQLA.

REFERENCES 1. Farquhar P. Utility assessment methods. Manage Sci. 1984;30: 1283–300. 2. Cardiac valvulopathy associated with exposure to fenfluramine or dexfenfluramine: U.S. Department of Health and Human Services interim public health recommendations, November 1997. MMWR Morb Mortal Wkly Rep. 1997;46(45):1061–6.

CONCLUSIONS

3. Rector TS, Tschumperlin LK, Kubo SH, et al. Use of the Living With Heart Failure questionnaire to ascertain patients’ perspectives on improvement in quality of life versus risk of drug-induced death. J Card Fail. 1995;1(3):201–6.

SQLA tools can be implemented on paper or in software. Early results suggest that at least some

4. Thompson CA. Alosetron withdrawn from market. Am J Health Syst Pharm. 2001;58(1):13.

ORIGINAL ARTICLE

797


SUMNER AND OTHERS

5. FDA announces discontinued marketing of GI drug, Zelnorm, for safety reasons. FDA News Release. 30 March 2007. Contract No. P07-55. 6. Hunink M, Siegel J, Weeks J, Pliskin J, Elstein A, Weinstein M. Valuing outcomes. In: Hunink MGM, Glasziou PP, eds. Decision Making in Health and Medicine: Integrating Evidence and Values. Cambridge, UK: Cambridge University Press; 2001. p 88-127. 7. Nease RF Jr, Kneeland T, O’Connor GT, et al. Variation in patient utilities for outcomes of the management of chronic stable angina: implications for clinical practice guidelines. Ischemic Heart Disease Patient Outcomes Research Team. JAMA. 1995; 273(15):1185–90. 8. Ganiats TG, Carson RT, Hamm RM, et al. Population-based time preferences for future health outcomes. Med Decis Making. 2000; 20(3):263–70. 9. Gold M, Siegel J, Russell L, Weinstein M. Cost-Effectiveness in Health Care and Medicine. New York: Oxford University Press; 1996. 10. Tversky A, Kahneman D. Judgment under uncertainty: heuristics and biases. Science. 1974;185(4157):1124–31. 11. Bernstein LM, Chapman GB, Elstein AS. Framing effects in choices between multioutcome life-expectancy lotteries. Med Decis Making. 1999;19(3):324–38. 12. Slovic P. The construction of preference. Am Psychologist. 1995;50:364–71. 13. Sumner W II, Nease RF Jr. Choice-matching preference reversals in health outcome assessments. Med Decis Making. 2001; 21(3):208–18. 14. Bennett J, Nease RF Jr, Sumner W II. Evidence for the rapid construction of preference during utility assessments. Proc AMIA Symp. 2002:41–5. 15. Lichtenstein S, Slovic P. Reversals of preferences between bids and choices in gambling decisions. In: Slovic L, ed. The Construction of Preference. New York: Cambridge University Press; 2008. p 52–68. 16. Kahneman D, Tversky A. Prospect theory: an analysis of decision under risk. Econometrica. 1979;47(2):263–92. 17. Bohm P. Time preference and preference reversal among experienced subjects: the effects of real payments. Econ J. 1994;104: 1370–8. 18. Hazen GB, Hopp WJ, Pellissier JM. Continuous-risk utility assessment in medical decision making. Med Decis Making. 1991;11(4):294–304. 19. Sumner W, Nease R, Littenberg B. U-titer: a utility assessment tool. Proc Annu Symp Comput Appl Med Care. 1991:701–5. 20. Arias E. United States life tables, 2004. In: Centers for Disease Control and Prevention, ed. National Vital Statistics Reports. 2007; 56(9). 21. Wilson DL. The analysis of survival (mortality) data: fitting Gompertz, Weibull, and logistic functions. Mech Ageing Dev. 1994;74(1–2):15–33. 22. Zikmund-Fisher BJ, Fagerlin A, Ubel PA. What’s time got to do with it? Inattention to duration in interpretation of survival graphs. Risk Anal. 2005;25(3):589–95. 23. Armstrong K, Weber B, Ubel PA, Peters N, Holmes J, Schwartz JS. Individualized survival curves improve satisfaction with

798

cancer risk management decisions in women with BRCA1/2 mutations. J Clin Oncol. 2005;23(36):9319–28. 24. Lewiecki EM. Risk communication and shared decision making in the care of patients with osteoporosis. J Clin Densitom. 2010;13(4):335–45. 25. Davis CR, McNair AG, Brigic A, et al. Optimising methods for communicating survival data to patients undergoing cancer surgery. Eur J Cancer. 2010;46(18):3192–9. 26. Mazur DJ, Hickam DH. Interpretation of graphic data by patients in a general medicine clinic. J Gen Intern Med. 1990; 5(5):402–5. 27. McNeil BJ, Weichselbaum R, Pauker SG. Fallacy of the fiveyear survival in lung cancer. N Engl J Med. 1978;299(25):1397–401. 28. Mazur DJ, Hickam DH. The effect of physician’s explanations on patients’ treatment preferences: five-year survival data. Med Decis Making. 1994;14(3):255–8. 29. Mazur DJ, Hickam DH. Five-year survival curves: how much data are enough for patient-physician decision making in general surgery? Eur J Surg. 1996;162(2):101–4. 30. Chao C, Studts JL, Abell T, et al. Adjuvant chemotherapy for breast cancer: how presentation of recurrence risk influences decision-making. J Clin Oncol. 2003;21(23):4299–305. 31. Epstein RM, Alper BS, Quill TE. Communicating evidence for participatory decision making. Jama. 2004;291(19):2359–66. 32. Trevena LJ, Davey HM, Barratt A, Butow P, Caldwell P. A systematic review on communicating with patients about evidence. J Eval Clin Pract. 2006;12(1):13–23. 33. Ancker JS, Senathirajah Y, Kukafka R, Starren JB. Design features of graphs in health risk communication: a systematic review. J Am Med Inform Assoc. 2006;13(6):608–18. 34. Spiegelhalter D, Smith A. Decision analysis and clinical decisions. In: Bithell J, Coppi R, eds. Perspectives in Medical Statistics. London: Academic Press; 1980. p 103–31. 35. Pliskin JS, Shepard DS, Weinstein MC. Utility functions for life years and health status. Oper Res. 1980;28(1):206–24. 36. Krabbe PF, Bonsel GJ. Sequence effects, health profiles, and the QALY model: in search of realistic modeling. Med Decis Making. 1998;18(2):178–86. 37. Duru G, Auray JP, Beresniak A, Lamure M, Paine A, Nicoloyannis N. Limitations of the methods used for calculating qualityadjusted life-year values. Pharmacoeconomics. 2002;20(7):463–73. 38. Weyler EJ, Gandjour A. Empirical validation of patient versus population preferences in calculating QALYs. Health Serv Res. 2011;46(5):1562–74. 39. Martin AJ, Glasziou PP, Simes RJ, Lumley T. A comparison of standard gamble, time trade-off, and adjusted time trade-off scores. Int J Technol Assess Health Care. 2000;16(1):137–47. 40. van der Pol M, Roux L. Time preference bias in time trade-off. Eur J Health Econ. 2005;6(2):107–11. 41. Attema AE, Brouwer WB. On the (not so) constant proportional trade-off in TTO. Qual Life Res. 2010;19(4):489–97. 42. Oliver A, Cookson R. Analysing risk attitudes to time. Health Econ. 2009;19(6):644–55. 43. Attema AE, Bleichrodt H, Wakker PP. A direct method for measuring discounting and QALYs more easily and reliably. Med Decis Making. 2012;32(4):583–93.




44. Tversky A, Kahneman D. Advances in prospect theory: cumulative representation of uncertainty. J Risk Uncertain. 1992;5(4): 297–323. 45. Kim SP, Knight SJ, Tomori C, et al. Health literacy and shared decision making for prostate cancer patients with low socioeconomic status. Cancer Invest. 2001;19(7):684–91. 46. Amalraj S, Starkweather C, Nguyen C, Naeim A. Health literacy, communication, and treatment decision-making in older cancer patients. Oncology (Williston Park). 2009;23(4):369–75. 47. Armstrong K, FitzGerald G, Schwartz JS, Ubel PA. Using survival curve comparisons to inform patient decision making can a practice exercise improve understanding? J Gen Intern Med. 2001;16(7):482–5. 48. Zikmund-Fisher BJ, Fagerlin A, Ubel PA. Mortality versus survival graphs: improving temporal consistency in perceptions of treatment effectiveness. Patient Educ Couns. 2007;66(1):100–7. 49. McNeil BJ, Pauker SG, Sox HC Jr, Tversky A. On the elicitation of preferences for alternative therapies. N Engl J Med. 1982; 306(21):1259–62.

hypothetical treatment choices: survival and mortality curves. Med Decis Making. 2002;22(1):76–83. 51. Clarke MG, Kennedy KP, MacDonagh RP. Discussing life expectancy with surgical patients: do patients want to know and how should this information be delivered? BMC Med Inform Decis Mak. 2008;8:24. 52. Fagerlin A, Zikmund-Fisher BJ, Ubel PA. ‘‘If I’m better than average, then I’m ok?’’: comparative information influences beliefs about risk and benefits. Patient Educ Couns. 2007;69(1–3):140–4. 53. Klein WM. Objective standards are not enough: affective, selfevaluative, and behavioral responses to social comparison information. J Pers Soc Psychol. 1997;72(4):763–74. 54. French DP, Sutton SR, Marteau TM, Kinmonth AL. The impact of personal and social comparison information about health risk. Br J Health Psychol. 2004;9(pt 2):187–200. 55. Mason D, Prevost AT, Sutton S. Perceptions of absolute versus relative differences between personal and comparison health risk. Health Psychol. 2008;27(1):87–92.

50. Armstrong K, Schwartz JS, Fitzgerald G, Putt M, Ubel PA. Effect of framing as gain versus loss on understanding and

ORIGINAL ARTICLE

799


Methods for performing lipidomics in white adipose tissue.

An adaptable mesocosm platform for performing integrated assessments of nanomaterial risk in complex environmental systems.

Establishing the learning curve for achieving competency in performing colonoscopy: a systematic review.

A method for comparing survival of burn patients to a standard survival curve.

Optimum numerical integration methods for estimation of area-under-the-curve (AUC) and area-under-the-moment-curve (AUMC).

Evaluation of the learning curve for a board-certified veterinary surgeon performing laparoendoscopic single-site ovariectomy in dogs.

Physician assessments of drug seeking behavior: A mixed methods study.

Benchmark Database for Ylidic Bond Dissociation Energies and Its Use for Assessments of Electronic Structure Methods.

Modeling error distributions of growth curve models through Bayesian methods.

Evaluation of some simple methods of expressing the capnographic curve.

Subjective and Objective Assessments of Flow-Volume Curve Configuration in Children and Young Adults.

Recovering the raw data behind a non-parametric survival curve.

Immuno-oncology combinations: raising the tail of the survival curve.

A comparison of two methods for expert elicitation in health technology assessments.

New methods for advancing research on tobacco dependence using ecological momentary assessments.

ZZ-Type a posteriori error estimators for adaptive boundary element methods on a curve.

Semiparametric methods for multistate survival models in randomised trials.

Methods for comparing center-specific survival outcomes using direct standardization.

Evaluation of recommended methods for radioisotopes red cell survival studies.

Rationale and Applications of Survival Tree and Survival Ensemble Methods.

Learning Curve and Clinical Outcomes of Performing Surgery with the InterTan Intramedullary Nail in Treating Femoral Intertrochanteric Fractures.

Levels of agreement between student and staff assessments of clinical skills in performing cavity preparation in artificial teeth.

Comparison of splitting methods on survival tree.

Confidence rating for eutrophication assessments.