The effects of nutrition labeling on consumer food choice: a psychological experiment and computational model.

Ann. N.Y. Acad. Sci. ISSN 0077-8923

A N N A L S O F T H E N E W Y O R K A C A D E M Y O F SC I E N C E S Issue: Paths of Convergence for Agriculture, Health, and Wealth

The effects of nutrition labeling on consumer food choice: a psychological experiment and computational model Peter Helfer1 and Thomas R. Shultz2 1 2

Department of Psychology and Integrated Program in Neuroscience, McGill University, Montreal, Quebec, Canada. Department of Psychology and School of Computer Science, McGill University, Montreal, Quebec, Canada

Address for correspondence: Peter Helfer, Department of Psychology, McGill University, 1205 Penfield Avenue, Montreal QC, Canada H3A 1B1. [email protected]

The widespread availability of calorie-dense food is believed to be a contributing cause of an epidemic of obesity and associated diseases throughout the world. One possible countermeasure is to empower consumers to make healthier food choices with useful nutrition labeling. An important part of this endeavor is to determine the usability of existing and proposed labeling schemes. Here, we report an experiment on how four different labeling schemes affect the speed and nutritional value of food choices. We then apply decision field theory, a leading computational model of human decision making, to simulate the experimental results. The psychology experiment shows that quantitative, single-attribute labeling schemes have greater usability than multiattribute and binary ones, and that they remain effective under moderate time pressure. The computational model simulates these psychological results and provides explanatory insights into them. This work shows how experimental psychology and computational modeling can contribute to the evaluation and improvement of nutrition-labeling schemes. Keywords: nutrition labeling; decision making; computational modeling

Introduction The widespread availability of calorie-dense food is believed to be a contributing cause of an epidemic of obesity and associated diseases throughout the world.1–3 In the United States, more than 35% of adults are obese4 and thus at increased risk for several chronic diseases. The trend toward obesity threatens to bankrupt medical systems in even the wealthiest nations.5 One possible countermeasure is to empower consumers to make healthier food choices with useful nutrition labeling. If many consumers opt for healthier food, this could ultimately motivate the food industry to provide healthier products. Nutrition-labeling schemes have recently emerged from many sources, including manufacturers, food sellers, health organizations, other nonindustry companies, and government-sponsored programs. The aim is to communicate nutritional quality and facilitate comparisons of food products within and between food categories. Nutrition label-

ing has the potential to enhance consumer knowledge and promote more healthful choices, thereby helping to combat the obesity epidemic. A recent worldwide survey shows that many people are ready and eager to enhance their health and control their body weight by choosing more nutritious food with the aid of nutrition information.6 However, it is important to assess nutrition-labeling schemes on their validity and usability. In this paper, we concentrate on the usability criterion. Two extensive reviews concluded that nutrition labeling is positively associated with selecting foods high in some beneficial nutrients (such as fiber) and avoiding foods high in some harmful components (such as sugar, fat, and cholesterol).7,8 Consumers choose rather poorly when they have to manipulate quantitative nutrient information. The consumers more likely to use nutrition labeling include those with higher education, an interest in nutrition and lack of time pressure, and women, although not meal planners. Most of this evidence is correlational, and thus the direction of causality is doi: 10.1111/nyas.12461

174

C 2014 New York Academy of Sciences. Ann. N.Y. Acad. Sci. 1331 (2014) 174–185

Helfer & Shultz

difficult to determine. Natural experiments comparing food choices before and after the introduction of nutrition labeling found no discernible effects for either %Daily Value (%DV)7 or Traffic Light9 labels, both of which are defined later and used in this study. An online experiment measured people’s ability to compare pairs of foods on nutrient levels, based on nutrition labels, and to estimate amounts of saturated fat, sugar, sodium, fiber, and protein in the foods.10 Those with Traffic Light labels did better on this quiz than those with Facts Up Front labels, and both of these labels were superior to a control group that was given no labels. Facts Up Front displays grams %DV information for various nutrients. Consumers show considerable difficulty dealing with quantitative aspects of nutrition labels, especially recommended daily values and serving sizes, and with having to compare these across food items.11 Here, we report an Internet-based food-choice experiment where participants make choices based on product images, taste scores, and one of four types of nutrition labeling. We find that the Nutrition Facts label,12 currently required in the United States, Canada, and a few other countries, is relatively ineffective in guiding participants toward nutritious choices, whereas some alternative schemes that present nutrition information in a more condensed form perform significantly better. Our results also show that participants make less healthy food choices when presented with product images, and—unsurprisingly—that taste plays an important role in food choices. We also present a computational model of food choice that simulates and helps to explain our psychological data. We hypothesize that the most usable labeling schemes (1) resolve nutrition conflicts, (2) quantify nutritional variation, and (3) are quick and easy to use. The paper consists of three parts: the first section describes the online experiment, the second part presents our computational model, and the last part is a discussion of our results. Internet-based food-choice experiment: method Our food-choice experiment evaluates the ability of four nutrition-labeling schemes to produce nutritious choices. We use four different nutritionlabeling schemes which vary substantially in their

Nutrition labeling and food choice

amount of information and likely ease of use: (1) the Nutrition Facts (%DV) label, required on most packaged foods in some countries;12 (2) the Traffic Light Signpost label, specified by the Foods Standards Agency of the United Kingdom;13 (3) the NuVal label, developed by NuVal LLC and available in some U.S. supermarkets;14 and (4) the Heart label, a binary labeling scheme designed for this study. The Heart scheme is similar to existing binary schemes like Canada Health Check,15 the Swedish National Food Agency’s Keyhole label,16 and several others, in presenting a symbol that certifies a food item as nutritious. Because we were unable to obtain the precise certification criteria used in these binary schemes, or obtain permission to use them, we simulate them generically by displaying a Heart icon if a product’s NuVal score is above 50. Examples of each labeling method are pictured in Figure 1. Each nutrition-labeling scheme is described to participants using the text in Table S1. We implement our experiment in custom software that presents the food alternatives and records a person’s choices and time to respond to the nearest 100 ms. Unlike many commercially available testing programs, our software enables people to participate directly through their web browser, without having to download any additional software. Our pilot work revealed that many potential participants would drop out as soon as they encountered an instruction to download software. Without that potential limitation, Internet experiments can be just as valid as laboratory experiments for many topics, while providing more representative samples.17 NuVal scores for 1148 cereals and 848 yogurts sold at Hy-Vee retail stores in the United States are used to estimate the nutritional value of each product. Their distributions are plotted in Figure S1. These NuVal scores serve as our principal measure of nutritional value because the NuVal nutrition scoring system is the only one to have been validated against health and body-weight outcomes. Longterm consumption of food scoring high on the NuVal scheme was negatively related to body mass index, chronic diseases, and all-cause mortality.18 That study used systematic food diaries and health outcomes from over 100,000 male and female healthcare workers. Five sets of four cereals each are created by randomly selecting one cereal from each quadrant of NuVal scores (1–25, 26–50, 51–75, and 76–100) for


175


Helfer & Shultz

Figure 1. Labeling schemes. From left to right, %Daily Value, Traffic Light, NuVal, and Heart.

each set. Five sets of four yogurts are created in the same way. These two food categories are chosen for our first experiment because they show a relatively large variation in NuVal score. Large variation in nutrition should improve the chances of finding significant effects of food labeling. Each participant is presented with 10 scenarios, each offering a choice between four food items of the same category, whether cereals or yogurts. Each food item is presented as an image, accompanied by its taste score and nutrition label. Participants are either shown product photos or generic clip art images, in order to examine the possible effect of product familiarity on the effectiveness of nutrition labeling. Finally, participants are either given unlimited time to make their choices, or a 20-s time limit, in order to investigate the possible effect of moderate, realistic time pressure. Examples of two such choices are shown in Figure 2. The choice scenario on the left is between four different yogurts, each presented in a photo and accompanied by its %DV label and taste score. In the condition shown, there is no time limit on responding. The scenario on the right of Figure 2 offers a choice between four different cereals, portrayed with a generic clip-art drawing representing the food category, and each accompanied by its Traffic Light nutrition label and its taste score. In this condition, response time is limited to 20 s, and a countdown timer shows the remaining time. Choice is signaled by clicking anywhere on one of the four images. Our paraboloid model for generating taste scores is inspired by experiments showing that hedonic taste is an interactive, curvilinear function of fat and sugar content, and taste is more sensitive to variation in sugar than fat.19,20 Hedonic taste Z = 2 2 − ax 2 − by2 − xy + 7, where x = normalized fat and c2 √ y = normalized sugar, a = c = 3 and b = 1. 176

The constants a, b, and c control the amount of curvature in the quadric surface, due to variation in the associated variables x, y, and the interaction xy, respectively (Fig. S2). Fat and sugar amounts per 100 g of each food item are taken from %DV entries on each product’s Nutrition Facts label. These amounts are converted to standard scores +7 and rounded to the nearest integer so that taste scores range from 0 to 8. We recruited 192 participants from the United States and Canada from online ads and randomly assigned them to each of the 16 betweensubject conditions of the experiment: nutrition labels (4), time (2), and visual information (2). Comparison of product photos to generic clip art enables testing the possible effects of familiarity and brand loyalty. Within-subject measures of the nutritional value of chosen product and response time are generated for 10 choices (2 food categories × 5 instances). Some sociodemographic characteristics of our sample are provided in Table S2. A nonrecording view of our experiment is at http://lnsclab.org/FoodPref/View.html. Internet-based food-choice experiment: results Response times and the nutritional value and taste scores of choices are subjected to mixed analysis of variance (ANOVA) with between-subject factors of nutrition labels (4), time (2), and visual image (2), and repeated-measures factors of food category (2) and instances (5). The main findings are as follows. 1. The NuVal scheme is more successful than any of the other schemes in promoting choices with high nutritional value (F(3, 176) = 10.0, P < 0.001). Tested with the least significant


Helfer & Shultz


Figure 2. Two screenshots from the Internet-based experiment. Left: %Daily Values with product photos and no time limit. Right: Traffic Light with clip art images and 20-s countdown timer.

difference (LSD) method, NuVal labels yield higher nutrition than the other three labels; Traffic Light labels perform better than %DV labels, which are tied with Heart labels as lowest in nutritional value (P < 0.05; Fig. 3A). When product photos are displayed, participants choose less nutritious products (mean of 56.7) than when the items are represented by generic clip art images (mean of 63.1) (F(1, 176) = 14.1, P < 0.001). 2. There is a main effect of nutrition label on response latencies (F(3, 176) = 10.2, P < 0.001). Tested with the LSD technique, latencies are greater with %DV labels than with the other three labels. The Heart condition is faster than Traffic Light but no different from NuVal (P < 0.05; Fig. 3C). The 20-s time limit produces shorter decision times (mean of 7.8 s) than when no time limit is imposed (mean of 13.7 s) (F(1, 176) = 42, P < 0.001), but has no significant effect on the nutritional value of chosen products. 3. Displayed taste scores have a large main effect on decisions: the mean taste score of chosen products is 6.0 versus 5.6 for unchosen products (F(1, 176) = 68.0, P < 0.001). We use linear multiple regression to assess the relation of participant individual characteristics on the three dependent measures in our study: means for the nutrition, latency, and taste of chosen items. We include only those predictors with significant (P < 0.05) Pearson’s correlation coefficients and ␤ coefficients with their predicted variable. Body

measures are excluded as predictors because there are so many unrealistic data points, probably due to careless data entry. All of the standard precautions are satisfied for ensuring validity of the regressions, including checks for linearity, normality, homoscedasticity, independence, multicollinearity, and outliers. Summary results are presented in Table S3. Nutrition of chosen items is predicted positively by educational level and how much nutrition information is valued, and negatively by how much taste and price information are valued. Taste of chosen items is predicted positively by how much taste is valued and negatively by age and amount of exercise in leisure time. The success of the NuVal scheme in promoting nutritious choices supports our hypothesis that a simple labeling scheme that quantifies nutritional value and resolves nutritional conflict is the most usable. Nevertheless, we considered an alternative explanation: it is possible that using NuVal score as our measure of nutritional value provides an unfair advantage to the NuVal labeling scheme. To investigate this possibility, we include alternative scoring methods based on each of the other three labeling schemes, as well as a composite score that weighs all four schemes equally. A product’s NU_Z score is its NuVal score converted to a standard score (µ = 0, σ = 1). TL_Z is the sum of the four Traffic Light components (fat, saturated fat, sugar, and salt) counted as 0 for red, 1 for amber, and 2 for green, and converted to standard scores. HE_Z is 1 for heart and 0 for no heart, converted to standard scores. %DV_Z is calculated by converting the nine %DV components (calories, fat, saturated fat,


177


Helfer & Shultz

Figure 3. Comparison of results from the human experiment and the computational model. (A) and (B) Mean nutrition scores of products selected under the different labeling schemes. (C) and (D) Mean decision times under the different labeling schemes. The error bars indicate standard error.

cholesterol, sodium, carbohydrates, fiber, sugar, and protein, each expressed as %DV per serving) to standard scores, adding them, and converting the results to standard scores. The amounts of protein and fiber are counted as positive attributes, the others as negative. As shown in Figure 4, regardless of which scoring method is used, NuVal labeling performs as well as or better than the other labeling schemes. When using the composite average of the four scores (COMP_Z in the diagram), NuVal labeling again produces the highest nutrition scores. We report most of our results using the NuVal score rather than the composite score as our measure of nutritional value, because it is the only one validated against health and body-weight outcomes.18 Computational decision-making model: method It is difficult, based on the psychological data alone, to draw conclusions about the cognitive mechanisms underlying decisions. One way to further explore a cognitive phenomenon is to build a compu178

tational model of a hypothetical causal mechanism and study its behavior. A successful replication of human performance may provide insights that lead to new theorizing. Here, we report the results of applying a decision-making model to our psychological data. The theory of decision making has a long history, starting with the writings of Pascal and Fermat21 in the 17th century and progressing through a succession of ever more accurate models.22 Whereas earlier stages consisted of improvements in the mathematical formulas describing human decision making, recent decades have seen progress in reflecting psychological processes underlying observed decisions.22 Decision field theory (DFT), currently one of the most successful models of human decision making,23 is based on the idea that decision making is a sequential sampling process: the available options are repeatedly considered and evaluated according to the degree to which they satisfy one or more of a decision maker’s objectives. As a result of this process, preference levels gradually develop for the different options, and when the preference for


Helfer & Shultz


Figure 4. Mean nutritional value of products selected by participants. The colors indicate the labeling system used, and the five groups of bars represent different scoring methods. The error bars indicate standard error.

one of the options reaches a decision threshold, a choice is made favoring that option. As shown in Table S4, DFT is able to account for a wide spectrum of preference reversals and other subtleties of decision making that prove difficult for other models to reproduce.24 We base our model on the multialternative version of DFT24 because of its ability to represent choices between options (in our case, food products) that differ in their values on multiple attributes (nutritional values, taste scores, and visual information). The DFT model is implemented as a three-layered connectionist network, following Roe et al.24 Figure 5 illustrates the model for a hypothetical choice situation with three options, A, B, and C, and two attributes, 1 and 2. Our food-choice simulations have four options (food products) and between three and 12 attributes, depending on which labeling scheme is being tested (taste, visual appeal, and 1–9 nutrition label attributes); Figure 5 has fewer options and attributes for greater clarity. Each option, if chosen, delivers a certain value for each attribute. For example, if option B in Figure 5 is chosen, then the value A1B is delivered for attribute 1. The decision-making process is modeled in a series of time steps, during which attention shifts randomly between the attributes such that the probability of the model focusing on any one attribute i at any particular time step is proportional to that attribute’s attention weight wi . The processing that takes place during each time step is as follows.

1. The momentary activation level of each utility unit is set to the corresponding option’s value for the attribute currently being focused on. 2. The activation level of each valence is computed as the difference between the corresponding utility level and the average of the other options’ utility levels. To achieve this, the connection weights between utility and valence units are configured with weights 1.0 within options and –1.0/(n – 1) across options, where n is the number of options. A small random fluctuation is then added to each utility value, to simulate the effect of minor attributes not explicitly modeled and other distractions. 3. The activation levels of the preference units are updated by combining the input from the valence units with self-feedback and lateral inhibition. The self-feedback connections have weights close to 1.0, so that the model remembers its accumulating preferences across time steps. The cross-feedback connections generate preference competition between the options. They have small negative weights that depend on the similarity of options (pairwise Euclidian distance in attribute space). All the simulations described here use a self-feedback value of 0.98 and cross-feedback values varying from –0.1 between very similar options to near zero for very dissimilar ones. These feedback strengths are used because they have been shown to produce a realistic model.


179


Helfer & Shultz

Figure 5. Multialternative DFT network. Arrows represent weighted connections that (like brain synapses) transmit signals from one unit (or neuron) to another.

The only additional tunable parameter in the model is the variability of the random fluctuation that is added to the utilities in step 1 above. A normally distributed fluctuation with a mean of 0.0 and standard deviation of 0.1 is used.

to normalize those attributes that are expressed as quantities or %DV per serving.a

Attributes The input attributes to the model are the nutrition values as represented on the nutrition labels, the taste scores, and the values of a visual appearance attribute (explained later).

Visual appearance. To account for the impact of product photos versus clip art, a visual-appearance attribute is defined as follows. Each product’s visualappearance attribute value is calculated as the number of times human participants choose the product in the photo condition minus the number of times it is chosen in the clip art condition. Thus, a product chosen more often when its photo is displayed gets a positive visual appearance value.

Nutrition attributes. In the %DV condition, each product has nine nutrition attributes: calories, fat, saturated fat, cholesterol, sodium, carbohydrates, fiber, sugar, and protein. Each of these attributes is expressed as the label’s %DV. A high attribute value counts against a food item, except for protein and fiber, which count as positive. The Traffic Light condition has four nutrition attributes—fat, saturated fat, sugar, and salt—each of which has one of the values 0, 1, or 2, representing high, medium, and low, respectively. The NuVal condition has a single nutrition attribute, the NuVal score, with a value in the range 1–100. The Heart label is represented as a single attribute with the values 1 for a heart and 0 for no heart. Serving size. The serving sizes indicated on the %DV labels are presented to the model, and used 180

Taste score. The taste score attribute is the same value as presented to the human subjects, a number in the range 1–8.

Normalization. All attribute values are converted to standard scores, putting them on an equal footing, before applying the attention weights. Attention weights The attention weights for the nutrition attributes are set so that the model spends the same

a

Specified serving sizes vary considerably, especially for the cereals, where they range from 15 to 59 g. This probably makes on-the-spot comparison quite challenging for most consumers. It may be interesting to re-run the human experiment with standardized serving sizes on the Nutrition Facts labels, to see if this would help subjects make better use of this type of label.


Helfer & Shultz

proportion of time attending to the label for each of the four labeling schemes, regardless of the number of nutrition attributes. Thus, the attention weight assigned to the single nutrition attribute in the NuVal and Heart conditions is nine times the weight assigned to each of the nine attributes in the %DV condition, and four times the weight assigned to each of the four Traffic Light attributes. It is possible that some labels are more salient than others, causing consumers to pay more attention to them; in the absence of any evidence for this, we give equal weight to each type of nutrition label. The weights for the taste and visual-appearance attributes are each set equal to half of the combined weight of the nutrition attributes. This allows these attributes to have significant effects without overpowering the nutrition information. Time limit When the 20-s time limit is applied in the human experiment, average decision time is reduced in all conditions, even though very few trials (1%) actually time out (i.e., subjects make their decisions quicker when there is a time limit and a countdown timer is displayed). To simulate this time pressure, we run the simulation with a lower preference threshold for decision, 6 rather than 8. The unit for preference threshold does not have any obvious interpretation, although its magnitude is related to that of the attribute values, which in our model are standard scores, with a mean of 0.0 and standard deviation of 1.0. As with the psychology experiment, there are thus 16 conditions in the simulation: four labeling schemes, visual-appearance attribute on or off, and decision threshold high or low. Computational decision-making model: results The model is presented with the same 10 scenarios that are used in the human experiment, and this is repeated 192 times evenly distributed across the 16 conditions, to yield a body of simulation data of the same size as that obtained with the human subjects. Response times and the nutritional value and taste scores of choices are subjected to mixed ANOVA with between-subject factors of nutrition labels (4), decision threshold (2), and visual appearance (2), and repeated-measures factors of food category (2) and instances (5).


We find that the model reproduces the relative ability of the four labeling schemes to promote nutritious choices, with NuVal being more effective than any of the other schemes, and Traffic Light better than %DV, (F(3, 176) = 291, P < 0.001; Fig. 3B). As in the human experiment, %DV produces longer decision times than NuVal or Heart, and Traffic Light is slower than either of NuVal and Heart (F(1, 176) = 213, P < 0.001; Fig. 3D). The effect of visual cues on nutritional value is also captured: human subjects make less nutritious choices when product photos are displayed, and the model exhibits the same behavior when the visual-appearance attribute is included in the input (F(1, 176) = 33.9, P < 0.001). As in the human experiment, products chosen by the model have a higher mean taste score (6.1) than unchosen ones (5.5) (F(1, 176) = 858, P < 0.001). Finally, the simulation reproduces the effect of time pressure (modeled as a lowered decision threshold): with human subjects, the 20-s time limit reduces the average decision time considerably for all labeling schemes and with both image types. Lowering the model’s decision threshold has the same effect (F(1, 176) = 354, P < 0.001). Interestingly, although time pressure accelerates decision making, it has no significant effect on nutritional value in the human experiment or in the model. Discussion In this study, we investigated the ability of four different styles of nutrition labeling to guide consumers toward healthier food choices. As hypothesized, we found that the most usable labeling scheme, in this case NuVal, is one that quantifies nutritional information, presents it in a way that is quick and easy to use, and resolves nutritional conflicts. NuVal labels are fast to use and yield nutritious choices. This does not necessarily imply that NuVal labeling should be used everywhere. It is possible that alternate labeling schemes with these same characteristics could do as well or better than NuVal labels. Heart labels are also fast to use, but produce choices that are not especially nutritious. With a more realistic higher proportion of foods certified as nutritious, binary labeling schemes could be expected to do even more poorly in nutrition than we find here. In more realistic scenarios, binary labeling would produce many more ties between food items with a substantially wider range of nutritive value. Traffic


181


Helfer & Shultz

Light labels take more time to use and yield only moderate increases in nutrition. The widely used %DV labels take the most time and yield the least nutritious choices. These findings are robust across all our variations of experimental conditions: products presented with or without photos, and limited versus unlimited decision time. It is somewhat surprising to find that an imposed time limit, while speeding decisions, does not affect the nutritional quality of choices. Regarding the negative nutritional impact of showing product photos, we hypothesize that this effect is due to a tendency for participants to prefer products that they have purchased in the past, a documented phenomenon known as “consumer inertia,”25 and that this effect competes with nutritional information. Other explanations are possible, but deciding between these possibilities is beyond the scope of this paper. It is, however, interesting to note that the effects of labeling scheme and of taste score are present whether product photos are displayed or not. Our computer simulation successfully captures and provides potential explanations for the main findings from the human experiment. In the simulation, the nutrition label attributes have the same combined attention weight, regardless of which labeling scheme is used. The greater nutritional success of NuVal, as compared to Traffic Light and %DV, is partly explained by the multiplicity of attributes in Traffic Light and %DV labels (four and nine, respectively). Compared to NuVal, the other three labels also suffer from the fact that they create decisional conflicts. For example, in a particular choice situation, one product may have lower sugar and salt content, but higher fat content and calorie count. It is nontrivial for a human shopper to resolve such a conflict, and the DFT model represents this as a struggle between preferences, as attention shifts back and forth between the conflicting attributes. If some other attribute (e.g., taste score or visual appearance) strongly favors a particular choice, it can easily dominate over conflicting nutrition information. In contrast, a single-attribute scheme like NuVal resolves such nutrition conflicts, rather than highlighting them, thus providing more guidance for decision making. Thus, our model provides a potential explanation for the higher usability of NuVal as compared with Traffic Light or %DV: if human decision making is 182

underpinned by a sequential sampling process, as in the DFT model, then labels that present nutritional information as potentially conflicting multiple attributes are disadvantaged. This can be graphically illustrated by plotting the model’s evolving preferences when presented with different nutrition labels. Figure 6A shows a run where NuVal labels are used; a preference quickly develops for the product with the highest nutrition score. In Figure 6B, the %DV label is used; here the model takes almost three times as long to reach a decision and selects a product which has a lower nutrition score but a higher taste score than the one selected in Figure 6A. The relatively poor performance of the Heart scheme is accounted for by its lack of granularity. With only two possible values, it does not provide enough information for a consumer (or computational model) to make informed choices. The many ties created in a binary labeling scheme present decisional conflicts that are difficult to resolve in favor of greater nutrition. When product photos are displayed, rather than generic clip art images, subjects tend to choose products with lower average nutritional value. Our model reproduces this effect in a straightforward manner, with an attribute representing preferred visual appearance, thereby supporting the hypothesis that there is something important about the actual product, rather than an alternative explanation that product images are a mere distraction from the task. In the human experiment, the 20-s time limit for decisions causes participants to decide faster, with very few (1%) trials actually reaching the time limit. The explanation provided by the model is that a lower preference threshold is used when making decisions under time pressure. What about the finding that time pressure does not affect the nutritional quality of decisions? To understand this, it is instructive to examine a trace of the model’s decision-making process. Figure 7 shows the evolving preferences for the four available options during a typical choice scenario. As can be seen, after some initial turmoil, a favorite option emerges quite early, and the rest of the time is spent accumulating this preference until it reaches the threshold. In this context, a moderate lowering of the threshold (e.g., from 8 to 6 in the example) would speed up the process without affecting the decision taken.


Helfer & Shultz


Figure 6. Two example runs of the DFT model. (A) Nutritional information for each product consists of a single NuVal score. (B) The nine %DV attributes are provided for each product. The c09–c12 labels identify the four cereal products used in this particular run.

The multiple regressions of our three dependent measures (nutrition, latency, and taste) onto individual characteristics of our subjects produced sensible but not unsurprising results. Nutrition of food choices is positively predicted by educational level and self-rated value of nutrition information, and negatively predicted by the self-rated value of taste and price. Taste is positively predicted by the self-rated value of taste information and negatively predicted by age and the amount of exercise in leisure time. Much of this can be explained by the idea that people’s values are consistently reflected in their food choices. Such effects would fit naturally into our computational model by making the payoffs for delivery of food characteristics proportional to how they are valued by individuals, implemented

by attribute values A in Figure 5. The small amount of variance in our dependent measures that is accounted for by individual characteristics reflects the relative power of our experimental manipulations in determining food choice. This bodes well for the capacity of usable nutrition information to help mitigate the obesity epidemic. Usable nutrition labels could help virtually anyone. Demographic comparisons indicate that our sample is more female and has attained more educational degrees than the general U.S. population. Given that education is positively related to more nutritious choices, it is possible that the effects of nutrition labeling would not be as dramatic in more general populations as they are in our experiment. However, a recent worldwide survey indicates that


183


Helfer & Shultz

Figure 7. An example DFT run. The model repeatedly samples the attribute values of four food products, randomly shifting attention between nutrition values, taste scores, and visual appeal. At each time step, the preference for each option increases or declines, depending on which attribute is currently in focus, until the preference for one option reaches the threshold and a decision is made.

a majority of responders want better nutrition information to improve their health and control their body weight.6 As better and more usable nutrition information becomes more widely available and populations become more familiar with nutritional issues, our results could become increasingly generalizable. A direction for future work is to apply our computational model to the evaluation of new labeling schemes. Being able to replace some or all of the psychological field testing of a proposed scheme with computer simulations would provide considerable savings in both time and expense. In addition to the visual appearance of packaging, a real shopping experience includes other factors that may affect product choices (e.g., price, advertising, and in-store placement). The impact of such factors, and their interaction with nutrition labeling, is beyond the scope of this study, but could be addressed with similar methodology in the future. Any existing or proposed nutrition-labeling scheme should be evaluated on (at least) two important criteria: validity and usability. Validity is the extent to which a labeling scheme is associated with food choices related to health outcomes, including body weight, longevity, and chronic diseases. Currently, validity can be studied with largescale data sets that longitudinally track what and how much people eat, along with their weight and health outcomes.18 Usability is the extent to which 184

a labeling scheme is easy enough and clear enough for people to actually use in their food choices. Scientific studies could be used to inspire and evaluate nutrition-labeling schemes that encourage validity and usability. Knowledge about food nutrients is essential, but not sufficient, to produce the best labeling. That knowledge needs to be translated into information that people can understand and use rather quickly. Policy makers in government, industry, and communities need to be aware of relevant, emerging research along these lines. Acknowledgments This work is supported by a grant to TRS from the Natural Sciences and Engineering Research Council of Canada. We are grateful to David Katz and Laurette Dub´e for inspiration and stimulating discussions, and to Allison Yan and Akash Venkat for pilot work leading up to this experiment and simulation. Conflicts of interest The authors declare no conflicts of interest. Supporting Information Additional supporting information may be found in the online version of this article. Figure S1. Distribution of NuVal scores of cereals and yogurts in a major U.S. supermarket chain.


Helfer & Shultz

Figure S2. A paraboloid model of taste, used to generate a realistic taste score for each product. Table S1. Explanations of the nutrition-labeling schemes


11. 12.

Table S2. Sociodemographic information for the psychological experiment Table S3. Multiple regression results Table S4. Ability of DFT and competing computational models to account for seven well-known phenomena in decision making22

13.

14. 15.

References 16. 1. Swinburn, B.A. et al. 2011. The global obesity pandemic: shaped by global drivers and local environments. Lancet 378: 804–814. 2. Cutler, D.M., E.L. Glaeser & J.M. Shapiro. 2003. Why have Americans become more obese? J. Econ. Perspect. 17: 93–118. 3. Rosengren A. & L. Lissner. 2008. The sociology of obesity. In Obesity and Metabolism. Vol. 36. M. Korbonits, Ed.: 260–270. Karger. Basel. 4. Flegal, K.M., M.D. Carroll, B.K. Kit & C.L. Ogden. 2012. Prevalence of obesity and trends in the distribution of body mass index among US adults, 1999-2010. J. Am. Med. Assoc. 307: 491–497. 5. Cawley, J. & C. Meyerhoefer. 2012. The medical care costs of obesity: an instrumental variables approach. J. Health Econ. 31: 219–230. 6. Reports and insights | Healthy eating trends around the world | Nielsen. 2013. Cited April 22, 2014. http://www. nielsen.com/us/en/reports/2012/healthy-eating-trendsaround-the-world.html. 7. Drichoutis, A., P. Lazaridis & R.M. Nayga, Jr. 2006. Consumers’ use of nutritional labels: a review of research studies and issues. Acad. Market. Sci. Rev. 10. 8. Campos, S., J. Doxey & D. Hammond. 2011. Nutrition labels on pre-packaged foods: a systematic review. Public Health Nutr. 14: 1496–1506. 9. Sacks, G., M. Rayner & B. Swinburn. 2009. Impact of frontof-pack ‘traffic-light’ nutrition labelling on consumer food purchases in the UK. Health Promot. Int. 24: 344–352. 10. Roberto, C.A. et al. 2012. Facts up front versus traffic light

17.

18.

19. 20.

21.

22.

23.

24.

25.

food labels: a randomized controlled trial. Am. J. Prev. Med. 43: 134–141. Daly, P.A. 1976. The response of consumers to nutrition labeling. J. Consum. Aff. 10: 170–178. Wartella, E.A., A.H. Lichtenstein, A. Yaktine & R. Nathan. 2011. Front-of-Package Nutrition Rating Systems and Symbols: Promoting Healthier Choices. National Academies Press. Washington, D.C. Food Standards Agency. Front of pack nutritional signpost labelling technical guidance. 2007. Cited April 22, 2014. http://www.food.gov.uk/multimedia/pdfs/frontofpackguid ance2.pdf. NuVal LLC. How It Works. 2011. Cited April 22, 2014. http://www.nuval.com/How. Health Check program. 2010. Cited April 22, 2014. http://www.healthcheck.org./. Keyhole symbol – Livsmedelsverket. 2007. Cited April 22, 2014. http://www.slv.se/en-gb/Group1/Food-labelling/ Keyhole-symbol./. Dandurand, F., T.R. Shultz & K.H. Onishi. 2008. Comparing online and lab methods in a problem-solving experiment. Behav. Res. Methods 40: 428–434. Chiuve, S.E., L. Sampson & W.C. Willett. 2011. The association between a nutritional quality index and risk of chronic disease. Am. J. Prev. Med. 40: 505–513. Drewnowski, A. 1997. Taste preferences and food intake. Annu. Rev. Nutr. 17: 237–253. Drewnowski, A & M. Greenwood. 1983. Cream and sugar: human preferences for high-fat foods. Physiol. Behav. 30: 629–633. David, F.N. 1962. Games, Gods and Gambling: The Origins and History of Probability and Statistical Ideas from the Earliest Times to the Newtonian Era. Hafner Pub. Co. New York. Busemeyer, J.R. & J.G. Johnson. 2008. Micro-Process Models of Decision Making. In Cambridge Handbook of Computational Cognitive Modeling. R. Sun, Ed.: 302–321. Cambridge University Press. New York. Busemeyer, J.R. & J.T. Townsend. 1993. Decision field theory: a dynamic-cognitive approach to decision making in an uncertain environment. Psychol. Rev. 100: 432–459. Roe, R.M., J.R. Busemeyer & J.T. Townsend. 2001. Multialternative decision field theory: a dynamic connectionist model of decision making. Psychol. Rev. 108: 370. Dube, J.-P., G.J. Hitsch & P.E. Rossi. 2010. State dependence and alternative explanations for consumer inertia. Rand J. Econ. 41: 417–445.


185

Food Choice and Nutrition: A Social Psychological Perspective.

Model for understanding consumer textural food choice.

A voluntary nutrition labeling program in restaurants: Consumer awareness, use of nutrition information, and food selection.

Consumer preferences for food allergen labeling.

Consumer Choice between Food Safety and Food Quality: The Case of Farm-Raised Atlantic Salmon.

Consumer nutrition knowledge and self reported food shopping behavior.

The growing role of front-of-pack nutrition profile labeling: a consumer perspective on key issues and controversies.

Effects of subtle and explicit health messages on food choice.

The Effects of industrial workers' food choice attribute on sugar intake pattern and job satisfaction with Structural Equcation Model.

Food Choice Motives When Purchasing in Organic and Conventional Consumer Clusters: Focus on Sustainable Concerns (The NutriNet-Santé Cohort Study).

Psychological determinants of consumer acceptance of personalised nutrition in 9 European countries.

The effects of nutrition knowledge on food label use. A review of the literature.

Analysis of U.S. Food and Drug Administration food allergen recalls after implementation of the food allergen labeling and consumer protection act.

Lack of correlation between food retention on the human dentition and consumer perception of food stickiness.

Nutrition labelling: a review of research on consumer and industry response in the global South.

The meaning of colours in nutrition labelling in the context of expert and consumer criteria of evaluating food product healthfulness.

A likelihood-based biostatistical model for analyzing consumer movement in simultaneous choice experiments.

Determinants of the choice of GP practice registration in England: evidence from a discrete choice experiment.

An objective measure of nutrition facts panel usage and nutrient quality of food choice.

Consumer food system participation: a community analysis.

Search, Memory, and Choice Error: An Experiment.

A benefit-risk assessment model for statins using multicriteria decision analysis based on a discrete choice experiment in Korean patients.

Food-choice in a food-preference test: comparison of two mouse strains and the effects of chlordiazepoxide treatment.

Obesity and the effects of choice at a fast food restaurant.