Risk Analysis, Vol. 35, No. 1, 2015

DOI: 10.1111/risa.12262

False Alarms and Missed Events: The Impact and Origins of Perceived Inaccuracy in Tornado Warning Systems Joseph T. Ripberger,1,∗ Carol L. Silva,2 Hank C. Jenkins-Smith,3 Deven E. Carlson,4 Mark James,5 and Kerry G. Herron6

Theory and conventional wisdom suggest that errors undermine the credibility of tornado warning systems and thus decrease the probability that individuals will comply (i.e., engage in protective action) when future warnings are issued. Unfortunately, empirical research on the influence of warning system accuracy on public responses to tornado warnings is incomplete and inconclusive. This study adds to existing research by analyzing two sets of relationships. First, we assess the relationship between perceptions of accuracy, credibility, and warning response. Using data collected via a large regional survey, we find that trust in the National Weather Service (NWS; the agency responsible for issuing tornado warnings) increases the likelihood that an individual will opt for protective action when responding to a hypothetical warning. More importantly, we find that subjective perceptions of warning system accuracy are, as theory suggests, systematically related to trust in the NWS and (by extension) stated responses to future warnings. The second half of the study matches survey data against NWS warning and event archives to investigate a critical follow-up question—Why do some people perceive that their warning system is accurate, whereas others perceive that their system is error prone? We find that subjective perceptions are—in part—a function of objective experience, knowledge, and demographic characteristics. When considered in tandem, these findings support the proposition that errors influence perceptions about the accuracy of warning systems, which in turn impact the credibility that people assign to information provided by systems and, ultimately, public decisions about how to respond when warnings are issued. KEY WORDS: Severe weather; tornadoes; warning response

1. INTRODUCTION 1 Center

for Risk and Crisis Management, Cooperative Institute for Mesoscale Meteorological Studies, University of Oklahoma, OK, USA. 2 Department of Political Science, Center for Risk and Crisis Management, University of Oklahoma, OK, USA. 3 Department of Political Science, Center for Energy, Security, and Society, University of Oklahoma, OK, USA. 4 Department of Political Science, University of Oklahoma, OK, USA. 5 Department of Political Science, Center for Risk and Crisis Management, University of Oklahoma, OK, USA. 6 Center for Risk and Crisis Management, Center for Energy, Security, and Society, University of Oklahoma, OK, USA. ∗ Address correspondence to Joseph T. Ripberger, 120 David L Boren Blvd, Room 3106, Norman, OK 73072, USA; tel: +(405) 325-5218; [email protected]/[email protected].

Folklore,(1) backed by decades of theoretical development(2) and conventional wisdom within the forecast and emergency management community,(3,4) suggests that errors undermine the credibility of tornado warning systems and consequently decrease the probability that individuals will comply (i.e., engage in protective action) when future warnings are issued. The logic behind this contention is rather simple—when faced with severe weather, individuals consider (among other things) the probability that a tornado will occur in their area. If the probability is judged to be high, then people will engage in protective action. If the probability is thought to be 44

C 2014 Society for Risk Analysis 0272-4332/15/0100-0044$22.00/1 

Inaccuracy in Tornado Warning Systems

45

Table I. Tornado Warning Error Matrix

2. EMPIRICAL EVIDENCE

Event Status

Warning status

Warning issued No warning issued

Tornado occurs

No tornado

Verified warning Missed event

False alarm No event

low, then people will shy away from protective action because it is costly in terms of time, effort, and loss of productivity/leisure.(5) Warning systems are important because they provide information about the probability that a tornado will occur. For perfectly accurate systems, a warning indicates that the probability of a tornado is 1 whereas no warning indicates that the probability is 0. Unfortunately, modern tornado warning systems are not perfectly accurate—like other warning systems, they are plagued by type I (false alarms) and type II (missed events) errors. As indicated in Table I, a false alarm occurs when a warning system issues a warning and no tornado occurs. A missed event occurs when a tornado occurs absent the issuance of a warning. Recognizing these imperfections, individuals may consider the accuracy of their warning system before using the information it provides to make a decision. If the warning system is perceived to be inaccurate, then individual confidence in that system will be low, causing people to ignore or devalue the information it provides. In such cases, assuming that taking protective action is costly,7 people will be more likely to opt against protective action when warnings are issued. If the warning system is perceived to be accurate, by comparison, then individual trust in the information will be higher, and people will be more likely to use the information provided by the system and heed warnings by engaging in protective action when they are issued.

7 Taking

protective action requires interrupting “normal” behaviors, including work, recreation, and rest, to take action that increases safety. While estimating the costs of these interruptions and actions is not undertaken in this article, we, like Sutter and Erickson,(5) assume them to be positive, resulting in an increasing propensity of individuals to avoid taking protective action as perceived warning accuracy declines.

2.1. The Influence of Warning System Accuracy on Responses to Future Warnings Intuition and theory aside, empirical research on the influence of warning system accuracy on public responses to tornado warnings is incomplete and inconclusive. It is incomplete because little research exists on the extent to which missed events influence individual assessments of warning system credibility and/or future warning responsiveness. This is a significant omission because forecasters operate in an environment wherein reducing the frequency of false alarms can lead to increases in the frequency of missed events.(6) Thus, information about the relative implications of missed events versus false alarms for public credibility and warning response may help forecasters determine which type of error is less costly (in terms of lost credibility) and therefore more tolerable. There is, however, a considerable and growing body of research on the “cry wolf” or false alarm effect. Unfortunately, the results generated from this research are inconclusive. Laboratory experiments, for example, suggest that recurrent false alarms do— as suggested by the theory—reduce individual willingness to respond to future warnings.(2) A number of field studies, on the other hand, have challenged the external validity of this oft-cited finding. In their study of evacuation behavior, for example, Dow and Cutter(7) found little if any evidence that the widely publicized Hurricane Bertha false alarm (July 1996) influenced the decisions of South Carolina residents to evacuate in advance of Hurricane Fran (August/September 1996), which occurred only two months later. A subsequent study by Carsell(8) on the accidental issuance of a dam-failure warning in Ventura, California came to a similar conclusion; contrary to popular belief, the false alarm did not reduce public confidence in the warning process. Extending these findings to tornado warnings, survey research conducted by Schultz et al.(9) finds no relationship between false alarms and the confidence that Austin, Texas residents have in their warning system. Similarly, recent interviews by Donner et al.(10) suggest that false alarms play a relatively minor role in the decisions that Louisiana, Missouri, and Tennessee residents make when confronted with tornado warnings. These studies tempt one to conclude that false alarms are relatively inconsequential and that the

46 “cry wolf” effect may exist in the lab, but not in the real world. Recent research by Simmons and Sutter,(11,12) however, should temper such conclusions. In this work, Simmons and Sutter indirectly examine the link between false alarms and individual responses to warnings by systematically comparing casualty and injury rates to false alarm ratios (FARs) in the areas impacted by tornadoes. They find that tornadoes that occur in areas with relatively high FARs tend to kill and injure more people than tornadoes that occur in areas with relatively low FARs. The false alarm effect, they argue, provides a rather convincing explanation for this phenomenon. In areas where FARs are high, residents assign relatively little credibility to warnings, which makes them less likely to engage in protective action when warnings are issued and more likely to be injured or killed in the event that a tornado occurs. The opposite is true in areas where false alarms are relatively uncommon. Simmons and Sutter are careful to note that their findings provide an indirect test of the false alarm effect. Their test is indirect for two reasons: first, they use injury and death as a proxy for individual behavior in response to a warning and the credibility that individuals assign to that warning. The basis for this proxy is rather simple—all else equal, protective action should be negatively correlated with injury and death. Thus, the authors assume that high casualty and injury rates provide a reasonably valid indicator that a sizable portion of the affected population decided against protective action when a warning was issued. Moreover, one could plausibly assume that protective action decisions are correlated with the credibility that people assign to a given warning. If this assumption holds true, then high casualty and injury rates would also indicate that some fraction of the affected population did not take protective action because they assigned lower credibility to that warning. Second, Simmons and Sutter use objective FARs (calculated via National Weather Service [NWS] warning records) as a proxy for subjective perceptions of warning system accuracy. Their justification for doing so rests on the untested assumption that objective and subjective estimates of accuracy are correlated—all else equal, people who live in high FAR areas will recognize and therefore perceive that the warning system in their area is prone to false alarms, whereas people who live in low FAR areas will perceive that their warning system is relatively accurate. While the assumptions made by Simmons and Sutter seem reasonable, a more direct evaluation

Ripberger et al. of the relationship between error and warning response would test the theory by directly measuring subjective perceptions of warning system accuracy, individual assessments of warning system credibility, and warning response. For the sake of completeness, such a study would also account for perceptions about both types of error—missed events and false alarms—when measuring perceived accuracy. In the first half of the article that follows, we use public responses to a large regional survey to accomplish this. If theory is correct, then people who believe that the warning system in their area is relatively inaccurate will, on average, discount the credibility of their warning system and, as a result, be less likely to take protective action when future warnings are issued. In the second half of the article, we move on to explore the relationship between subjective perceptions of warning system accuracy and objective indicators of warning system performance. Does objective accuracy influence perceived accuracy? 3. DATA To test this conjecture and answer this question we designed, fielded, and analyzed responses to an Internet survey of residents who live in tornadoprone regions of the United States. The survey instrument we designed contains 144 questions that gauge perceptions about weather, tornadoes, and warnings, as well as a variety of sociodemographic characteristics, like geographic location, residential situation, income, and education. It was fielded in eight weekly waves between September 12, 2012 and November 1, 2012. In each wave, we collected responses from 500 different members of an online survey panel that is recruited and maintained by Survey Sampling International (SSI). Because we are interested in individual perceptions about and responses to tornadoes, we geographically conditioned our selection of potential respondents from this panel such that the people asked to take our survey had to reside in a “tornado-prone” region of the United States. Members of the panel were considered to live in a tornado-prone region if the address they registered with SSI is located in one of the high-vulnerability regions listed by Ashley(13) in his climatological study of significant and fatal tornadoes between 1880 and 2005. We oversampled members of the panel who reside in rural settings so as to maximize geographic coverage and combat the urban bias associated with Internet access and participation in web-based surveys(14) and used quotas

Inaccuracy in Tornado Warning Systems to ensure that our respondents are demographically representative of the target population.8 This process yielded data from 4,004 crosssectional surveys completed by individual respondents who reside at the locations depicted in Fig. 1. Thirty-two percent of the respondents indicated that they lived in an “urban” area within the incorporated boundaries of a city or town that provides emergency services such as fire, rescue, and storm warnings for their residence; 39% reported that they live near or in a “suburb or town” that provides emergency services such as fire, rescue, and storm warnings for their residence; and 29% said that they live in a “rural” area outside of the incorporated boundaries of a city or town, where emergency services such as fire, rescue, and storm warnings are provided by county, state, or federal entities. Respondents closely matched the demographic characteristics of the target population with 52% being female, 36% being college graduates, and 72% reporting their race as white, non-Hispanic. The median income category was $30,000 to $40,000, the mean age was 48, and 35% of the respondents were college graduates.

4. SUBJECTIVE PERCEPTIONS OF WARNING SYSTEM ACCURACY, CREDIBILITY, AND WARNING RESPONSE As explained in Section 2, the first stage of our analysis examines the relationship between subjective perceptions of warning system accuracy, credibility, and responsiveness to future warnings. This examination requires that we operationalize and measure each of these variables, which we accomplish using the survey questions described below.

8 Note

that our respondents are, in general, demographically and geographically representative of the target population but—like the majority samples that are drawn from online panels—not randomly drawn from that population (probabilistic). As such, we follow the advice of the American Association for Public Opinion Research (AAPOR) by avoiding estimates of sampling error and precise population values.(15) The same is true of response rates. Like most samples that are drawn from online panels, our respondents consist of volunteers from a large panel that constantly fluctuates, rather than a finite list of eligible respondents who were invited to participate in the survey. Accordingly, it is not possible to reliably calculate a response rate. However, as noted by APPOR,(16) we can calculate a “completion” or “participation” rate by dividing the number of people who completed the survey (4,004) by the total number of people who requested participation by clicking on the survey link (7,440). In this case, the completion rate is 53.8% (4,004/7,440).

47 4.1. Measures 4.1.1. Subjective Perceptions of Warning System Accuracy Historically, the NWS has used two metrics to evaluate the accuracy of a given warning system— the probability of detection (POD) and FAR. The POD ratio for a given system indicates the probability that the system will issue a warning prior to the occurrence of a verified tornado. It is calculated by dividing the number of verified warnings issued by the system by the number of verified tornadoes that occurred within the jurisdiction of that system over the same period of time. The missed event ratio (MER) for a given system is the inverse of the POD; it is calculated by subtracting POD from one, or by dividing missed events by total events. The FAR for a given warning system, by comparison, indicates the probability that the system will issue a tornado warning that is not followed by a verified tornado within the geospatial and temporal frame encompassed by the warning. It is calculated for a given system by dividing the number of false alarms issued by the total number of warnings issued. Using a similar logic, we operationalize subjective perceptions of warning system accuracy by calculating a perceived FAR (pFAR) and perceived MER (pMER) for each respondent using the following questions: Q61: Based on your general impressions over the past three years, when tornado WARNINGS have been issued for your local area, about what percent of the time were tornadoes sighted or caused damages within the warning area? Q62: Based on your general impressions over the past three years, when tornadoes have been sighted or have caused damages in your local area, about what percent of the time was a tornado WARNING issued prior to the occurrence of the tornado? Upon receiving these questions, respondents were given two options—specify a percentage or indicate that the question is not applicable because no tornado warnings were issued or no tornadoes occurred in the area over the past three years. To calculate pFAR and pMER, we transformed percentage responses to Q61 and Q62, respectively, by subtracting the number that respondents listed from 100 and then dividing that value by 100 to make it a proportion. In theory, both measures can range from 0,

48

Ripberger et al.

Fig. 1. Approximate location of survey respondents.

indicating a perception that the warning system is perfect (no false alarms/no missed events), to 1, indicating a perception that the system is completely flawed (all false alarms/all missed events). In our sample, pFAR ranges from 0 to 1, has a mean of 0.51 and a standard deviation of 0.32. By comparison, pMER ranges from 0 to 1, but has a significantly lower mean of 0.28 and a standard deviation of 0.33. For purposes of comparison, NWS performance statistics indicate that the national false alarm and MERs during the three-year period prior to the survey were actually 0.73 and 0.27, respectively, which means that the average respondent underestimated the frequency of false alarms, but was reasonably accurate in his or her perception of the frequency of missed events.

Responses to this question ranged from 0 to 10, but the mean response was 8.64, which is significantly higher than the midpoint on this scale (5), suggesting that the NWS is, on average, viewed as a trustworthy organization that is likely to issue credible products (i.e., warnings).

4.1.2. Credibility

4.1.3. Responsiveness to Future Warnings

We operationalize warning system credibility by asking respondents to indicate how much they trust broadcasts from the NWS, which (as they are told) is the official entity responsible for issuing tornado watches and warnings. We accomplish this by way of the following prelude and question:9

Consistent with previous research,(9) we use intended behavior in response to a hypothetical warning as a rough indicator of future warning response. More specifically, we asked respondents to indicate how they would respond if they were to receive the following warning:

Tornado WATCHES and WARNINGS issued by the NWS are provided to the public

Q88: While you are at home during daylight hours, if you were to learn that the National Weather Service has issued a tornado WARNING for [randomize: “light”; “moderate”; “significant”; “severe”; “devastating”; “incredible”] tornadoes in your local

9 Respondents

are presented with this information and asked this question before they were asked to evaluate the accuracy of their warning system.

by various sources. In some cases, national weather information is supplemented with regional or local information from observations and radar. Q52: Using a scale from zero to ten, where zero means no trust and ten means complete trust, how much do you trust the accuracy of weather radio broadcasts for your local area from the National Weather Service?

Inaccuracy in Tornado Warning Systems area, which of the following most accurately describes what you would do?10 0–Nothing; continue on as before the warning was received 1–Move to the most sheltered part of your residence, but do not leave your residence 2–Move to a specially constructed storm shelter on your property 3–Move to a nearby location or building that you consider to provide better shelter 4–Leave your residence and drive away from the tornado warning area To operationalize the respondent’s intended response, we create a dichotomous protective action variable, where zero denotes that no protective action would be taken and one indicates that some type of protective action (i.e., options 1, 2, 3, or 4) would be taken if the respondent were to receive such a warning in the future. When measured in this way, 91% of the respondents who completed our survey said that they would take some sort of protective action in response to the hypothetical future warning, whereas 9% indicated that they would do nothing.

49 gression (M2ORD and M3ORD ).11 If theory is correct (H2 /H3 ), there will be a negative and statistically significant relationship between pFAR and pMER and trust in the NWS—as pFAR and pMER increase, trust will decrease. Model 4 regresses trust in the NWS on pFAR and pMER simultaneously, again using both OLS regression (M4OLS ) and ordered logistic regression (M4ORD ). Doing so allows us to isolate the influence of one type of perceived error on trust while controlling for the other. In line with previous research,(18–20) we control for tornado risk perceptions in the response model (M1) and basic demographic differences in all four models (M1–M4), including age, education, gender, and race/ethnicity. We include these controls to validate the quality of our data and to ensure that our inferences about the relationship between perceived accuracy, trust in the NWS, and intended behavior are not biased by demographic differences in our sample. These controls are not meant to provide an exhaustive list of the many other factors that may affect trust and intended behavior. Such an analysis is beyond the scope of this work. 4.3. Findings

4.2. Analytical Strategy and Hypotheses To assess the relationships between these variables, we estimate a set of four models. As depicted in Fig. 2, model 1 (M1) uses logistic regression to assess the relationship between credibility and response by regressing intended behavior on trust in the NWS. If theory is correct, then (H1 ) there will be a positive and statistically discernible relationship between trust and intended response—as trust in the NWS increases the likelihood that an individual will respond to the hypothetical future warning with some sort of protective action. Models 2 and 3 assess the relationship between perceived accuracy and credibility by independently regressing trust in the NWS on pFAR and pMER. To ensure that our results are robust to methodological differences, we estimate these models using both ordinary least squares (OLS) regression (M2OLS and M3OLS ) and ordered logistic re-

Table II presents the estimates derived from model 1. Consistent with H1 , the estimates derived from model 1 show that survey respondents who reported higher levels of trust in the NWS were, on average, more likely to select protective action in response to the hypothetical future warning than respondents who reported lower levels of trust. The substantive effect of this relationship is rather pronounced—according to the model, an increase in trust from the 5th to 95th percentile is predicted to increase the probability of intended action by 0.05, which is sizable considering that the baseline (mean) probability of intended action in response to the hypothetical warning is rather close to the ceiling (0.91). Consistent with previous research,(21) model 1 also indicates that respondents who perceive that tornadoes pose a relatively high risk to them or their family were more likely to select protective action than respondents who perceive that tornadoes pose

10 We

randomly varied the wording of Q88 so as to explore the effect of impact-based messages on warning responses; results of that analysis are presented in Ripberger et al.(17) Because respondents were randomly exposed, variation with respect to question wording is orthogonal to variation on the variables of interest in this study. Accordingly, it was not necessary to account for wording variation in the models we estimated for this article.

11 In

theory, ordered logistic regression should be used when working with dependent variables that are measured on ordinal scales. In practice, however, researchers often use OLS regression when working with dependent variables that are measured on ordinal scales that include a large number of categories, like our 0–10 trust scale, which includes 11 categories.

50

Ripberger et al.

Fig. 2. Diagram of the models used to test Hypotheses 1–5.

Table II. Predictors of Intended Response to Hypothetical Tornado Warning M1 Trust in the NWS Age (years) Education (1 = college+) Gender (1 = male) Race/ethnicity (1 = minority) Risk perception Intercept Log-likelihood AIC N

0.12*** (0.03) 0.00 (0.00) 0.17 (0.12) −0.39** (0.12) 0.06 (0.13) 0.16*** (0.02) 0.42 (0.30) −1110.92 2235.84 3,789

Logistic regression; ***p < 0.001, **p < 0.05, *p < 0.10 in twotailed hypothesis tests. Standard errors listed in parentheses.

a relatively low risk. The opposite is true for male respondents, who were less likely to choose a protective action than female respondents. Again, this result is consistent with previous research on warning response,(22,23) which enhances our confidence in the validity of our data and, consequently, the inferences drawn in the sections that follow. Table III presents the estimates derived from models 2, 3, and 4. The estimates derived from models 2 and 3 are consistent H2 and H3 . On average, respondents who scored highly on the pFAR and pMER indicators reported lower levels of trust in the NWS than respondents who perceived their warning system to be more accurate. Model 4 indicates that the substantive effects of pFAR and pMER on trust are relatively similar, with a slight edge toward pMER. According to the estimates derived from M4OLS , an increase in the perceived FAR from the 5th to 95th percentile of the observed distribution is predicted to decrease trust in the NWS by 0.37 units on the trust scale, a change that corresponds to 0.20

standard deviations. A similar shift in the perceived MER from the 5th to 95th percentile produces a slightly larger decrease in predicted trust—0.53 units on the trust scale, or 0.29 standard deviations. The coefficients derived using ordered logistic regression rather than OLS regression tell a similar story—there is a negative and statistically significant relationship between pFAR, pMER, and trust. Again, however, the combined model (M4ORD ) suggests that the substantive impact of perceived missed events is slightly more pronounced than the impact of perceived false alarms on trust in the NWS. The results of models 2, 3, and 4 also indicate that respondent demographics are related to trust in the NWS. On average, older respondents placed more trust in the NWS than younger respondents. By comparison, male and college-educated respondents were a bit less trusting, respectively, than female respondents and less educated respondents who have not completed a college degree. 5. THE ORIGINS OF PERCEIVED ACCURACY The results presented above provide evidence that perceptions of warning system accuracy matter—they influence the level of trust that people have in the NWS, which significantly impacts intended responses to future warnings. Motivated by this finding, we turn now to a critical follow-up question—Where do subjective perceptions of warning system accuracy come from? Why do some people perceive that the warning system in their area is accurate, while others perceive that their system is error prone? We propose that two broadly defined factors will interact to explain some of this variation—local experience (objective accuracy) and knowledge about severe weather. 5.1. Local Experience Prior research suggests that subjective perceptions are, in part, a function of empirical observation (objective reality). Perceptions about the weather,

Inaccuracy in Tornado Warning Systems

51

Table III. Predictors of Trust in the National Weather Service Linear (OLS) Regression M2OLS Perceived false alarm ratio

−0.63*** (0.10)

Perceived missed event ratio Age (years) Education (1 = college+) Gender (1 = male) Race/ethnicity (1 = minority) Intercept Log-likelihood AIC R2 N

M3OLS

0.01*** (0.00) 0.12 (0.06) −0.27*** (0.06) 0.09 (0.07) 8.61*** (0.13) −6568.12 13150.25 0.03 3,299

−0.73*** (0.10) 0.01*** (0.00) −0.14* (0.06) −0.28*** (0.06) 0.14 (0.07) 8.51*** (0.12) −6559.27 13132.53 0.03 3,291

Ordered Logistic Regression M4OLS

M2ORD

−0.38*** (0.11) −0.55*** (0.11) 0.01*** (0.00) −0.14* (0.06) −0.27*** (0.06) 0.14 (0.07) 8.66*** (0.13) −6485.24 12986.48 0.04 3,265

M3ORD

−0.58*** (0.10)

0.01*** (0.00) −0.22** (0.07) −0.31*** (0.07) 0.18* (0.08)

−5085.57 10201.13 3,299

−0.65*** (0.10) 0.01*** (0.00) −0.23*** −0.07) −0.33*** (0.07) 0.22** (0.08)

−5071.05 10172.10 3,291

M4ORD −0.38** (0.12) −0.47*** (0.12) 0.01*** (0.00) −0.23*** (0.07) −0.32*** (0.07) 0.22** (0.08)

−5016.85 10065.70 3,265

OLS and ordered logistic regression; *** p < 0.001, ** p < 0.05, * p < 0.10 in two−tailed hypothesis tests. Standard errors listed in parentheses. Intercept thresholds for ordinal logistic regression models are available upon request.

for example, are—in part—based on local weather patterns.(19,24) In much the same way, one might surmise that an individual’s perceptions about warning system accuracy are influenced by the ratio of errors to nonerrors produced by the system responsible for issuing warnings for that individual’s location. In other words, objective accuracy is likely to influence perceived accuracy. If this were true, then individuals who live in relatively error-prone warning areas would report higher pFAR and pMER scores than people who live within the jurisdiction of relatively accurate warning systems. It is also possible that people disproportionately weight recent experiences when evaluating the frequency with which events occur.(25) In the context of warning system accuracy, for example, this so-called recency bias suggests that individuals may rely upon their most recent experience rather than the whole of their experience when formulating perceptions about the accuracy of their warning system. Was the most recent warning a false alarm? Was a warning issued before the last tornado occurred? If the recency bias holds true, then individuals who answer “yes” to the first question and/or “no” to the second question would report higher pFAR and/or pMER scores than people whose most recent experience was an accurate warning.

5.2. Knowledge About Severe Weather Like experience, knowledge is a broad concept that is likely to influence perceptions about warning system accuracy in a number of different ways. People who are unfamiliar with the difference between severe weather watches and warnings, for example, may overestimate the number of errors produced by their warning systems. The logic behind this conjecture is rather simple—by design, tornado watches are issued with less certainty and more frequency than tornado warnings; accordingly, people who conflate watches and warnings will overestimate the frequency with which warnings are issued, which (all else equal) will lead to higher ratios of perceived error. 5.3. Measures 5.3.1. Local Experience In the United States, the NWS operates 122 Weather Forecast Offices (WFOs) that are responsible for issuing severe weather warnings for the counties included in their jurisdictions. By virtue of this responsibility, they represent and maintain the systems tasked with issuing tornado warnings to the public. As such, we use WFO-level tornado warning

52

Ripberger et al.

performance to capture the variety of local experi` ences that respondents have vis-a-vis the accuracy of their warning systems. To accomplish this, we asked respondents to give us the five-digit zip code of their primary residence. Using this information, we match each respondent to a WFO. Then, we use data from the NWS Performance Branch (Severe Weather Verification Database) to assign a three-year FAR and MER score for each of the WFOs to which our respondents were matched.12 This yields objective local accuracy or “experience” scores for each respondent that range from 0.36 to 1.00 (FAR) and 0.00 to 1.00 (MER), have means of 0.73 and 0.29, and standard deviations of 0.11 and 0.17, respectively. 5.3.2. Most Recent Experience To capture each respondent’s most recent experience, we create two 0/1 dichotomous variables. The first variable categorizes respondents according to the most recent warning that was issued by their WFO—if it was a false alarm, respondents receive a 1; if it was verified (a tornado occurred), respondents receive a 0. The second variable classifies respondents according to the most recent tornado that touched down in their WFO region. If the tornado was warned, respondents receive a 0; if it was not warned (a missed event), respondents receive a 1. When measured in this way, 86% of our respondents experienced a false alarm the last time their WFO issued a tornado warning; 45% experienced a missed event the last time their region was impacted by a tornado. 5.3.3. Knowledge About Severe Weather We measure knowledge by assessing respondent ability to distinguish between watches and warnings. We accomplish this by way of a randomized split design, wherein 50% of our respondents were asked to identify one of the two following advisories as a Tornado WATCH or WARNING: Q25a: This advisory is issued when severe thunderstorms and tornadoes are possible in and near the area. It does not mean that they will occur. It only means they are possible. Q26b: This advisory is used when a tornado is imminent. When this advisory is issued, seek safe shelter immediately. 12 We

obtained these data here: https://verification.nws.noaa.gov/ stats/severe/request.aspx.

The other 50% were asked to identify one of the two following advisories as a severe thunderstorm WATCH or WARNING: Q27c: This advisory is issued when severe thunderstorms are possible in and near the area. It does not mean that they will occur. It only means they are possible. Q28d: This advisory is issued when severe thunderstorms are occurring or imminent in the area. Using responses to this question, we create a dichotomous knowledge measure, where correct answers receive a 1, and incorrect or “Don’t Know” answers receive a 0.13 When measured in this way, 78% of our respondents were able to distinguish between a severe weather watch and warning, whereas 22% were not. 5.4. Analytical Strategy and Hypotheses To discern the influence of local experience and knowledge about severe weather on perceived accuracy, we estimate two linear models. The first model (M5) uses OLS regression to assess the relationship between these variables and pFAR. The second model (M6) uses the same procedure to assess the relationship between these variables and pMER. If our propositions about local experience are correct, then (H6 ) there will be a positive and statistically discernible relationship between objective experience and subjective perceptions. Respondents who live in high FAR (MER) regions will report higher pFAR (pMER) scores than respondents who live in low FAR (MER) regions. If the recency bias holds true, then (H7 ) there will be a positive and statistically significant relationship between most recent experience and perceptions. Respondents whose last experience was a false alarm (missed event) will perceive that their warning system issues a higher rate of false alarms (misses a higher proportion of events) than people whose most recent experience with their warning system was accurate. If knowledge influences the perceived frequency of errors, then (H8 ) we will find a negative and statistically significant relationship between knowledge and perceptions. Respondents 13 In an alternatively specified model, knowledge about tornado and

severe thunderstorm advisories was treated as separate indicators. As expected, their influence on pFAR and mFAR was virtually identical, so we combined them into a general knowledge variable.

Inaccuracy in Tornado Warning Systems

53

Table IV. Predictors of Perceived False Alarm and Missed Event Ratios M5 Three-year WFO false alarm ratio Most recent warning (1 = false alarm)

0.24*** (0.05) −0.01 (0.02)

Three-year WFO missed event ratio Most recent tornado (1 = missed event) Knowledge (1 = knows warn. vs. watch) Age (years) Education (1 = college+) Gender (1 = male) Race/ethnicity (1 = minority) Intercept R2 N

M6

−0.04** (0.01) −0.002*** (0.00) −0.02 (0.01) 0.05*** (0.01) 0.00 (0.01) 0.46*** (0.04) 0.03 3317

0.15*** (0.03) 0.03** (0.01) −0.10*** (0.01) −0.002*** (0.00) −0.04*** (0.01) 0.01 (0.01) 0.06*** (0.01) 0.39*** (0.03) 0.05 3310

OLS regression; *** p < 0.001, ** p < 0.05, * p < 0.10 in two-tailed hypothesis tests. Standard errors listed in parentheses.

who are unable to distinguish between watches and warnings will overestimate the frequency with which their system produces errors. Like models 1 through 4, models 5 and 6 contain controls for demographic attributes, including age, education, gender, and race/ethnicity.

5.5. Findings The estimates we derived from models 5 and 6 are summarized in Table IV. Consistent with H6 , the coefficients derived from both models indicate that objective local experience significantly influences subjective perceptions about warning system accuracy. Controlling for recent experience, knowledge, and demographics, the estimates derived from M5 indicated that a shift in the FAR from the 5th to 95th percentile of the observed distribution is predicted to increase pFAR by 0.09, which represents an increase of 0.29 standard deviations. A similar increase in the MER from the 5th to 95th percentile leads to a slightly smaller change in pMER—0.06, which corresponds to 0.18 standard deviations on the pMER scale.

Testing for a recency bias (H7 ) produced mixed results. Controlling for three-year FAR/MER, knowledge, and demographics, pFAR was no different for respondents whose most recent experience with a tornado warning was a false alarm. There was, however, a slight but statistically significant difference in pMER between respondents whose most recent experience with a tornado was warned and respondents who experienced a missed event the last time a tornado occurred in their region. On average, the pMER among respondents who experienced a missed event was 0.03 (0.09 standard deviations) higher than the pMER among respondents who received a warning in advance of the last tornado. Moving on to H8 , the partial regression coefficient associated with knowledge is significant in both models, which is consistent with our proposition that individuals who are unable to distinguish between watches and warnings will, on average, maintain inflated perceptions about the frequency with which their system produces errors. The difference in pFAR and pMER between those who can and cannot make this distinction is roughly −0.04 and −0.10, which equate to differences of 0.13 standard deviations on the pFAR scale and 0.30 standard deviations on the pMER scale. Last, these models clearly show that demographics are related to perceptions about warning system accuracy. However, these relationships vary in nature. Age, for example, exerts a uniformly negatively influence on both pFAR and pMER. All else equal, older respondents perceive that their warning system is more accurate than younger respondents— even if they live in the same warning region, have the same level of knowledge, and share the same set of demographic characteristics. Race/ethnicity and education, by comparison, only influenced pMER. Compared to otherwise similar non-Hispanic white respondents, Hispanic and nonwhite respondents perceive that their tornado warning system missed a higher proportion of events. The same is true of respondents who did not complete college. Differences with respect to gender, on the other hand, were only associated with pFAR. Controlling for experience, knowledge, and the other demographic characteristics, male respondents reported significantly higher pFAR scores than female respondents. 6. SUMMARY AND CONCLUSIONS When considered in tandem, the findings reported in Sections 4.3 and 5.5 of this article provide

54

Ripberger et al.

Fig. 3. The empirical influence of false alarms and missed events on intended responses to tornado warnings. Estimated via maximum likelihood with a logit link function for the response on trust regression; N = 3,146; *p< 0.001.

a revealing test of the warning system errorperception-response model discussed in Section 1. To illustrate this point, we estimated a path model that simultaneously estimates models 1, 4, 5, and 6 (described above). Fig. 3 displays the coefficient estimates we derived from this model.14 Folklore, conventional wisdom, and theory suggest that errors influence individual perceptions of the accuracy of warning systems, which affect the credibility that people assign to information provided by systems, and—by extension—public decisions about how to respond when warnings are issued. As illustrated in Fig. 3, our empirical findings are consistent with each step of this logic. Relatively frequent false alarms and missed events appear to generate heightened perceptions of inaccuracy; high levels of perceived inaccuracy correspond with less trust in the NWS; and low levels of trust diminish the likelihood of intended action in response to the hypothetical warning. If intended behavior predicts future behavior, then these findings indicate that people who live in relatively error-prone areas are systematically less likely to engage in protective action in response to future tornado warnings than people who live in regions of the United States that are monitored and warned by relatively accurate WFOs. Unfortunately, extant research says relatively little about the relationship between intended and actual responses to future tornado warnings. There is, however, an extensive research program in psychology on the link between intended behavior and actual behavior.(26,27) Meta-analytic reviews of this program consistently reveal a positive and statistically significant relationship between intended and actual future behavior.(28–30) Moreover, recent research indicates that this relationship holds in 14 Most

recent experience, knowledge, and demographics were included as exogenous predictors of pFAR and pMER, but excluded from Fig. 6 to facilitate a comparison with Fig. 1. Results are available upon request.

high-stress situations created by natural disasters, like hurricanes.(31) Accordingly, we are confident that our results provide some insight into the influence of warning system accuracy on public responses to tornado warnings. Nevertheless, we recognize the need for future research on the relationship between intended and actual responses to tornado warnings. The theory of planned behavior(27) may provide a useful framework for commencing this program of study. For example, the theory submits that—in addition to intention—behavior is strongly influenced by the extent to which people have (or think they have) control over their actions. In the context of tornado warnings, situational context may affect the amount of actual control that individuals have over their behavior. For instance, a person may plan on going to his or her shelter when a warning is issued, but may be unable to because the person is stuck in traffic when the warning is issued. In this instance, perceptions about warning system accuracy are rendered insignificant by the situation. In other instances, perceptions about accuracy might be significant because they influence the amount of control that someone thinks they have over their ability to protect themselves from tornadoes and, consequently, their behavior when tornado warnings are issued. A person who believes that his or her warning system is highly inaccurate may think that the person has less control over his or her situation than people who think that their warning system is highly accurate. Again, however, these are mere conjectures that warrant additional research. As we continue to study responsiveness to warnings, it is important to remember that the error-perception-response model is predicated on the notion that protective action is costly in that it prevents people from engaging in “normal” behaviors, like work, recreation, and/or rest. In some cases, these costs are monetary (e.g., hourly wages) and

Inaccuracy in Tornado Warning Systems in other cases they represent opportunity costs for interrupted or deferred activities (sleep, recreation, etc.). Either way, the model suggests that sensitivity to these costs explains why people are, on average, hesitant to heed subjectively inaccurate warnings by taking protective action. When false alarms are relatively frequent, the expected net costs of taking protective action within a given area are likely to exceed the net benefits of action, which is why people decide against action. Our findings are consistent with this explanation. However, these costs may differ substantially across individuals (and households), which may explain why some people are more or less likely to act upon potentially inaccurate tornado warnings. Thus, we urge additional research on the factors that influence these costs and the extent to which these differing costs moderate the relationship between error perceptions and warning responsiveness. We also encourage additional investigation into the antecedents of perceived accuracy. Our results indicate that objective accuracy is but one of the many factors that influence subjective perceptions of warning system accuracy. Other factors like knowledge and demographic characteristics are influential as well, suggesting that perceived accuracy is a complex construct that differs within and across warning system boundaries. As such, we encourage additional study of the origins of perceived accuracy. It may be, for instance, that culture, social networks, and/or basic values like trust in technology are important predictors of perceptions about accuracy. Similarly, it is possible that the way in which individuals receive tornado warnings influences their perceptions about the accuracy of the warning system and, as a result, the credibility of the warning system. People who rely on television news programs that may exaggerate the risk of tornadoes, for example, may overestimate the number of errors produced by their system. This is especially likely if those people fail to differentiate between official tornado warnings issued by the NWS and cautionary statements delivered by local broadcasters. Our research is, of course, limited by the nature of the data employed. Survey responses, based on personal recall or hypothetical scenarios depicting future events, are imperfect indicators of actual responses in the context of real severe weather events. Indeed, given the survey data employed, the significant and sizable effects of externally measured local WFO FAR and MER rates on perceived forecasting errors are quite remarkable. However,

55 the functional form and magnitude of these effects, and the potential effects of contextual factors and conditions, will require additional research effort. Future research notwithstanding, our findings provide relatively straightforward theoretical and practical implications. On the theoretical side, our results provide strong empirical support for the proposition that both types of warning system error— false alarms and missed events—influence warning response. This finding is consistent with Simmons and Sutter(11,12) and inconsistent with the majority of other studies that have empirically examined this relationship.(7–10) There are a number of factors that may contribute to this discrepancy. For example, Simmons and Sutter focus on the long-term effect of multiple false alarms on warning system response, whereas the other studies focus on the short-term effect of one or two false alarms on warning system credibility and response. Our findings suggest that there is a long-term and short-term effect, but that the short-term effect is quite a bit smaller than the long-term average effect. It may be that Dow and Cutter,(7) Carsell,(8) and other such studies have failed to find a relationship between false alarms and warning response because the effect of an isolated error or two is, in practice, rather negligible, whereas the effect of multiple errors over longer periods of time is more pronounced. If this is the case, then extant theory about the relationship between warning system accuracy and warning response should be refined to account for the difference between repeated and isolated exposure to false alarms and missed events. On the practical side, our results suggest that one approach to increasing public responsiveness to tornado warnings would involve an attempt to change perceptions regarding the relative accuracy of NWS warning systems. One way to accomplish this would involve a technical campaign to improve the objective accuracy of warning systems across the country, which—given the results we present above— would likely prompt modest increases in the credibility of warning systems as well as public responsiveness. It is important to reiterate, however, that objective accuracy is only one of the factors that influences perceived accuracy. Thus efforts to increase perceived accuracy are more likely to be successful if technical campaigns are coupled with social campaigns. For example, the NWS could launch an information campaign recounting the dramatic improvements in warning system accuracy that have occurred over the past couple decades and/or the accuracy of

56 warning systems vis-a-vis highly salient events such as the 2013 tornados in Oklahoma. Such actions would have the potential to influence individual perceptions about the accuracy of tornado warning systems, which—if our results are correct—will translate into a higher probability of individuals taking protective action when warnings are issued. ACKNOWLEDGMENTS Partial funding for this project was provided by NOAA/Office of Oceanic and Atmospheric Research under NOAA-OU Cooperative Agreement #NA11OAR4320072, U.S. Department of Commerce, and the OU Office of the Vice President for Research. The statements, findings, conclusions, and recommendations are those of the authors and do not necessarily reflect the views of NOAA, the U.S. Department of Commerce, or the University of Oklahoma. REFERENCES 1. Aesop. Aesop’s Fables: Complete, Original Translation from Greek. Charleston, SC: Forgotten Books, 2007. 2. Breznitz S. Cry Wolf: The Psychology of False Alarms. Lawrence Erlbaum Associates, 1984. 3. Gruntfest E, Carsell K. The warning process toward an understanding of false alarms. Department of Geography and Environmental Studies, University of Colorado at Colorado Springs, 2000. 4. Weaver J, Gruntfest E, Levy G. Two floods in Fort Collins, Colorado: Learning from a natural disaster. Bulletin of the American Meteorological Society, 2000; 81(10): 2359–2366. 5. Sutter D, Erickson S. The time cost of tornado warnings and the savings with storm-based warnings. Weather, Climate, and Society, 2010; 2(2):103–112. 6. Brooks H. Tornado-warning performance in the past and future. Bulletin of the American Meteorological Society, 2004; 85(6):837–844. 7. Dow K, Cutter S. Crying wolf: Repeat responses to hurricane evacuation orders. Coastal Management, 1998; 26(4):237–252. 8. Carsell K. Impacts of a false alarm: The January 29, 2000 Ventura, California experience (M.S. Thesis). Department of Geography and Environmental Studies, University of Colorado at Colorado Springs, 2001. 9. Schultz D, Gruntfest E, Hayden M, Benight C, Drobot S, Barnes L. Decision making by Austin, Texas, residents in hypothetical tornado scenarios. Weather, Climate, and Society, 2010; 2(3):249–254. 10. Donner W, Rodriguez H, Diaz W. Tornado warnings in three southern states: A qualitative analysis of public response patterns. Journal of Homeland Security and Emergency Management, 2012; 9(2):Art. 5. DOI: 10.1515/1547-7355.1955. 11. Simmons K, Sutter D. False alarms, tornado warnings, and tornado casualties. Weather, Climate, and Society, 2009; 1(1): 38–53. 12. Simmons K, Sutter D. Economic and Societal Impacts of Tornadoes. Boston, MA: American Meteorological Society, 2011.

Ripberger et al. 13. Ashley W. Spatial and temporal analysis of tornado fatalities in the United States: 1880–2005. Weather and Forecasting, 2007; 22(6):1214–1228. 14. Couper MP. Web surveys: A review of issues and approaches. Public Opinion Quarterly, 2000; 64(4):464–494. 15. Baker R, Blumberg SJ, Brick JM, Couper MP, Courtright M, Dennis JM, Dillman D, Frankel MR, Garland P, Groves RM, Kennedy C, Krosnick J, Lavrakas PJ, Lee S, Link M, Piekarski L, Rao K, Thomas RK, Zahs D. Research synthesis AAPOR report on online panels. Public Opinion Quarterly, 2010; 74(4):711–781. 16. AAPOR. Standard definitions: Final dispositions of case codes and outcome rates for surveys, 2011. Available at: http://aa por.org/Content/NavigationMenu/AboutAAPOR/Standardsa mpEthics/StandardDefinitions/StandardDefinitions2011.pdf, Accessed June 5, 2014. 17. Ripberger J, Silva C, Jenkins-Smith H, James M. The Influence of Consequence-Based Messages on Public Responses to Tornado Warnings. Norman, OK: University of Oklahoma Center for Risk and Crisis Management, 2014. 18. Fothergill A, Maestas EGM, Darlington JD. Race, ethnicity and disasters in the United States: A review of the literature. Disasters, 1999;23(2):156–173. 19. Goebbert K, Jenkins-Smith H, Klockow K, Nowlin M, Silva C. Weather, climate, and worldviews: The sources and consequences of public perceptions of changes in local weather patterns. Weather, Climate, and Society, 2012; 4(2):132–144. 20. Nagele D, Trainor J. Geographic specificity, tornadoes, and protective action. Weather, Climate, and Society, 2012; 4(2):145–155. 21. Kalkstein A, Sheridan S. The social impacts of the heat–health watch/warning system in Phoenix, Arizona: Assessing the perceived risk and response of the public. International Journal of Biometeorology, 2007; 52(1):43–55. 22. Comstock R, Mallonee S. Comparing reactions to two severe tornadoes in one Oklahoma community. Disasters, 2005; 29(3):277–287. 23. Bateman J, Edwards B. Gender and evacuation: A closer look at why women are more likely to evacuate for hurricanes. Natural Hazards Review, 2002; 3(3):107–117. 24. Hamilton L, Keim B. Regional variation in perceptions about climate change. International Journal of Climatology, 2009; 29(15):2348–2352. 25. Bjork R, Whitten W. Recency-sensitive retrieval processes in long-term free recall. Cognitive Psychology, 1974; 6(2): 173–189. 26. Fishbein M, Ajzen I. Belief, Attitude, Intention and Behavior: An Introduction to Theory and Research. Reading, MA: Addison-Wesley, 1975. 27. Ajzen I. The theory of planned behavior. Organizational Behavior and Human Decision Processes, 1991; 50(2):179–211. 28. Armitage C, Conner M. Efficacy of the theory of planned behaviour: A meta-analytic review. British Journal of Social Psychology, 2001; 40(4):471–499. 29. Schulze R, Wittmann W. A meta-analysis of the theory of reasoned action and the theory of planned behavior: The principle of compatibility and multidimensionality of beliefs as mod¨ erators. Pp. 219–250 in Schulze R, Holling H, Bohning D (eds). Meta-Analysis: New Developments and Applications in Medical and Social Sciences. Seattle, WA: Hogrefe & Huber, 2003. 30. McEachan R, Conner M, Taylor N, Lawton R. Prospective prediction of health-related behaviours with the theory of planned behaviour: A meta-analysis. Health Psychology Review, 2011; 5(2):97–144. 31. Kang J, Lindell M, Prater C. Hurricane evacuation expectations and actual behavior in Hurricane Lili1. Journal of Applied Social Psychology, 2007; 37(4):887–903.

False alarms and missed events: the impact and origins of perceived inaccuracy in tornado warning systems.

Theory and conventional wisdom suggest that errors undermine the credibility of tornado warning systems and thus decrease the probability that individ...
495KB Sizes 0 Downloads 5 Views