The Laryngoscope C 2015 The American Laryngological, V

Rhinological and Otological Society, Inc.

Effect of Environmental Factors on Internet Searches Related to Sinusitis Thomas J. Willson, MD; Joshua Lospinoso, PhD; Erik K. Weitzel, MD; Kevin C. McMains, MD Objectives/Hypothesis: Sinusitis significantly affects the population of the United States, exacting direct cost and lost productivity. Patients are likely to search the Internet for information related to their health before seeking care by a healthcare professional. Utilizing data generated from these searches may serve as an epidemiologic surrogate. Study Design: A retrospective time series analysis was performed. Methods: Google search trend data from the Dallas–Fort Worth metro region for the years 2012 and 2013 were collected from www.google.com/trends for terms related to sinusitis based on literature outlining the most important symptoms for diagnosis. Additional terms were selected based on common English language terms used to describe the disease. Twelve months of data from the same time period and location for common pollutants (nitrogen dioxide, ozone, sulfur dioxide, and particulates), pollen and mold counts, and influenza-like illness were also collected. Statistical analysis was performed using Pearson correlation coefficients, and potential search activity predictors were assessed using autoregressive integrated moving average. Results: Pearson correlation was strongest between the terms congestion and influenza-like illness (r 5 0.615), and sinus and influenza-like illness (r 5 0.534) and nitrogen dioxide (r 5 0.487). Autoregressive integrated moving average analysis revealed ozone, influenza-like illness, and nitrogen dioxide levels to be potential predictors for sinus pressure searches, with estimates of 0.118, 0.349, and 0.438, respectively. Nitrogen dioxide was also a potential predictor for the terms congestion and sinus, with estimates of 0.191 and 0.272, respectively. Conclusions: Google search activity for related terms follow the pattern of seasonal influenza-like illness and nitrogen dioxide. These data highlight the epidemiologic potential of this novel surveillance method. Key Words: Internet search, sinusitis, pollution, allergy, Google, big data. Level of Evidence: NA Laryngoscope, 125:2447–2450, 2015

INTRODUCTION Sinusitis is a disease affecting a large portion of the population of the United States. It is broadly divided into two subsets: acute bacterial rhinosinusitis (ABRS) and chronic rhinosinusitis (CRS). Nearly 20 million cases of ABRS are reported each year in the United States, and the chronic subtype maintains a 14% to 16% lifetime prevalence.1–3 The true incidence of the disease is unknown, given the number of cases in individuals who never seek medical attention. Tracking disease incidence and/or prevalence can be a difficult and tedious task

From the Department of Otolaryngology–Head and Neck Surgery (T.J.W., E.K.W.), San Antonio Military Medical Center, San Antonio, Texas; Portia Statistical Consulting (J.L.), San Antonio, Texas; and Department of Otolaryngology–Head and Neck Surgery (K.C.M.), South Texas Veterans Medical Center, San Antonio, Texas, U.S.A. Editor’s Note: This Manuscript was accepted for publication May 12, 2015. Presented at the Triological Society Combined Spring Meeting, Las Vegas, Nevada, U.S.A., May 15, 2014. The opinions and assertions contained herein are the private views of the authors, and are not to be construed as official, or as reflecting the views of the Department of the Army, Department of the Air Force, or Department of Defense. The authors have no funding, financial relationships, or conflicts of interest to disclose. Send correspondence to Thomas Joseph Willson, MD, San Antonio Military Medical Center, Otolaryngology, 3551 Roger Brooke Dr., San Antonio, TX 78234. E-mail: [email protected] DOI: 10.1002/lary.25420

Laryngoscope 125: November 2015

through epidemiological methods. Recently investigations have probed the utility of using Internet search data to survey geographic populations for disease, most notably Google’s Flu Trends.4–7 Although imperfect, utility of this novel epidemiologic tool has been demonstrated. Prior reports suggest a possible link between poor air quality and an increased prevalence of sinusitis.8,9 Sulfur dioxide (SO2) and nitrogen dioxide (NO2) have been linked with coal burning and automobile exhaust, respectively. Also, these pollutants have been associated with both hay fever and the prevalence of sinusitis.2 Given these associations, it is unsurprising that proximity to major thoroughfares may be considered a marker for increased exposure to particulates and prevalence of sinusitis.2 In this study, we examined the relationship between measured levels of airborne pollutants, aeroallergens, and influenza-like illness (ILI) activity compared to internet search activity for terms related to sinusitis in the Dallas–Fort Worth (DFW) metro region.

MATERIALS AND METHODS Approval for this research was obtained from the San Antonio Military Medical Center Institutional Review Board. Internet search trend activity data were collected exclusively through Google at google.com/trends. Terms were selected based on common patient sinusitis symptom complaints (congestion, sinus pressure, snot), as well as the terms sinus and sinusitis. Search data and aeroallergen count included activity from

Willson et al.: Internet Searches Related to Sinusitis

2447

Fig. 1. Sinus-related search terms with trending peaks that cluster around a spike in the regional influenza-like illness (ILI) level. Note this trend recurs in the following year at far right. CDC 5 US Centers for Disease Control and Prevention.

January 2012 through December 2013 for the DFW metro region. Pollutant and ILI data were limited to 12 months and collected for the year 2012. Two years of allergen data allowed capture of complete allergen seasons, which bridge evaluated time periods. National Allergy Bureau daily pollen and mold counts were obtained from archived records. Because Google Trends data are produced in weekly intervals, the pollen and mold data were averaged over each week to correspond with Google data sets. Linear transformation to a 100-point scale was applied to the pollen and mold data sets to constrict the range and match the search trend data scale. The maximum value for each data set was set to 100, and the remaining data within each set were converted proportionately. Air quality index (AQI) data were retrieved from the Environmental Protection Agency (EPA) website (www.epa.gov) utilizing sites located in the DFW metro region. AQI is calculated by the EPA based on national air quality standards and reported on a scale from 0 to 500. Based on this index, 0 to 50 is considered “good,” 51 to 100 “moderate,” 101 to 150 “unhealthy for sensitive groups,” 151 to 200 “unhealthy,” 201 to 300 “very unhealthy,” and 301 to 500 hazardous. This allows for a standardized way to classify differing levels of pollutants in the air. Because DFW data on ILI were not available during the study, data for the State of Texas were obtained from the US Centers for Disease Control and Prevention (CDC) website (www.CDC.gov). All data were maintained in spreadsheet format using Excel (Microsoft, Redmond, WA). A graphical analytic approach was used to explore the temporal relationships between search term trends, actual aeroallergen trends, AQI trends, and ILI. Pearson correlation, using SPSS (IBM, Armonk, NY), was computed between search trend data, environmental data, and ILI data to assess for relationships. Two tailed t-test for significance of correlation was applied. Using autoregressive integrated moving average (ARIMA), a time series modeling technique, time series models

were constructed for terms and independent variables (environmental factors) that assessed for ability to predict search activity. An ARIMA model predicts a value in a dependent time series based on its past values, past errors, and current and past values of other times series.10

RESULTS Graphical analysis of the time series showed a similar overall trend in the activity levels of the three selected sinus symptom terms (congestion, sinus pressure, snot) and ILI activity (Fig. 1). This was observed chiefly in November through March of 2012 and appeared to be trending again in November into December 2013. There is a well-defined peak in search activity for the three terms, which occurs at the height of the influenza season, in January. On analysis of graphed pollutant time series, the pollutants did not appear to exert any visible effect on the search activity. Pearson correlation (Table I) showed several statistically significant correlations when evaluated by twotailed t-test. Moderate to strong positive correlations were observed between the search term sinus and both ILI activity (r 5 0.534; P < 0.01) and NO2 AQI (r 5 0.487; P < 0.01). The search term congestion showed a strong and significant correlation with ILI (r 5 0.615; P

Effect of environmental factors on Internet searches related to sinusitis.

Sinusitis significantly affects the population of the United States, exacting direct cost and lost productivity. Patients are likely to search the Int...
156KB Sizes 1 Downloads 7 Views