Environmental Research 134 (2014) 466–473

Contents lists available at ScienceDirect

Environmental Research journal homepage: www.elsevier.com/locate/envres

Multiple imputation for assessment of exposures to drinking water contaminants: Evaluation with the Atrazine Monitoring Program Rachael M. Jones a,n, Leslie T. Stayner b, Hakan Demirtas b a Division of Environmental and Occupational Health Sciences, School of Public Health, University of Illinois at Chicago, 2121 W. Taylor St. (M/C 922), Chicago, IL 60612, United States b Division of Epidemiology and Biostatistics, School of Public Health, University of Illinois at Chicago, United States

art ic l e i nf o

a b s t r a c t

Article history: Received 4 June 2013 Received in revised form 12 May 2014 Accepted 30 July 2014 Available online 29 October 2014

Background: Drinking water may contain pollutants that harm human health. The frequency of pollutant monitoring may occur quarterly, annually, or less frequently, depending upon the pollutant, the pollutant concentration, and community water system. However, birth and other health outcomes are associated with narrow time-windows of exposure. Infrequent monitoring impedes linkage between water quality and health outcomes for epidemiological analyses. Objectives: To evaluate the performance of multiple imputation to fill in water quality values between measurements in community water systems (CWSs). Methods: The multiple imputation method was implemented in a simulated setting using data from the Atrazine Monitoring Program (AMP, 2006–2009 in five Midwestern states). Values were deleted from the AMP data to leave one measurement per month. Four patterns reflecting drinking water monitoring regulations were used to delete months of data in each CWS: three patterns were missing at random and one pattern was missing not at random. Synthetic health outcome data were created using a linear and a Poisson exposure–response relationship with five levels of hypothesized association, respectively. The multiple imputation method was evaluated by comparing the exposure–response relationships estimated based on multiply imputed data with the hypothesized association. Results: The four patterns deleted 65–92% months of atrazine observations in AMP data. Even with these high rates of missing information, our procedure was able to recover most of the missing information when the synthetic health outcome was included for missing at random patterns and for missing not at random patterns with low-to-moderate exposure–response relationships. Conclusions: Multiple imputation appears to be an effective method for filling in water quality values between measurements. & 2014 Elsevier Inc. All rights reserved.

Keywords: Multiple imputation Drinking water quality Atrazine Atrazine Monitoring Program

1. Introduction The quality of drinking water provided by community water systems (CWSs) is determined by periodic measurement of contaminants in the CWS's finished drinking water. Measurement frequencies are specified in the National Primary Drinking Water Regulations (NPDWR), promulgated by the US EPA under authority of the Safe Drinking Water Act. Measurement frequencies vary among contaminants, with contaminant concentration, and with features of the CWS. Typically, water quality is measured quarterly, annually, or once every several years. Drinking water quality data are contained in the Safe Drinking Water Information System (SDWIS), and are available to the public upon request from state agencies. Recently, SDIWS data have been integrated into the Environmental Public n

Corresponding author. E-mail address: [email protected] (R.M. Jones).

http://dx.doi.org/10.1016/j.envres.2014.07.027 0013-9351/& 2014 Elsevier Inc. All rights reserved.

Health Tracking (EPHT) Network, which will increase their accessibility to the public and researchers. SDWIS data have been utilized in several epidemiological studies in the United States (Rinksy et al., 2012; Ochoa-Acuna et al., 2009; Ward et al., 2007; Weyer et al., 2001; Munger et al., 1997), and similar data have been used in other countries (Migeot et al., 2013; Chang et al., 2010; Villanueva et al., 2005). Since the water quality measurements are relatively infrequent in a calendar year, epidemiological studies have generally averaged data across years and CWSs (Rinksy et al., 2012; Weyer et al., 2001) or restricted the study population to metropolitan areas served by a single CWS, with or without supplemental monitoring programs (Munger et al., 1997; Ochoa-Acuna et al., 2009; Chang et al., 2010). These approaches, however, have significant limitations that are particularly difficult to overcome in EPHT applications. The objective of EPHT is to utilize existing data collection systems to survey and evaluate environmental health, including

R.M. Jones et al. / Environmental Research 134 (2014) 466–473

changes in environmental quality, population health and linkages between environmental quality and health. In the EPHT framework, supplemental targeted data collection is not feasible. For example, additional water quality monitoring cannot be conducted to fill in gaps between regulatory-required monitoring. Cohort epidemiological studies require large populations, so the study cannot be restricted to small areas with the most robust data without losing power to detect associations. In addition, long-term averages may introduce significant exposure misclassification in the context of non-chronic health effects, such as adverse birth outcomes (e.g., low birth weight or pre-term birth) for which the critical time-window of exposure is measured in months (Rogers and Kavlock, 2008). These limitations led us to consider statistical methods by which to estimate drinking water quality between measurements. The specific objective of this work is to evaluate whether multiple imputation is a feasible approach for estimating water quality in each calendar month when water quality is monitored quarterly, annually or less frequently. Multiple imputation is one of many statistical methods that can be used to fill in missing values, but was selected for evaluation for two reasons. Firstly, there has been relatively limited application and evaluation of the method for environmental data (Hopke et al., 2001; Le et al., 2007; Nieh et al., 2014). Though previous research has found multiple imputation to perform relatively well, even in the absence of structured time-series imputation models for time-series data (Hopke et al., 2001), the high frequency of missing data in the context of monthly drinking water quality warrants further evaluation. Secondly, multiple imputation has advantages relative to single value imputations (e.g., mean value substitution) and common likelihood-based methods (e.g., harmonic analysis or forecasting by Becker et al., 2006; Dilmaghani et al., 2007; Weerasinghe, 2010) used to fill in missing values in environmental data. Single-value imputations reduce variance in the water quality data: exploratory analysis in these data found that filling missing observations with CWS-specific annual average atrazine concentrations underestimated the exposureresponse association, even when data are missing at random (data not shown). A particular advantage of multiple imputation with respect to likelihood-based methods is that the method identifies missing data as a source of random variation distinct from ordinary sampling variability, which maintains appropriately wide standard errors for inference (Demirtas and Hedeker, 2007). The specific multiple imputation method employed in this study was Multivariate Imputation by Chained Equations (MICE) in which unique imputation models are specified for each variable with missing values and imputed sequentially (van Buuren, 2012). This method of multiple imputation has not been previously applied to environmental data, to our knowledge. The MICE algorithm is implemented as follows (White et al., 2011). Consider a data set with x1 ; x2 ; …xk variables. Initially, all missing values are replaced by random sampling from observed values. For variable xi with missing values, xi is regressed on the other variables using the specified Table 1 Summary of AMP data used in the simulation study. The geometric mean (GM) and the geometric standard deviation (GSD) were estimated using the method of maximum likelihood with consideration for censoring at detection limits 0.05 μg/L (2007–2009) and 0.1 μg/L (2006).

Estimated Maximum

467

imputation model, and the missing values in xi are replaced by draws from the posterior predictive distribution of xi. In one cycle, this is repeated for all of the xi, i ¼ f1; 2; …; kg, variables with missing values; and the cycle is repeated several times to produce an imputed data set. This process is repeated m times, to give m imputed data sets. MICE has the advantage of not requiring a joint distribution, such that different types of variables can be multiply imputed using appropriate regression models and subsets of the variables. The objective was approached through simulation. To allow the performance of multiple imputation to be evaluated relative to real data, the simulation was performed using data from the Atrazine Monitoring Program (AMP), rather than from SDWIS. The AMP is a special program managed by the US EPA, which measures the concentration of atrazine in CWSs known to be heavily impacted by atrazine approximately every two weeks. There are thousands of CWS in the Midwest (Jones et al., 2014) of which 90 CWSs in Illinois, Indiana, Iowa, Missouri and Ohio participated in the AMP at some time during the years 2006– 2009. Owing to the relatively high-frequency of measurement in the AMP CWSs, it was possible to create missing values by deleting measurements in these data, and to compare inferences made from multiply imputed data to inferences from the real data.

2. Methods 2.1. Atrazine Monitoring Program data The AMP monitors the concentration of atrazine and related chemicals in CWSs vulnerable to atrazine pollution. Data are available online from the EPA Office of Pesticide Programs. The AMP continues monitoring campaigns conducted by private companies and professional associations (Graziano et al., 2006). Participating CWSs monitor finished (treated) and raw (untreated) drinking water approximately every two weeks: frequency may increase in the spring and summer, and decrease in the winter. In general, atrazine concentrations decrease over time within each CWS, until the CWS is removed from the AMP. The AMP data include the CWS name, and the county and state in which each CWS is located. For the years 2006–2009, 90 CWSs in Illinois, Indiana, Iowa, Missouri and Ohio ever participated in the AMP. Not every CWS participated in the AMP every year 2006–2009. The years 2006–2009 and CWSs in Midwestern states were selected because they fall into the time period and geographical area of interest for our larger study, and have consistent analytical detection limits in each CWS. During this period, the AMP consistently included the agents atrazine, simazine, deethlatrazine (DEA), and deisopropylatrazine(DIA). Simazine is a related pesticide, while DEA and DIA are metabolites of atrazine and simazine. The limit of detection for all chemicals in 2006 was 0.1 μg/L , and in 2007–2009 was 0.05 μg/L. The AMP data were prepared as follows. Random deletion was used to retain only one value in each calendar month for each CWS. In each calendar year, a CWS was excluded if there were fewer than 6 months with observations; if 6–11 months had observations rows were added to indicate missing values. As a result, there were 289 CWS-years, and 143 rows (4.2%) with missing observations of chemicals in finished water and 186 rows (5.4%) with missing observations in raw water. This approach was taken because elimination of CWS-years with observations missing in any month would reduce the data substantially, and introduce bias if the missing observations were not missing at random. The data used in the simulation study are summarized in Table 1. Lognormality of the chemical concentrations is indicated by the high GSD values in Table 1, and was confirmed by quartile–quartile plots (not shown). Many of chemicals are Table 2 Pearson's correlation of log-transformed chemical concentrations of AMP data used in the simulation study. Atrazine and simazine were measured in finished (F) and raw (R) drinking water. All correlation tests have p o 0:05.

Chemical

Water Type

GM (μg/L)

GSD

(μg/L)

N

Percent Censored

Chemicals

Atrazine-F

Atrazine-R

Simazine-F

Simazine-R

DIA

DEA

Atrazine Atrazine Simazine Simazine DIA DEA

Finished Raw Finished Raw Finished Finished

0.159 0.338 0.002 0.048 0.034 0.062

4.29 4.10 8.82 7.89 4.30 3.39

17.8 34.0 15.98 21.6 2.51 2.20

3325 3282 3325 3282 3325 3325

18 8 68 55 66 51

Atrazine-F Atrazine-R Simazine-F Simazine-R DIA DEA

1.00 0.70 0.20 0.11 0.44 0.72

1.00 0.04 0.12 0.24 0.49

1.00 0.57 0.47 0.18

1.00 0.40 0.08

1.00 0.47

1.00

468

R.M. Jones et al. / Environmental Research 134 (2014) 466–473

correlated (Table 2), which suggests that they will be useful predictors in the imputation model.

2.2. Patterns of missingness Four patterns of missing values — three missing at random (MAR) and one missing not at random (MNAR) mechanisms — were defined to reflect the monitoring frequencies for atrazine and nitrate specified in the NPDWR and observed in SDWIS data (Jones et al., 2014). Atrazine monitoring is typically required quarterly, but may be decreased to once every three years when concentrations are consistently below the maximum contaminant level, 3 μg/L. Nitrate monitoring is typically required annually. Variations to these monitoring frequencies are allowed for some CWSs. We consider the nitrate monitoring frequency because our larger study is concerned with both atrazine and nitrogen (nitrate and/or nitrite) in drinking water, and the two chemicals reflect commonly required monitoring schedules. Pattern A is an MAR pattern introduced into the real monthly data by deleting concentrations of each chemical (atrazine and simzine in raw and finished water, DIA and DEA) for two randomly selected months in each calendar quarter for each CWS. Pattern A leaves one observation per calendar quarter, which is the most frequent monitoring required by NPDWR for atrazine. This pattern produces approximately 65% missingness for atrazine. Pattern B is another MAR pattern introduced into the real monthly data by deleting atrazine concentrations in raw and finished water for 11 randomly selected months in each calendar year for each CWS. The other chemicals were deleted as in Pattern A. Pattern B leaves one atrazine observation in finished water per calendar year, which is the most frequent monitoring required by NPDWR for nitrate. This pattern produces approximately 92% missingness for atrazine and approximately 65% missing for simazine, DIA and DEA. Pattern C is another MAR pattern introduced into the real monthly data by deleting concentrations of each chemical (atrazine and simzine in raw and finished water, DIA and DEA) for 11 randomly selected months in each calendar year for each CWS. Pattern C leaves one observation per calendar year, which is the most frequent monitoring required by NPDWR for nitrate. This pattern produces approximately 92% missingness for all chemicals. Pattern C, relative to Pattern B, explores the influence of the frequency of measurement of other water contaminants on the imputation of atrazine. Pattern D is an MNAR pattern designed to reflect a two-tier regulation in which the dependence of monitoring frequency on atrazine concentration: less frequent monitoring is required of CWSs with lower atrazine concentrations. The CWSs were divided into two tiers based on the overall mean of the real atrazine concentration in each CWS. Tier 1 included 50% of CWSs with overall mean atrazine concentrations equal to or greater than the median, where the mean atrazine concentration was calculated after substitution of censored values: Pattern A was introduced to atrazine in raw and finished water in these data. Tier 2 included the remaining CWSs, and Pattern B was introduced to atrazine in raw and finished water in these data. For the other chemicals, Pattern C was introduced, leaving one observation per calendar year per CWS. This pattern produces approximately 79% missingness for atrazine. Pattern D is a MNAR pattern because better water quality (lower contaminant concentrations) in a CWS results in more missing values for the respective CWS. Though the focus of the imputation is atrazine, missing values were introduced for the other chemicals to reflect that other water contaminants are also measured on specified time intervals — e.g., quarterly or annually. The random deletion of observations was done separately for each chemical so that values for each chemical were not always present (or always missing) in the same calendar months.

3.5

2.3. Synthetic health outcomes Linear regression and Poisson regression were used to develop synthetic health outcomes, denoted Y. A synthetic health outcome was used because it was the only way to control the strength of association with water quality data. In addition, real health outcome data and co-variates were not available from our larger study because the unit of measure is the county, whereas the unit of measure of water quality is CWSs. Each county may contain multiple CWSs. These two regression methods were chosen to reflect approaches used for continuous rates and for count outcomes. In our larger study, for example, Poisson regression is used to model the observed number of adverse birth outcomes in each county (see Almberg et al. 2014). Both the linear and Poisson regression models were implemented in their simplest forms, and in epidemiological applications more complex error structures and other modifications may be appropriate. The linear regression model was Y ¼ β0 þ β1 X þ ϵ

ð1Þ

where β0 ¼ 1, β1 is the real association between water quality and adverse health, X is a vector containing the logarithm of the real atrazine concentrations in finished water (ln μg/L) measured in each month in each CWS, and the random error ϵ  Nðμ ¼ 0; σ ¼ 2Þ. We considered β1 ¼ f0; 0:3; 0:6; 0:9; 1:2g. The Poisson regression model of the mean values of a series of Poisson random variables, m, was m ¼ expðβ0 þ β1 X þ ϵÞ

ð2Þ

where β0 ¼ 0, β1 is the real association between water quality and adverse health, X is a vector containing the logarithm of the real atrazine concentrations (ln μg/L) measured in each month in each CWS, and the random error ϵ  Nðμ ¼ 0; σ ¼ 0:1Þ. The integer health outcome variable Y was determined by randomly sampling from a Poisson distributed variable with mean m. We considered β1 ¼ f0; 0:1; 0:2; 0:3; 0:4g. In both models, for the 4.2% of CWS-months in which no real atrazine concentrations were available, the health outcome was randomly selected with replacement from the distribution of health outcomes estimated for the other CWS-months.

2.4. Multiple imputation implementation Multiple imputation (MI) was implemented using Multivariate Imputation by Chained Equations in the R Project for Statistical Computing (van Buuren, 2012; van Buuren and Groothuis-Oudshoorn, 2012). We performed imputation of the logarithm of all chemical concentrations using linear regression with predictive mean matching (ppm method) and a fully specified imputation model. A linear regression with bootstrap (norm.boot method) was preferred, but yielded singularity errors. The order of imputation was DIA, DEA, simazine in finished water, atrazine in raw water, simazine in raw water and atrazine in finished water. Atrazine in finished water was imputed last because that was the primary variable of interest for inference, and by being imputed last, the imputed values of the other variables could be utilized. A total of m¼5 imputed data sets were generated. Predictor variables in the imputation model were identified based on hypothesized associations with atrazine concentrations. Predictor variables used in the final models include fraction of county land used for agriculture, fraction of county land used for corn production, average atrazine concentration in groundwater in the county over 1990–2000, month (factor), month-standardized, peak season indicator (equal to 1 if sample was collected between May and August, and 0 otherwise), CWS-year identification (random effect), the synthetic health outcome, and the

System A System B

Atrazine Concentration (ppb)

3.0 2.5 2.0 1.5 1.0 0.5 0.0 2005

2006

2007

2008

2009

2010

Calendar Time (years)

Fig. 1. Atrazine concentrations in finished drinking water exhibit temporal trends in two randomly selected community water systems.

R.M. Jones et al. / Environmental Research 134 (2014) 466–473

469

Table 3 Simulation results for multiple imputation method with 400 replicates, with linear regression health model.

β1

β1

Raw bias

Relative bias (%)

Deletion Pattern A 0.0  0.001 0.3 0.288 0.6 0.579 0.9 0.087 1.2 1.18

 0.001 0.012  0.021  0.027  0.021

–  3.97  3.43  3.00  1.78

Deletion Pattern B 0.0  0.003 0.3 0.262 0.6 0.509 0.9 0.789 1.2 1.06

 0.003  0.038  0.090  0.111  0.135

Deletion Pattern C 0.0  0.003 0.3 0.252 0.6 0.499 0.9 0.782 1.2 1.06 Deletion Pattern D 0.0 0.001 0.3 0.288 0.6 0.576 0.9 0.894 1.2 1.21

Real

Standardized bias (%)

λ

Coverage rate (%)

Mean CI width

Mean

RMSE

4.19 38.3 66.7 85.9 67.2

96.3 93.5 92.8 87.5 90.8

0.151 0.143 0.139 0.136 0.129

0.524 0.502 0.490 0.465 0.424

0.028 0.023 0.033 0.037 0.034

–  12.6  15.0  12.3  11.3

4.72 58.8 150 181 247

94.5 93.3 80.0 70.3 55.0

0.340 0.326 0.315 0.311 0.293

0.859 0.849 0.834 0.822 0.798

0.061 0.067 0.097 0.113 0.130

 0.003  0.048  0.101  0.118  0.140

–  15.9  16.8  13.1  11.7

4.62 69.4 158 177 231

94.3 92.0 74.8 64.3 57.3

0.362 0.336 0.320 0.332 0.304

0.865 0.855 0.836 0.828 0.811

0.060 0.075 0.106 0.121 0.136

0.001  0.012  0.024  0.006 0.013

–  3.87  4.19  0.641 1.05

2.57 23.9 53.5 13.2 31.4

94.5 95.8 94.5 92.8 90.0

0.251 0.247 0.221 0.184 0.158

0.732 0.735 0.684 0.595 0.512

0.046 0.045 0.046 0.039 0.038

Table 4 Simulation results for multiple imputation method with 400 replicates, with the Poisson regression health model.

β1

β1

Raw bias

Relative bias (%)

Deletion Pattern A 0.0 0.001 0.1 0.097 0.2 0.193 0.3 0.292 0.4 0.388

0.001  0.003  0.007  0.008 0.020

–  2.80  3.56  2.71  3.10

Deletion Pattern B 0.0 0.001 0.1 0.089 0.2 0.185 0.3 0.279 0.4 0.372

 0.001  0.011  0.015  0.021  0.028

Deletion Pattern C 0.0 0.001 0.1 0.082 0.2 0.166 0.3 0.251 0.4 0.333 Deletion Pattern D 0.0  0.001 0.1 0.097 0.2 0.191 0.3 0.294 0.4 0.397

Real

Standardized bias (%)

λ

Coverage rate

Mean CI width

Mean

4.64 16.7 36.5 39.9 63.2

95.5 96.0 93.5 96.3 92.5

0.091 0.081 0.084 0.093 0.093

0.553 0.528 0.500 0.517 0.473

0.018 0.015 0.019 0.020 0.021

–  10.7  7.32  7.07  7.03

1.59 39.2 47.5 66.5 80.9

95.3 93.8 92.8 92.3 90.8

0.157 0.143 0.154 0.162 0.174

0.789 0.782 0.776 0.770 0.771

0.028 0.026 0.031 0.034 0.040

0.001  0.017  0.034  0.049  0.067

–  17.5  17.2  16.3  16.8

3.11 49.1 85.8 119 165

96.5 93.5 89.8 85.8 75.8

0.210 0.185 0.199 0.217 0.210

0.870 0.859 0.853 0.860 0.832

0.035 0.035 0.047 0.057 0.070

 0.001  0.003  0.009  0.006  0.003

–  2.64  4.27  2.11  0.642

4.34 10.0 29.9 20.7 8.15

96.5 96.3 93.8 94.0 95.0

0.148 0.137 0.142 0.146 0.151

0.744 0.737 0.702 0.684 0.668

0.025 0.024 0.027 0.028 0.028

chemical concentrations in water. Agricultural land use was included as atrazine is applied in commercial agriculture, particularly to corn crops. Calendar time variables were included to reflect temporal variation in atrazine concentration in finished drinking water (Fig. 1) and application to crops. Administrative variables reflect differences among CWSs owing to water treatment practices. We also considered the predictors year, quarter, state (factor), county (factor), and source water type (factor); but excluded these variables because they provided no

RMSE

additional information or were not found to be associated with finished water quality and exhibited little variability (e.g., source water type was surface water in 4 90% of CWSs). The average atrazine concentration in groundwater was calculated for each county using the regression model of Stackelberg et al. (2012), which considers local geology, hydrology, rainfall and other environmental factors. A single groundwater value was used for all years and CWSs in each county because the model was

470

R.M. Jones et al. / Environmental Research 134 (2014) 466–473

Table 5 Simulation results for multiple imputation implemented with the linear regression health model at different levels of random error:

β1

β1

σ ¼0.2 and σ ¼ 8. λ

Raw bias

Relative bias (%)

Standardized bias (%)

Coverage rate

Mean CI width

Mean

Deletion Pattern A, σ ¼0.2 0.0 0.000 0.6 0.599 1.2 1.02

0.000  0.001  0.184

–  0.102 14.3

5.49 15.8 1220

93.8 95.0 0.0

0.037 0.018 0.089

0.526 0.611 0.623

0.007 0.004 0.165

Deletion Pattern A, σ ¼8 0.0  0.004 0.6 0.569 1.2 1.16

 0.004  0.031  0.040

–  4.15  3.32

3.38 24.2 32.7

96.8 94.5 96.3

0.575 0.584 0.579

0.504 0.515 0.513

0.109 0.117 0.114

Real

not designed to be time-varying: the primary variable in the model that can be time-varying is rainfall, but the model was developed and implemented by the investigators at the USGS using aggregated data over 10 years. We considered updating the rainfall data, but found that this would not substantially alter the estimated atrazine concentrations. The synthetic health outcome was included based on the philosophy that all relevant information should be included in the imputation model: exclusion of an important predictor of missing variables may bias the imputation, and produce incorrect inference. Were this to occur, then the analysis model would not be congenial (Meng, 1994; Schaffer, 1999).

2.5. Simulation Each replication of the MI simulation included the following steps: (1) replacement of values r LOD in the complete real data, (2) synthesis of health outcome data, (3) imposition of the missingness pattern on the chemical concentrations, (4) imputation of missing values with m¼ 5, (5) fitting of generalized linear regression model (Eqs. (1) and (2)) to predict the health outcome from imputed data, and (6) pooling of regression coefficient estimates using Rubin's rules (Rubin, 1987). To explain step (1) in more detail, replacement of values r LOD followed the approach of Lubin et al. (2004): (a) a random sample of the complete real data was drawn with replacement, (b) a lognormal distribution was fit by maximum likelihood considering the censored values using the cenmle method of Helsel (2005), (c) censored observations were replaced by values selected from the corresponding percentiles of the lognormal distribution. This approach generated unique replacements for censored observation in each replication. Simulation across 400 replications yielded 400 average estimates of the exposure–response regression coefficient,

β1,

1

2

400

denoted B1 ¼ fβ^ 1 ; β^ 1 ; …; β^ 1 g. The mean and standard

deviation of the B1 estimates are denoted β 1 and SD ½B1 , respectively. Performance of MI was summarized using the metrics outlined by Demirtas (2004): average confidence interval width, average fraction of missing information (λ), average root-mean-square error, coverage rate, raw bias, relative bias, and standardized bias. The parameter λ is a measure of the variation in β^ attributable 1

to the missing atrazine data (van Buuren, 2012). As λ increases, more information about β^ 1 is drawn from the imputed values than the observed values. Root-meansquare error is a combined measure of bias and variance, and is averaged across the qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ^i 2 1 ∑400 replications: RMSE ¼ 400 i ¼ 1 ðβ 1  β 1 Þ . The coverage rate is the percentage of i

simulation replicates in which the 95% confidence interval of β^ 1 includes β1. Coverage rates of approximately 95% indicate appropriate control of Type I error (Demirtas,

2004).

Raw

bias

is

the

difference

β 1  β1 . Relative bias is

100%  jβ 1  β 1 j=β1 , while standardized bias is 100%  jβ1  β 1 j=SD½B1 .

3. Results Simulation results for the linear and Poisson health regression models are shown in Tables 3 and 4, respectively. It is appropriate that the coverage rate when β1 ¼ 0 equal approximately 95%, to control for type I error: by random chance an association may be observed even when there is no real association. Comparing results among deletion patterns A–C indicates the influence of the frequency of missing observations on the imputation performance for observations MAR. For both linear and Poisson exposure–response models, performance of imputation was best for Pattern A, indicated by the relatively low bias, high

RMSE

coverage rates and low RMSE. This makes sense given that the data are relatively abundant in Pattern A, with one observation for each chemical in each calendar quarter. Patterns B and C have atrazine observations once per calendar year, but have observations for the other chemicals once per quarter or once per calendar year, respectively. The presence of more observations among the other chemicals improves the imputation performance for higher values of β1. The imputation performed well in the MNAR pattern, Pattern D: the RMSEs were similar to those for Pattern B, while the bias and coverage rates were consistent with (or better than) results for Pattern A. The performance of the imputation relative to the magnitude of exposure–response relationships was tested using a range of real β1 values in both the linear and Poisson exposure–response models. The magnitude of the relative bias was relatively consistent for all values of β1 within each deletion pattern, though the absolute magnitude of the raw bias increased with β1. In addition, the coverage rates decreased with increased β1, particularly Patterns B and C. Figs. 2 and 3 compare the distributions of the estimated exposure-response relationship coefficients with the real values, and show that the variance and bias in the estimated coefficients increase with more missing observations. These trends suggest that the imputation model may not be sufficiently rich, or the data may not have sufficient variability. As the exposure– response relationship increases, more information is available in the imputation model, which can reduce the variance due to missingness and result in smaller between-imputation variability. This can be addressed by increasing the number of predictor variables in the imputation model, or by increasing the variability in the predictor variables. Additional predictor variables were not identified for testing in this study. The impact of the latter approach was demonstrated by changing the variability in the linear synthetic health data from σ ¼ 2 to σ ¼0.2 or σ ¼8 (Table 5). Increased values of σ increase the confidence interval width and coverage rate with little influence on the accuracy of β 1 . Overall, while there are instances where the coverage rates are below 90%, the bias in β 1 is relatively small for all deletion patterns. While some metrics suggest that the performance of the multiple imputation was better for the Poisson than for the linear exposure–response model, the results should not be directly compared because the models have different health outcome measures and different values for β1. Overall, the performance of the multiple imputation method is strikingly good given that 65–92% of values were missing.

4. Discussion Our objective was to evaluate the performance of multiple imputation for the estimation of drinking water quality at time points between measurements. Evaluation of the method for this

R.M. Jones et al. / Environmental Research 134 (2014) 466–473

Deletion Pattern B Estimated Coefficients

Estimated Coefficients

Deletion Pattern A 1.0 0.5 0.0 0

0.3 0.6 0.9 Real Association

1.0 0.5 0.0 0

1.2

1.0 0.5 0.0 0.3 0.6 0.9 Real Association

0.3 0.6 0.9 Real Association

1.2

Deletion Pattern D Estimated Coefficients

Estimated Coefficients

Deletion Pattern C

0

471

1.0 0.5 0.0

1.2

0

0.3 0.6 0.9 Real Association

1.2

Fig. 2. Distribution of estimated exposure-response coefficients in the linear regression model. The gray triangles denote the real hypothesized associations, β1 = {0, 0.3, 0.6, 0.9, 1.2}.

Deletion Pattern B Estimated Coefficients

Estimated Coefficients

Deletion Pattern A 0.6 0.4 0.2 0.0 -0.2 0

0.1 0.2 0.3 Real Association

0.6 0.4 0.2 0.0 -0.2

0.4

0

0.6 0.4 0.2 0.0 -0.2 0

0.1 0.2 0.3 Real Association

0.4

Deletion Pattern D Estimated Coefficients

Estimated Coefficients

Deletion Pattern C

0.1 0.2 0.3 Real Association

0.4

0.6 0.4 0.2 0.0 -0.2 0

0.1 0.2 0.3 Real Association

0.4

Fig. 3. Distribution of estimated exposure-response coefficients in the Poisson regression model. The gray triangles denote the real hypothesized associations, 0.2, 0.3, 0.4}.

application is necessary because the NPDWRs require periodic monitoring on quarterly, annual, or multi-year time-scales, though more frequent values may be of interest for epidemiological studies. These regulations leave many months without measurements, and these months are not randomly distributed throughout the calendar year. Our approach was to evaluate the performance of multiple imputation using water quality data collected through the AMP. In the AMP, water quality is typically measured more frequently then monthly, so real values were available and could be compared to imputed values. The AMP data have been used by other investigators for the evaluation of statistical methods intended for SDIWS

β1 = {0, 0.1,

data, and found to provide equivalent results (Mosquin et al., 2012). Relative, however, to the 4 7000 CWSs in eight Midwestern states, CWSs participating in the AMP tend to have higher atrazine concentrations and exhibit more temporal variability than nonparticipating CWSs (Jones et al., 2014). The MAR and MNAR patterns of missingness introduced into AMP data for multiple imputation are based on NPDWRs for atrazine and nitrates, and leave 65–92% of months without water quality measurements. This rate of missingess is very high for multiple imputation applications. Hopke et al. (2001), for example, addressed missingness rates of 7–20% (and left-censoring rates of 0–53%) in Arctic air pollution data.

472

R.M. Jones et al. / Environmental Research 134 (2014) 466–473

The primary finding of this study is that the multiple imputation method can effectively fill in missing values in drinking water quality data. Even with the large fraction of missing information, our procedure was able to recover most of the missing information in the AMP data, even when values are MNAR. The imputation model employed included other chemical contaminants related to atrazine — simazine, DIA, and DEA — and atrazine measurements in raw drinking water. In preliminary analyses, we had not included these other variables in the atrazine model, and in the context of a linear exposure–response relationship we observed similar performance of the imputation method. In general, however, a richer imputation model is preferred to capture associations among variables, and to ensure congeniality with the analysis model (Meng, 1994; Schaffer, 1999). Towards this end, we sought additional time-varying environmental variables for the imputation model, such as rainfall. Rainfall, however, is a challenging variable to develop because the relationship between rainfall and atrazine concentrations is complex and highly dependent upon local hydrogeology and climate, such that the best source of data and rainfall metric is not obvious (Borggaard and Gimsing, 2008; Kookana et al., 2010; Stackelberg et al., 2012). A related limitation to the imputation model in this simulation is the lack of health-related variables, like risk factors, that would be available in an epidemiological study, and used in the analysis model. These variables should be included in the imputation model when available, particularly when they will be included in the analysis model. The multiple imputation method did not explicitly model the data as a time-series, despite the fact that atrazine concentrations in CWSs in the AMP generally exhibit seasonality. An explicit timeseries model was considered for the imputation model, but was not implemented for several reasons. First, seasonality is highly dependent upon the contaminant in drinking water, and even when a contaminant exhibits seasonal variations, the magnitude and timing of the seasonal changes may vary among CWSs and years owing to local hydrogeology, source water, and rainfall. Second, while a time-series model may be more conceptually appropriate, it has not been found to always improve imputation or inference (Hopke et al., 2001). Third, the infrequency of measurements in the SDWIS data for selected contaminations (annually or greater) were judged to be insufficient to support modeling of seasonal changes in water quality. And finally, to the extent that successive measurements are correlated, the multiple imputation model implemented in this study considers data in other months to impute a value missing in one month. Ultimately, our objective is to fill in monthly values of drinking water quality for atrazine and nitrate among 4 7000 CWSs in eight Midwestern states. Our results suggest that the multiple imputation method is a feasible approach, and will perform well as long as the monthly values are MAR, or are MNAR and the exposure–response relationship is modest in size (e.g., β 1 r0:9). Review of atrazine measurements in SDWIS data indicates that the data are MNAR owing to the concentration-dependency of the monitoring frequency in the NPDWRs (Jones et al., 2014). Strong associations are not expected in our epidemiologic analysis owing to the ecological design, and modest associations observed in previous studies (Rinksy et al., 2012; Ochoa-Acuna et al., 2009; Munger et al., 1997). The SDWIS data pose an additional challenge not present in the AMP data: the rate of left-censoring of atrazine concentrations (89%) is too high to apply maximum likelihood estimation to obtain numerical values for left-censored results (Jones et al., 2014; Lubin et al., 2004). We have found that the majority of CWSs always report undetectable concentrations of atrazine, and the atrazine concentration in some of these CWSs is plausibly zero due to the types of crops grown. This suggests that a two-tier

approach to multiple imputation may be necessary for the SDWIS data in which data from CWSs that ever quantified atrazine in finished drinking water are multiply imputed separately from those CWSs that never quantified atrazine. The application of the multiple imputation method to drinking water quality data in SDWIS remains non-trivial. The success of the multiple imputation method in this evaluation, however, indicates that our procedure can be adapted to address the problem of missing values in drinking water quality data, and to environmental pollutants more generally.

5. Conclusions To summarize, our multiple imputation method was able to recover information about the concentration of atrazine in finished drinking water despite high rates of missing values (65–92%) and when values were missing not at random. This finding suggests that multiple imputation is an appropriate, effective method for filling in environmental pollutant data at time points between measurements.

Acknowledgments We would like to acknowledge Kirsten Almberg and Raja Kaliappan, who prepared the crop density and groundwater quality variables, respectively; and Fei Shi, who assisted with the computer programming in the early stages of this research. Funding sources: This work was funded by the Centers for Disease Control and Prevention Contract #200-2010-37442. References Almberg, K., Turyk, M., Jones, R.M., Anderson, R., Graber, L., Waller, L., Gibson, R., Stayner, L.T., 2014. A linkage study of agricultural land use in the Midwest and adverse birth outcomes. Environ. Res., in press. Becker, S., Halsall, C.J., Tych, W., Hung, H., Attewell, S., Blanchard, P., Li, H., Fellin, P., Stern, G., Billeck, B., Frisen, S., 2006. Resolving the long-term trends of polycyclic aromatic hydrocarbons in the Canadian arctic atmosphere. Environ. Sci. Technol. 40, 3217–3222. Borggaard, O., Gimsing, A., 2008. Fate of glyphosate in soil and the possibility of leaching to ground and surface waters: a review. Pest Manag. Sci. 64, 441–456. Chang, C., Chen, C., Wu, D., Yang, C., 2010. Nitrates in drinking water and risk of death from rectal cancer: does hardness in drinking water matter? J. Toxicol. Environ. Health A 73, 1337–1347. Demirtas, H., 2004. Simulation driven inferences for multiply imputed longitudinal datasets. Stat. Neerl. 58, 466–482. Demirtas, H., Hedeker, D., 2007. Gaussianization-based quasi-imputation and expansion strategies for incomplete correlated binary responses. Stat. Med. 26, 782–799. Dilmaghani, S., Henry, I., Soonthornnanoda, P., Christensen, E., Henry, R., 2007. Harmonic analysis of environmental time series with missing data or irregular sample spacing. Environ. Sci. Technol. 41, 7030–7038. Graziano, N., McGuire, M., Roberson, A., Adams, C., Jiang, H., Blute, N., 2006. 2004 National atrazine occurrence monitoring program using the Abraxis ELISA method. Environ. Sci. Technol. 40, 1163–1171. Helsel, D., 2005. Nondetects and Data Analysis: Statistics for Censored Environmental Data. John Wiley and Sons, Hoboken, NJ. Hopke, P., Liu, C., Rubin, D., 2001. Multiple imputation for multivariate data with missing and below-threshold measurements: time-series concentrations of pollutants in the Arctic. Biometrics 57, 22–33. Jones, R., Graber, J., Anderson, R., Rockne, K., Turyk, M., Stayner, L., 2014. Community drinking water quality monitoring data: utility for public health research and practice. J. Public Health Pract. Manag. 20 (2), 210–219. Kookana, R., Holz, G., Barnes, C., Bubb, K., Fremlin, R., Boardman, B., 2010. Impact of climatic and soil conditions on environmental fate of atrazine used under plantation forestry in Australia. J. Environ. Manag. 91, 2649–2656. Le, H., Batterman, S., Wahl, R., 2007. Reproducibility and imputation of air toxics data. J. Environ. Monit. 9, 1358–1372. Lubin, J., Colt, J., Damann, D., Davis, S., Cerhan, J., Severson, R., Bernstein, L., Hartage, P., 2004. Epidemiologic evaluation of measurement data in the presence of detection limits. Environ. Health Perspect. 112, 1691–1696. Meng, X., 1994. Multiple-imputation inferences with uncongenial sources of input. Stat. Sci. 9, 538–573.

R.M. Jones et al. / Environmental Research 134 (2014) 466–473

Migeot, V., Albouy-Llaty, M., Carles, C., Limousi, F., Strezlec, S., Dupuis, A., Rabouan, S., 2013. Drinking water exposure to a mixture of nitrate and low-dose atrazine metabolites and small-for-gestational age (SGA) babies: a historic cohort study. Environ. Res. 122, 58–65. Mosquin, P., Whitmore, R., Chen, W., 2012. Estimation of upper centile concentrations using historical atrazine monitoring data from community water systems. J. Environ. Qual. 41, 834–844. Munger, R., Isacson, P., Hu, S., Burns, T., Hanson, J., Lynch, C., Cherryholmes, K., Dorpe, P.V., Hausler Jr., W.J., 1997. Intrauterine growth retardation in Iowa communities with herbicide-contaminated drinking water supplies. Environ. Health Perspect. 105, 308–314. Nieh, C., Dorevitch, S., Liu, L., Jones, R., 2014. Evaluation of imputation methods for microbial surface water quality studies. Environ. Sci. Process. Impacts 16, 1145–1153. Ochoa-Acuna, H., Frankenberger, J., Hahn, L., Carbajo, C., 2009. Drinking-water herbicide exposure in Indiana and prevalence of small-for-gestational-age and preterm delivery. Environ. Health Perspect. 117, 1619–1624. Rinksy, J., Hepnhayan, C., Golla, V., Browning, S., Bush, H., 2012. Atrazine exposure in public drinking water and preterm birth. Public Health Rep. 127, 72–80. Rogers, J., Kavlock, R., 2008. Developmental toxicology. In: Kalsen, C. (Ed.), Casarett and Doull's Toxicology, The Basic Science of Poisons, 7th ed. McGraw Hill, Illinois, pp. 415–452. Rubin, D., 1987. Multiple Imputation for Nonresponse in Surveys. Wiley, New York.

473

Schaffer, J., 1999. Multiple imputation: a primer. Stat. Methods Med. Res. 8, 3–15. Stackelberg, P., Barbash, J., Billiom, R., Stone, W., Wolock, D., 2012. Regression models for estimating concentrations of atrazine plus deethylatrazine in shallow groundwater in agricultural areas of the United States. J. Environ. Qual. 41, 479–494. van Buuren, S., 2012. Flexible Imputation of Missing Data. Chapman & Hall/CRC Press, Boca Raton, FL. van Buuren, S., Groothuis-Oudshoorn, K., 2012. Multivariate imputation by chained equations: Package ‘mice’ version 2.15 〈http://cran.r-project.org/web/packages/ mice/mice.pdf〉. Villanueva, C., Durand, G., Coutte, M., Cevrier, C., Cordier, S., 2005. Atrazine in municipal drinking water and risk of low birth weight, preterm delivery, and small-for-gestational-age status. Occup. Environ. Med. 62, 400–405. Ward, M., Rusiecki, J., Lynch, C., Cantor, K., 2007. Nitrate in public water supplies and risk of renal cell carcinoma. Cancer Causes Control 18, 1141–1151. Weerasinghe, S., 2010. A missing values imputation method for time series data: an efficient method to investigate the health effects of sulphur dioxide levels. Environmetrics 21, 162–172. Weyer, P., Cerhan, J., Kross, B., Hallberg, G., Kantamneni, J., Breuer, G., et al., 2001. Municipal drinking water nitrate level and cancer risk in older women: the Iowa Women's Health Study. Epidemiology 11, 317–338. White, I., Royston, P., Wood, A., 2011. Multiple imputation using chained equations: issues and guidance for practice. Stat. Med. 30, 377–399.

Multiple imputation for assessment of exposures to drinking water contaminants: evaluation with the Atrazine Monitoring Program.

Drinking water may contain pollutants that harm human health. The frequency of pollutant monitoring may occur quarterly, annually, or less frequently,...
431KB Sizes 0 Downloads 6 Views