Journal of Environmental Management 149 (2015) 253e262

Contents lists available at ScienceDirect

Journal of Environmental Management journal homepage: www.elsevier.com/locate/jenvman

Predictive capability of chlorination disinfection byproducts models Evan C. Ged, Paul A. Chadik, Treavor H. Boyer* Department of Environmental Engineering Sciences, Engineering School of Sustainable Infrastructure & Environment (ESSIE), University of Florida, P.O. Box 116450, Gainesville, FL 32611-6450, USA

a r t i c l e i n f o

a b s t r a c t

Article history: Received 18 June 2013 Received in revised form 14 August 2014 Accepted 13 October 2014 Available online

There are over 100 models that have been developed for predicting trihalomethanes (THMs), haloacetic acids (HAAs), bromate, and unregulated disinfection byproducts (DBPs). Until now no publication has evaluated the variability of previous THM and HAA models using a common data set. In this article, the standard error (SE), Marquardt's percent standard deviation (MPSD), and linear coefficient of determination (R2) were used to analyze the variability of 87 models from 23 different publications. The most robust models were capable of predicting THM4 with an SE of 48 mg L1 and HAA6 with an SE of 15 mg L1, both achieving R2 > 0.90. The majority of models were formulated for THM4. There is a lack of published models evaluating total HAAs, individual THM and HAA species, bromate, and unregulated DBPs. © 2014 Elsevier Ltd. All rights reserved.

Keywords: Trihalomethanes (THM) Haloaceticacids (HAA) Dissolved organic carbon (DOC) Bromide Uniform formation conditions Chlorination Mathematical modeling

1. Introduction The development of mathematical models that predict the formation of disinfection byproducts (DBPs) under different water quality and treatment conditions is of wide interest and usefulness to the drinking water field (Lu et al., 2011). The most studied DBPs are the trihalomethanes (THMs) and haloacetic acids (HAAs), two classes of compounds that are regulated by the U.S. EPA (USEPA, 2013), and have guidelines in Canada, Australia, the European Union, and the World Health Organization. Although more than 100 DBP formation models have been published in the last 30 years, many of these models are not applicable to a range of drinking water sources and can often underestimate DBP formation by excluding key explanatory variables such as bromide. The objective of the research described herein was to critically evaluate existing DBP models by assessing the variability of the models using two external data sets and statistical metrics. 2. Background on DBP modeling A recent review article discusses existing DBP models from 48 articles: 42 models focus on THMs, 8 on HAAs, 5 on bromate, and 1

* Corresponding author. Tel.: þ1 352 846 3351; fax: þ1 352 392 3076. E-mail address: thboyer@ufl.edu (T.H. Boyer). http://dx.doi.org/10.1016/j.jenvman.2014.10.014 0301-4797/© 2014 Elsevier Ltd. All rights reserved.

on chlorite (Chowdhury et al., 2009). For a description of existing DBP models, and their advantages and limitations, the reader is referred to Chowdhury et al. (2009) and an earlier review article (Sadiq and Rodriguez, 2004). Both articles provide a chronological summary of existing models demonstrating that several of the models were capable of generating linear correlations (R2) > 0.90 based on results from their respective studies. In addition to the handful of review articles, many of the publications are research articles that discuss the development of new models for total THMs (TTHMs or THM4) (Morrow and Minear, 1987; Golfinopoulos and Arhonditsis, 2002; Hong et al., 2007), HAAs (Lekkas and Nikolaou, 2004; Sohn et al., 2004), brominated THM and HAA species (Siddiqui et al., 1998; Fabbricino and Korshin, 2009; Chowdhury et al., 2010), and unregulated and emerging classes of DBPs (Chen and Westerhoff, 2010). However, many of the models are not applicable to a wide range of source waters because of the experimental design or methodology involved in the model formulation. For example, some researchers have correlated explanatory variables (e.g., dissolved organic carbon (DOC) or UV absorbance at 254 nm (UV254)) to DBP formation potential using low bromide waters thereby creating inaccurate model predictions when the models are applied to waters that contain higher levels of bromide (Rathbun, 1996; Nikolaou et al., 2004). A study by Chen and Westerhoff (2010) generated models for predicting THM4, HAA9 (the summation of all 9 HAA species), as well as unregulated nitrogenous DBPs. These multivariate power

254

E.C. Ged et al. / Journal of Environmental Management 149 (2015) 253e262

law models resulted in moderately strong positive correlations with R2 values ranging from 0.62 to 0.88 across all DBP species and used a large number of samples in the experimental design, ranging from 134 to 210. Although the Chen and Westerhoff (2010) models demonstrated reasonable accuracy during model validation, they only incorporated three explanatory variables: DOC, UV254, and bromide ion (Br). The methodology used by Chen and Westerhoff (2010) simplified the model development process by creating a uniform set of DBP formation conditions including specified pH, temperature, chlorine dose, and reaction time (all variables that influence DBP formation), but in turn limits the application of the models to other disinfection scenarios. For many DBP formation potential studies, models such as Chen and Westerhoff (2010) can be used and may give accurate results under controlled disinfection conditions; however, the impacts of real drinking water treatment plants and distribution systems cannot be evaluated. For the chlorination experiments in the Chen and Westerhoff (2010) study the uniform formation conditions (UFC) specified a 24 h reaction time (Summers et al., 1996), but in reality DBPs can form for much longer times in a distribution system, resulting in an underestimation of DBP formation if the model does not include reaction time as an explanatory variable. Another variable important to fullscale systems is temperature, which affects reaction rates. For water utilities in the southern U.S., and other parts of the world, it is realistic to expect temperatures to exceed 30  C in the summer and residence time approaching 7 d in large distribution systems.

External factors such as these decrease the predictive capability of DBP models when the models do not include chlorine dose, reaction time, or temperature. To gain valuable insights on the accuracy and applicability of existing DBP models it is necessary to compare all published models using a common data set. Furthermore, this approach can evaluate whether DBP models developed for local conditions can be applied on a global scale. The key limitation to the DBP modeling literature is the lack of evaluation using external data sets and a cross comparison of predictability based on statistical metrics. Accordingly, the research objective of this work was to compare DBP models published over the last 30 years using two external data sets on DBP formation and assess the variability of the models using three statistical metrics. 3. Research approach 3.1. DBP models A thorough literature review was conducted to compile published DBP models. For the purpose of this study, models were excluded if they included explanatory variables for fulvic acid, chlorophyll a, fluorescence, geographic region, seasons, kinetic rate constants, flow rate and/or tank volume, or any of several dummy variables because the value of these variables are not typically available. In addition, models that predicted total DBP

Fig. 1. Measured versus predicted THM4 concentrations for (a) raw waters, models including Br, (b) treated waters, models including Br, (c) raw waters, models excluding Br, and (d) treated waters, models excluding Br. Measured data from Boyer and Singer (2005). n is number of individual waters per model. Models 26 and 27 excluded from c and d.

E.C. Ged et al. / Journal of Environmental Management 149 (2015) 253e262

concentration in mmol L1 were excluded because it was not possible to convert to mass based units without knowing the DBP speciation. A complete list of the models considered in this study is tabulated in Table S1 in supplementary material. A total of 87 models were selected for this study. All models were based on chlorine disinfection because this is the most widely published form of DBP modeling and most widely used disinfectant.

255

resulting in 145 discrete samples. Additionally, the survey spanned four seasons over 18 months generating four subsets of data. A list of the water chemistry parameters recorded for the two data sets is included in Table S2 in supplementary material. The data used for DBP model validation for Boyer and Singer (2005) is included as supplementary data (Tables S3 and S4); the data used for DBP model validation for Amy et al. (1993) can be found in the original report.

3.2. DBP formation data sets 3.3. Statistical analysis Two data sets were used to assess the variability of the existing DBP models. The first data set was comprised of four surface waters that represented a range of DOC and bromide concentrations (Boyer and Singer, 2005). DBP formation experiments were conducted on all four raw waters as well as for the same waters under various treatment scenarios. The chlorination procedure involved uniform formation conditions (UFC), which yielded 1 mg L1 chlorine residual after 24 h of incubation in the dark at 20  C (Summers et al., 1996). The treatment included alum coagulation, magnetic ion exchange (MIEX) treatment, and a combination of alum and MIEX treatment. The second data set was from a comprehensive survey of bromide in drinking water (Amy et al., 1993), which included 100 U.S. water utilities from a variety of raw water sources including surface water and groundwater,

Model fit and predictive capability was assessed using standard error (SE), Marquardt's percent standard deviation (MPSD) (Marquardt, 1963), and linear coefficient of determination (R2). The SE is an absolute error and gives a model's deviation from measured results, the MPSD is a relative error and is defined as a modified percent standard deviation that accounts for the number of explanatory variables used in the model, and the R2 value quantifies how well the model predictions fit the data. The three statistical metrics provide a concise analysis of model performance by quantifying errors in prediction, imposing a penalty on models that

Table 1 Descriptive statistics for THM4 models for (a) models including term for Br and (b) models excluding term for Br. The values for SE, MPSD, and R2 are the average of 16 data points from measured (Boyer and Singer, 2005) and predicted concentrations. SE (mg L1)

MPSD (%)

R2

47.7 159 67.4 85.6 77.2 57.1 122 84.6 117 151 88.3 54.8

36% 120% 58% 56% 61% 50% 75% 45% 68% 107% 74% 30%

0.933 0.967 0.976 0.709 0.915 0.950 0.242 0.594 0.855 0.501 0.790 0.694

92.7

65%

0.761

(b) Modelb 2 6 7 8 9 11 12 13 14 18 19 25 28 29 30 32

108 137 136 146 80.6 74.3 140 90.1 130 51.3 53.5 101 166 108 86.9 78.5

75% 80% 95% 96% 51% 58% 93% 52% 69% 50% 35% 58% 114% 62% 49% 90%

0.916 0.958 0.617 0.575 0.607 0.937 0.962 0.926 0.558 0.900 0.730 0.050 0.384 0.729 0.791 0.108

Average

105

70%

0.672

(a) Model 5 10 15/20a 16/21a 17/22a 24 31 33 34 35 36 38 Average

a

Models 15/20, 16/21, 17/22 are grouped as raw/treated water models for DOC, UV254, and DOC þ UV254 based models, respectively. The values for SE, MPSD, and R2 are the average of 16 data points (12 treated waters and 4 raw waters). b Models 26 and 27 excluded from calculations because of high error.

Fig. 2. Measured versus predicted THM4 concentrations for (a) models including Br (model 36 excluded) and (b) models excluding Br (models 25, 26, 27, and 29 excluded). Measured data from Amy et al. (1993). n is number of individual waters per model.

256

E.C. Ged et al. / Journal of Environmental Management 149 (2015) 253e262

use more explanatory variables than necessary, and demonstrating how well predicted versus measured values are linearly correlated. 4. Results 4.1. Trihalomethanes 4.1.1. THM4 Models were classified as including a term for Br or excluding a term for Br to evaluate the effect of neglecting Br on THM formation and speciation. Although models excluding Brmay still be accurate for waters containing low concentrations of Br, it was expected that models including a term for Br would exhibit a higher degree of accuracy across a wide range of Br concentrations when compared with models that excluded a term for Br. The four individual chlorine- and bromine-containing THM species were measured in the Boyer and Singer (2005) data set while the Amy et al. (1993) data set only reported THM4. Results for THMs are illustrated with figures of measured versus predicted concentrations with the y ¼ x line showing 1:1 correspondence between predicted and measured values. Fig. 1(a)e(d) are for the Boyer and Singer (2005) data set, showing model results for raw waters and treated waters separately. Fig. 1(a) and (b) are for models that include a term for Br, whereas Fig. 1(c) and (d) are from the same data set but show results for models excluding a Br term. Accompanying Fig. 1(a)e(d) are results of the statistical analysis (Table 1) on the measured versus predicted THM4 concentrations. Using the Boyer and Singer (2005) data set, the average SE for THM4 models that include Br was 92.7 mg L1 and for models excluding Br the average SE was 105 mg L1. Models 26 and

27 were excluded from the figures and calculation of average SE for models which excluded Br because the error predicted in these models exceeded 108 mg L1. Using the same data set to compare R2 results, models including Br had an average R2 value of 0.761 whereas models excluding Br had an average R2 value of 0.672. Data in all four parts of Fig. 1 show that the majority of DBP models under predicted THM4 formation. Based on the lowest SE, the best models for predicting THM4 in Fig. 1 were models 5, 24, and 38 for models including Br and models 18 and 19 which did not include Br. All of these models had SE < 60 mg L1 and MPSD  50%. The interpretation of these results suggests that these models are capable of predicting THM4 within 60 mg L1 and exhibit a percent standard deviation of less than 50% across a range of water chemistry inputs. This is a remarkable result considering that the models were calibrated with data sets specific to the individual studies. A common feature of all five models, with the exception of model 38, was that they incorporated one or more DBP precursors as inputs (i.e., organic carbon concentration, UV254, and bromide) as well as disinfection conditions (i.e., Cl2 dose, pH, reaction time, and temperature). Model 38, generated by Chen and Westerhoff (2010), included terms for DOC, UV254, and Br. The formation conditions included pH, temperature, reaction time, and chlorine dose similar to those used in Boyer and Singer (2005), which likely contributed to the accurate model predictions. Although there was a spread in the predictive capability of the models in Fig. 1, the average of the three statistical metrics across all models favors models including Br as shown by lower SE and MPSD and higher R2 (Table 1). Also included in the THM4 modeling effort was the Amy et al. (1993) data set, which used 145 discrete raw water samples to

Fig. 3. Measured versus predicted THM4 concentrations for (a) model 15, (b) model 17, (c) model 30 and, (d) model 34. Measured data from Amy et al. (1993). n is number of individual waters per model.

E.C. Ged et al. / Journal of Environmental Management 149 (2015) 253e262

cover over 100 different source waters across the U.S. In addition, the data set from Amy et al. (1993) represented a national data set, whereas the data set from Boyer and Singer (2005) is more representative of a local data set. To illustrate the spread in model prediction Fig. 2(a) and (b) shows the measured versus predicted concentrations of THM4 for models including Br term and models excluding Br term, respectively. In contrast to Fig. 1, the data in Fig. 2 are scattered above and below the y ¼ x line indicating both over prediction and under prediction. This is likely the result of the Amy et al. (1993) data set being nationally representative. Individual plots were also generated for the most accurate models using the same data (Fig. 3(a)e(d)). Three of the four models in Fig. 3 are models that include Br (models 15, 17, and 34) and one model that did not include Br (model 30). Table 2 shows the results of the statistical analysis for Fig. 2. Models that included a term for Br performed better in SE and MPSD but slightly worse for R2. The lower R2 is because two of the models have poor R2 values ( 1013 mg L1 and model 67 was not plotted because all predictions were essentially 0 mg L1. Model 65 included only pH, contact time, and chlorine dose as explanatory variables and did not include DBP precursors (Nikolaou et al., 2004), which likely explains the low predictive capability of the model. Model 67 included chlorination conditions and bromide (Lekkas and Nikolaou, 2004); however the model was calibrated from laboratory chlorination experiments that spiked bromide at 1e30 mg L1. This very high bromide concentration explains why model 67 predicted essentially no HAA formation in natural waters with bromide 50% of HAA5 (Uyak et al., 2007). In addition, the SE for models 77, 80, and 81 was low because these HAA species are typically formed at low concentrations. The summation of models 77e81 can be used to predict HAA5 and gave an SE equal to 31.2 mg L1, which was comparable to the high predictive capability HAA models discussed in the previous subsection. Models 86 and 87 had higher SE than models 78 and 79 possibly due to the absence of chlorination conditions as explanatory variables. 5. Discussion 5.1. Summary Using SE, MPSD, and R2 as statistical metrics for predictive capability, the best predictive models for THM4 were 5, 15, 17, and 38. All of these models included a term for Br. The best predictive

Table 3 Descriptive statistics for individual THM species models. The values for SE, MPSD, and R2 are the average of 16 data points from measured (Boyer and Singer, 2005) and predicted concentrations. SE (mg L1)

Fig. 7. Measured versus predicted concentrations of bromoform (CHBr3) for (a) raw waters and (b) treated waters. Model 59 does not appear in the figure because predicted values are

Predictive capability of chlorination disinfection byproducts models.

There are over 100 models that have been developed for predicting trihalomethanes (THMs), haloacetic acids (HAAs), bromate, and unregulated disinfecti...
2MB Sizes 3 Downloads 6 Views