Ann. Occup. Hyg., 2014, 1–15 doi:10.1093/annhyg/meu061
Evaluating Temporal Trends from Occupational Lead Exposure Data Reported in the Published Literature Using Meta-Regression 1.Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA 2.National Environmental Health Research Center, National Health Research Institutes, Taipei 11503, Taiwan 3.Present address: National Cancer Control Institute, National Cancer Center, Goyang 410-769, Korea. *Author to whom correspondence should be addressed. Occupational and Environmental Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, 9609 Medical Center Drive Room 6E608 MSC 9771, Bethesda, MD 20892-7240, USA. Tel: +1-240-276-7278; fax: +1-240-276-7835; email:
[email protected] Submitted 20 February 2014; revised version accepted 9 July 2014.
A b st r a ct Objectives: The published literature provides useful exposure measurements that can aid retrospective exposure assessment efforts, but the analysis of this data is challenging as it is usually reported as means, ranges, and measures of variability. We used mixed-effects meta-analysis regression models, which are commonly used to summarize health risks from multiple studies, to predict temporal trends of blood and air lead concentrations in multiple US industries from the published data while accounting for within- and between-study variability in exposure. Methods: We extracted the geometric mean (GM), geometric standard deviation (GSD), and number of measurements from journal articles reporting blood and personal air measurements from US worksites. When not reported, we derived the GM and GSD from other summary measures. Only industries with measurements in ≥2 time points and spanning ≥10 years were included in our analyses. Meta-regression models were developed separately for each industry and sample type. Each model used the log-transformed GM as the dependent variable and calendar year as the independent variable. It also incorporated a random intercept that weighted each study by a combination of the betweenand within-study variances. The within-study variances were calculated as the squared log-transformed GSD divided by the number of measurements. Maximum likelihood estimation was used to obtain the regression parameters and between-study variances. Results: The blood measurement models predicted statistically significant declining trends of 2–11% per year in 8 of the 13 industries. The air measurement models predicted a statistically significant declining trend (3% per year) in only one of the seven industries; an increasing trend (7% per year) was also observed for one industry. Of the five industries that met our inclusion criteria for both air and blood, the exposure declines per year tended to be slightly greater based on blood measurements than on air measurements. Conclusions: Meta-analysis provides a useful tool for synthesizing occupational exposure data to examine exposure trends that can aid future retrospective exposure assessment. Data remained too
Published by Oxford University Press on behalf of the British Occupational Hygiene Society 2014.
• 1
Downloaded from http://annhyg.oxfordjournals.org/ at University of Alaska Anchorage on October 16, 2014
Dong-Hee Koh1,3, Jun-Mo Nam1, Barry I. Graubard1, Yu-Cheng Chen2, Sarah J. Locke1 and Melissa C. Friesen1,*
Page 2 of 15 • Temporal trends of lead exposure using meta-regression sparse to account for other exposure predictors, such as job category or sampling strategy, but this limitation may be overcome by using additional data sources. K e y w or d s : exposure; lead; meta-analysis; meta-regression; temporal trend
I n t ro d u ct i o n
M ET H ODS
Lead exposure database and data treatment We previously extracted lead measurements collected in US work sites that were reported in the published literature into a lead exposure database. In brief, the published literature was searched for occupational exposure
Downloaded from http://annhyg.oxfordjournals.org/ at University of Alaska Anchorage on October 16, 2014
Retrospective exposure assessment in populationbased studies, such as case-control studies, is hampered by the impracticality of obtaining historical exposure measurements from the thousands of employers reported by study participants. Improving the intensity of exposure estimates for these exposure assessments can be aided by the use of publicly available sources of measurement data that cover the wide time span and numerous industries and occupations in a typical study (Lavoue et al., 2013). Several studies have recently shown the utility of the measurements reported in the published literature for retrospective exposure assessments by using the reported data to develop statistical models that identify broad differences in exposure based on available or estimatable ancillary data reported alongside the measurements. For instance, Hein et al. (2008, 2010) modeled potential predictors of solvent concentrations for several solvents using measurements reported in the literature, in NIOSH Health Hazard Evaluations, and in NIOSH Industry-Wide Study reports. The resulting models were used to predict historical exposures for a case-control study of brain cancer (Neta et al., 2012). Similarly, Friesen et al. (2013) developed a model based on air measurements of metalworking fluid concentrations reported in the published literature to predict exposure intensity for three broad classes of metalworking fluids based on industry, machining operation, and decade that were used in a case-control study of bladder cancer (Colt et al., 2013). One of the challenges of using data from the published literature is that the measurements are nearly always aggregated and reported as summary statistics (e.g. arithmetic and geometric means [GMs], geometric standard deviations [GSDs], range). To model the summary measures, the aforementioned models used weighted linear regression models that weighted each summary statistic by the number of measurements used to calculate the summary measure. This approach assumes that the within-study variability is identical across studies (i.e. with essentially zero betweenstudy variability); however, a close examination of the
summary statistics and their variance components indicates that this assumption is rarely met when the measurements cover wide time spans, multiple industries, and multiple occupations. To account for differing within-study variability, Lavoue et al. (2007) used Monte Carlo simulation to draw random samples from distributions with reported GM and GSD. This approach required complex statistical programming that may not be accessible to all researchers that would like to synthesize the occupational exposure data. A simpler and less computer-intensive approach is to use mixed-effects meta-analysis models, an approach commonly used to synthesize health risks across multiple epidemiologic studies. Meta-analysis uses an estimate of effect size reported in study (such as a standardized mean difference, an odds ratio, or a correlation coefficient), and then combines these estimates across studies to produce a single summary measure (Hedges and Vevea, 1998). In the present study, we demonstrate the utility of mixed-effects meta-analysis regression models to predict temporal trends of blood and air lead concentrations in multiple US industries using measurements reported in the published literature. Our focus here was on evaluating the exposure changes over time, since previous models of historical data have shown that many exposures decrease by a median 8% per year (Symanski et al., 1998b). Thus changes over time can often dwarf differences in exposure across occupations and industries (e.g. Peters et al., 2011; Friesen et al., 2012; Koh et al., 2014) and can be a substantial source of exposure misclassification if not captured in the exposure assessment process. The use of these models in the exposure assessment process will be described separately in another paper.
Temporal trends of lead exposure using meta-regression • Page 3 of 15
lead-based paint abatement projects (because these were presumed worst-case or non-typical scenarios, 15% of all personal air summary results). From the personal air and blood measurements that met the above inclusion criteria, we identified the industries that had summary results reported in two or more years and that had measurements that spanned 10 or more years. Overall, 31% of the summary results of personal air measurements and 80% of the summary statistics of the blood lead measurements met our inclusion criteria to evaluate time trends. Several data treatment steps were undertaken to obtain GM and GSD estimates for each unique set of summary statistics. Summary results below the limit of detection (LOD) were rare, but when found, were estimated by dividing the LOD concentration by two. If the set of reported summary statistics included both GM and GSD, we used those values. If a study reported individual measurements, the GM and GSD were calculated directly from the measurements. If GM and GSD were not available, we estimated them using conversion equations previously reported in the literature (Aitchison and Brown, 1963; Lavoue et al., 2007; Hein et al., 2008). If both the arithmetic mean and standard deviation (SD) were available, equations (shown below) (1) and (2) were used. If only minimum (min) and maximum (max) values were available, equations (3) and (4) were used. If the median was available, we used equation (5). If GSD was not available, we assumed a GSD of 2.56, which is reported as average estimated GSD in non-chemical particulate exposed industries (Kromhout et al., 1993). If the GM and GSD were not extractable or estimatable from other metrics, or if the GSD was 1, the summary results were excluded.
GM = e
SD 2 ln( AM )−0.5× ln 1+ AM
GSD = e
SD 2 ln(1+ AM
(1)
(2)
GM = e[ln(max)+ ln(min)]/2 (3)
GSD = e[ln(max)−ln(min)]/4 (4)
GM = median (5)
Downloaded from http://annhyg.oxfordjournals.org/ at University of Alaska Anchorage on October 16, 2014
to lead using the web-based bibliographic databases MEDLINE, Web of Science, Scopus, and NIOSHTIC2, and search terms ‘lead exposure’, ‘worker’, ‘occupation’ and ‘occupational exposure’, and job and industry-specific keywords such as ‘burner’ or ‘printing’ were used. Papers identified included peer-reviewed journal articles, case reports, and Morbidity and Mortality Weekly Reports from the US Centers for Disease Control and Prevention (CDC). Additional papers were located by reviewing the citations of papers identified in the webbased databases. Papers were selected for inclusion in the database if they contained air, blood, or urine lead measurements or summary statistics and contained, at minimum, information on job and/or industry. If a paper presented exposure measurements from another source (secondary exposure measurements), these measurements were also included. When studies were testing new work practices, control methods, or monitoring devices, both pre- and post-intervention measurement data were included. Duplicate data presented in multiple papers were removed by reviewing results from the same industries collected during the similar time periods. Overall, the database consisted of 1163 summary results of air and biological measurements of lead extracted from 175 papers reporting exposure measurements collected between 1930 and 2010 (Koh et al., in preparation). The database contained the extracted summary statistics and ancillary information about measurement year (if missing, the publication year minus 2 was assigned), industry, job, sample type, sampling and analytic methods, particle size, and characteristics of the study. A descriptive analysis of all the extracted summary results will be reported in a separate publication. For the analyses described here, we extracted all personal air measurements (525 summary results) and blood lead measurements (350 summary results) from the database. We included air measurements if the particle size was inhalable, total suspended particle, or not specifically reported (assumed inhalable or total suspended particle), but excluded respirable, PM10, and gas measurements (13.4% of personal air summary results excluded). We excluded summary results if only a single measurement was reported in a study (19% of all personal air summary results; 12% of all blood air summary results), if the sampling duration was less than 1 h (4% of all personal air summary results), if samples were collected in containment areas or associated with
Page 4 of 15 • Temporal trends of lead exposure using meta-regression
R E SU LTS We restricted the meta-regression analyses to the 13 industries that met our inclusion criteria for blood lead measurements and the seven industries that met our criteria for personal air lead measurements. Basic descriptive information of the time span and number of unique sets of summary measurements for each industry and sample are presented alongside the estimated model parameters in Table 1. The lead battery industry had the most summary results (43 blood lead summary results; 47 personal air lead summary results), followed by secondary lead smelters (26 blood; 3 air). Only five industries met the inclusion criteria for both sample types. The blood measurement models predicted statistically significant declining trends of 2–11% per year in 8 of the 13 industries (Table 1). In contrast, the air measurement models predicted a statistically significant declining trend (3% per year) in only 1 of the 7 industries (secondary lead smelters). Additionally, an increasing trend of 7% per year was observed for one industry (bronze foundries). Of the five industries meeting inclusion criteria for both sample types, three industries had statistically significant declining time trends based on blood measurements, but only one industry had significant declining trends based on personal air measurements. The decline per year was generally similar between blood measurements and air measurements. For example, firing ranges have a predicted decline of 10% per year based on blood and 8.7% per year based on air. Similarly, secondary lead smelters have a predicted decline of 2.7% per year based on blood and 3.3% per year based on air. We show these five industries graphically in Fig. 1. Between-study variances of personal air lead for auto radiator repair, firing range, lead battery, and ship building/repair/demolition were greater than those of blood lead for the same industries, while the opposite direction was observed for secondary lead smelters (Table 1). The proportion of the total variance explained by including study year ranged from −1.6% to 81%, with a median of 9.8% for blood lead and 5.5% for personal air lead. Predicted GMs for any given year can be calculated using the equation in footnote ‘c’ of Table 1. For example, the predicted 1980 blood lead GMs and personal air lead GMs for auto radiator repair workers
Downloaded from http://annhyg.oxfordjournals.org/ at University of Alaska Anchorage on October 16, 2014
Statistical analysis We used mixed-effects meta-analysis models rather than fixed-effects models so that we could account for both within- and between-study variability and because our aim was to provide a population-level exposure estimate. The mixed-effects model treats the set of studies used as if they were a random sample selected from a larger population of studies with varying exposures, whereas fixed-effects meta-analysis treats the set of studies used as they have a common exposure and thus accounts only for within-study variability (Hedges and Vevea, 1998). In a fixed-effects model, a large study (based on number of measurements) would be given substantial weight and a small study would be largely ignored, because the model weights each result by the inverse of variance. In contrast, a random-effects model weights the studies by a combination of within- and between-study variances so that the influence of any single study would depend on how large their within variance is compared to the between-study variance. Mixed-effects regression models were developed separately for each industry and sample type using SAS 9.2 (SAS Institute Inc., Cary, NC, USA) (van Houwelingen et al., 2002; Madden and Paul, 2011). In each model, the dependent variable was the natural log-transformed GMs to make the variable approximately normally distributed. To estimate the time trend of exposure changes in each industry, calendar year was entered as a continuous, linear fixed-effect, centered at 1980. Each model also incorporated a random intercept that allowed for study-to-study variability in each study-specific summary measure. The within-study variance of the summary measure was calculated as the squared log-transformed GSD divided by the number of measurements. The regression parameters and between-study variances were obtained using maximum likelihood estimation (Littell et al., 2006). The models required an initial guess of the between-study variance, which we set to one-half the average within-study variance of the summary measure (Konstantopoulos and Hedges, 2004). The SAS code for an example mixed-effects model used in this study is provided in Appendix 1. Data used in this model, including each summary measurement, year, and reference are reported in online supplementary materials (see supplementary material, available at Annals of Occupational Hygiene online).
Auto radiator repair
Blood
1953–1985 1956–1975 1978–1997
1972–1991 1984–2002
17
36
15
33
37
Lead battery
Police protection 92
28
General construction/ renovation
Polyvinyl chloride
Residential renovation
Secondary lead smelter
Ship building/ repair/ demolition
4
7
4
5
3
8
4
3
6
8
7
3
8
4
3
2
8
4
3
10
9
7
18
26
5
6
6
43
8
5
23
24
12
2.255
4.116
2.314
3.737
3.151
3.466
2.850
3.526
3.677
4.232
3.519
0.179 1.905
0.054 4.010
0.038 2.388
0.287 3.174
0.160 2.838
0.045 3.378
0.234 2.392
0.114 3.302
0.219 3.248
0.363 3.521
0.118 3.287
0.4 −4.4
59 79
5.0 3.6 0.8 81 0.8 3.2
3.750 −0.047 0.013 −0.071 −0.022 0.013 3.308 −0.068 0.011 −0.090 −0.046 0.337
3.554 −0.026 0.007 −0.040 −0.012 0.069 0.013 0.015 0.063 0.348 2.434 −0.045 0.002 −0.049 −0.041 0 4.222 −0.027 0.009 −0.045 −0.009 0.025 0.014 0.156
0.004 0.030 −0.055
2.605 −0.021 0.018 −0.057
4.299
3.465 −0.005 0.009 −0.024
12
4.107 −0.107 0.034 −0.174 −0.040 0.476
−2.1
−2.7
−0.5
−2.6
−6.6
−4.6
−10
−11
30
4.943 −0.113 0.028 −0.167 −0.059 0.401
−2.4
9.8
0.007 0.059
3.750 −0.024 0.016 −0.056
Between- Proportion Exposure study variance change per variance explained by yearf year (%)e
Downloaded from http://annhyg.oxfordjournals.org/ at University of Alaska Anchorage on October 16, 2014
1962–1994
1938–2005
1977–1990
28
Fuel additives
1974–1988
1980–2005
1979–1994
92
16
75
SIC Time span of No. No. No. Intercept (centered at 1980) Slope (year-1980)d measurements yeara studyb resultc βintercept SE 95% CI βslope SE 95% CI
Firing range
(µg dl−1) Bridge cconstruction/ maintenance/ demolition
Industry
Sample type
Table 1. Estimated intercept, slope and between-study variance from the mixed-effects meta-regression models, by industry and sample type.
Temporal trends of lead exposure using meta-regression • Page 5 of 15
1993–2003 1985–2001
(µg m−3) Bronze foundry 33
92
36
49
33
37
Firing range
Lead battery
Metal recycling
Secondary lead smelter
Ship building/ repair/ demolition
3
3
2
7
4
2
8
4
3
3
3
6
6
2
5
3
7
3
10
47
9
9
15
8
5.857
5.135
9.149
3.055
5.574
2.402
4.124
2.302
3.650
1.3 17 5.5 0.01 7.9
0.054 0.767 0.109 0.038 0.142 4.362 0.037 1.192 0.089 1.935
0.067 0.021
0.025
3.457 −0.001 0.020 −0.040
8.276 −0.091 0.119 −0.324
3.286
0.266 5.336
0.059 5.019
6.377 −0.020 0.016 −0.051
0.012 0.098
5.2513 −0.034 0.006 −0.046 −0.023