This article was downloaded by: [New York University] On: 31 May 2015, At: 17:05 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Occupational and Environmental Hygiene Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/uoeh20

“Quasi Nonparametric” Upper Tolerance Limits for Occupational Exposure Evaluations a

Charles B. Davis & Paul F. Wambach a

b

EnviroStat, Las Vegas, Nevada

b

Consultant, Rockville, Maryland Accepted author version posted online: 03 Feb 2015.Published online: 20 Apr 2015.

Click for updates To cite this article: Charles B. Davis & Paul F. Wambach (2015) “Quasi Nonparametric” Upper Tolerance Limits for Occupational Exposure Evaluations, Journal of Occupational and Environmental Hygiene, 12:5, 342-349, DOI: 10.1080/15459624.2014.995301 To link to this article: http://dx.doi.org/10.1080/15459624.2014.995301

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Journal of Occupational and Environmental Hygiene, 12: 342–349 ISSN: 1545-9624 print / 1545-9632 online c 2015 JOEH, LLC Copyright  DOI: 10.1080/15459624.2014.995301

“Quasi Nonparametric” Upper Tolerance Limits for Occupational Exposure Evaluations Charles B. Davis1and Paul F. Wambach2 1

EnviroStat, Las Vegas, Nevada Consultant, Rockville, Maryland

Downloaded by [New York University] at 17:05 31 May 2015

2

Upper tolerance limits (UTLs) are often used in comparing exposure data sets with an occupational exposure limit (OEL) or other regulatory criterion (RC): if the 95%-95% UTL does not exceed the OEL, one is 95% confident that at most 5% of exposures exceed the OEL, and the comparison “passes.” The largest of 59 observations is a nonparametric (distributionfree) 95%-95% UTL (NPUTL); the chance that this largest value equals or exceeds the actual 95th percentile is at least 95%, regardless of the underlying data distribution. That many observations may seem excessive in clean environments or small studies, though, and one would like to “pass” using UTLs based on fewer observations sufficiently far below the OEL or RC. “Quasi-nonparametric” UTLs (QNP UTLs) accomplish this. QNP UTLs assign a “pass” so long as one has “59 [values] less than the RC” (the NPUTL itself), “30 less than 1/2 [of the RC],” “21 less than 1/3,” and on down to “8 less than 1/10,” the last matching a rule-of-thumb given in 2006 American Industrial Hygiene Association (AIHA) guidance. They are derived using the conservative, experience-based assumption that the data distribution is lognormal with log-scale standard deviation σ at most 2.0 (geometric standard deviation at most 7.39). Although based on this assumption, their statistical performance is reasonably unaffected or conservative when data come from other distributions often assumed for contaminant concentrations; moreover, their performance is insensitive to analytical variation. This conservative robustness merits the description “quasi-nonparametric.” QNP UTLs are very easy to use. Reporting Limit (RL) issues do not arise. QNP UTLs reduce the numbers of observations needed to support conservative risk management decisions when sampling from compliant working conditions. Keywords

censored data, exposure limits, nondetects, reporting limits, uncensored data, upper tolerance limits

Address correspondence to: Charles B. Davis, EnviroStat, 3468 Misty Court, Las Vegas, NV 89120; e-mail: [email protected]

INTRODUCTION

U 342

pper tolerance limits (UTLs) have long been a standard tool for comparing exposure sampling data sets with

an occupational exposure limit (OEL) or other regulatory criterion (RC) to assure enough samples have been collected to demonstrate that working conditions are compliant. A UTL is an upper confidence limit for a percentile of the distribution of all possible measurements; typically the percentile (content) and confidence level are both set at 95%. If the 95%-95% UTL does not exceed the RC, one is 95% sure that no more than 5% of exposures exceed that RC, in which case the evaluation receives a “pass.” Discussions, along with methods for computing UTLs, may be found in National Institute for Occupational Safety and Health (NIOSH) and American Industrial Hygiene Association (AIHA) publications such as those written by Leidel, Busch, and Lynch(1) and Ignacio and Bullock.(2) See also Tuggle(3) and Krishnamoorthy and Matthew(4) for statistical discussions of UTLs. A brief discussion of other decision approaches is presented at the end of the Discussion section. Parametric methods for computing UTLs are generally avoided if measurement results are predominantly below a reporting limit (RL) or if their distribution cannot be assumed to comply with parametric assumptions. (2)Nonparametric UTLs (NPUTLs) do not rely on assumptions about the distribution of measurements; one needs only a representative sampling of the population of all possible measurements. A 95%-95% NPUTL is the largest of N = 59 to 92 observations, the second largest of 93 to 123, and so on. But 59 observations may seem excessive for particularly clean environments or small studies. One would like to be able to use fewer observations and “get credit” for how much lower they are than the RC. The AIHA(2) provides a rule-of-thumb that if one obtains 6 to 10 values which are all less than 10% of the RC, one can feel comfortable that no more than 5% of all measurements would exceed the RC. The “Quasi-nonparametric” UTLs (QNP UTLs) presented in this article provide a statistical framework for relating that rule-of-thumb to the NPUTL and filling the gap between N ≈ 8 and N = 59. They are derived using the assumption that the distribution of all possible data values is lognormal (LN) with (natural) log-scale standard deviation (σ ) no greater than 2.0, an upper bound based on the authors’

Journal of Occupational and Environmental Hygiene

May 2015

Downloaded by [New York University] at 17:05 31 May 2015

experience and that of others with environmental contaminant data, as discussed in (online) Supplement 2. A convenient way to view QNP UTLs is to think of assigning a “pass” so long as one has “59 [values] less than the RC” (the NPUTL itself), “30 less than 1/2 [of the RC],” “21 less than 1/3,” “16 less than 1/4,” and so on, down to “8 less than 1/10”; the last corresponds to that AIHA rule-of-thumb. This article first describes QNP UTLs and their use. Since QNP UTLs are the largest values observed, RL issues do not arise, so long as the RLs themselves do not exceed the desired fraction of the RC. QNP UTLs are quite simple to use, needing only a multiplication and Table I. The article then evaluates the statistical performance of QNP UTLs with simulated data from a variety of distributions, and finds their performance to be reasonably insensitive or conservative relative to the assumptions used to develop them, hence justifying the “quasi-nonparametric” description. Typical parametric UTL methods based on LN (or perhaps other) assumptions about the distribution of contaminant concentrations give no consideration to the additional random contribution due to analytical variation, which can be non-negligible in relatively clean situations, rendering those distribution models at best somewhat simplistic. The development of QNP UTLs was motivated by a study of uncensored data obtained during clearance surveys of removable surface contamination. (Censored data are a mixture of actual values and “< RL” values; uncensored data report the actual measurements for every analysis.) Typical RLs would be 25% to 10% of the RC in that setting. Due to analytical variation many of the values (up to around half) obtained in relatively clean facilities were negative; in common data reporting practice these would always be hidden beneath RLs. Negative values violate the statistical assumptions almost always used in deriving parametric censored-data UTL procedures, such as those evaluated by Davis.(5) In evaluating statistical approaches for those uncensored data, Davis, Field, and Gran(6) found that violating those assumptions in that way can adversely affect the statistical performance of commonly recommended parametric censored-data UTL procedures. The ability of the QNP UTL methodology to take credit for the distance between the RL and the RC leads to other industrial hygiene applications. Whenever an OEL or RC is ten times the RL or less, exposure monitoring results from a compliant workplace are likely to be predominantly below a RL. New, lower OELs that have yet to be accommodated by the development and adoption of improved sampling and analytical methods are candidates for the application of the QNP UTL method. Validating the effectiveness of recommended controls for non-routine construction or maintenance tasks with limited opportunities for exposure monitoring are another situation in which QNP UTLs could prove useful. The principles underlying the QNP UTL also improve the technical basis for the long-standing industrial hygiene practice of using preliminary monitoring results that are consistently less than

10% of the OEL to assign an operation a low priority for additional monitoring.

QNP UTLS The idea behind QNP UTLs was suggested by Ed Frome of Oak Ridge National Laboratories around 2007. They were first investigated by Davis, Field, and Gran(6) as one possible way of dealing with the actual distributions of uncensored environmental contaminant data including negative values. One starts with the common assumption that the distribution of measurements belongs to the LN family, with an upper bound for the shape parameter σ . Derivation Step 1: Relating Content to N at 95% confidence When the NPUTL is the largest data value, the relationship between content (percentile) and confidence is the following: (content)N = 1 − confidence.

(1a)

This comes from the following; if xp is the 100pth percentile of a distribution, and one has N independent observations Xi from that distribution, Pr(allXi < xp ) = Pr(NPUTL = maxXi < xp ) =pN . (1b) Therefore the chance that the largest sample value is at least as large as the 100pth percentile is 1 – pN. With 95% confidence and N = 59, we have content = 0.9505, slightly higher than the target 95th percentile, so the N = 59 NPUTL is slightly conservative. Then, keeping the confidence level at 95%, as N decreases the content decreases. With N = 45, content = 0.9356; with N = 25, content = 0.8871; with N = 8, content = 0.6877; and so on. This last states that the largest of N = 8 observations is a NPUTL for the 68.77th percentile, still with 95% confidence. Step 1 depends on no distribution assumptions. See chapter 8 of Krishnamoorthy and Matthew(4) for more detail on NPUTLs. Step 2: Ratios of percentiles The next step is to devise a way of taking credit for how far the highest observed value is below the RC. This is where the LN assumption comes in. If z is a given percentile for the standard normal distribution, that percentile for a LN distribution is exp(μ + zσ ), where μ and σ are the mean and standard deviation of the normal distribution of logs of the data. For illustration, consider the N = 8 case. The interest is in the 95th percentile, and the largest of 8 values is a NPUTL for the 68.77th percentile. The normal distribution “z-scores” for these percentiles are 1.64485 and 0.48922 respectively. The

Journal of Occupational and Environmental Hygiene

May 2015

343

TABLE I. QNP UTL Factors

Downloaded by [New York University] at 17:05 31 May 2015

Compare RATIO ∗ largest value with RC, or compare largest value with TCV% ∗ RC N

RATIO

TCV%

N

RATIO

TCV%

8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33

10.09 8.52 7.36 6.47 5.76 5.19 4.73 4.34 4.00 3.72 3.47 3.26 3.07 2.90 2.75 2.61 2.49 2.38 2.28 2.19 2.10 2.02 1.95 1.89 1.82 1.76

9.9% 11.7% 13.6% 15.5% 17.4% 19.3% 21.2% 23.1% 25.0% 26.9% 28.8% 30.7% 32.6% 34.5% 36.4% 38.3% 40.1% 42.0% 43.9% 45.7% 47.6% 49.4% 51.2% 53.0% 54.9% 56.7%

34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

1.71 1.66 1.61 1.57 1.53 1.49 1.45 1.41 1.38 1.35 1.32 1.29 1.26 1.23 1.21 1.18 1.16 1.14 1.12 1.10 1.08 1.06 1.04 1.02 1.01 0.99

58.5% 60.3% 62.0% 63.8% 65.6% 67.3% 69.1% 70.8% 72.6% 74.3% 76.0% 77.7% 79.4% 81.1% 82.8% 84.5% 86.2% 87.8% 89.5% 91.2% 92.8% 94.5% 96.1% 97.7% 99.3% 101.0%

For 59 ≤ N ≤ 92 the 95%-95% NPUTL is the largest value, and so on.

ratio between these is RATIO = 95th percentile/68.77th percentile

(2a)

= exp(μ + 1.64485σ )/exp(μ + 0.48922σ ) = exp((1.64485 − 0.48922)∗ σ ). (2b) For other N, replace 0.48922 with the z-score for the percentile obtained in Step 1. Step 3. Using the upper bound on σ Assuming that σ ≤ 2.0, RATIO is no more than 10.09 ≈ 10; the 95th percentile is no more than about 10 times the 68.77th percentile. From Step 1 one is 95% confident that the largest of N = 8 observations is at least as large as the 68.77th percentile. Putting these together, one is 95% confident that RATIO times the largest of N = 8 observations is at least as large as the 95th percentile so long as σ is no greater than 2.0. The QNP UTL is then RATIO times the largest value. Another way to express the resulting decision rule is to say that the evaluation will “pass” so long as the largest value is no larger than RC / RATIO. This agrees nicely with that AIHA rule of thumb (≈ 8 values less than 1/10 of the RC warrants a 344

“pass”). In some applications it is convenient to think of RC / RATIO as a “test critical value” (TCV%). One can repeat the preceding calculation for any value of N; the results are given in Table I for N ≤ 59. (For N ≥ 59 the NPUTL itself is available.) The formulas are as follows: content = 0.0511/N , to give the desired 95% confidence; (3a) z = normal distribution z − score corresponding to that content (the EXCEL function is NORMSINV(content);

(3b)

RATIO = exp((1.64485 − z ) ∗ σ), where σ = 2.0 for QNP UTLs;

(3c)

TCV% = Test critical value as a percent of the RC for that N = 1/RATIO; Xmax = larger of (largest data value, largest RL) QNP UTL = RATIO ∗ X max.

(3d) (3e) (3f)

The last of these says that to obtain the QNP UTL, multiply the largest data value or largest RL, whichever is larger, by RATIO. The value 1.64485 in (3c) is the z-score for 95%

Journal of Occupational and Environmental Hygiene

May 2015

Downloaded by [New York University] at 17:05 31 May 2015

FIGURE 1. Test critical values as percents of the RC. The notation “24 < 2/5,” for example, indicates that a “pass” is declared if all of N = 24 data values are at most 40% (2/5) of the RC (OEL).

content; when N = 59 we have RATIO ≈ 1.0, so QNP UTL ≈ NPUTL when N = 59. The TCV% information may be displayed graphically, as in Figure 1. It is convenient to identify various combinations in a manner resembling the AIHA rule-of-thumb, using descriptions such as “21 less than 1/3,” “30 less than 1/2,” and so on. Another useful way of viewing this result is the following: 95%UCL on Proportion of Measurements Exceeding RC or OEL = 1 − p−value of{ln(RC/Xmax )/2+z},

(4)

where z is the value obtained in Eq. (3b) and p-value is the usual left-tail normal probability (the EXCEL function is NORMSDIST). Eq. (4) is derived by setting QNP UTL equal to RC (the borderline case), letting zb replace 1.64485 in Eq. (3c), solving for zb , and from that obtaining the exceedence proportion that would make the QNP UTL comparison borderline. Using this approach, one would assign a “pass” if the 95% UCL on exceedence proportion is at most 5%. Using QNP UTLs The information in Table I can be used in two equivalent ways once data are available: either multiply the largest data value by RATIO to obtain the QNP UTL to compare with the RC; or compare the largest observed data value with RC multiplied by TCV%. The advantage of expressing the result as the QNP UTL itself is its direct comparison with the RC. Planning Numbers of measurements Another use of Table I is in planning a study. If one anticipates that the highest value likely to be seen in a forthcoming data collection effort would be, say, at most 45% of a pertinent RC, one could plan on obtaining N = 28 or so observations (possibly a few more) to attempt to obtain a “pass.” If the anticipated largest value would be 75% of the RC, N = 44 or so would be advisable, and so on.

An Important subtlety An advantage of using TCV% is that, in implementation, an initial data set’s highest value may exceed (TCV% × RC) but not RC itself, so one cannot yet declare either a “pass” or a “fail.” Nonetheless, the comparison does suggest how many additional observations might provide enough evidence to reject the null hypothesis and declare the desired “pass.” For example, with RC = 0.2 and an initial data set of N = 30 observations, TCV% = 51.2%, so one compares the maximum value with 51.2% ∗ 0.2 = 0.1024. If the maximum value in the data set is at most 0.1024, the area under investigation (“survey unit” or SU) “passes,” and no further sampling is needed. If the maximum value exceeds the RC itself, the SU “fails” (even with additional data), and corrective action or further investigation is needed. But if the maximum value is between 0.1024 and 0.2, one may obtain additional observations in an effort to “pass.” For example, suppose that the maximum value is 0.12, 60% of the RC; comparing with Table I, one might then obtain 5 (perhaps a few more) additional observations, for a total N of 35 (or a few more), in a next attempt to “pass.” This iterative strategy is consistent with waste characterization strategies given in the U.S. Environmental Protection Agency’s (EPA’s) SW-846(7) regulation regarding testing wastes for toxicity, for example. Reporting Limit Considerations Regarding RLs, the only requirement is that the largest RL should be no greater than the desired fraction of the RC. For example, with RC = 0.2 and RL = 0.05, RL / RC = 25.0%, so one would need N ≥ 16 to carry out the QNP UTL procedure. This consideration could enter into negotiations with laboratories regarding setting RLs in some circumstances. Statistical Hypothesis Testing via UTLs Comparing any UTL with a RC this way implements the following statistical hypothesis test: H0 : 95th percentile > RC (Null Hypothesis = “dirty”) HA : 95th percentile ≤ RC (Alternate Hypothesis = “clean”)

H0 is assumed true until there is sufficient evidence to reject it in favor of HA ; the SU is assumed “dirty” until there is enough evidence to declare it “clean” via hypothesis testing. The target error rate for making that “clean” decision incorrectly is the significance level, which is 5% (100% less the confidence level). An acceptable UTL procedure should maintain at most that target significance level. Given that, one would prefer UTL procedures with the chance of rejecting H0 when the 95th percentile is actually less than the RC is greater than that of competing procedures. This “success” rate is known as statistical power. The objectives of the previous paragraph should be met when distribution assumptions are satisfied, of course. Another desirable property for UTL (or other) statistical procedures is that they perform reasonably well when the distributions differ from those assumed. QNP UTLs are found to be adequately “robust” in this sense in the following section.

Journal of Occupational and Environmental Hygiene

May 2015

345

Downloaded by [New York University] at 17:05 31 May 2015

EVALUATING THE PERFORMANCE OF QNP UTLS Statistical Distributions and Analytical Variation The statistical distribution most often assumed for environmental contaminant data is lognormal LN; for example, all of Leidel, Busch, and Crouse,(9) Leidel, Busch, and Lynch,(1) Tuggle,(3) Gilbert,(10) Davis,(5) Davis, Field, and Gran,(6) and numerous EPA guidance documents follow this convention. LN distributions are characterized by a shape parameter σ and a scale parameter eμ, where μ and σ are the mean and standard deviation of the normal distribution of (natural) logs of data values. Other distribution families sometimes used are gamma (GA) and Weibull (WE). In each of these there is also a shape and a scale parameter, denoted  and  here. All three distribution families are designed for nonnegative data, although one can carefully shift the left endpoint if needed. All have shapes which range from symmetric and nearly normal to extremely skewed with very long right tails. The skewness increases with increasing σ for the LN family and with decreasing  for the GA and WE families. Supplement 1 describes the distributions used in this evaluation in greater detail. A significant complication arises when one considers analytical variation. Even though the distributions of concentrations themselves may be LN (or GA or WE), analytical variation adds to the variability of measurements. Based on their experience with uncensored data (data reported without reference to RLs, including even negative values) from the Department of Energy Nevada Site Office (DOE/NV) Worker Environment Beryllium Characterization Study,(8) Davis, Field, and Gran(6) suggest that normal distribution models are appropriate for the analytical variation. This also is discussed in further detail in Supplement 1. Accordingly, in their 2009 study Davis, Field, and Gran(6) used several LN distributions with more or less skewed distributions of concentrations, to which are added varying amounts of normally distributed analytical variation. The QNP UTL concept was introduced in that study as one possible way of dealing with the actual observed data distributions. Of particular interest is the question of the robustness of the UTL with respect to analytical variation, the potential impact of which is typically ignored in both publication and practice. Of the UTL procedures they evaluated, the simplest to implement and least affected by analytical variation is the QNP UTL. The results presented in Supplement 3 and summarized here extend those earlier findings of Davis, Field, and Gran by including additional LN distributions as well as GA and WE models and performing a much more comprehensive evaluation of significance levels and statistical power.

As a quick example, Figure 2 taken from Supplement 3 shows four distributions, being LN 2.00 with 95th percentile equal to 0.05 (a quarter of the RC for the Be surveys), with varying amounts of analytical variation added. Even though the distribution of concentrations is non-negative, analytical variation can add a substantial proportion of negative values, even with a good lab (“B” in Figure 2). In Supplement 2 this is called a LogNormal-Normal Model (LNNM), for concentrations having a lognormal distribution with analytical variation having normal distributions dependent on concentration; some properties of the LNNM are discussed there. In Supplement 2 some 9579 Be measurements from the NV study(8) are pooled into 52 data sets, with the LNNM being fit to each, toward the end of obtaining fitted values of σ . Figure 3 shows the data from data set 11, along with its fitted LNNM. In this data set there are 190 measurements, of which 37 are negative and 10 are above a typical low RL of 0.02 (10% of the RC).

The Simulation Study The statistical hypothesis testing objectives presented above were investigated for QNP UTLs through a large-scale simulation study, with 10,000 samples randomly generated for each combination of these factors:

Results: Significance Levels The results detailed in Supplement 3 are summarized here and in the next sub-section. First, for all distributions the significance levels approach the nominal 5% as N approaches 59. This is expected since as

346

FIGURE 2. Distributions of measurements obtained by adding varying amounts of normally distributed analytical variation to a LN concentration distribution with conservative shape parameter σ = 2.0. For each 95th percentiles are 0.05.

• the shape of the underlying distribution of concentrations (parameterized by σ for LN and  for GA and WE families of distributions); • different amounts of analytical variation (A = none, B = good, C = fair, and D = poor); • scale parameters (eμ for LN,  for GA and WE) adjusted to make the 95th percentiles equal to RC = 0.20 (to assess significance level stability) as well as 0.15, 0.10, 0.05, and 0.02 (to assess statistical power); and • numbers of observations N equal to 10, 15, 20, 30, 45, and 59.

Journal of Occupational and Environmental Hygiene

May 2015

Otherwise, the power curves are rather similar for the various LN σ distributions for all N. For N = 45 and N = 59 the power curves for GA and WE distributions are above that for the LN 2.0 distribution for all 95th percentiles. That remains true for smaller N, with the exception of the most skewed GA and WE distributions, for which the power is rather low until the 95th percentile is substantially below the RC. A caveat is in order in interpreting this information. That is the “important subtlety” discussed previously. The statistical power is the chance of rejecting the null hypothesis with the available data. If the available data are insufficient to reject the null hypothesis and declare a “pass,” but the largest value is at most the RC, one may obtain additional observations in an attempt to “pass,” as discussed in the “sample size planning” discussion previously.

Downloaded by [New York University] at 17:05 31 May 2015

FIGURE 3. Measurements from Data set 11 of Supplement 2, along with the fitted Lognormal-Normal Model (LNNM)

N → 59 the QNP UTL becomes the NPUTL, with a significance level that does not depend on the distribution. For smaller N the significance levels do vary by distribution, as anticipated. It is desirable for significance levels not to exceed their nominal value, in this case 5%. For LN 2.0, the distribution used in deriving QNP UTLs, they remain close to 5%. For LN 2.3 (very conservative) they rise to around 8% for the smallest N evaluated (10), and are below around 6% for N at least 30. For LN σ with less conservative (smaller) values of σ the significance level decreases to around 1% in the worst case evaluated. Significance levels with the other distributions are similar, with the exception of the less skewed GA and WE distributions, for which the significance levels become virtually zero for N ≤ 30 and are quite low for N = 45. This raises concerns that the statistical power (chance of correctly rejecting the null hypothesis and finding the situation “clean”) might correspondingly be low. On the other hand, it is very promising that the significance levels found in the simulation study are virtually unaffected by the amount of analytical variation added, so long as it is “A,” “B,” or “C,” and not that much affected even by poor lab performance (“D”). Results: Statistical Power Again as expected, statistical power (the chance of obtaining a “pass”) is in the vicinity of 5% when the 95th percentile is actually the RC in all cases; after all, this is the significance level. Also as anticipated, the power increases in every case as N increases. Again, it is very promising that the empirical powers found in the simulation study are virtually unaffected by the amount of analytical variation added, so long as it is “A,” “B,” or “C,” and not that much affected even by poor lab performance (“D”).

DISCUSSION

T

he QNP UTL procedure is extremely easy to use. It conservatively implements the 95%-95% UTL decision rule often used for comparison with OELs, RCs, or other ceiling standards. Although developed assuming a LN distribution with geometric standard deviation (GSD) no greater than 7.39 (σ ≤ 2.0), its statistical performance is adequately robust or conservative when evaluated under other distribution families commonly used for environmental contaminant data. Moreover, its statistical performance is at most mildly affected by the nature of analytical variation, unlike that of other UTL procedures which make greater use of distributional assumptions, particularly those based on censored-data estimation techniques which ignore the possibility of negative measurements; those do occur where low levels of contamination are present though these are nearly always hidden below RLs. This procedure is developed assuming a LN distribution with σ = 2.0 and found to be conservative with LN distributions with σ < 2.0. Empirical evidence supporting this bound is presented in Supplement 2. If in a particular setting one might be able to suggest a credible lower upper bound, one could redevelop Table I to provide less conservative QNP UTLs with better statistical power. One should be wary, though; it is difficult to reliably determine σ from empirical datasets for two reasons. The first is that the observable effects of σ on the distribution are seen largely in measurements in the upper tails of the distribution, which are rare. Also, values near zero can affect estimates of σ , since as data values approach zero, their logs go off to negative infinity. These are most often hidden beneath RLs; if observed, their variability would reflect low-level analytical variation as well. The statistical performance of the QNP UTL procedure is investigated in Supplement 3, which is summarized in the preceding section. It maintains the target 95% confidence reasonably conservatively, and provides reasonable statistical power, when evaluated using a variety of other distributions.

Journal of Occupational and Environmental Hygiene

May 2015

347

Other UTL Approaches In the application envisioned in this article, a UTL no greater than the OEL gives a “pass.” In addition, typically all data values must not exceed the OEL; an analyst would feel uncomfortable assigning a “pass” if any value exceeds the OEL. This additional requirement is automatically satisfied with the QNP UTL but not with other UTL approaches.

Downloaded by [New York University] at 17:05 31 May 2015

Uncensored Data If one had uncensored data with no negative values, one could use the straightforward method given by Gilbert(10) in his §11.3 with log data, then exponentiating the log-scale UTL. With uncensored data including negative measurements, see the exploratory discussions in Davis, Field, and Gran.(6)

Censored Data But if data are censored, there are other parametric statistical techniques available, as discussed by Davis,(5) Davis, Field, and Gran,(6) and others. If one expects that analytical variation should be relatively negligible and one is willing to assume that the data distribution is LN, Davis(5) recommended using UTLs based on censored-data maximum likelihood estimates (cMLEs). These ignore the possibility of negative measurements. One can of course use the cMLE (or other censored-data parametric) methods anyway. For a quick comparison, Figure 4 shows the same data and fitted LNNM as Figure 3, adding the model implicitly fit by the cMLEs when the data are censored at 0.02. With these data, though, the proportion of negative values (19%) is not negligible, and the fitted LNNM model clearly fits the data better. Table II compares a few summary statistics for the empirical distribution, the fitted LNNM, and the fitted cMLE distribution. How UTLs based on the cMLE methodology perform is a different question, though. Davis, Field, and Gran(6) include a small performance evaluation comparing the cMLE method with the QNP UTL method and another method. Based on their limited simulation results they recommend against using the proportion of “detects” in a particular data set in deciding whether to use the cMLE methodology. They do point out that the statistical properties of the cMLE-based UTLs can be sensitive to the interplay between the RL and the nature of analytical variation when the possibility of negative values hidden below the RL cannot be dismissed. Data set 11 shown in Figure 4 has about 5% “detects” if RL = 0.02. Conventional wisdom might suggest that this is too low to use parametric procedures. This conventional wisdom has in the past had the ironic result that one might be able to find that a mildly “dirty” facility could “pass,” using small data sets and censored-data parametric UTLs with “enough” detects, whereas a cleaner facility would need the larger data set (N = 59) required by the NPUTL. This issue does not arise with QNP UTLs, which use only the largest data value or RL. 348

FIGURE 4. As in Figure 3, with a fitted censored-data maximum likelihood fitted lognormal distribution added. This would be the fitted distribution if a LN model was assumed and all values less than 0.02 were simply reported “< 0.02.” The cMLE model assumes that the negative values are between 0 and RL.

Horn’s Quasi-Nonparametric Upper Tolerance Region A reviewer has pointed out that Horn(11) used the term “quasi-nonparametric upper tolerance region” in 1992. This is unrelated to the present work, however. Rather, Horn’s approach separates estimating the quantile (providing the desired content) and providing the confidence. His quantile estimator can be parametric or nonparametric; the confidence comes from multiplying that by a scale factor based on bootstrapped order statistics, which is inherently nonparametric. Using a nonparametric 95th percentile estimator requires at least 19 observations; with N = 19 at least 4 uncensored values are needed for the scale factor. The QNP UTL proposed in this article uses only the largest data value or RL. Another difference between the papers is that Horn’s performance evaluation is based on the reliability of the confidence level and the average ratio between the UTL and the actual quantile, whereas in the present article emphasis is on the statistical significance and power when the UTL is used in comparing data sets with an OEL. Finally, Horn explicitly avoids using upper order statistics because of their undeniable instability, whereas in the application envisioned here one typically cannot ignore the largest data values, as discussed previously.

TABLE II. Data Set 11 Descriptive Statistics

Empirical LNNM cMLE 0.02

Journal of Occupational and Environmental Hygiene

Mean

StDev

95percentile

0.0039 0.0043 0.0069

0.0082 0.0147 0.0079

0.0200 0.0177 0.0205

May 2015

Downloaded by [New York University] at 17:05 31 May 2015

Other Decision Approaches Leidel, Busch, and Crouse(9) used UTL concepts to develop the regulatory action level as a metric compliance officers could use for comparing a single exposure monitoring result to an OEL. They included a meta-analysis of published estimates of inter-day exposure variation to support their conclusion that a lower bound for GSD is 1.25 (σ = 0.223). This led them to conclude that if the largest of a few (presumably representative) results is more than half the OEL, conditions are most likely non-compliant and monitoring strategies should focus on diagnostic sampling to identify sources requiring additional controls. The QNP UTL is supported by similar meta-analyses giving an upper bound of σ = 2 (GSD = 7.39). Together these can help industrial hygienists interpret results from a few preliminary samples. If the largest of a few representative results is less than 10% or so of the OEL, the monitoring strategy would focus on collecting enough representative samples to reach the specified 95% confidence that fewer than 5% exceed the OEL. Results between these suggest conditions require continuing management attention with ongoing monitoring to provide feedback on whether controls are effective; such additional monitoring can also serve to augment the initial data set as discussed in the previous section, “An Important Subtlety.” RLs greater than 10% of the OEL add uncertainty to this decision process and should motivate a review to see if they can be lowered.

CONCLUSION QNP UTLs can be used to support conclusions that a sufficient number of sufficiently low monitoring results have been obtained to demonstrate compliance with an OEL or RC. They include the AIHA rule-of-thumb “6 to 10 less than a tenth [of the RC]” at the lower end and the N = 59 NPUTL at the upper end. They are reasonably insensitive to distribution assumptions and the effects of (typically unseen) low-end analytical variation. Another use is as a tool to help in planning sampling campaigns. One can estimate the number of samples needed based on the fraction of the RC that controls are intended to achieve; one may need to ensure that the sampling and analytical methods have a RL that matches that fraction, of course. Using QNP UTLs in this fashion has the potential to decrease the number of samples needed to demonstrate compliance. Their insensitivity to distribution assumptions allows for flexibility in evaluating periodic summaries of aggregated monitoring results grouped by factors relevant to workers and managers of an organization operating a health protection program.

SUPPLEMENTAL MATERIAL

S

upplemental data for this article can be accessed at tandfonline.com/uoeh. AIHA and ACGIH members may also access supplementary material at http://oeh.tandfonline.com/. ACKNOWLEDGMENTS QNP UTLs were originally developed by EnviroStat under Subcontract 58459 with National Security Technologies LLC (NSTec), under Contract DE-AC52–06NA25946 with the US DOE at the Nevada National Security Site (NNSA) (formerly the Nevada Test Site). These efforts were encouraged by Danny Field, now with US DOE/NNSA, and Tom Gran of NSTec. Subsequent refinements were supported under Contract 43346–004 with Mission Support Alliance LLC, a contractor with the US DOE at the Hanford Site. Careful reviews of previous drafts of this article by Karl Agee of Fowler LLC, supporting US DOE at the Hanford Site, have led to a vastly improved exposition of the material and are greatly appreciated. REFERENCES 1. National Institute of Occupational Safety and Health (NIOSH): Occupational Exposure Sampling Strategy Manual, by N.A. Leidel, K.A. Busch, and J.R. Lynch (NIOSH 77–173). NIOSH, 1977. 2. Mulhausen, J., J. Damiano, and E.L. Pullen: Further Information Gathering. In A Strategy for Assessing and Managing Occupational Exposures, 3rd ed., J.S. Ignacio and W.H. Bullock (eds.). Fairfax, VA: American Industrial Hygiene Association, 2006. 3. Tuggle, R.M.: Assessment of occupational exposure using one-sided tolerance limits. Am. Ind. Hyg. Assoc. J. 43:338–346 (1992). 4. Krishnamoorthy, K., and T. Mathew: Statistical Tolerance Regions: Theory, Applications and Computation. New York: John Wiley and Sons, 2009. 5. Davis, C.B.: “Parametric 95%-95% Upper Tolerance Limits for LeftCensored Lognormal Data.” Presented at the Joint Statistical Meetings, Seattle, Washington, August 10, 2006. 6. Davis, C.B., D. Field, and T.E. Gran: A Model for Measurements of Lognormally Distributed Environmental Contaminants (Report DOE/NV/25946–733). U.S. Department of Energy, July 2009 (revised Jan. 2012). 7. U.S. Environmental Protection Agency (EPA): Test Methods for Evaluating Solid Wastes, Physical/Chemical Methods (Publication SW846): Alexandria, VA: NTIS, Chapter 9, Box 1, pp. NINE-13–17. 8. National Security Technologies: "Worker Environment Beryllium Characterization Study." Available at http://www.osti.gov/scitech/biblio/ 970811. 9. National Institute of Occupational Safety and Health (NIOSH): Exposure Measurement Action Level and Occupational Environmental Variability, by N.A. Leidel, K.A. Busch, and W.E. Crouse, (NIOSH 76–131). NIOSH, 1975. 10. Gilbert, R.O.: Statistical Methods for Environmental Pollution Monitoring. New York: Van Nostrand Reinhold, 1987. Chapter 13. 11. Horn, P. S.: Quasi-nonparametric upper tolerance regions based on the bootstrap. Commun. Stat. - Theory Meth. 21(12):3351–3367 (1992).

Journal of Occupational and Environmental Hygiene

May 2015

349

"Quasi nonparametric" upper tolerance limits for occupational exposure evaluations.

Upper tolerance limits (UTLs) are often used in comparing exposure data sets with an occupational exposure limit (OEL) or other regulatory criterion (...
257KB Sizes 0 Downloads 8 Views