Mutation Research, 240 (1990) 177-194

177

Elsevier MUTGEN 01519

Methods for comparing Salmonella mutagenicity data sets using nonlinear models W . G r e g o r y A l v o r d a, J e f f r e y H . D r i v e r b, L a r r y C l a x t o n c a n d J o h n P. C r e a s o n c a Data Management Services, Inc., National Cancer Institute, Frederick Cancer Research Facility, Frederick, M D 21701, b RiskFocus Division, V E R S A R Inc., Springfield, VA 22151 and c Health Effects Research Laboratory, U.S. Environmental Protection Agency, Research Triangle Park, NC 27711 (U.S.A.)

(Received 13 January 1989) (Revision received 4 October 1989) (Accepted 10 October 1989)

Keywords: Ames/Salmonella assay; Nonlinear regression; Statistical modelling; Data-set comparison

Summary A variety of linear and nonlinear mathematical models have been proposed to characterize Salmonella mutagenicity data sets, but no systematic procedure has been suggested for comparing two or more data sets across experiments, laboratories, occasions, mutagens or treatment conditions. In this paper, a general method for data-set comparison is provided. Nonlinear regression techniques are applied to real data sets. Data-set and parameter equivalence are described in depth. Confidence-band construction for nonlinear models and other graphical techniques are presented as auxiliary tools. Key Statistical Analysis System (SAS) code programs are provided.

The Salmonella test (Ames et al., 1975) is widely used to screen potentially carcinogenic compounds to which humans are exposed. Typically, logarithmically spaced doses of the suspected mutagen(s) are mixed with a specific tester strain of S a l m o n e l l a t y p h i m u r i u m , and the number of colonies reverting to histidine prototrophy is counted. The results of the assay are generally expressed in terms of the number of revertant colonies per plate.

By acceptance of this article, the publisher or recipient acknowledges the right of the U.S. Government to retain a nonexclusive, royalty-free license in and to any copyright covering the article. Correspondence: Dr. W. Gregory Alvord, Data Management Services, Inc., Building 362, P.O. Box B, Frederick, MD 21701 (U.S.A.).

Several investigators have presented mathematical models to describe Ames mutagenicity test results. Bernstein et al. (1982) assumed that the initial part of the curve contains most of the interpretable information about the mutagenesis dose-response. They described a procedure for eliminating points at high doses such that a straight-line linear model can be fit to the data. Margolin et al. (1981) constructed a family of statistical models that consider mutation and toxicity as competing risks and allow hyper-Poisson variability. These models included parameters that quantify mutagenic rates and plate-to-plate variability. Stead et al. (1981) developed a nonlinear mathematical model that takes into account the spontaneous reversion count, the rate of revertant colony formation and the rate at which the system becomes toxic. By treating the counts as Poisson,

0165-1218/90/$03.50 © 1990 Elsevier Science Publishers B.V. (Biomedical Division)

178 these authors defined the likelihood function and described methods for obtaining maximum likelihood estimates of parameters. Statistical tests based on the likelihood ratio are available with their software and can be used to test hypotheses about the toxicity of a given mutagen. Myers et al. (1981) presented a number of nonlinear models that account for mutagenicity, toxicity and saturation. They used a weighted least-squares regression analysis to fit their models and supplied the reader with useful Statistical Analysis System (SAS) code for some of their models. Snee and Irr (1984) adopted a power transformation for revertant count data to produce a measurement scale on which the experimental errors could be described by a normal distribution. They subsequently applied standard analysis of variance, regression analysis and Student's t tests to analyze Salmonella mutagenicity test results. The models presented vary in mathematical complexity. Many of them are described as nonlinear, meaning that the models are nonlinear in the parameters. The estimation of parameters in nonlinear models involves complexities not encountered in applying linear models. These complexities in turn make it more difficult to compare data sets across experiments, laboratories, occasions, mutagens a n d / o r treatment conditions.

Model development and methods In this paper, we present a general method for comparing several sets of Ames dose-response curves to accommodate numerous potential hypotheses. We use the 3-parameter and 4-parameter models of Stead et al. (1981); however, our general method could be used with other nonlinear models (e.g., Myers et al., 1981) or even linear models. We demonstrate our approach with two examples. In the first, we describe the nonlinear regression analysis of individual and combined data sets, the notion of parameter equivalence and the construction of approximate confidence intervals and confidence bands to aid in the analysis. In the second example, a more complex model is used and the more realistic assumption of nonhomoge-

neous variances is taken into account. Unnecessary mathematical complexity has been avoided in the body of the paper. Key SAS code programs are presented in several appendixes to assist the reader in applying the ideas presented here. Example 1 Comparing models for several treatments The 3-parameter model of Stead et al. (1981) is used in this example. It is written as y =/3 0 + exp[fll + 132 I n ( X ) ]

(1)

where flo = spontaneous reversion count and ill, fl2=parameters that jointly describe the amount and rate of revertant-colony formation with increasing dose. 3 sets of data from Stead et al. (1981) are listed in Appendix A. We will identify the strains being compared at a later point in the analysis. To begin, a nonlinear regression analysis is performed on each of the 3 data sets. A requirement of regression models is that the distribution of errors about the model be normal and with constant variance at all points along the dose dimension. This requirement is satisfied with Bartlett's test (Milliken and Johnson, 1984) for homogeneity of variance within each data set across doses; we obtained p values of 0.5789, 0.2919 and 0.1617, respectively, for data sets 1, 2 and 3. Note that in analyses of Ames test data sets one should not generally assume constant variance across all doses, a condition that will be considered in Example 2. However, in the first example we take the homogeneity and normality assumptions to be satisfied and focus on the general concepts of nonlinear model comparison. The models are fit using the SAS N L I N (SAS Institute, Inc.) procedure, which provides leastsquares estimates of the parameters. To use SAS N L I N , the partial derivatives of the functions with respect to the parameters are required (see Appendix B for appropriate SAS code, including derivatives to 3- and 4-parameter models). Table 1 presents the analysis of variance (ANOVA) source tables for each set of data. F tests for the 3-parameter models and R-squared

179 TABLE 1 ANOVA TABLE FOR 3-PARAMETER

STEAD MODEL

FIT TO 3 DATA SETS IN EXAMPLE

1

Source

df

SS

MS

F

P

Set 1 R e g r a Residual LOF b P.E. c

3 12 2 10

42463.52 364.47 17.13 347.34 R 2 = 0.96

14154.51 30.37 8.57 34.73

466.07

0.0001

0.24

0.8668

Set 2 R e g r Residual LOF P.E.

3 15 3 12

30378.47 429.52 84.52 345.00 R 2 = 0.96

10126.16 28.63 28.17 27.75

353.69

0.0001

1.04

0.4033

Set 3 R e g r Residual LOF P.E.

3 12 2 10

26 049.18 792.82 304.14 488.68 R 2 = 0.89

8 683.06 66.07 152.07 48.87

131.42

0.0001

3.l 1

0.0668

R e g r , regression. b L O F , lack o f fit. c P.E., p u r e error.

values of 0.96, 0.96 and 0.89, respectively, indicate good fit of the model to each set. Because replicate measures are available at each dose, we are able to test for lack of fit against pure error sums of squares. Also, tests for homogeneity of variance among the respective residual mean squares ( M S ) are performed. The F tests for these data are nonsignificant. With the knowledge that the 3-parameter model provides good fit to the data sets individually, we are now interested in determining whether there are any differences among them. We want to investigate whether changes in occasions, laboratories, chemicals, tester strains or other possible experimental conditions or treatments are reflected in the parameters. A particular hypothesis of interest is that there are no differences among the data sets. That is, we hypothesize that the data sets are equivalent or invariant. For example, if a particular experiment were repeated over a period of several days under (supposedly) the same laboratory conditions, then day-to-day reliability could be tested with a hypothesis of equivalence. This situation, which we will call the hypothesis of treatment equivalence (or data set equivalence), would be reflected in (statistically) equal parame-

ters across the data sets in question. Formally, the method is based on the "extra sums of squares principle" of Draper and Smith (1981, Section 2.7) or the "principle of conditional error" of Milliken and DeBruin (1978). We proceed with this hypothesis. One way of looking at the hypothesis of data-set equivalence is to consider the following. If all data sets under consideration are exemplars of the same underlying biological reality, then analyses performed on separate data sets will yield statistical results similar to those obtained by an analysis of the combined data (in which all of the data sets are lumped together into one large data set). Let SS(DS~) represent the residual sum of squares (SS) derived from fitting the model to the data of the ith experiment or data set (DS). Then, for our example, SS(separate fits) = SS(DS

) + SS(DS:) + SS(DS,)

The degrees of freedom (df) associated with the SS(separate fits) is the sum of the df associated with each respective SS(DS~).

180 TABLE 2 PARAMETER

E S T I M A T E S A N D A N O V A S U M M A R Y T A B L E F O R 3 D A T A SETS a

Sources of

Parameter estimates

variation

Bo

fll

f12

Set 1

20.23 (3.10)

- 0.2049 (0.57)

0.6962 (0.09)

364.47

12

Set 2

5.50 (2.86)

0.3888 (0.44)

0.6140 (0.07)

429.52

15

Set 3

11.18 (4.58)

- 0.2476 (0.92)

0.6854 (0.14)

792.82

12

S e p a r a t e fits

-

Combined

11.69

Residual

-

-

1 586.81

39

0.0210

0.6607

3 064.25

45

1477.44

6

Experiments F

1477.44/6 1586,81/39

df

F

SS

6.05

6.05 ( p = 0.0002)

a A s y m p t o t i c S t a n d a r d Error in parentheses.

Next, the model is fit against the combined data set. The residual SS from this procedure may be defined as the SS(combined) and has associated with it df equal to the total number of observations less the number of parameters fit in the model. The SS resulting from different experiments, conditions or treatments, known as SS(experiments), is the difference between the SS(combined) and the SS(separate fits), such that SS(experiments) = SS(combined) - SS(separate fits) Also, the df for experiments is df(experiments) = df(combined) - df(separate fits) The test statistic used to test the hypothesis of equivalence of different experiments, conditions or treatments is

F = SS(experiments)/df(experiments) SS(separate fits)/df(separate fits)

Table 2 is a modified A N O V A source table that includes parameter estimates for each set of data analyzed. The separate-fits residual SS is 1586.81 and is based on 39 df. The residual SS for the combined data set is 3064.25 and is based on 45 df. By subtraction, we can determine that the SS associated with experiments is 1477.44 and is based on 6 df. The resulting F statistic is 6.05 ( p = 0.0002), which is significant beyond the a = 0.001 level. This statistic indicates that the experiments (or data sets) are different. If data sets were not found to be statistically different, the analysis would stop. We would conclude that the fitted data sets were equivalent to each other. The next step is to determine, if possible, whether differences in the data sets are due to changes in specific parameters or groups of parameters. This in turn implies hypotheses that place constraints on certain parameters such that they are equivalent across (or common to) data sets while other parameters are free to vary with each data set. We proceed with a strategy similar to the one presented by Ratkowsky (1983, Section 7.3).

181

In the analysis above, we obtained the combined SS by minimizing the following function for SS(combined), here designated as SSB: N

SSB = E { Yt - (/30 + exp[/31 + r2 l n ( X t ) ] ) } 2 t=l

increasing dose. The hypothesis of equivalent /31 and 132 parameters across data sets describes this situation while allowing the spontaneous reversion counts to vary. To fit a c o m m o n set of fll and 132 parameters while permitting the/30 parameters to vary, the following sums of squares must be minimized:

(2) 3

where Yt is the t th data point and N denotes the total number of counts in all 3 data sets. If one replaces the parameters fl0, f l l and^/32 with their least squares estimates/30, fll and f12, the expression within the braces represents a residual. Hence, the entire equation represents the SS residuals about the model under consideration. Note that no subscripts are carried on the fl0, fll and r2 parameters. 3 parameters are being estimated. Thus, the number of df is 3 less than the total number of observations, i.e., 48 - 3 -- 45. To obtain the pooled SS, we originally fit each data set separately and added the residual sums of squares together from each separate fitting. Alternatively, we could have obtained the nine parameter estimates by minimizing the function for SS(separate fits), here designated as SSA: 3

nj

s&=E E

i=1 t=l X (rt-

(130i + exp[/31i + 132i ln(Xt)])) 2

(3) where n i denotes the number of counts in the ith data set. Note that in this case each of the parameters carries the i subscript. Therefore, for 3 data sets, 3 × 3 = 9 parameters are being estimated. Hence, the df associated with this model would be 9 less than the total number of observations, or 48 - 9 = 39. Next we evaluate 4 other models that may be relevant to our hypotheses. Other hypotheses are possible, but the general method can be gleaned from what follows. According to Stead et al. (1981), 131 and /32 jointly describe the number of revertant colonies formed and the rate of colony formation with

ni

ss = E E

i=1 t=l X (Yt-

(/3oi--F exp[13, + f12 ln( X,)])} :

(4) (Pertinent SAS code is provided in Appendix C.) Notice that the r0 parameter carries the i subscript while the t31 and f12 parameters do not carry the subscript. This means that the fll and r2 parameters are common to all 3 data sets while the 13o parameters are free to vary with each data set. Hence, in this model 5 parameters (fl01, 1302, 1303, ill, /32) are being estimated all together. The df associated with this model is therefore 5 less than the total number of observations, or 4 8 - 5 = 43. This hypothesis would be entertained if it were thought that the rates of revertant-colony formation were constant across all 3 data sets but that the spontaneous reversion counts (intercepts) might vary significantly for some biological reason. For example, different tester strains would be expected to differ significantly in the spontaneous reversion counts. Another possible hypothesis of interest is that the 130 and fll parameters vary across data sets but that the/32 parameter does not. This involves minimizing the following sums of squares: 3

ni

SSD= E E

i=1 t=l

× ( Y t - ( f l o ~ + e x p [ f l ~ + f12 l n ( X t ) ] ) } 2

(s) Here, r2 is the only parameter held constant across the 3 data sets. On the other hand, r0 and 13~ are free to vary. Hence, 7 parameters are

182 TABLE 3 C O M P A R I S O N OF FITS F O R 3 D A T A SETS TO D E T E R M I N E W H I C H P A R A M E T E R S R E M A I N E Q U I V A L E N T Description of fit

A: B: C: D: E: F:

Ind. rio, ill,/32 Com. fl0, ill,/32 Ind./30; Com./31, f12 Ind. rio, ill; Com. /32 Ind. /3o, /3z; Com. /31 Ind. /31, /32; Com. /3o

N u m b e r of parameters estimated

df

9 3 5 7 7 7

39 45 43 41 41 41

Description of test

B -

CDEF-

df

A: equivalence of data sets A: equivalence of/31, f12 A: equivalence of/32 A: equivalence of fll A: equivalence of/3 o

6 4 2 2 2

estimated simultaneously, and the df is 4 8 - 7 = 41. Table 3 presents the results of the analysis described above using the models suggested as well as 3 additional models. The top part of the table describes the model that was fit, the number of parameters estimated and the residual SS. The bottom part of the table describes the statistical test that was performed for the hypothesis indicated.

Resid.

Resid.

SS

MS

1 586.81 3064.25 1 729.42 1609.00 1 618.00 1 961.70

40.69

Resid.

Resid.

SS

MS

1 477.44 142.61 22.19 31.19 374.89

246.24 35.65 11.10 15.60 187.45

F

P

6.05 0.88 0.27 0.38 4.61

0.0002 0.4848 0.7648 0.6866 0.0159

The difference in the residual S S at steps B and A ( B - A) describes the test for equivalence of data sets discussed in detail previously. For consistency, residual M S for the remaining equivalence hypotheses are tested against the residual M S (40.69) for the pooled model. The test for equivalence of fll and flz ( C - A) is nonsignificant ( F = 0.88 with 4 and 39 dr, p = 0.4848). Similarly, the tests for equivalence of /~2 ( D - A) and fll ( E - A) are adjudged as nonsig-

TABLE 4 C O M P A R I S O N OF FITS F O R D A T A SETS 2 A N D 3 TO D E T E R M I N E W H I C H P A R A M E T E R S R E M A I N E Q U I V A L E N T Description of fit

A: B: C: D: E: F:

Ind./3o,/31, /32 Com./30,/31, /32 Ind. ill, f12; Com. fl0 Ind. rio, /32; Com. /31 Ind./30, fig Com. f12 Ind./30; Com./31,/32

Description of test B - A: equivalence of data sets C - A: equivalence of/30 D - A: equivalence of fll E - A: equivalence of f12 F -- A: equivalence of/31, /32

N u m b e r of parameters estimated

df

Resid.

Resid.

SS

MS

6 3 5 5 5 4

27 30 28 28 28 29

1 222.34 1 358.11 1280.39 1 247.33 1 236.04 1 354.71

45.27 45.27 45.72 44.54 44.14 46.71

df

Resid.

Resid.

SS

MS

3 1 1 1 2

135.77 58.05 24.99 13.70 132.37

45.26 58.05 24.99 13.70 66.18

F 1.00 1.28 0.55 0.30 1.46

0.4079 0.2678 0.4647 0.5884 0.2499

183 TABLE 5 COMPARISON OF FITS FOR DATA SETS 1 AND 2 TO DETERMINE WHICH PARAMETERS REMAIN EQUIVALENT Description of fit

Number of parameters estimated

df

Resid. SS

A: Ind. ,8o, ill, r2 B: Com. ,8o, fla, r2 C: Ind. ill, f12; Com. flo D: Ind. fl0, f12; Com. ,81 E: Ind. ,80, ill; Com. fiE F: Ind. ,8o; Com. /31, r2

6 3 5 5 5 4

27 30 28 28 28 29

793.99 1 794.44 1 167.75 815.49 811.98 825.59

df

Resid. SS

Resid. MS

F

P

1000.45 373.76 21.50 17.99 31.60

333.48 373.76 21.50 17.99 15.80

11.34 12.71 0.73 0.61 0.54

0.0001 0.0001 0.4004 0.4416 0.5889

Description of test B - A: equivalence of data sets C - A: equivalence of ,8o D - A: equivalence of fll E - A: equivalence of r2 F - A: equivalence of ill, /81

3 1 l l 2

nificant. O n the o t h e r hand, the test for equivalence o f / 3 o is f o u n d to be significant at the 0.05 level ( p = 0 . 0 1 5 9 ) . T h e test for equivalence of i n t e r c e p t s is therefore rejected. W e c o n c l u d e that at least one of the intercepts is significantly different f r o m the o t h e r two. T h e h y p o t h e s e s for equivalence of the/31 and/32 p a r a m e t e r s , t a k e n individually a n d together, are n o t rejected. Therefore, the p r e f e r r e d m o d e l d e s c r i b e s three curves with (statistically) equal, increasing rates b u t different s p o n t a n e o u s r e v e r t a n t counts. A logical next step is to d o pairwise c o m p a r i sons of the d a t a sets. D a t a sets 2 a n d 3 were g e n e r a t e d using strain TA1537. K n o w i n g this, we first selected these two d a t a sets a n d r e p e a t e d the analysis, following the strategy d e s c r i b e d for Table 3. T h e results are p r e s e n t e d in T a b l e 4. The difference ( B - A ) p r o v i d e s a test of the equivalence of the two d a t a sets, a n d this h y p o t h e s i s is n o t rejected. Hence, d a t a sets 2 a n d 3 are statistically equivalent. T a b l e 5 describes the p a i r w i s e c o m p a r i s o n of fits for d a t a sets 1 a n d 2 following the strategy d e s c r i b e d for T a b l e 3. D a t a set 1 was g e n e r a t e d using strain TA98. The tests for equivalence of d a t a sets (B - A) a n d i n v a r i a n c e of/3o (C - A ) are significant ( p = 0.0001), while all o t h e r tests are nonsignificant. W e thus c o n c l u d e that o n l y the s p o n t a n e o u s reversion counts (/3o p a r a m e t e r s ) are

Resid. MS 29.41 59.81 41.71 29.12 28.99 28.47

statistically different for these two d a t a sets. This result agrees with p r e v i o u s scientific findings that s p o n t a n e o u s reversion c o u n t s differ b e t w e e n strains T A 9 8 a n d TA1537. Constructing confidence intervals and bands about nonlinear models The process of c o n s t r u c t i n g c o n f i d e n c e intervals a b o u t p r e s p e c i f i e d values o f the regressor v a r i a b l e in a linear m o d e l is well k n o w n ( D r a p e r a n d Smith, 1981, C h a p t e r s 1 a n d 2; Bates a n d Watts, 1988, C h a p t e r 1). Similarly, the p r o c e d u r e s for c o n s t r u c t i n g s i m u l t a n e o u s c o n f i d e n c e intervals or c o n f i d e n c e b a n d s for linear m o d e l s have also been d e s c r i b e d (Bates a n d W a t t s , 1988, Section

1.1.2). U n f o r t u n a t e l y , the p r o c e d u r e s for c o n s t r u c t i n g exact c o n f i d e n c e intervals a n d b a n d s a b o u t n o n linear m o d e l s are c o m p l e x ( K h o r a s o n i a n d Milliken, 1982). However, u n d e r the r e g u l a r i t y c o n d i tions (Bates a n d W a t t s , 1988, C h a p t e r 2; Seber a n d Wild, 1989, C h a p t e r 5) t h a t are often a s s u m e d when w o r k i n g with n o n - l i n e a r m o d e l s ( a n d are a s s u m e d here), it is p o s s i b l e to c o n s t r u c t app r o x i m a t e c o n f i d e n c e intervals a n d b a n d s a b o u t n o n l i n e a r models. C o n f i d e n c e intervals a n d b a n d s can p r o v i d e useful g r a p h i c a l aids to the p r e c e d i n g analyses. The m e t h o d involves a f i r s t - o r d e r T a y i o r ' s series

184 expansion about the " t r u e " vector of parameters within a small neighborhood about the vector of least-squares parameter estimates. The mathematics of this procedure are developed, for example, in Milliken (1982, Section 5.1.2), Bates and Watts (1988, Section 2.2.1) and Seber and Wild (1989, Section 2.1.2). In practice, this procedure is relatively easy to carry out with the output from SAS P R O C N L I N . For the 3-parameter model f ( x , rio, ill, f12), the partial derivatives of f ( . ) with respect to each parameter are written as

i 7O

40 30 2O 10

40

Of DB°=~o'

Of Of D i n = 3 f l , ' DB2= 3fl2

(6)

S E ( f ) [ 2 2DBoSBo q- D21821 -1-D B 22 S B2 2 + 2DBoDmRmSBoS m 2DBoDBzRozSBoSB2

-4- 2DBIDB2R12SBISB2] 1/2

1~0

160

200

240

280

3,20 160

400

440

480

520

~

600

X (dose)

From the computer printout, let SBO, S m and SB2 denote the standard errors of the parameters and R m, R02 and Ra2 denote the correlations among the parameters. The estimated standard error of the estimate of the model, Of(x,~), ^2 at a given dose x is

+

80

(7)

where the derivatives are evaluated at the values of the least-squares estimates of the parameters. In Fig. 1, the predicted 3-parameter nonlinear model was fit to data set 1 and parameter estimates were obtained. These estimates (20.23, - 0 . 2 0 4 9 and 0.6962 for rio, /~1 and /~2, respectively) were substituted in eqn. (1), and predicted count values were generated as X was varied from 0.01 to 600 in a DO loop. [SAS code is provided in Appendix D. Note that we have replaced X = 0 with a small number close to 0 in the generation of the predicted model, since l n ( X ) is not defined at X = 0.] The 95% confidence bands about the predicted model are included on the plot. They have been constructed by adding to (or subtracting from) the predicted model the product of the S E ( f ) by the quantity ( P - F [ P , N - P ; , x ] ) 1/2, where P is the number of parameters in the non-

Fig. 1. Plot of predicted 3-parameter nonlinear model for data set 1, 95% confidence bands about the predicted model.

linear model, N is the sample size and a is the type I error rate for the F statistic with P and ( N - P ) degrees of freedom (see Bates and Watts, 1988, Section 2.3.2). The quantity ( P . F[ P, N - P; a]) 1/2 has been used because we are interested here in simultaneously constructing infinitely many confidence intervals about the model. If a researcher were interested in a confidence interval about the curve at a prespecified dose, say X 0, then the quantity ( P . F [ P , N - P;a]) 1/2 could be replaced by the t~/2 statistic for ( N - P ) degrees of freedom. Similarly, for m confidence intervals at m prechosen doses, for example, a Bonferroni type strategy could be used by using ta/2m values (Seber and Wild, 1989). When differences between the predicted values for two data sets are found, they can be compared by constructing confidence intervals about their difference at a prespecified number of points. The standard error of the difference between two predicted values is S E ( D i f ) = [ S E ( f l ) z + S E ( f 2 ) z] ,/2

(8)

The approximate confidence interval about the differences between the predicted values for two data sets at a given dose is Dif = (f~ - f 2 ) 4- t~/2(v ) • S E ( D i f )

(9)

where o represents the sum of the degrees of freedom (i.e., associated with the residual S S )

185

from the two data sets under consideration. To obtain protection from a family-wise type I error rate of a at m preselected doses, a Bonferroni type strategy can be used. That is, the t~/2(v) value can be replaced with t~/Zm in eqn. (9). Fig. 2 shows the result when this strategy was applied to the difference function between data sets 2 and 3 for the 3-parameter model. To gain protection at the a = 0.05 level for m = 5 confidence intervals, the value for to.o5/2 m = t0.02s was used for 5 prespecified dosages at 0, 60, 100, 20(I and 600. Appendix D shows the appropriate SAS code. Note that in the generation of the expected response for each data set, the parameter estimates originally obtained from the separate fits of the data sets were used. For this comparison, the confidence intervals include zero at each dose. This is consistent with the finding that data sets 2 and 3 were (are) statistically equivalent (invariant) to each other. Fig. 3 shows a plot of the difference function between data sets 1 and 2 using the same Bonferroni strategy. For these data, the confidence intervals excludes zero at doses 0, 60, 100 and 200 but (barely) includes zero at dose 600. This outcome is reasonably consistent with our finding that data sets 1 and 2 differed with respect to their t3o parameters (intercepts) but did not differ in their fll and t32 parameters (rates of increase). We emphasize that plots of confidence bands and confidence intervals are approximate and

8

IT!

1~

]

T I

I[

1

-18 -21 -24

50

100

150

200

250

300

3.50

400

450

500

30. 15 10

-10



l

:

l

-15 -2O

0

50

100

150

200

2_50

3(X3

350

400

450

500

550

600

x Fig. 2. Plot of difference between predicted values fit to data sets 2 and 3, 99% Bonferroni confidence intervals about the predicted difference at 5 prespecified doses.

600

should be used as follow-up procedures to the regression analyses, not as inferential tools. In the next example, we will observe a difference plot in which confidence intervals include zero for some values of X but exclude 0 for others.

Example 2 In this example, we will compare 2 sets of data that indicate significant toxicity. Hence, the 4parameter model of Stead et al. (1981) will be used in this example. It is written as Y = (rio + exp[fl~ + f12 ln(x)] )exp[-f13 X ]

P.5

550

x Fig. 3. Plot of difference between predicted values fit to data sets 1 and 2, 99% Bonferroni confidence intervals about the predicted difference at 5 prespecified doses.

(10)

where rio, fll and /~2 represent the same quantities as in Eqn. (1). The fourth parameter, f13, describes the rate at which the system becomes toxic. The data, listed in Appendix A, were generated at the U.S. Environmental Protection Agency, Genetic Bioassay Branch, in Research Triangle Park, North Carolina using TA98, a metabolically activated strain of S. typhimurium. 1-Nitropyrene was used in the standard plate assay, SOP No. MB-015/0, and was diluted with dimethyl sulfoxide to yield the doses indicated. The experiment was repeated 2 days later. As before, we check the data for homogeneity both within and across data sets. Bartlett's test was performed on data set 1 (day 1) and found to be nonsignificant ( p = 0.0621). However, the same

186 TABLE 6

U s u a l l y they increase with increasing m e a n response. Or they show a general increasing trend with some exceptions as in our data. T a b l e 6 displays the sample variances at each dose for day-1 a n d day-2 data sets. W h e n the a s s u m p t i o n of homoscedasticity (hom o g e n e o u s variances) is n o t justified, it is necessary to alter the p r o c e d u r e s for o b t a i n i n g least-squares estimates ( D r a p e r a n d Smith, 1981; Myers, 1986; Seber a n d Wild, 1989). In our example, it may be assumed that errors are i n d e p e n d e n t b u t do not have c o n s t a n t variance across doses. The suggested correction in this case is to o b t a i n weights that are inversely p r o p o r t i o n a l to the variances at each value of dose X ( D r a p e r a n d Smith, 1981; Myers, 1986). Often, it is possible to find a relatively simple f u n c t i o n that accomplishes this, especially if m o n o t o n i c i t y is required of the weights (see Myers et al., 1981; D r a p e r a n d Smith, 1981). However, with multiple p o i n t s at each dose, we can o b t a i n sample variances directly at each dose a n d then use their inverses as weights (see Myers, 1986). The weighted least-squares criterion is based on m i n i m i z i n g the expression

SAMPLE VARIANCES AND WEIGHTS RELEVANT TO WEIGHTED LEAST-SQUARES ANALYSIS IN EXAMPLE 2 Data set

Dose (X)

N

S2

Wgt (1/S 2)

1 1 1

0.0 0.1 0.2

3 3 3

19.0 56.3 874.3

0.0526316 0.0177515 0.0011437

1 1 1 1

0.4 0.8 1.6 3.2

3 3 3 3

382.3 2156.3 3627.0 2362.3

0.0026155 0.0004638 0.0002757 0.0004233

2 2 2

0.0 0.1 0.2

3 3 3

21.0 787.0 457.0

0.0476190 0.0012706 0.0021882

2 2 2 2

0.4 0.8 1.6 3.2

3 3 3 3

2120.3 372.0 21796.0 15992.3

0.0004716 0.0026882 0.0000459 0.0000625

test performed on data set 2 (day 2) was signific a n t ( p = 0.0031), i n d i c a t i n g that the variances a b o u t the m e a n c o u n t s were not homogeneous. As stated previously, one should not assume that A m e s data sets exhibit h o m o g e n e o u s variances.

n

SS=t~=l

0"2 Var(-Yt) [ Yt - E ( Yt )12

(11)

TABLE 7 PARAMETER ESTIMATES AND ANOVA SUMMARY TABLE FOR TWO EXPERIMENTS USING STRAIN TA98 ON SUCCESSIVE DAYS, ACTIVATION, WEIGHTED LEAST-SQUARES ANALYSIS a Source of variation

Parameter estimates flo fll

Residual

f12

f13

SS

df

MS

R2

Day 1

23.16 (2.94)

7.22 (0.04)

1.00 (0.03)

0.26 (0.03)

23.43

17

1.38

0.99

Day 2

21.19 (2.53)

7.26 (0.05)

1.20 (0.05)

0.31 (0.04)

15.56

17

0.92

0.99

-

-

38.99

34

1.15

7.18 (0.04)

1.01 (0.03)

89.71

38

50.72

4

Pooled Combined

22.48 (2.79)

Occasion (Day) F

a

5O.72/4 38.99/34

11.06 (p = 0.0001)

Asymptotic Standard Error in parentheses.

0.25 (0.02)

187 TABLE 8 COMPARISON OF FITS FOR 2 Expts. USING STRAIN TA98 ON SUCCESSIVE DAYS TO DETERMINE WHICH PARAMETERS REMAIN INVARIANT - - EXAMPLE 2 Description of fit

Number of parameters estimated

df

A: Ind. ,8o,/31,/32,/33 B: Com./30, ill,/32, f13

8 4

34 38

38.99 89.71

C: Com./31,/33 Ind./3o,/32

6

36

40.89

D: Com. rio, ill,/33 Ind. f12

5

37

41.75

E: Com. ill, /32, /33 Ind. /3o

5

37

83.12

F: Com. /30, f12,/33 Ind. /31

5

37

83.84

H: Com./31, f12 Ind. rio, f13

6

36

76.04

df

Resid.

Resid.

SS

MS

50.72 1.90 2.76 44.13 44.85 47.88 37.05 0.86

12.68 0.95 0.92 14.71 14.95 15.96 18.53 0.86

Description of test B - A: equivalence of data sets C - A: equivalence of/31, f13 D - A: equivalence of rio, ill, f13 E - A : equivalence of/31, f12, /33 F - A : equivalence of rio, ill, 132 G - A: equivalence of rio, /32,/33 H - A: equivalence of/31, f12 D - C: conditional equivalence of rio, 131, /33

4 2 3 3 3 3 2 1

w h e r e E ( Y t ) is the e x p e c t e d or m e a n v a l u e o f the m o d e l u n d e r c o n s i d e r a t i o n at dose X~ a n d Var(Yt) is the v a r i a n c e of the t th o b s e r v a t i o n . If Var(Yt) is c o n s t a n t (e.g., o 2) t h r o u g h o u t the dose r a n g e , E q n . (11) r e d u c e s to the u s u a l u n w e i g h t e d l e a s t - s q u a r e s criterion, w h i c h was used in E x a m p l e 1. SAS P R O C N L I N p r o v i d e s the user w i t h a w e i g h t f u n c t i o n that p e r m i t s a w e i g h t e d leasts q u a r e s a n a l y s i s to b e p e r f o r m e d with relative ease. I n this e x a m p l e , we c a l c u l a t e d weights as the r e c i p r o c a l s (inverses) of the s a m p l e v a r i a n c e s ( T a ble 6) a n d u s e d these for analysis. (SAS c o d e is p r o v i d e d in A p p e n d i x E.) T h e 4 - p a r a m e t e r m o d e l was fit to each d a t a set s e p a r a t e l y ( T a b l e 7). T h e R 2 v a l u e s i n excess of

Resid.

Resid.

SS

MS

1.15

F

P

11.06 0.83 0.80 4.90 13.00 13.88 16.11 0.75

0.0001 0.4447 0.4576 0.0062 0.0001 0.0001 0.0001 0.3925

0.99 i n d i c a t e e x t r e m e l y g o o d fit. T h e r e s i d u a l M S error for d a t a set 1 is 1.38. Similarly, the r e s i d u a l M S for d a t a set 2 is 0.92. A n F ratio of these m e a n s q u a r e s p r o v i d e s a test for h o m o g e n e i t y of error v a r i a n c e across d a t a sets. T h e r e s u l t i n g F r a t i o of 1.50 (with 17 a n d 17 d f p = 0.2058) is n o n s i g n i f i c a n t , i n d i c a t i n g t h a t the e r r o r v a r i a n c e s are h o m o g e n e o u s across d a t a sets. T a b l e 8 shows the p r e s c r i p t i o n that was followed to p r o v i d e tests for v a r i o u s p o t e n t i a l h y p o t h e s e s of i n t e r e s t for these d a t a sets. T h e first h y p o t h e s i s of i n t e r e s t was to d e t e r m i n e w h e t h e r the d a t a sets were e q u i v a l e n t , here i n t e r p r e t e d as a m i n i m a l level of d a y - t o - d a y i n t r a l a b o r a t o r y v a r i a tion. A t step A, /30, /31, f12 a n d f13 were fit to each i n d i v i d u a l set. T h e r e s i d u a l S S f r o m each d a t a set

188

were added together to produce a pooled residual SS. The 8 parameter estimates were also obtained by minimizing 2

ni

s & = E Ewt i=l

t=l

x {Yt - [ ( ' 8 o ~ + exp[/3u +/32~ In(Xt)]) x e x p ( - f l 3 i X t ) ] }2

(12)

where w, is the weight. Note the i subscripts for each parameter. At step B, common rio, ,81, '82 and '83 were fit to the data. This involved minimizing N

SSu = ~'_.,wt{ Y t - [(flo + exp[/31+ /321n( Xt)]) t=l

× exp(-flsX,)] }2

(13)

Note that the i subscripts are missing in this equation. Four parameters were estimated. At step C, common fil and /33 parameters were fit while allowing parameters '8o and '82 to vary individually with each data set. This involved minimizing the following quantity 2

ni

s & = Z Zw, i=1 t=l

× { Y , - [ ( / 3 0 , + e x p [ f l l + f12, l n ( X , ) ] ) X exp(-/33Xt)] }2

(14)

Note the i subscripts. Six parameters were estimated: /8m, '8o2, '81, flzp fl22 and flsSteps D through H were performed in a similar fashion. Of particular interest for these data was the hypothesis of data-set equivalence, since the two sets represented the same experiment performed on different days (occasions). The residual MS of 12.68, obtained by determining SS B - S S A , was divided by the residual MS obtained at step A; the resulting F ratio of 11.06 ( p

Methods for comparing Salmonella mutagenicity data sets using nonlinear models.

A variety of linear and nonlinear mathematical models have been proposed to characterize Salmonella mutagenicity data sets, but no systematic procedur...
1MB Sizes 0 Downloads 0 Views