This article was downloaded by: [Imperial College London Library] On: 10 October 2014, At: 07:04 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Biopharmaceutical Statistics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/lbps20

The estimation of parameters from bulked samples a

Shein-Chung Chow & Erik V. Nordheim

b

a

Biostatistics department , Bristol-Myers Squibb Company U .S. Pharmaceutical Group , Evansvill, Indiana, 47721 b

statistics department , University of Wisconsin , Madison, Wisconsin, 53706 Published online: 29 Mar 2007.

To cite this article: Shein-Chung Chow & Erik V. Nordheim (1991) The estimation of parameters from bulked samples, Journal of Biopharmaceutical Statistics, 1:1, 1-15, DOI: 10.1080/10543409108835002 To link to this article: http://dx.doi.org/10.1080/10543409108835002

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/ terms-and-conditions

Journal of Biopharmaceutical Statistics, 1(1), 1-15 (1991)

Downloaded by [Imperial College London Library] at 07:04 10 October 2014

THE ESTIMATION O F PARAMETERS FROM BULKED SAMPLES Shein-Chung chow' and Erik V . ~ o r d h e i m ' 'Biostatistics Department Bristol-Myers Squibb Company U .S. Pharmaceutical Group Evansville, Indiana 4772 1 'statistics Department University of Wisconsin Madison, Wisconsin 53706

Key words. Parametric bootstrap; Density estimation; One-step MLE Bulked samples; Lognormal distribution

Abstract An estimation procedure has been developed for the estimation of parameters from bulked samples using the parametric bootstrap and density estimation in conjunction with the one-step maximumlikelihood estimator. It is shown that the proposed estimation procedure provides an asymptotically efficient estimator for parameters of interest when the density for the mean of the bulked samples has ' assumed known) is a certain form. The lognormal density (with a an important distribution with the proper form. The finite sample performance for bulked samples based on underlying lognormal observations was examined by Monte Carlo study. The results indicate that the proposed procedure leads to a reduction in mean squared error compared to known procedures.

Copyright 0 1991 by Marcel Dekker. Inc.

Chow and Nordheim

Downloaded by [Imperial College London Library] at 07:04 10 October 2014

Introduction In some practical estimation problems, the data available may be a sum or mean of a group of (unobserved) individual observations. Two examples are a study of ambient air lead in human blood based on grouped blood data (1) and the estimation of bacterial populations on leaf surfaces from bulked leaf samples (2). There are many other examples in a wide range of applied fields [e.g., Aitchison and Brown (3); Mitchell (4)]. We refer to such data as coming from bulked samples. The primary emphasis of this paper is on estimation with bulked samples when the distribution of the sum (or mean) of the unobserved individual observations cannot be expressed in closed form. Our principal application will be to the lognormal distribution. Let Y , be identically and independently distributed (i.i.d.) random variables with probability density p(y, 8) where 8 is a q X 1 vector of parameters and i = 1, . . . , k and j = 1, . . ., n. Suppose we observe only y;. = 1/n 2;=1Yij (or Y,. = Y,). The objective is to estimate 8 by using these bulked samples. The probability density function of W, = yi,,denoted by f (w, 8), may not have a closed-form expression, e.g., when Yi, are i.i.d. lognormal or Weibull random variables. In these cases, one cannot obtain directly a maximum-likelihood estimator (MLE). A common estimator in such situations is the method of moments. However, the moment estimator is not always efficient and may have poor finite sample performance in terms of mean squared error (MSE) when p(y, 8) is asymmetric. However, if the unknown density f (w, 8) and its derivatives with respect to 8 can be estimated, then maximum-likelihood estimation is possible. In this paper we propose an estimation procedure using the parametric bootstrap idea ( 5 , 6) and a density estimation technique (7, 8) in conjunction with the one-step MLE (9). In the next section, we provide a description of our proposed estimation procedure. Some asymptotic properties are established in the following section. Next we illustrate application of the procedure for the case where p(y, 8) is lognormal. Also included is a brief simulation study with two starting estimators for use in the procedure. An example from plant pathology is discussed next, followed by a short discussion.

The Proposed Procedure In this section, we develop a general estimation procedure for 8 by combining the following three ideas: the parametric bootstrap, the kernel estimation technique, and the one-step MLE. The parametric bootstrap idea is used to generate bootstrap samples based on the observed bulked data. The kernel estimation technique is applied to the generated samples to obtain estimates of

Downloaded by [Imperial College London Library] at 07:04 10 October 2014

Estimating Parameters from Bulked Samples

3

the unknown density and its derivatives with respect to 8. Finally, the onestep MLE is employed to approach the MLE in one iteration. For notational convenience, we denote the probability density function of Wi = f i ,f (w, 8), by f and the density with 8 replaced by $ by f * , i.e., f * = f (w, 8). We also denote aPf(w, 8)/awP and apf(w, 8)/aOp by f "' and f IP', respectively. Note that f is a q x 1 vector. A key step in our procedure is the estimation o f f IP'. This does not appear possible in general. However, when f (w, 8) is of the form specified in theorem 1, the estimation is straightforward. Several important distributions, for which the density function of W, do not have closed-form expressions, result in f (w, 6) of the appropriate form. These include the one-parameter lognormal (a' assumed known) and the one-parameter Weibull (c assumed known). Theorem I . Assume that f (w, 8) = g(whl(8))h2(8),where g is a probability density function for X = Whl(8) and h,(8) and h2(8) are twice continuously differentiable function from Rq to R . Then the first and second derivatives of f with respect to 8, i.e., f ' " , f " ' , can be expressed in terms of g and its derivatives with respect to X. Proof: This follows by direct application of the chain rule. (The expressions for these derivatives are presented explicitly below). We describe the proposed procedure by enumerating the steps to be performed. Step I . For a given set of data {Y,,,i = 1 , . . ., k } , calculate a starting estimator for 8 (e.g., the moment estimator). Step 2. Draw a parametric bootstrap sample with sample size m, i.e., {Y?, i = 1, . . ., m } . This is done by generating Y;, i = 1, . . ., rn, j = 1, . . ., n from p(y, 6) and summing over j to form rn random samples of p,*.Note that m need not necessarily be equal to k. Step 3. Estimate f and its first and second derivatives with respect to 8 using the generated samples. We utilize kernel estimation for this step. To describe this procedure, we present the relevant formulae here. Motivation and added detail are provided in the next section. Assume the conditions on f (w, 8) specified in theorem 1. Let XT = y?h1(8). Then, using a kernel function representation for g:

Chow and Nordheim

where: b o = b, =

jm)

log m

1/5

The expressions for the bandwidths are justified in the next section. Consequently, f lP1 can be estimated by jgblfor p = 0, 1, 2, as follows:

Downloaded by [Imperial College London Library] at 07:04 10 October 2014

Ji.2 = gm(Yi.h1(6))h2(b

jg[" is algebraically complicated but can be expressed

similarly.

Step 4. Apply the one-step MLE:

where L1(f , 8) is a q x 1 vector:

and L"( f , 6) is q

X

q matrix:

Note that L1(f,6) and L"( f , 6) are the first and second derivatives of the loglikelihood expressed as functions off ['' for p = 0, 1, 2. The resultant estimator can then be obtained by replacing f in Eq. (1) with j;"' Thus .

Asymptotic Properties In this section we show that Eq. (2) is an asymptotically efficient estimator for 8 as m tends to infinity. To prove this we need several lemmas. Let X , , X,, . . ., X , be i.i.d. random variables with probability density function f ( x ) .

Estimating Parameters from Bulked Samples

5

Downloaded by [Imperial College London Library] at 07:04 10 October 2014

The kernel density estimates o f f and its derivatives with respect to x are defined by

where K(x) is the kernel function and bp is the bandwidth for the estimation of the p-th derivative (7, 10, 11). Yang and Cox ( 8 )proposed an optimal rate of convergence off:' to fiP' when K(x) has compact support and satisfies a set of orthogonality conditions. It can be verified that the standard normal function has compact support and satisfies those orthogonality conditions (12). With K(x) = c$(x),the following lemma is obtained using the results in Yang and Cox (8, theorem 3). Lemma I . Suppose that f is r + 1 times differentiable and fir+" is bounded. Then sup -x

The estimation of parameters from bulked samples.

An estimation procedure has been developed for the estimation of parameters from bulked sample using the parametric bootstrap and density estimation i...
516KB Sizes 0 Downloads 0 Views