J. R. Statist. Soc. B (2015) 77, Part 1, pp. 107–130

Quantile regression adjusting for dependent censoring from semicompeting risks Ruosha Li University of Pittsburgh, USA

and Limin Peng Emory University, Atlanta, USA [Received January 2012. Final revision November 2013] Summary. We study quantile regression when the response is an event time subject to potentially dependent censoring. We consider the semicompeting risks setting, where the time to censoring remains observable after the occurrence of the event of interest. Although such a scenario frequently arises in biomedical studies, most of current quantile regression methods for censored data are not applicable because they generally require the censoring time and the event time to be independent. By imposing quite mild assumptions on the association structure between the time-to-event response and the censoring time variable, we propose quantile regression procedures, which allow us to garner a comprehensive view of the covariate effects on the event time outcome as well as to examine the informativeness of censoring. An efficient and stable algorithm is provided for implementing the new method. We establish the asymptotic properties of the resulting estimators including uniform consistency and weak convergence.The theoretical development may serve as a useful template for addressing estimating settings that involve stochastic integrals. Extensive simulation studies suggest that the method proposed performs well with moderate sample sizes. We illustrate the practical utility of our proposals through an application to a bone marrow transplant trial. Keywords: Copula; Dependent censoring; Quantile regression; Semicompeting risks; Stochastic integral equation

1.

Introduction

Quantile regression (Koenker and Bassett, 1978) is gaining increasing attention in survival analysis as a valuable alternative to traditional regression models. It targets a spectrum of conditional quantiles to achieve a comprehensive and robust analysis. Specifically, define the τ th conditional quantile of an event time T1 as QT1 .τ |Z/ = inf {t : Pr.T1  t|Z/  τ }, where τ ∈ .0, 1/, ˜ T and Z˜ is a p × 1 covariate vector. A quantile regression model may assume Z = .1, Z/ QT1 .τ |Z/ = exp{ZT β0 .τ /},

0 < τ < 1,

.1/

where β0 .τ / is a vector of unknown regression coefficients representing covariate effects at the τ th quantile of the event time. By allowing β0 .τ / to vary with τ , model (1) may accommodate inhomogeneous covariate effects that, for example, differ between subjects of high susceptibility to the event versus those with low risk of the event. Also note that model (1) is a strict extension of the accelerated failure time model, which corresponds to model (1) with each component of Address for correspondence: Limin Peng, Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA. E-mail: [email protected] © 2014 Royal Statistical Society

1369–7412/15/77107

108

R. Li and L. Peng

β0 .τ / being constant except for the intercept. Like with the accelerated failure time model, quantile regression postulates a direct relationship between the event time and covariates, thereby rendering straightforward physical interpretations. Many research efforts have been devoted to developing methods for quantile regression with survival data subject to independent censoring (Powell (1986), Ying et al. (1995), Portnoy (2003), Peng and Huang (2008), Wang and Wang (2009) and Huang (2010), among others). However, the assumption of independent censoring is problematic in many practical situations. For example, in a multicentre trial of allogeneic bone marrow transplants (BMTs) in patients with acute leukaemia (Copelan et al., 1991; Klein and Moeschberger, 2005), one intermediate end point of interest is the development of chronic graft-versus-host disease (GVHD). However, some patients died without experiencing GVHD. The time to GVHD for these subjects was then censored by death, which may correlate with GVHD. In contrast, the occurrence of GVHD did not prevent the subsequent patient follow-up in this study. As a result, time to death remained observable after the GVHD end point. Such a dependent censoring scenario falls into the paradigm of semicompeting risks (Fine et al., 2001), which permits the observation of censoring after the occurrence of the end point of interest and is the focus of this work. Naively conducting competing risks analysis in the BMT example would ignore the observed data on time to death for patients who experienced GVHD during the study and hence incur unnecessary loss of information. In the presence of dependent censoring to T1 , it would be problematic to estimate model (1) by using existing quantile regression methods that assume that the time to censoring is independent of T1 . For example, we show via simulation studies that applying the method of Peng and Huang (2008) by treating dependent censoring as independent can incur large biases in the estimation of β0 .τ / (Fig. 1). To date, very limited work has been done for quantile regression with dependently censored data. The competing risks quantile regression method by Peng and Fine (2009), like its precursor in the one-sample setting (Peng and Fine, 2007a), targets the crude conditional quantiles, which are conditional quantiles defined on the basis of the cumulative incidence function, not the marginal distribution function. In practice, quantities that are based on the marginal distribution of T1 can also be of interest (Jiang et al., 2003). This consideration motivates the study of quantile regression based on model (1), which formulates covariate effects on QT1 .τ |Z/, and cannot be addressed by the method of Peng and Fine (2009). In this paper, we develop a quantile regression method which accommodates dependent censoring and renders inference on net conditional quantiles that refer to conditional quantiles defined on the marginal distribution of the event time of interest. We focus on the semicompeting risks setting, where the censoring event remains observable after the end point of interest, as in the BMT example. The new method provides a useful alternative to existing regression approaches for dependently censored data of semicompeting risks structure. For example, it allows for non-constant covariate effects, which were not permitted by Lin et al. (1996), Peng and Fine (2006), Hsieh et al. (2008), Ding et al. (2009) and Chen (2011). Although other varyingcoefficient models, such as the multiplicative hazards model and additive risks model, have been studied for survival data (we refer to Martinussen and Scheike (2006) for a comprehensive coverage), approaches that are tailored to the semicompeting risks setting are quite limited. One available method is the functional regression model that was studied by Peng and Fine (2007b), which generalizes the Cox proportional hazards model with varying coefficient incorporated. In contrast, model (1) formulates covariate effects on the quantiles of T1 , offering direct physical interpretations, and thus may be preferred in many practical situations. In our proposals, we first tackle the issue that is related to the lack of non-parametric identifiability of the marginal distribution of T1 (Tsiatis, 1975) by assuming that the association between

0.1

0.3

0.5

0.1

0.3

(a)

0.5

0.3

0.5

0.5 (c)

Regression Coefficient −0.3 0.0 0.2

Regression Coefficient −0.8 −0.6 −0.4 −0.2 0.3

0.1

(b)

Regression Coefficient −0.3 0.1 0.1 0.3 0.1

109

Regression Coefficient −0.3 0.0 0.2

Regression Coefficient −0.3 −0.1 0.1 0.3

Regression Coefficient −0.8 −0.6 −0.4 −0.2

Quantile Regression

0.1

0.3

0.5

0.1

0.3

0.5

τ

τ

τ

(d)

(e)

(f)

ˆ / with data generated from set-up S2.C in (a), (b) and (c) and Fig. 1. Assessment of the robustness of β.τ from set-up S2.F in (d), (e) and) (f) ( , true regression quantiles; , empirical averages of naive estimates; , empirical averages of the estimates proposed on the basis of Clayton’s copula; , empirical averages of the estimates proposed on the basis of Frank’s copula): (a), (d) β .1/ .τ /, intercept; (b), .2/ .3/ (e) β .τ /, coefficient for Z1 ; (c), (f) β .τ /, coefficient for Z2

T1 and its censoring time follows a copula model with unspecified copula parameters. Note that this is a weaker assumption compared with that usually adopted in the competing risks setting, where the copula model is fully specified (Zheng and Klein (1995) and Chen (2010), among others). We construct unbiased estimating equations based on the assumed models, effectively utilizing the semicompeting risks feature of the censoring scheme. Estimation and inference procedures are developed to offer not only analyses of net conditional quantiles but also insights about how censoring is associated with the event of interest. A stable and efficient algorithm is developed for the implementation of the procedures proposed. Moreover, we establish the asymptotic properties of the proposed estimators despite considerable technical challenges. Our theoretical development provides a useful template for addressing estimating equations that involve the use of stochastic integrals. Via extensive simulations, we show that the method proposed performs well with moderate sample sizes and is robust to misspecification of the association model assumed. An application to the BMT example illustrates the utility of our proposals, uncovering findings that are unattainable through traditional survival regression models. 2.

Quantile regression procedure

2.1. Data and model We begin with a formal introduction of the data and notation. Let T1 be the time to the end point

110

R. Li and L. Peng

of interest and T2 be the time to dependent censoring. Define Z = .1, Z˜ T /T as a .p + 1/ × 1 vector ˜ An administrative censoring time C is often involved extended from covariates recorded in Z. which is conditionally independent of .T1 , T2 / given Z. Define X = T1 ∧ T2 ∧ C, Y = T2 ∧ C, δ = I.T1  T2 ∧ C/ and η = I.T2  C/, where ‘∧’ is the minimum operator and I.·/ is the indicator function. Observed data include n identically and independently distributed (IID) replicates of {X, Y , δ, η, Z}, which are denoted by {Xi , Yi , δi , ηi , Zi }ni=1 . To address the lack of non-parametric identifiability of the net conditional quantiles QT1 .τ |Z/ in the presence of dependent censoring by T2 (Tsiatis, 1975), we impose additional assumptions on the association structure between T1 and T2 . Such a strategy has been widely adopted in previous work on dependent censoring. Specifically, we adopt a copula model that assumes that Pr.T1 > s, T2 > t|Z/ = Ψ{1 − F1 .s|Z/, 1 − F2 .t|Z/, g.Z¯ T r0 /},

.2/

T Z˜ T 0/

˜ or Z˜ where Fi .t|Z/ = Pr.Ti  t|Z/ (i = 1, 2), Z¯ = .1, with Z˜ 0 being a subvector of Z, itself, and Ψ is a known function. For a given parameter θ, Ψ.·, · , θ/ is a copula function, e.g. Clayton’s copula (Clayton, 1978) or Frank’s copula (Genest, 1987). In model (2), r0 is an unknown parameter, characterizing the dependent censoring mechanism that may vary ¯ When Z¯ = 1, the association between T1 and T2 is assumed to be homogeaccording to Z. neous for all subjects. The copula parameter specified as g.Z¯ T r0 / is often closely connected to a measure that characterizes the strength of the association between T1 and T2 , where g.·/ is a known function. For example, under Clayton’s copula model, where Ψ.u, v, a/ = .u−a + v−a − 1/−1=a and a  0, a=.a + 2/ equals Kendall’s τ -coefficient (Kendall and Gibbons, 1962). In this case, one may select g.x/ = exp.x/, which accounts for the fact that a needs to be nonnegative. Although model (2) allows identification of the net conditional quantiles of T1 , the estimation of model (1) is further facilitated by the fact that T2 is subject to independent censoring by C only, and therefore standard censored regression techniques can be applied to estimate F2 .t|Z/. Here, we assume that the regression model for T2 takes the same form as that for T1 : QT2 .τ |Z/ = exp{ZT α0 .τ /},

τ ∈ .0, 1/,

.3/

where α0 .τ / is a .p + 1/ × 1 vector of regression coefficients, which may be estimated by using the approach of Peng and Huang (2008). Denote the resulting estimator as α.·/. ˆ Note that the modelling in expressions (1)–(3) accommodates situations where T1 and T2 are influenced by different sets of covariates. This can be achieved by letting Z include all covariates that impact T1 or T2 and then appropriately setting zero coefficients in β0 .τ / and α0 .τ /. In practice, we may consider a different model for T2 when there is strong evidence for the inadequacy of model (3). In principle, model (3) may be replaced by any model that can generate ‘good’ estimates for QT2 .τ |Z/ with randomly censored data. The resulting change in the estimating equations would be substituting exp{ZT i α.τ /} with the estimate for QT2 .τ |Z/. In what follows, models (1)–(3) are assumed without further mention. It is interesting to note that the model studied here reduces to a bivariate location–shift model, when Z¯ = 1 and both α0 .τ / and β0 .τ / are constant except in the intercept terms. The method that is proposed below will provide a valuable complement with enhanced flexibility and interpretability to the regression methods by Lin et al. (1996) and Peng and Fine (2006). 2.2. Estimating equations Our proposal for estimating β0 .τ / and r0 is based on the two equalities

Quantile Regression

Pr.X > t|Y > t, Z/ =

Pr.T1 > t, T2 > t|Z/ T = KA {F1 .t|Z/, F2 .t|Z/, g.Z¯ r0 /}, Pr.T2 > t|Z/

111

.4/

and, for s  t, Pr.T1  s, T2 > t|Z/ T = KB {F1 .s|Z/, F2 .t|Z/, g.Z¯ r0 /}, .5/ Pr.T2 > t|Z/ where KA .u, v, θ/ = Ψ.1 − u, 1 − v, θ/=.1 − v/ and KB .u, v, θ/ = {1 − v − Ψ.1 − u, 1 − v, θ/}=.1 − v/. It is suggested by equations (4) and (5) that the independent censoring by C may be handled through conditioning X > t and X  s on Y > t with s  t. Note that equation (4) is derived on the basis of the assumed joint distribution of .T1 , T2 / on the diagonal line, whereas equation (5) utilizes the information on the upper wedge of .T1 , T2 /. Therefore, these two equalities well capture the semicompeting risks feature of the observed data. To construct estimating equations for β0 .τ / and r0 based on equations (4) and (5), we need further to bridge α0 .·/ and β0 .·/ with the distribution functions of T1 and T2 . This may be done on the basis of the facts that F1 [exp{ZT β0 .τ /}|Z] = τ for any τ ∈ .0, 1/, and  1  1 I{F2 .t|Z/  u} du = F2 .t|Z/ = I[t  exp{ZT α0 .u/}] du: .6/ Pr.X  s|Y > t, Z/ =

0

0

Motivated by equations (4)–(6), we consider the two quantities T Pi0 .β, α, r, τ / = I{log.Xi / > ZT i β.τ /} − I{log.Yi / > Zi β.τ /}   1  T T T ¯ × KA τ , I{Zi β.τ /  Zi α.u/} du, g.Zi r/ ,

.7/

0

 Q0i .β, α, r, τ / =

0



 I{ZT i β.τ /  log.t/ < log.Yi /}

  − KB τ , 0

1

I{log.Xi /  ZT i β.τ /}  dt:

¯ I{log.t/  ZT i α.u/} du, g.Zi r/ T

.8/

In these definitions, the β and α in Pi0 .β, α, r, τ / and Q0i .β, α, r, τ / stand for functions defined from [0, 1] to Rp+1 . The same rule applies to other similar notation without further mention. Under the assumed models (1)–(3), we have E{Pi0 .β0 , α0 , r0 , τ /|Zi } = 0. This can be seen from the equality E{I.X > t/ − I.Y  > t/ Pr.X > t|Y > t, Z/|Z} = 0 with equation (4) plugged in. Moreover, we can show that E{ τ Q0i .β0 , α0 , r0 , τ /dτ |Zi } = 0 by noting that    E I.s  t, t < Y/{I.X  s/ − Pr.X  s|Y > t, Z/} dt ds|Z = 0, s t

and applying the monotone variable transformation s = exp{ZT β0 .τ /}. The double integration with respect to s and t reflects a good use of the data in the whole upper wedge of .T1 , T2 /. On the basis of the above results, we may consider the estimating equations n  Zi Pi0 .β, α, ˆ r, τ / = 0, n−1=2 i=1  τb .9/ n  n−1=2 Z¯ i Q0i .β, α, ˆ r, τ / dτ = 0, i=1 τa

where τa and τb are some prespecified constants in .0, 1/. In the absence of censoring (i.e. Yi = ∞

112

R. Li and L. Peng

for all i), under an independent copula model (i.e. Ψ.x, y, θ/ = xy), the first estimating equation in expression (9) would reduce to n−1=2 Σni=1 Zi [I{log.Xi / > ZT i β.τ /} − τ ] = 0, which is the same as that presented in Ying et √ al. (1995) for the case without censoring. The estimating function in this special case equals .−1= n/ times the ‘pseudo’-derivative of the classical objective function for defining sample regression quantiles (Koenker, 2005), Σni=1 ρτ {log.Xi / − ZT i b}, where ρτ .x/ = x{τ − I.x < 0/}. Such a connection may help to understand the inclusion of Zi in expression (9). There are some potential issues with equations (9). More specifically, with T2 subject to independent censoring from C, α0 .τ / may not be identifiable for some τ close to 1 (Peng and Huang, 2008). In this case, problems would arise because Pi0 .β, α, ˆ r, τ / and Q0i .β, α, ˆ r, τ / requires estimating α0 .τ / for all τ ∈ .0, 1/, which may not be feasible owing to the lack of identifiability of α0 .τ / in the upper tail of τ . To circumvent this issue, we modify equations (9) on the basis of the idea of ‘truncating’ Pi0 .β0 , α0 , r0 , τ / and Q0i .β0 , α0 , r0 , τ / conditioning on T T ZT i β 0 .τ /  Zi α0 .τU,2 / and t  exp{Zi α0 .τU,2 /} respectively, where τU, 2 is an upper bound of a τ -range, in which α0 .τ / is identifiable. Some empirical rules for selecting τU,2 were provided in Peng and Huang (2008). Applying this truncation idea leads to the consideration of T Pi .β, α, r, τ / = I{log.Xi / > ZT i β.τ /} − I{log.Yi / > Zi β.τ /}   τU, 2  T T ¯ × KA τ , I{ZT β.τ /  Z α.u/} du, g. Z r/ , i i i 0



 Qi .β, α, r, τ / =

t∈.0,∞/

T I{ZT i β.τ /  log.t/  Zi α.τU,2 / ∧ log.Yi /}

  −KB τ ,

0

τU, 2

I{log.Xi /  ZT i β.τ /}

 dt,

¯T I{log.t/  ZT i α.u/} du, g.Zi r/

which involves α.u/ for u up to τU,2 instead of 1. We have T E[I{ZT i β 0 .τ /  Zi α0 .τU,2 /}Pi .β 0 , α0 , r0 , τ /]  T T β .τ /  Z α .τ /} I{log.Xi / > ZT = E I{ZT 0 U,2 i 0 i i β 0 .τ /} − I{log.Yi / > Zi β 0 .τ /}

  × KA τ , 0

1



T ¯ I{ZT i β 0 .τ /  Zi α0 .u/}du, g.Zi r0 / T

T T T = E[I{ZT i β 0 .τ /  Zi α0 .τU,2 /}{I{log.Xi / > Zi β 0 .τ /} − I{log.Yi / > Zi β 0 .τ /} T T ¯ × KA .F1 [exp{ZT i β 0 .τ /}|Z], F2 [exp{Zi β 0 .τ /}|Z], g.Zi r0 //}]

= 0,

 τU, 2

.10/

1 T T I{ZT i β 0 .τ /  Zi α0 .u/} du = 0 I{Zi β 0 .τ /

for τ ∈ .0, 1/. The first equality holds because 0 T T  ZT i α0 .u/} du when Zi β 0 .τ /  Zi α0 .τU,2 /, and the second and the third equalities follow directly from equations (4) and (6). With similar arguments, we can also show that the integrand in Qi .β0 , α0 , r0 , τ / has expectation 0, and hence E{Qi .β0 , α0 , r0 , τ /} = 0:

.11/

By equations (10) and (11), we propose the estimating equations ˆ r, τ / = 0, n1=2 Sn .β, α,  τb Wn .β, α, ˆ r, τ / dτ = 0, n1=2 τa

.12/

Quantile Regression

113

where Sn .β, α, r, τ / = n−1

n  i=1

T Zi I{ZT i β.τ /  Zi α.τU,2 /} Pi .β, α, r, τ /,

Wn .β, α, r, τ / = n−1

n 

Z¯ i Qi .β, α, r, τ /:

i=1

Taking into consideration the identifiability of β0 .τ / due to censoring, we restrict our attention to β0 .τ / with τ ∈ .0, τU,1 ], where τU,1 is a constant less than 1 subject to certain regularity conditions given in Appendix A. In practice, τU,1 and τU,2 may be selected in an adaptive manner as suggested by Peng and Huang (2008) for randomly censored data. We suggest choosing [τa , τb / such that the interval covers most of .0, τU,1 ] but stays away from the boundaries. The proposed estimators of β0 .τ / and r0 can be obtained as the solutions to the equations (12) and are denoted ˆ / and rˆ respectively. by β.τ Remark 1. To compute Qi .β, α, ˆ r, τ /, we need to evaluate only the integration over t ∈ .0, maxni=1 Yi ∧ exp{ZT α.τ ˆ /}/. When β.·/ is a cadlag function, the integrand in Qi .β, α, ˆ r, τ / U,2 i is a piecewise constant function of t, and hence Qi .β, α, ˆ r, τ / can be calculated exactly. 2.3. Computational algorithm We developed an efficient iterative algorithm to estimate β0 .·/ and r0 jointly on the basis of equations (12). In what follows, we may use β as a shorthand for β.τ / and α for α.τ /. Similarly, τ we may write τab Wn .β, α, ˆ r, τ / dτ as Wn .β, α, ˆ r/ for brevity. First, let GL.n/ = {0 < τ0 < τ1 ZT i b} − Ai .τ /] = 0

[k+1,m]

.τ /  ZT ˆ U,2 /} and i α.τ

.13/

114

R. Li and L. Peng [m] ˆ [k+1,m] .τ /} Aˆ i .τ / = I{log.Yi / > ZT i β   τU, 2  T [k] ˆ [k+1,m] .τ /  ZT α.u/} ¯ ˆ × KA τ , β r I{ZT ˆ du, g. Z / : i i i 0

Step 3(c): increase m by 1 and go to step 3(b) until certain convergence criteria are satisfied. It can be shown that the estimating function (13) is a monotone random field in b (Fygenson and Ritov, 1994), and equals the derivative of the following: L1 -type function: n

n n   n−1=2  [m] [m] [m] [m] T T ˆ ˆ ˆ − Bˆ i .τ /| log.Xi / − ZT B B b| + |M − .τ /Z b| + |M + 2 .τ / A .τ /Z b| , i i i i i i 2 i=1 i=1 i=1 .14/ where M is an extremely large number that can bound n  Bˆ [m] .τ /ZT b i i i=1

and

n  2 Bˆ [m] .τ / Aˆ [m] .τ /ZT b i i i i=1

from above. Such an L1 -type minimization problem can be readily implemented in the rq function in R or the l1fit function in SPLUS. In Section 3, we show that the algorithm proposed can produce fast and stable implementation of the proposed estimation method. More details of the algorithm, including convergence criteria that are adopted for iterations, are provided in Appendix D. 2.4. Asymptotic results In this section we outline the asymptotic properties of the estimators proposed. Below, theorem ˆ /, and theorem 2 gives the 1 states the consistency of rˆ and the uniform consistency of β.τ ˆ / − β0 .τ /}. To establish the two results on the limiting distribution of n1=2 .rˆ − r0 / and n1=2 {β.τ theorems, we need regularity conditions 1–5, the details of which are relegated to Appendix A. ˆ /, r} ˆ in a Theorem 1. Under conditions 1–5, if limn→∞ GL.n/  = 0, then there exists {β.τ ˆ / − β0 .τ / →p 0, for any 0 < ν1 < neighbourhood of {β0 .τ /, r0 }, such that supτ ∈[ν1 ,τU, 1 ] β.τ τU,1 and rˆ →p r0 . Theorem 2. Under conditions 1–5, if limn→∞ n1=2 GL.n/  = 0, then n1=2 .ˆr − r0 / converges in ˆ / − β0 .τ /} converges weakly distribution to a normal distribution with mean 0, and n1=2 {β.τ to a mean 0 Gaussian process for τ ∈ [ν1 , τU,1 ], where 0 < ν1 < τU,1 : The proofs for theorems 1 and 2 involve extensive use of empirical process theory and stochastic integrals. In what follows we sketch the major steps taken to prove theorems 1 and 2, which can provide a useful template for asymptotic studies in similar settings. For this, we need to introduce additional notation. Let s.β, α, r, τ / = E{Sn .β, α, r, τ /}, w.β, α, r, τ / = E{Wn .β, α, r, τ /} τ and w.β, α, r/ = τab w.β, α, r, τ / dτ . τ To prove the consistency in theorem 1, we first show that 0 U, 2 I[t  exp{zT α.u/}] ˆ du provides a uniformly consistent estimate for F2 .t|z/ ∧ τU, 2 . By this result, applications of empirical process ˆ theory lead to supτ ∈.0,τU, 1 ] s{β.r/, α0 , r, τ } = op .1/ for any fixed r ∈ R.dR /, which implies the ˜ ; r/, the solution of s.β, α0 , r, τ / = 0. Similarly, we can show ˆ ; r/ to β.τ uniform consistency of β.τ ˜ ˜ ˜ 0 , rˆ/ = op .1/, which ˜ 0 , r/ →p 0, where w.α, r/ = w{β.r/, α, r}, and then w.α that ˜ w.α, ˆ r/ −w.α

Quantile Regression

115

˜ ; r/ has a bounded derivative with respect to implies the consistency of rˆ. Next, we show that β.τ ˜ ; r/ ˜ ; r0 / = op .1/, ˆ − β.τ r at r = r0 . This, coupled with the consistency of rˆ, gives supτ ∈.ν,τU, 1 ] β.τ ˆ ˆ and hence the uniform consistency of β.τ / ≡ β.τ ; rˆ/. Establishing the asymptotic normality of rˆ and the weak convergence of β.τ / requires considerable efforts to address the difficulties due to the non-smoothness of the estimating functions ˆ / ≡ β.τ ˆ ; r/ ˆ and r. ˆ A brief outline of our proof and also the complicated entanglement between β.τ ˆ 0 /, α0 , r0 , τ } − s.β0 , α0 , r0 , τ /] follows. First, we establish the weak convergence of n1=2 [s{β.r ˆ 0 /, α, ˆ r0 }. The former result entails a uniform IID representation of and that of n1=2 Wn {β.r 1=2 ˆ ˆ ; r/ ˆ ; r0 /} ˆ − β.τ n {β.τ ; r0 / − β0 .τ /}. Secondly, we show the asymptotic linearity of n1=2 {β.τ with respect to n1=2 .rˆ − r0 /. This result facilitates deriving an IID representation of n1=2 .rˆ − r0 / ˆ 0 /, α, ˆ r0 }. Lastly, noting that from the arguments for the weak convergence of n1=2 Wn {β.r ˆ / − β0 .τ /} = n1=2 {β.τ ˆ ; r/ ˆ ; r0 / + β.τ ˆ ; r0 / − β0 .τ /}, the asymptotic linearization ˆ − β.τ n1=2 {β.τ ˆ ; r/ ˆ ; r0 /}, coupled with the IID representations of n1=2 {β.τ ˆ ; r0 / − β0 .τ /} and ˆ − β.τ of n1=2 {β.τ ˆ / − β0 .τ /}. The detailed proofs of theon1=2 .rˆ − r0 /, renders the weak convergence of n1=2 {β.τ rems 1 and 2 are provided in Appendices B and C. 2.5. Other inferences ˆ / and rˆ, we propose to Owing to the complexity that is involved in the limit distribution of β.τ use a bootstrap method for the inference on β0 .τ / and r0 . Specifically, we employ the paired bootstrap scheme (Efron, 1979) and apply the algorithm that was presented in Section 2.3 to each Å of the B bootstrapped data sets to acquire {αˆ Åb , βˆ b , rˆÅb }B b=1 . For a fixed τ ∈ .0, τU,1 ], we may use Å ˆ /. Similarly, the the sample variance of {βˆ b .τ /}B to estimate the asymptotic variance of β.τ b=1 Å B variance of rˆ can be approximated by the sample variance of {ˆrb }b=1 . The confidence intervals for β0 .τ / and r0 can be constructed by using the normal distribution approximation, or with the empirical percentiles of the bootstrap estimates. We can also perform second-stage inferences for exploring the varying pattern of covariate effects over τ ∈ [l, u], where [l, u] ⊂ .0, τU,1 ] denotes a prespecified interval of interest. For example, u .q/ we may let [l, u] = [τa , τb ]. First, we define a trimmed mean effect Φ1,q = l β0 .τ / dτ =.u − l/ to use u.q/ to denote the qth component of a summarize the effect of Z.q/ . Here and hereafter,  u we .q/ ˆ ˆ vector u. As a natural estimate for Φ1,q , Φ1,q = l β .τ / dτ =.u − l/ can be shown to be consistent and asymptotically normal under mild assumption on the functional form of β0 .·/. The limit ˆ estimated by the empirical distribution of the bootstrap realizations, distribution  u Å.q/ of Φ1,q can be B ˆ { l β b .τ / dτ =.u − l/ b=1 . This result can be easily extended to a Wald-type or percentile-based test based on Φˆ 1,q for evaluating whether a covariate Z.q/ (2  q  p + 1) has a significant effect on a range of quantiles with τ ∈ [l, u], namely H01 : β .q/ .τ / = 0, τ ∈ [l, u]. In practice, one may be interested in assessing whether the effect of a covariate Z.q/ is constant for τ ∈ [l, u]. The null hy.q/ pothesis may be formulated as H02 : β0 .τ / = c0 , τ ∈ [l, u], where c0 is an  uunspecified constant. With a predetermined non-constantweight function Ξ.τ / satisfying l Ξ.τ / dτ = 1, the test .q/ u ˆ , the distribution of which can be statistic can be constructed as T2,q = l Ξ.τ / βˆ  .τ / dτ − Φ  u Å.q/ Å.q/1,q u ˆ approximated by the empirical distribution of { l Ξ.τ /β b .τ / dτ − l βˆ b .τ / dτ =.u − l/}B b=1 . Wald-type or percentile-based hypothesis testing can be performed accordingly. Rejection of H02 indicates that the effect of Z.q/ may not be constant across the quantiles. Justifications of the second-stage inferences presented would follow similar lines to those of Peng and Huang (2008) and thus have been omitted from the paper. 3.

Simulation studies

We evaluated the finite sample performance of the proposed estimators via extensive Monte

116

R. Li and L. Peng

Carlo simulations. Let Z = .1, Z1 , Z2 /T , where Z1 ∼ Unif.0, 1/ and Z2 ∼ Bernoulli.0:5/. The model that is used for generating T1 is log.T1 / = b1 Z1 + b2 Z2 + "1 , where the error term "1 is N.0, 0:252 / when Z2 = 0, and N.0, 0:52 / when Z2 = 1. With this heteroscedastic error struc.1/ .2/ .3/ .1/ ture, the underlying regression quantile β0 .τ / = {β0 .τ /, β0 .τ /, β0 .τ /}T , where β0 .τ / = .2/ .3/ Qnorm.τ , 0, 0:252 /, β0 .τ / = b1 and β0 .τ / = Qnorm.τ , 0, 0:52 / − Qnorm.τ , 0, 0:252 / + b2 . .1/ Here Qnorm.τ , μ, σ 2 / denotes the τ th quantile of an N.μ, σ 2 / variable. Note that both β0 .τ / .3/ .2/ and β0 .τ / vary with τ whereas β0 .τ / is constant. We generated the dependent censoring time T2 from a log-linear model with IID errors, log.T2 / = a1 Z1 + a2 Z2 + "2 , where "2 ∼ N.μ2 , 0:52 /. For the association structure between T1 and T2 , we considered two types of copula: Clayton’s copula and Frank’s copula. The bivariate survival function Pr.T1 > s, T2 > t|Z/ is given by {Pr.T1 > s|Z/− exp.rc / + Pr.T2 > t|Z/− exp.rc / − 1}−1= exp.rc / when Clayton’s copula is adopted, and by   [exp{−rf Pr.T1 > s|Z/} − 1][exp{−rf Pr.T2 > t|Z/} − 1] 1 − log 1 + rf exp.−rf / − 1 when Frank’s copula is adopted. For a fixed Z, Kendall’s τ -coefficient between T1 and T2 equals exp.rc /={exp.rc / + 2} under Clayton’s model, and equals 1 + 4{D1 .rf / − 1}=rf under Frank’s r model, where D1 .r/ = 0 t={exp.t/ − 1} dt=r. We set rc = log.2/ and rf = 5:75 so that the corresponding Kendall’s τ -coefficients equal 0:5 under both copula models. The independent censoring time C was set to follow Unif.0, UC /. We considered various combinations of μ2 , .a1 , a2 /, .b1 , b2 / and UC , which led to the four set-ups that are presented in Table 1. Under each set-up, we generated 1000 simulated data sets and implemented the proposed nuˆ / and ˆr. We chose sample size n = 200, grid size GL.n/  = 0:01 merical algorithm to obtain β.τ and bootstrap sample size B = 150. We set .τa , τb / = .0:1, 0:75/ under scenarios S1.C and S2.F, and let .τa , τb / = .0:1, 0:65/ under scenarios S2.C and S2.F, which involve heavier censoring. ˆ / at different τ s are summarized in Table 2. We report the empirical The simulation results on β.τ biases EBias, the empirical standard errors ESD and the average resampling-based standard erˆ /, as well as the empirical coverage probabilities of the 95% Wald-type, ECPW , rors ASD for β.τ ˆ / has small biand percentile-based, ECPP , confidence intervals of β0 .τ /. It is shown that β.τ ases under all set-ups, and the bootstrap standard errors closely match the empirical standard errors. Both the Wald-type and the percentile-based confidence intervals achieve empirical coverage probabilities that are close to the nominal level. It is observed that the percentile-based confidence intervals may perform better than the Wald-type intervals when τ is small. Table 3 summarizes the simulation results on ˆr. We present the same set of summary statistics including EBias, ESD, ASD, ECPW and ECPP as is presented in Table 2, along with the true Table 1. Summary of the simulation set-ups, with P00 D P.δ D 0, η D 0/, P01 D P.δ D 0, η D 1/, P10 D P.δ D 1, η D 0/ and P11 D P.δ D 1, η D 1/ Set-up

Copula

μ2

(a1 , a2 )

(b1 , b2 )

UC

P00

P01

P10

P11

S1.C S1.F S2.C S2.F

Clayton Frank Clayton Frank

0.1 0.1 0.0 0.0

.0:4, 0:2/ .0:4, 0:2/ .0:3, −0:25/ .0:3, −0:25/

.0, 0/ .0, 0/ (−0.4, 0) (−0.4, 0)

18 18 3.5 3.5

0.06 0.06 0.24 0.23

0.15 0.15 0.24 0.24

0.04 0.04 0.09 0.10

0.76 0.76 0.43 0.43

Quantile Regression

117

ˆ /, which includes the empirical biases, empirical standard Table 2. Summary of simulation results on β.τ errors, averages of resampling-based standard error estimates and empirical coverages of 95% Wald-type and percentile-based confidence intervals τ

EBias (×103 ) .0/ βˆ

Set-up S1.C 0.1 4 0.2 5 0.3 4 0.4 4 0.5 4 0.6 4 0.7 5

.1/ βˆ

ESD (×103 )

.2/ βˆ

ASD (×103 )

ECP W (%)

ECP P (%)

.0/ βˆ

.1/ βˆ

.2/ βˆ

.0/ βˆ

.1/ βˆ

.2/ βˆ

.0/ βˆ

.1/ βˆ

.2/ βˆ

.0/ βˆ

.1/ βˆ

.2/ βˆ

2 −4 −2 −3 −2 −1 −4

2 1 2 2 3 3 4

100 89 86 83 81 81 82

162 143 138 133 131 131 134

99 86 84 82 83 86 87

102 94 90 87 86 86 87

167 151 144 140 138 139 142

105 92 88 87 87 88 92

93.9 93.9 94.3 95.1 94.7 94.3 94.6

94.6 94.5 94.5 95.7 94.7 94.9 95.3

94.5 95.0 94.9 95.5 95.9 95.4 94.5

93.8 94.5 95.9 95.5 95.2 95.0 95.1

95.9 95.9 96.2 95.9 96.2 95.6 94.8

95.5 95.7 96.2 96.2 95.4 95.0 95.5

Set-up S1.F 0.1 13 −10 0.2 11 −8 0.3 11 −9 0.4 10 −7 0.5 11 −9 0.6 11 −9 0.7 12 −11

4 3 4 1 2 2 2

101 86 75 71 69 66 70

159 136 121 115 111 111 117

99 86 78 75 74 76 79

102 89 81 77 74 74 75

164 142 131 124 121 120 124

101 87 81 78 78 78 81

93.4 94.5 95.1 96.0 95.3 94.8 94.3

95.0 96.3 96.5 96.2 96.4 96.5 94.6

93.9 94.5 94.2 95.2 95.7 95.5 95.4

94.1 94.3 95.2 96.5 95.7 95.1 94.7

96.5 96.6 96.2 97.2 97.1 96.9 95.1

95.3 96.0 95.2 95.0 95.3 95.6 95.7

Set-up S2.C 0.1 9 −8 0.2 6 −1 0.3 10 −8 0.4 10 −9 0.5 13 −10 0.6 15 −14

9 5 5 6 6 13

106 97 97 100 95 97

171 157 152 156 151 154

113 106 108 110 116 126

112 110 109 107 107 107

180 171 168 167 168 170

113 110 112 117 124 131

94.4 96.3 95.7 96.3 95.4 95.7

95.6 95.9 95.8 95.7 95.8 95.5

92.7 94.9 95.3 95.8 95.6 94.7

94.6 96.3 96.8 95.9 95.8 95.4

95.7 97.3 97.2 97.3 96.5 96.2

93.3 96.2 95.9 96.0 95.4 95.6

Set-up S2.F 0.1 26 0.2 18 0.3 21 0.4 22 0.5 16 0.6 19

13 18 20 20 14 20

125 108 98 91 84 84

191 164 148 137 129 133

118 111 106 102 101 107

114 108 101 95 93 93

179 167 157 149 147 149

115 111 108 106 108 112

88.6 92.6 93.4 93.8 95.2 95.5

91.8 94.8 95.3 95.2 95.5 96.1

91.0 93.4 94.0 94.2 95.3 93.7

88.8 92.3 93.7 92.3 92.7 94.8

93.4 94.8 96.2 95.5 95.3 95.6

93.2 94.0 94.0 93.3 93.6 93.9

−30 −21 −26 −27 −15 −21

values of r0 and Kendall’s τ -coefficient, which is expressed as a known function of r0 , K.r0 /; see the column labelled ‘True’. We can see that rˆ is virtually unbiased, and the estimated standard errors are close to their empirical counterparts. For the estimation of r0 and K.r0 /, we note that the coverage rates of the Wald-type confidence intervals tend to be higher than those of the percentile-based intervals. In Table 3, we also present the empirical biases and empirical standard errors of the corresponding estimates for Kendall’s τ , denoted by K.ˆr/. It is shown that, as an established measure of association, Kendall’s τ -coefficient can be accurately estimated with small standard errors. The results of second-stage inferences are summarizedin Table 4. We present EBias, ESD u ˆ .q/ ˆ .q/ and ASD for the trimmed mean effect estimates Φ 1 = l β .τ / dτ =.u − l/, where q = 2, 3, and .l, u/ = .τa , τb /. We also summarize the performance of the proposed hypothesis tests for .q/ .q/ H01 : β0 .τ / = 0 and H02 : β0 .τ / ≡ c0 , following the procedures in Section 2.5. To construct the test statistics for H02 , we chose the weight function Ξ.τ / as 2 I{τ  .τL + τU /=2/}=.τU − τL /.

118

R. Li and L. Peng ˆ which include empirical biases, Table 3. Summary of simulation results on r, empirical standard errors, averages of resampling-based standard error estimates, empirical coverages of 95% Wald-type and percentile-based confidence intervals† Set-up

S1.C S1.F S2.C S2.F

r K.r/ r K.r/ r K.r/ r K.r/

True value

EBias (×103 )

ESD (×103 )

ASD (×103 )

ECPW (%)

ECPP (%)

0.693 0.500 5.750 0.500 0.693 0.500 5.750 0.500

−14 −4 −215 −15 −30 −7 −356 −28

206 51 852 48 305 74 1317 77

226 56 885 51 370 90 1371 81

96.7 96.1 94.5 95.2 98.6 97.7 94.8 96.1

94.8 94.8 92.5 92.5 96.0 96.0 92.3 92.3

†The same sets of summary statistics are also presented for the corresponding Kendall’s τ -coefficient estimates, K.ˆr/. Table 4. Summary of simulation results on second-stage inferences, which include empirical bias, empirical standard error and averages of resamˆ .q/ , q D 2, 3, as well as the empirical pling-based standard error estimates of Φ 1 rejection rates of H01 and H02 with Wald-type (ERRW ) and percentile-based (ERRP ) methods Set-up

S1.C S1.F S2.C S2.F

ˆ .q/ Φ 1

q

2 3 2 3 2 3 2 3

H01

H02

EBias

ESD

ASD

ERRW

ERRP

ERRW

ERRP

−2 −1 −9 −1 −9 4 −22 15

108 68 93 61 123 88 115 83

109 68 96 61 128 87 116 80

0.056 0.129 0.046 0.162 0.886 0.190 0.950 0.167

0.062 0.129 0.056 0.178 0.930 0.145 0.973 0.136

0.049 0.936 0.043 0.950 0.026 0.561 0.040 0.614

0.038 0.946 0.033 0.967 0.017 0.736 0.020 0.750

For each test, we report the empirical rejection rates based on the normal distribution approximation, ERRW , and those based on percentiles, ERRP . We find that the trimmed mean effect ˆ .q/ estimates Φ 1 , q = 2, 3, are well performed in terms of biases and standard error estimates. For H01 , we observe that both the Wald-type test and the percentile-based test achieve empirical sizes that are close to the nominal level 0:05 and have satisfactory power. Regarding the constancy test, the percentile-based test, as compared with the Wald-based test, may be more conservative in terms of size but have higher power in the presence of non-constant effects. In on-line supplementary material, we present additional simulation results with n = 400, which indicate better performance of the proposed method with a larger sample size while delivering similar observations to those from Tables 2–4. We also perform simulation studies to evaluate the robustness of the estimators proposed. Specifically, we examine the proposed estimators of β0 .τ / and r0 when the association model (2) is misspecified. We generated data from the set-ups S2.C and S2.F listed in Table 1, where

Quantile Regression

119

Clayton’s copula and Frank’s copula hold respectively. For each set-up, we implemented the proposed method separately by assuming Clayton’s copula and Frank’s copula. Fig. 1 presents the ˆ / under a correctly specified copula model and an incorrectly specified empirical averages of β.τ copula model. For comparison, we also plot the empirical averages of the estimators of Peng and Huang (2008) obtained by naively treating T2 as independent censoring. Not surprisingly, the naive estimator yields large biases for τ ∈ [0:1, 0:65] in all scenarios. When the association ˆ / are close to the true regression model (2) is correctly specified, the empirical averages of β.τ quantiles. This agrees with the results in Table 2. With a misspecified association structure, the proposed estimator of β0 .τ / exhibits quite small deviations from the true coefficients. For exˆ /, when we incorrectly adopt Frank’s copula in set-up S2.C, are nearly ample, the biases of β.τ negligible. We have very similar observations for the estimation of the association parameter. The empirical average of the estimated Kendall’s τ -coefficient, when Frank’s copula is incorrectly assumed in set-up S2.C, equals 0.475, and that when we assume Clayton’s copula in set-up S2.F equals 0.472. Both empirical averages are not far from the true Kendall’s τ -coefficient 0.5. Overall, our simulations suggest that the proposed estimation of β0 .τ / and r0 is quite robust to the misspecification of the parametric form of the copula model. The empirical results from our robustness study agree with previous reports in the competing risks literature. For example, the marginal regression analysis that was proposed by Chen (2010) was found to be robust to the specification of the functional form of the copula when the level of association is correctly assumed. Compared with competing risk methods such as those by Zheng and Klein (1995) and Chen (2010), the method proposed requires only the assumption on the parametric form of the copula while leaving the copula parameter unspecified. Consequently, our robustness result entails a better utility of the method proposed than the sensitive analysis that is permitted by competing risk data. 4.

A real example

We illustrate the proposed method via an application to the BMT example, where data were collected on patients with acute leukaemia following allogeneic bone marrow transplantation (Copelan et al., 1991). One intermediate end point of interest is the development of chronic GVHD, which is a common complication following transplantation. The GVHD end point was subject to potentially dependent censoring due to death. With the origin at transplantation, we let T1 denote the time to the GVHD end point and T2 denote the time to death, whereas C represents the time to the end of follow-up, which is not fixed but random. For the 137 subjects in this data set, 61 (44.5%) were observed to experience chronic GVHD, 81 (59%) died during the follow-up period, among whom 52 patients (38%) died before the onset of chronic GVHD. The values of P00 , P10 , P01 and P11 defined in Table 1 are 17.5%, 38.0%, 23.4% and 21.2% respectively. On the basis of disease type, patients were classified into three groups, referred to as acute lymphoblastic leukaemia (ALL), acute myelocytic leukaemia (AML) low risk and AML high risk. The data recorded also include patients’ age at the time of study enrolment. In this analysis, we are interested in assessing the effects of disease type and age on the development of GVHD. The covariate vector Z = .1, Z1 , Z2 , Z3 /, where Z1 and Z2 are indicator variables of the AML low risk group and AML high risk group respectively, and Z3 represents patients’ age. We fitted models (1)–(3) to the BMT data, adopting Frank’s copula and setting .τa , τb / = .0:05, 0:55/. Standard error estimates and confidence intervals were obtained based on 400 bootstrap resamples. The estimated association parameter rˆ equals 4:65. The corresponding Kendall’s τ equals 0.43, with a percentile-based 95% interval of .−0:04, 0:60/ and a 95% Waldtype interval of .0:05, 0:63/. This result provides some evidence for a moderate positive asso-

Regression Coefficient −0.5 1.0 2.0 3.0

R. Li and L. Peng Regression Coefficient 1.0 1.5 2.0 2.5

120

0.1

0.3

0.5

0.1

(a)

0.3

0.5

Regression Coefficient −0.04 0.00 0.04

Regression Coefficient −0.2 0.2 0.6

(b)

0.1

0.3

0.5

0.1

0.3

τ

τ

(c)

(d)

0.5

Fig. 2. Estimated regression coefficients for the BMT data set ( , proposed regression coefficients as a function of τ ; , estimator based on the method of Peng and Huang (2008); , percentile-based 95% intervals): (a) intercept; (b) Z1 , AML low versus ALL; (c) Z2 , AML high versus ALL; (d) Z3 , age

ciation between GVHD and death, conditionally on the type of disease and age. In Fig. 2, we present the fitted regression coefficients, coupled with percentile-based 95% confidence intervals. Fig. 2 suggests that patients in the AML low risk group may have overall significantly slower progression to GVHD compared with patients in the ALL group. The benefit in the timing of GVHD associated with being in the AML high risk group versus ALL group is much less evident, with the 95% confidence intervals excluding 0 for τ < 0:15 only. In addition, the estimation of the coefficients for Z3 suggests little age effect on the timing of GVHD development. For comparison, we also present in Fig. 2 Peng and Huang’s (2008) estimator βˆ N .τ / which treats death as independent censoring. We observe that the intercept term of βˆ N .τ / tends to be higher ˆ /, which agrees with our observation in Fig. 1 from the simulation studies that than that of β.τ βˆ N .τ / may tend to overestimate the intercept term when T1 and T2 are positively associated. There are rather large discrepancies in the estimated coefficients for Z2 and Z3 . For example, naively using βˆ N .τ / may lead to an impression that younger patients have more rapid progression to GVHD compared with older patients, which seems counterintuitive. Compared with the naive analysis based on βˆ N .τ /, the method proposed may provide us with higher confidence on the results by appropriately accounting for the dependence between the time to GVHD and the time to death. We next conducted second-stage inferences on covariate effects, setting .l, u/ = .0:05, 0:55/. The estimated trimmed mean effect of Z2 (AML low risk versus ALL) equals 0:65, with an estimated standard error of 0:32. The percentile-based 95% confidence interval is .0:26, 1:44/. These results confirm the beneficial effect of being in the AML low risk group (versus the ALL group) that was observed in Fig. 2. For Z2 (AML high risk versus ALL), the trimmed mean

0.1

0.3 τ (a)

0.5

10 8 6

Estimated Quantiles of T1

12

14

121

2

4

14 12 10 8 6 2

4

Estimated Quantiles of T1

12 10 8 6 4 2

Estimated Quantiles of T1

14

Quantile Regression

0.1

0.3 τ (b)

0.5

0.1

0.3 τ

0.5

(c)

Fig. 3. Estimated quantiles of time (months) to the GVHD end point ( , proposed estimator; , estimator based on the method of Peng and Huang (2008)): (a) ALL group; (b) AML low risk group; (c) AML high risk group

effect estimate is 0:17, with an estimated standard error of 0:14 and a percentile-based 95% confidence interval of .−0:03, 0:50/. This indicates little evidence to support the difference in time to GVHD between the AML high risk group and the ALL group. The trimmed mean estimate for Z3 (age) equals −0:003, with an estimated standard error of 0:009 and a percentilebased 95% confidence interval of .−0:020, 0:016/. This suggests a non-significant age effect on the GVHD end point. Regarding the constancy of covariate effects, we obtained T2,2 = −0:22 for Z1 , coupled with a percentile-based 95% confidence interval of .−0:75, −0:04/. This confirms our visual impression from Fig. 2 and implies that the difference between the AML low risk group and the ALL group may be more pronounced in patients who are less prone to GVHD. For Z2 and Z3 , the test statistics T2,3 and T2,4 equal 0.03 and −0:002 respectively, both with percentile-based confidence intervals covering 0. Therefore, constant effects may be adequate for AML high risk (versus ALL) and age. Finally, we present the estimated quantiles of the time to the GVHD end point for patients with different disease types in Fig. 3, with age set at its mean of 28:4 years. These curves provide useful alternative survival summaries by using quantiles of the time to the end point of interest, which have direct physical interpretations. We observe that the differences in estimated quantiles ˆ / and βˆ N .τ / seem to increase with τ . For example, for the ALL, AML low and based on β.τ AML high risk groups, the differences between the estimated 40th quantiles based on these two estimators are 0.9, 2.3 and 1.7 months respectively. Further unreported analyses suggest that the discrepancies in estimated quantiles increase with patients’ age. It is also observed from Fig. 3 that estimated quantiles of the time to GVHD for the AML low risk group are much larger than those for the ALL group and AML high risk group, and this beneficial effect of being in the AML low risk group appears more apparent when τ increases. By comparison, the estimated quantiles for AML high risk patients are quite similar to those for ALL patients. These observations are consistent with the findings from our second-stage inferences.

122

5.

R. Li and L. Peng

Remarks

In this paper, we propose a quantile regression method that can properly accommodate dependent censoring situations that fall into the paradigm of semicompeting risks. Our method offers a useful tool for investigating non-terminating end points that often arise in clinical follow-up studies and their relationship with subsequent competing end points. The net quantile inference that was pursued in this paper is useful when covariate effects with the removal of dependent censoring are of substantive relevance. It is worth noting that the inequalities (4) and (5) that were adopted for estimating model (1) entail quite a comprehensive use of semicompeting risks information, which is not limited to the diagonal line of the joint distribution of .T1 , T2 / as in Peng and Fine (2007b). Although formal efficiency comparisons between these two approaches are tricky because of their distinct modelling strategies, some simulation studies (unreported) show 25–35% more efficient coefficient estimation from the new method compared with that from Peng and Fine (2007b). We impose assumptions on the dependent censoring scheme through a general class of copula models. As noted in the paper, this is necessary for addressing the identifiability issue that is inherited with the dependent censoring problem. Simulations have shown that the estimators proposed are quite robust to misspecification of the parametric form of the copula model. This robustness feature is expected to enhance the practical utility of the method proposed. Acknowledgements The authors are grateful to the Joint Editor, the Associate Editor and the two referees for many helpful comments. This research has been supported by the National Science Foundation under grant DMS-1007660 and the National Institutes of Health under award R01HL 113548. Appendix A: Regularity conditions For a vector x, let x⊗2 denote xxT and x denote the Euclidean norm of x. For a random variable T , let fT .·|z/ denote its conditional density function given  τ Z = z. We define s.β, α, r, τ / = E{Sn .β, α, r, τ /}, w.β, α, r, τ / = E{Wn .β, α, r, τ /} and w.β, α, r/ = τab w.β, α, r, τ / dτ . Let Bb .β, α, r, τ / = @s.β, α, r, τ /=@β.τ /, Br .β, α, r, τ / = @s.β, α, r, τ /=@r, Lb .β, α, r, τ / = @w.β, α, r, τ /=@β.τ / and Lr .β, α, r/ = @w.β, α, r/=@r. Next, define μ.a/ = E[Z I{Y  exp.ZT a/, η = 1}] as in Peng and Huang (2008). For d > 0, let ¯ = A.d/ = {a ∈ Rp+1 : inf τ ∈.0, τU, 2 ] μ.a/ − μ{α0 .τ /}  d} and R.d/ = {r ∈ Rq+1 : r − r0   d}. Define B.d/ {β ∈ Rp+1 : maxτ ∈.ν1 , τU, 1 ] β.τ / − β0 .τ /  d}. Let dA , dB and dR be positive constants that determine the spans of the neighbourhoods of α0 , β0 and r0 respectively. For presentation simplicity, we exclude the indicator functions I{ZTi β.τ /  ZTi α.τU, 2 /} in Sn and I{log.t/ < ZTi α.τU, 2 /} in Wn , corresponding to the situation with τU, 2 close to 1. This incurs little essential difference in asymptotic arguments. We require the following regularity conditions. Condition 1. The covariate space Z is bounded, i.e. supi Zi   ∞. Condition 2. (a) The regularity conditions in Peng and Huang (2008) hold for .Y , η, Z/ and α0 .τ /, τ ∈ .0, τU, 2 ]. (b) For Bα .a/ = @μ.a/=@a, there is a constant CF , such that each component of fT2 {exp.zT a/|z} exp.zT a/ × Bα .a/−1  is bounded by CF uniformly in z ∈ Z and a ∈ A.dA /, where dA is a positive constant. (c) Define T T T T T VA .a, τ / = −Z[I{log.Y/   τb  > Z β0 .τ /}qA {τ , F2 [exp{Z β0 .τ /}|Z], g.Z r0 /} × I{Z β0 .τ /  Z a}] ¯ I{ZT β0 .τ /  log.t/ < log.Y/} × qB {τ , F2 .t|Z/, g.ZT r0 /} I{log.t/ VB .a/ = −Z τa t∈.0, ∞/  T  exp.Z a/} dτ dt ,

Quantile Regression

123

where qA .u, v, θ/ = @KA .u, v, θ/=@v and qB .u, v, θ/ = @KB .u, v, θ/=@v. For Z containing continuous covariates, VA .a, τ / and VB .a/ are differentiable with respect to a, and every component of [@E{VA .a, τ /}=@a]Bα .a/−1 and [@E{VB .a/}=@a]Bα .a/−1 are bounded uniformly for a ∈ A.dA / and τ ∈ .0, τU, 1 ]. Condition 3. (a) Each component of s.β0 , α0 , r0 , τ / and w.β0 , α0 , r0 , τ / is a Lipschitz function of τ when τ ∈ .0, τU, 1 ]. (b) Let dθ .u, v, θ/ = @Ψ.u, v, θ/=@θ, dθ {u, v, g.zT r0 /}g .zT r0 / is uniformly bounded for z ∈ Z, u  τU, 1 and v  τU, 2 . (c) The conditional density functions fX .t|z/, fY .t|z/, fT1 .t|z/ and fT2 .t|z/ are bounded above uniformly in t and z ∈ Z. Condition 4. (a) kb ≡ inf τ ∈.ν1 , τU, 1 ] eigmin{Bb .β0 , α0 , r0 , τ /} > 0, where eigmin.·/ denotes the minimum eigenvalue of a matrix. ˜ ; r/ is the unique solution to s{β, α0 , r, τ } = 0 for β ∈ B.d ¯ B /, and (b) For any fixed r ∈ R.dR /, β.τ ˜ α0 , r}=@r] > 0. kr ≡ inf r∈R.dR / eigmin[@w{β.r/, ¯ B /. (c) Every component of Lb .β, α0 , r0 , τ / × Bb .β, α0 , r0 , τ /−1 is uniformly bounded for β ∈ B.d Condition 5. (a) For any z ∈ Z, we have zT β0 .τU, 1 /  zT α0 .τU, 2 /. (b) Both inf z∈Z Pr[C > exp{ZT β0 .τU, 1 /}|Z] and inf z∈Z Pr[C > exp{ZT α0 .τU, 2 /}|Z] are bounded away from 0. Condition 1 requires bounded covariates and is often met in practice. Condition 2 imposes mild assumpτ ˆ du serves tions on the dependent censoring time T2 . Specifically, it ensures that 0 U, 2 I[t  exp{ZT α.u/}] an appropriate estimator of F2 .t|Z/ ∧ τU, 2 , and that s.β, α, ˆ r, τ / preserves the nice asymptotic properties of α.·/ ˆ shown in Peng and Huang (2008). By condition 3, s{β0 .τ /, α0 , r, τ } and w{β0 .τ /, α0 , r, τ } are smooth in both τ and r, and the conditional density functions of X, Y , T1 and T2 are uniformly bounded. Condition 4, coupled with the boundary constraints in condition 5, entails the identifiability of the proposed estimator in a neighbourhood of β0 and r0 . Specifically, conditions 4, parts (a) and (b), require that ˜ α, r, τ / over r are bounded from below at the true the derivative of s.β, α, r, τ / over β.τ / and that of w.β, ˜ ; r/ over r is bounded at r = r0 . Note parameters, and condition 4, part (c), ensures that the derivative of β.τ that condition 5, part (a), can be removed when equation (8) is adopted. Our following proofs utilize the asymptotic properties of the estimator α.·/ ˆ of Peng and Huang (2008). ˆ remain valid when other modelling strategies It is worth pointing out that the asymptotic results on β.·/ are adopted for T2 , provided that the strategy entails reasonable estimates of F2 .t|Z/. The proof can be carried out following the lines of Appendix B and C as follows, and the detailed arguments are omitted here.

Appendix B: Proof of theorem 1 Lemma 1. For μ.a/ = E[Z I{Y  exp.ZT a/, η = 1}], we have  τU, 2   τU, 2   sup  I[t  exp{zT α.u/}] ˆ du − I[t  exp{zT α0 .u/}] du  2CF

z∈Z, t

0

0

sup μ{α.τ ˆ /} − μ{α0 .τ /}:

τ ∈.0, τU,2 ]

B.1. Proof of lemma 1 By regularity condition 2, part (b), and Taylor series expansion, we have sup |F2 [exp{zT α.τ ˆ /}|z] − F2 [exp{zT α0 .τ /}|z]|  CF μ{α.τ ˆ /} − μ{α0 .τ /}, z∈Z

τ ∈ .0, τU, 2 ],

where F2 [exp{zT α0 .τ /}|z] = τ . Let "F = supz∈Z, τ ∈.0, τU, 2 ] |F2 [exp{zT α.τ ˆ /}|z] − τ |; we then have "F  CF supτ ∈.0, τU, 2 ] μ{α.τ ˆ /} − μ{α0 .τ /}. Note that

124

R. Li and L. Peng  τU, 2   I[t  exp{zT α.u/}] ˆ du − sup 

z∈Z, t

0



 sup

z∈Z, t



I{F2 .t|z/ ∈ [F2 [exp{zT α.u/}|z] ˆ ∧ u, F2 [exp{zT α.u/}|z] ˆ ∨ u]} du



τU, 2

I{F2 .t|z/ ∈ [u − "F , u + "F ]} du

0

= sup

z∈Z, t

0

τU, 2

  I[t  exp{zT α0 .u/}] du

0

 sup

z∈Z, t

τU, 2

τU, 2

I{u ∈ [F2 .t|z/ − "F , F2 .t|Bz/ + "F ]} du  2"F ,

0

where ‘∨’ is the maximum operator. This immediately completes the proof of lemma 1.

B.2. Proof of theorem 1

For any fixed θ, the copula function Ψ.u, v, θ/ satisfies the Lipschitz condition in u and v (Nelsen, 2006). Hence KA .u, v, θ/ and KB .u, v, θ/ are both Lipschitz continuous in v when v  τU, 2 . By lemma 1 and the fact that supτ ∈.0, τU, 2 ] μ{α.τ ˆ /} − μ{α0 .τ /} →p 0 (Peng and Huang, 2008), we can show with a Taylor series expansion that    τU, 2     τU, 2   p I{zT b  zT α.u/} ˆ du, g.zT r/ − KA τ , I{zT b  zT α0 .u/} du, g.zT r/  → 0: sup KA τ , z∈Z, r, b, τ 0

0

It follows that p

ˆ r, τ / − s.β, α0 , r, τ / → 0: sup s.β, α,

.15/

β, r, τ

Next, we claim that G1 = {Zi Pi .β, α, r, τ / : Zi ∈ Z, β, α ∈ Rp+1 , r ∈ Rq+1 , τ ∈ .0, 1/} is a Donsker and thus Glivenko–Cantelli class. This follows by noting that the class of indicator functions is a Vapnik– ˇ Chervonenkis class, and by using the permanence properties of the Donsker class (van der Vaart and Wellner, 1996). Therefore, the Glivenko–Cantelli theorem gives sup Sn .β, α, r, τ / − s.β, α, r, τ / = op .1/:

.16/

β, α, r, τ

˜ ; r/ is the solution to s.β, α0 , r, τ / = 0, we can use simple manipulations and obtain Since β.τ ˜ ˆ ˆ s{β.r/, α0 , r, τ } − s{β.r/, α0 , r, τ } = s{β.r/, α0 , r, τ } ˆ ˆ ˆ  s{β.r/, α, ˆ r, τ } + s{β.r/, α0 , r, τ } − s{β.r/, α, ˆ r, τ }:

.17/

ˆ α, ˆ r, τ } = o.1/, we obtain Combining equation (16) and the fact that supr, τ ∈[ν1 , τU, 1 ] Sn {β.r/, sup

r, τ ∈[ν1 , τU,1 ]

p

ˆ s{β.r/, α, ˆ r, τ } → 0:

This, coupled with expressions (15) and (17), implies sup

r, τ ∈[ν1 , τU, 1 ]

p ˜ ˆ s{β.r/, α0 , r, τ } − s{β.r/, α0 , r, τ } → 0:

.18/

ˆ , r/ ∈ B.d ¯ B /. Moreover, By condition 4, part (a), the inverse function theorem implies that there exists β.τ ˜ ; r/ gives ˆ Taylor series expansion of s{β.r/, α0 , r, τ } around β.τ sup

r∈R.dR /, τ ∈[ν1 , τU,1 ]

˜ ; r/  ˆ ; r/ − β.τ β.τ

sup

r∈R.dR /, τ ∈[ν1 , τU,1 ]

1 p ˜ ˆ s{β.r/, α0 , r, τ } − s{β.r/, α0 , r, τ } → 0: kb

.19/

This fact, combined with the Glivenko–Cantelli theorem on Wn .β, α, r/, implies that supr∈R.dR / Wn .α, ˆ r/ − ˜ ˆ ˜ α, ˜ w. ˆ r/ →p 0, where Wn .α, r/ = Wn {β.r/, α, r} and w.α, r/ = w{β.r/, α, r}. Similarly, we can use lemma ˜ α, ˜ 0 , r/ →p 0. 1 and the uniform consistency of μ{α.τ ˆ /}, τ ∈ .0, τU, 2 ], to show that supr∈R.dR / w. ˆ r/ − w.α It follows that p

˜ 0 , r/  sup [Wn .α, ˜ α, ˜ α, ˜ 0 , r/] → 0: ˆ r/ − w.α ˆ r/ − w. ˆ r/ + w. ˆ r/ − w.α sup Wn .α,

r∈R.dR /

r∈R.dR /

.20/

Quantile Regression

125

˜ 0 , ˆr/ = op .1/ from Wn .α, ˆ rˆ/ = op .1/. By regularity condition 4, part (b), Therefore, we can that see w.α ˜ 0 , r0 / = 0 and ˆr − r0   kr−1 w.α ˜ 0 , ˆr/ − w.α ˜ 0 , r0 /. The consistency of rˆ follows. w.α ˜ ; r/ with ˆ To show the uniform consistency of β.·/, we first need to show that the partial derivative of β.τ ˜ α0 , r, τ } = 0, we respect to r is bounded at r = r0 . Taking partial derivative @=@r on both sides of s{β.r/, obtain ˜ ; r/  @β.τ = −Bb .β0 , α0 , r0 , τ /−1 Br .β0 , α0 , r0 , τ /: .21/  r=r0 @r ˜ ; r/=@r is uniformly bounded for τ ∈ [ν1 , τU, 1 ], and thus By conditions 3, part (b), and 4, part (a), @β.τ sup

τ ∈[ν1 , τU, 1 ]

˜ ; r/ ˜ ; r0 / = op .1/ ˆ − β.τ β.τ

.22/

follows from the consistency of ˆr. On the basis of expressions (19) and (22), the inequality ˜ ; r0 /  β.τ ˜ ; r/ ˜ ; r/ ˜ ; r0 / ˆ ; r/ ˆ ; r/ ˆ / − β0 .τ / = β.τ ˆ − β.τ ˆ − β.τ ˆ + β.τ ˆ − β.τ β.τ ˆ / − β0 .τ / →p 0. This completes the proof of theorem 1. implies that supτ ∈[ν1 , τU, 1 ] β.τ

Appendix C: Proof of theorem 2 ¯ ¯ Lemma 2. For any sequence {β¯ n .τ /, τ ∈ [ν1 , τU, 1 ]}∞ n=1 ∈ B.dB / satisfying supτ ∈[ν1 , τU, 1 ] s.β n , α0 , r0 , τ / − s.β0 , α0 , r0 , τ / →p 0, we have sup

τ ∈[ν1 , τU, 1 ]

n1=2 {Sn .β¯ n , α0 , r0 , τ / − Sn .β0 , α0 , r0 , τ /} − {s.β¯ n , α0 , r0 , τ / − s.β0 , α0 , r0 , τ /} → 0: p

Similarly, for any sequence{α¯ n .τ /, τ ∈ .0, τU, 2 ]}∞ ¯ n .τ /} − μ{α0 .τ /} →p 0, n=1 satisfying supτ ∈.0, τU, 2 ] μ{α sup

τ ∈[ν1 , τU, 1 ]

p

n1=2 {Sn .β0 , α¯ n , r0 , τ / − Sn .β0 , α0 , r0 , τ /} − {s.β0 , α¯ n , r0 , τ / − s.β0 , α0 , r0 , τ /} → 0:

C.1. Proof of lemma 2

Define σB2 .β/ = supτ ∈[ν1 , τU, 1 ] var{Pi .β, α0 , r0 , τ / − Pi .β0 , α0 , r0 , τ /}. Following the arguments of Lai and Ying (1988) and Peng and Huang (2008), it is sufficient for the first part to hold if σB2 .β¯ n / →p 0. By condition 4, we can use Taylor series expansion on s.β¯ n , α0 , r0 , τ / around β0 .τ / to show that supτ ∈[ν1 , τU, 1 ] β¯ n .τ / − β0 .τ / →p 0. Furthermore, the conditional density functions fX .t|z/, fY .t|z/ and fT2 .t|z/ are bounded above uniformly in t and z, and KA .u, v, θ/ is Lipschitz continuous in v. These facts allow us to mimic lemma B.1 in Peng and Huang (2008) and to obtain σB2 .β¯ n / →p 0. Similarly, define σA2 .α/ = supτ ∈[ν1 , τU, 1 ] var{Pi .β0 , α, r0 , τ / − Pi .β0 , α0 , r0 , τ /}; a sufficient condition for the second part is σA2 .α¯ n / →p 0, which follows directly from lemma 1 and the Lipschitz continuity of KA .u, v, θ/ in v. This completes the proof of lemma 2.

ˆ 0 ), α0 , r0 , τ}  s(β0 , α0 , r0 , τ )] C.2. Weak convergence of n1=2 [s {β(r By simple manipulations, we can show that

ˆ 0 /, α, ˆ r0 , τ } − Sn .β0 , α0 , r0 , τ /] = n1=2 {Sn .β0 , α, ˆ r0 , τ / − Sn .β0 , α0 , r0 , τ /} n1=2 [Sn {β.r ˆ 0 /, α0 , r0 , τ } − Sn .β0 , α0 , r0 , τ /] + "S + n1=2 [Sn {β.r

.23/

ˆ 0 /, α, ˆ 0 /, α0 , r0 , τ } − Sn .β0 , α0 , r0 , τ /] can be where "S = n1=2 [Sn {β.r ˆ r0 , τ } − Sn .β0 , α, ˆ r0 , τ /] − n1=2 [Sn {β.r τ ∈[ν , τ ] shown to be op 1 U, 1 .1/ and asymptotically negligible. On the basis of the uniform convergence of ˆ 0 /, α0 , r0 , τ } to s{β.r0 /, α0 , r0 , τ } for τ ∈ [ν1 , τU, 1 ] in expression (18), and the uniform convergence s{β.r of μ{α.τ ˆ /} to μ{α0 .τ /} for τ ∈ .0, τU, 2 ] in Peng and Huang (2008), we can use lemma 2 to show that τ ∈[ν1 , τU, 1 ]

ˆ r0 , τ / − Sn .β0 , α0 , r0 , τ /} = n1=2 [s.β0 , α, ˆ r0 , τ / − s.β0 , α0 , r0 , τ /] + op n1=2 {Sn .β0 , α,

.1/,

τ ∈[ν1 , τU, 1 ]

ˆ 0 /, α0 , r0 , τ } − Sn .β0 , α0 , r0 , τ /] = n1=2 [s{β.r ˆ 0 /, α0 , r0 , τ } − s.β0 , α0 , r0 , τ /] + op n1=2 [Sn {β.r

.1/: .24/

126

R. Li and L. Peng

Here and in what follows, oSp .1/ means convergence to 0 in probability uniformly on set S. Similarly, we use oSp .n−1=2 / to denote uniform root n convergence to 0 in probability on set S. According to Peng and Huang (2008), n u∈.0, τU, 2 ] n1=2 [μ{α.u/} ˆ − μ{α0 .u/}] = −n1=2 φ{Zi RYi .α0 , u/} + op .1/, .25/ i=1

u

where RYi .α, u/ = I[Yi  exp{ZTi α.u/}, η = 1] − 0 I[Yi  exp{ZTi α.t/}] dH.t/, H.t/ = − log.1 − t/ for 0  t < 1, and φ is a continuous linear map that involves product integration. The detailed form of φ can be found in Appendix B of Peng and Huang (2008). By lemma 1 and the Lipschitz continuity of KA .u, v, θ/ in v, we can use Taylor series expansion to see that n1=2 {s.β0 , α, ˆ r0 , τ / − s.β0 , α0 , r0 , τ /}

= −n1=2 E Z I{log.Y/ > ZT β0 .τ /} × qA .τ , F2 [exp{ZT β0 .τ /}|Z], g.ZT r0 //  τU, 2   τU, 2 τ ∈[ν , τ ] × I{ZT β0 .τ /  ZT α.u/} ˆ du − I{ZT β0 .τ /  ZT α0 .u/} du + op 1 U, 1 .1/, 0

.26/

0

where qA .u, v, θ/ = @KA .u, v, θ/=@v. In what follows we shall show that the left-hand side of equation (26) has a uniform IID representation. We first consider the one-sample case, where {exp{α.u/}, ˆ u ∈ .0, τU, 2 ]} is asymptotically equivalent to the quantile function of the Nelson–Aalen estimator (Peng and Huang, 2008) of T2 ’s distribution, denoted NA NA u∈.0, τU, 2 ] −1=2 by Fˆ 2 .·/. Specifically, it can be shown that Fˆ 2 [exp{α.u/}] ˆ = u + op .n /. By this result and condition 5, part (a), we obtain  τU, 2  τU, 2 NA I{β0 .τ /  α.u/} ˆ du = I.Fˆ 2 [exp{β0 .τ /}]  u/ du + op .n−1=2 / 0

0 NA

= Fˆ 2 [exp{β0 .τ /}] + op .n−1=2 /: NA

Since n1=2 {Fˆ 2 .t/ − F2 .t/} is asymptotically Gaussian and can be written as n−1=2 Σni=1 πi .t/ + o.1/, where {πi .t/}ni=1 are IID and form a Donsker class with mean 0 (Kosorok, 2008), we can see that the right-hand side of equation (26) equals n τ ∈[ν , τ ] −n−1=2 qA {τ , F2 [exp{β0 .τ /}], g.r0 /}πi [exp{β0 .τ /}] Pr{log.Y/ > β0 .τ /} + op 1 U, 1 .1/: .27/ i=1

These arguments for the one-sample case can be easily extended to the K-sample case, which covers situations where all covariates are discrete. Specifically, suppose that Z = {z1 , z2 , : : : , zK } and Zi = zk if and only if observation i belongs to the kth group. We can show that the right-hand side of equation (26) equals − n−1=2

n K i=1 k=1

{I.Zi = zk /zk qA .τ , F2 [exp{zkT β0 .τ /}|zk ], g.zkT r0 //πi [exp{zkT β0 .τ /}] τ ∈[ν1 , τU, 1 ]

× Pr{log.Y/ > zkT β0 .τ /|zk }} + op

.1/:

.28/

By the boundedness of qA .·/, Z and Pr{log.Y/ > zkT β0 .τ /|zk }, we can show that −

K k=1

zk I.Zi = zk / qA .τ , F2 [exp{zkT β0 .τ /}|zk ], g.zkT r0 // Pr{log.Y/ > zkT β0 .τ /|zk }πk, i [exp{zkT β0 .τ /}]

is mean 0 and forms a Donsker class, as the Donsker property is preserved under Lipschitz transformations. Therefore, expression (28) converges weakly to a mean 0 Gaussian process. Next we consider the case when Z involves some continuous components. Define vA .a, τ / = E{VA .a, τ /}. By condition 2, part (c), JA .a, τ / ≡ @vA .a, τ /=@a exists, and each component of JA {α0 .u/, τ } Ba−1 {α0 .u/} is uniformly bounded for any u ∈ .0, τU, 2 ] and τ ∈ [ν1 , τU, 1 ]. Therefore, a Taylor series expansion, coupled with expressions (25) and (26), shows that  τU, 2 τ ∈[ν , τ ] ˆ r0 , τ / − s.β0 , α0 , r0 , τ /} = n1=2 [vA {α.u/, ˆ τ } − vA {α0 .u/, τ }] du + op 1 U, 1 .1/ n1=2 {s.β0 , α, 0

Quantile Regression = −n−1=2

n i=1



τU, 2

0

τ ∈[ν1 , τU, 1 ]

JA {α0 .u/, τ }Ba−1 {α0 .u/} φ{Zi RYi .α0 , u/} du + op

.1/:

127 .29/

ˆ r0 , τ / − s.β0 , α0 , r0 , τ /} is asymptotically It is seen from expressions (28) and (29) that n1=2 {s.β0 , α, equivalent to a sum of IID influence functions, no matter whether Z includes continuous covariates or not. In what follows we unify expressions (28) and (29) by writing n1=2 {s.β0 , α, ˆ r0 , τ / − s.β0 , α0 , r0 , τ /} = n−1=2

n i=1

τ ∈[ν1 , τU, 1 ]

ρi .β0 , α0 , r0 , τ / + op

.1/:

.30/

τ ∈[ν , τ ] ˆ 0 /, α, ˆ r0 , τ } = op 1 U, 1 .n−1=2 /, equations (23) and (24) imply that Noting that Sn {β.r

ˆ 0 /, α0 , r0 , τ } − s.β0 , α0 , r0 , τ /] n1=2 [s{β.r τ ∈[ν , τ ] = −n1=2 Sn .β0 , α0 , r0 , τ / − n1=2 {s.β0 , α, ˆ r0 , τ / − s.β0 , α0 , r0 , τ /} + op 1 U, 1 .1/: By equation (30), we then obtain ˆ 0 /, α0 , r0 , τ } − s.β0 , α0 , r0 , τ /] = n−1=2 n1=2 [s{β.r

n i=1

[ν , τU, 1 ]

χi .β0 , α0 , r0 , τ / + op 1

.1/,

.31/

where χi .β0 , α0 , r0 , τ / = −Zi Pi .β0 , α0 , r0 , τ / − ρi .β0 , α0 , r0 , τ /. This uniform IID representation implies ˆ 0 /, α0 , r0 , τ } − s.β0 , α0 , r0 , τ /] to a mean 0 Gaussian process by an the weak convergence of n1=2 [s{β.r application of the Donsker theorem.

ˆ 0 ), α, C.3. Weak convergence of n1=2 Wn {β(r ˆ r0}

Using similar arguments to those in equation (23) and lemma 2, we have ˆ 0 /, α, n1=2 [Wn {β.r ˆ r0 } − Wn .β0 , α0 , r0 /] 1=2 ˆ 0 /, α0 , r0 } − Wn .β0 , α0 , r0 /] + "W = n {Wn .β0 , α, ˆ r0 / − Wn .β0 , α0 , r0 /} + n1=2 [Wn {β.r 1=2 1=2 ˆ = n {w.β0 , α, ˆ r0 / − w.β0 , α0 , r0 /} + n [w{β.r0 /, α0 , r0 } − w.β0 , α0 , r0 /] + op .1/ + "W ,

.32/

ˆ 0 /, α, ˆ 0 /, α0 , r0 } − Wn .β0 , α0 , r0 /] can also be ˆ r0 } − Wn .β0 , α, ˆ r0 /] − n1=2 [Wn {β.r where "W = n1=2 [Wn {β.r shown to be op .1/. It is easy to see that n1=2 Wn .β0 , α0 , r0 / converges weakly to a Gaussian process by the Donsker property. Moreover, we can follow the arguments for expressions (28) and (29) to show that n n1=2 {w.β0 , α, ˆ r0 / − w.β0 , α0 , r0 /} = n−1=2 κi .β0 , α0 , r0 / + op .1/: .33/ i=1

The IID influence functions κi .β0 , α0 , r0 / take the form  τb  K [z¯k I.Zi = zk / I{zkT β0 .τ / < log.t/}Pr.Y > t|zk / qB {τ , F2 .t|zk /, g.zkT r0 /} × πk, i .t/ dτ dt], − k=1

τa

t∈.0, ∞/

.34/ in the K-sample case, and equals  τU, 2 JB {α0 .u/}Ba−1 {α0 .u/} φ{Zi RYi .α0 , u/} du −

.35/

0

when Z¯ contains continuous components, where JB .a/ ≡ dvB .a/=da, and vB .a/ = E{VB .a/}. It then follows from the central limit theory and condition 2 that n1=2 {w.β0 , α, ˆ r0 / − w.β0 , α0 , r0 /} converges to a mean 0 normal distribution. We also show by condition 4, part (c), and a Taylor series expansion that  τb ˆ 0 /, α0 , r0 } − w.β0 , α0 , r0 /] = .Lb {β0 .τ /, α0 , r0 , τ }Bb {β0 .τ /, α0 , r0 , τ }−1 n1=2 [w{β.r τa

ˆ 0 /, α0 , r0 , τ } − s.β0 , α0 , r0 , τ /]/ dτ + op .1/: × n1=2 [s{β.r Combining equations (31), (32), (33) and (36), we then have

.36/

128

R. Li and L. Peng ˆ 0 /, α, n1=2 Wn {β.r ˆ r0 } = n−1=2

n

ιi .β0 , α0 , r0 / + op .1/,

.37/

i=1

τ where ιi .β0 , α0 , r0 / = Z¯ i Qi .β0 , α0 , r0 / + κi .β0 , α0 , r0 / + τab Lb {β0 .τ /, α0 , r0 , τ }Bb {β0 .τ /, α0 , r0 , τ }−1 × ˆ 0 /, α, χi .β0 , α0 , r0 , τ / dτ . The weak convergence of n1=2 Wn {β.r ˆ r0 } follows from the Donsker theorem.

ˆ )  β0 (τ )} C.4. Asymptotic normality of n1=2 (rˆ  r0 ) and weak convergence of n1=2 {β(τ ˜ r/, α0 , ˆr, τ } = s.β0 , α0 , r0 , τ / = 0, we can obtain the inequality Using s{β.ˆ

ˆ 0 /, α0 , r0 , τ }  [s{β. ˆ r/, ˆ r/, ˆ r/, ˆ α0 , r0 , τ } − s{β. ˆ α0 , r0 , τ } − s{β.r ˆ α0 , r, ˆ τ } s{β. ˜ r/, ˆ ˆ α0 , r, ˆ τ } − s{β. ˆ α0 , r, ˆ τ } + s{β.r/, ˆ 0 /, α0 , r0 , τ } − s.β0 , α0 , r0 , τ /]: + s{β.r ˆ r/, ˆ r/, ˆ α0 , r0 , τ } − s{β. ˆ α0 , r, ˆ τ } →p 0. By condition 3, part (b), rˆ →p r0 implies that supτ ∈[ν1 , τU, 1 ] s{β. ˆ r/, ˆ α0 , r0 , τ } − This, coupled with the inequality above and expression (18), implies that supτ ∈[ν1 , τU, 1 ] s{β. ˆ ; r/ ˆ ; r0 / →p 0 after using condition 4, ˆ 0 /, α0 , r0 , τ } →p 0. Hence, we have supz∈Z, τ ∈[ν , τ ] β.τ ˆ − β.τ s{β.r 1 U,1 part (a), and Taylor series expansions. Mimicking the arguments in lemma 2, we can show with condition 3, part (b), that sup

τ ∈[ν1 , τU,1 ]

ˆ r/, ˆ 0 /, α0 , r0 , τ } − s{β. ˆ r/, ˆ 0 /, α0 , r0 , τ } = op .n−1=2 /, ˆ α0 , r, ˆ τ } − Sn {β.r ˆ α0 , r, ˆ τ } + s{β.r Sn {β. .38/

and, similarly, sup

τ ∈[ν1 , τU, 1 ]

ˆ r/, ˆ 0 /, α, ˆ r/, ˆ 0 /, α0 , r0 , τ } = op .n−1=2 /: ˆ α, ˆ τ } − Sn {β.r ˆ α0 , r, ˆ τ } + Sn {β.r Sn {β. ˆ r, ˆ r0 , τ } − Sn {β.

ˆ ˆ r/, ˆ 0 /, ˆ α0 , r, ˆ τ } − Sn {β.r α, ˆ r, τ } = o.n−1=2 /, and thus supτ ∈[ν1 , τU, 1 ] Sn {β. Note that supτ ∈[ν1 , τU, 1 ] Sn {β.r/, α0 , r0 , τ } = op .n−1=2 /. This, combined with equation (38), further implies that sup

τ ∈[ν1 , τU, 1 ]

ˆ r/, ˆ 0 /, α0 , r0 , τ } = op .n−1=2 /: ˆ α0 , r, ˆ τ } − s{β.r s{β.

ˆ ; r0 / to β0 .τ / in τ ∈ [ν1 , τU, 1 ], we can use Taylor series expansion to By the uniform convergence of β.τ obtain sup

τ ∈[ν1 , τU, 1 ]

ˆ ; ˆr/ − β.τ ˆ ; r0 / + Bb .β0 , α0 , r0 , τ /−1 Br .β0 , α0 , r0 , τ /.rˆ − r0 / = op .n−1=2 ∨ ˆr − r0 /: β.τ

.39/

ˆ ; r0 / and equation (39), we see after a Taylor series By the consistency of ˆr, uniform convergence of β.τ expansion that ˆ 0 /, α0 , r0 } ˆ r/, α0 , ˆr} − w{β.r w{β.ˆ  = Lr .β0 , α0 , r0 /.ˆr − r0 / +

τb

τa

ˆ ; ˆr/ − β.τ ˆ ; r0 /} dτ + op .n−1=2 ∨ ˆr − r0 / Lb .β0 , α0 , r0 , τ /{β.τ

.40/ = Dr .β0 , α0 , r0 / × .ˆr − r0 / + op .n−1=2 ∨ ˆr − r0 /,  τb where Dr .β0 , α0 , r0 / = Lr .β0 , α0 , r0 / − τa Lb .β0 , α0 , r0 , τ / Bb .β0 , α0 , r0 , τ /−1 × Br .β0 , α0 , r0 , τ / dτ . On the ˜ basis of equation (21), Dr .β0 , α0 , r0 / equals @w{β.r/, α0 , r}=@r|r=r0 . ˆ r/, ˆ 0 /, α, ˆ α, ˆ − Wn {β.r Following the arguments for lemma 2, we can also show that n1=2 [Wn {β. ˆ r} ˆ r0 }] ˆ r/, α, ˆ 0 /, α, ˆ r/, α0 , ˆr} − can be approximated by n1=2 [w{β.ˆ ˆ ˆr} − w{β.r ˆ r0 }], and furthermore by n1=2 [w{β.ˆ ˆ 0 /, α0 , r0 }]. This fact, coupled with Wn {β.ˆ ˆ r/, α, w{β.r ˆ ˆr} = o.n−1=2 / and equation (40), gives ˆ 0 /, α, ˆ r0 } = Dr × n1=2 .ˆr − r0 / + op .1 ∨ n1=2 rˆ − r0 /, −n1=2 Wn {β.r which, combined with equation (37), further implies that

Quantile Regression n1=2 .ˆr − r0 / = −n−1=2

n i=1

D−1 r ιi .β 0 , α0 , r0 / + op .1/:

129 .41/

−1 1=2 .rˆ − r0 / converges to a normal Here, D−1 r is a shorthand notation for Dr .β 0 , α0 , r0 /. It then follows that n distribution with mean 0. Finally, by combining equations (31) and (39), we can see that

ˆ / − β0 .τ /} = n1=2 {β.τ ˆ ; r/ ˆ ; r0 /} + n1=2 {β.τ ˆ ; r0 / − β0 .τ /} ˆ − β.τ n1=2 {β.τ n [Bb .β0 , α0 , r0 , τ /−1 Br .β0 , α0 , r0 , τ /D−1 = n−1=2 r ιi .β 0 , α0 , r0 / i=1

+ Bb .β0 , α0 , r0 , τ /−1 χi .β0 , α0 , r0 , τ /] + op .1/: ˆ / − β0 .τ /} converges weakly to a Gaussian process with mean 0 for τ ∈ [ν1 , τU, 1 ]. This implies that n1=2 {β.τ This completes the proof of theorem 2.

Appendix D: Convergence criteria for the algorithm proposed The algorithm that was proposed in Section 2.3 involves iterative procedures which are stopped by certain prespecified convergence criteria. Below, we elaborate the convergence criteria that were adopted in our numerical studies and are also expected to function well in other typical settings. First, to assess the convergence status, the distance between two estimates for β0 .·/ is defined as τ Db .β1 , β2 / = maxi=1,:::, p+1 ν1U, 1 β2.i/ .τ / − β1.i/ .τ / dτ . The distance between two estimates for r0 is defined as Dr .r1 , r2 / = |K.r1 / − K.r2 /|, where K.r/ denotes Kendall’s τ -coefficient under the assumed model (2) with r0 = r. We define the distance measure for r on the Kendall’s τ -scale to have a unified criterion for different choices of copula functions. [k, m] For the iterations within steps 3(a)–3(c), the convergence of the sequence {βˆ .τ / : m = 0, 1, : : :} is determined via the following procedure. (a) Set a maximum number of iterations as max.iter0 , and choose convergence tolerance levels as tolb, 1 and tolb, 2 . [k, q] [k, q−1] (b) In the qth iteration (q < max.iter0 ), we regard the sequence as converged if Db .βˆ , βˆ /  tolb, 1 [k, q] [k, q−2] [k, q] [k] ˆ ˆ ˆ ˆ or D . β , β /  tol . The final estimate β.τ ; r / is set as β .τ / in the former case and as b b, 1 [k, q] [k, q−1] {βˆ .τ / + βˆ .τ /}=2 in the latter case. [k, q] [k, q−1] [k, q] (c) When q = max.iter0 , the sequence is concluded as convergent if Db .βˆ , βˆ /  tolb, 2 or Db .βˆ , [k, q−2] [k] ˆ ; r / is set in the same βˆ /  tolb, 2 , and as non-convergent otherwise. In the convergent case, β.τ way as in the previous step. [k] The convergence criteria for the iterations within steps 1–4 need to concern both {βˆ .τ / : k = 0, 2, : : :} [k] [k] ˆ and[k,{m]rˆ .τ / : k = 1, 2, : : :}. We assess the convergence of {β .τ / : k = 1, 2, : : :} in the same way as that for {βˆ .τ / : m = 1, 2, : : :}, except that we may adopt a different maximum number of iterations, max.iter1 . [k] We stop the algorithm at the qth iteration (q  max.iter1 ) if the sequence {βˆ .τ / : k = 1, 2, : : :} reaches convergence and Dr .r[q] , r[q−1] /  tolk , where tolk is a prespecified tolerance level. In the numerical studies that were reported in Section 3 and 4, we choose tolb, 1 = 5 × 10−4 and tolb:2 = tolk = 0:005, and we set max.iter0 = 10 and max.iter1 = 20.

References Chen, Y.-H. (2010) Semiparametric marginal regression analysis for dependent competing risks under an assumed copula. J. R. Statist. Soc. B, 72, 235–251. Chen, Y. (2011) Maximum likelihood analysis of semicompeting risks data with semiparametric regression models. Liftim. Data Anal., 1–22. Clayton, D. (1978) A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika, 65, 141–151. Copelan, E., Biggs, J., Thompson, J., Crilley, P., Szer, J., Klein, J., Kapoor, N., Avalos, B., Cunningham, I. and Atkinson, K. (1991) Treatment for acute myelocytic leukemia with allogeneic bone marrow transplantation following preparation with bucy2. Blood, 78, 838–843. Ding, A., Shi, G., Wang, W. and Hsieh, J. (2009) Marginal regression analysis for semi-competing risks data under dependent censoring. Scand. J. Statist., 36, 481–500.

130

R. Li and L. Peng

Efron, B. (1979) Bootstrap methods: another look at the jackknife. Ann. Statist., 7, 1–26. Fine, J., Jiang, H. and Chappell, R. (2001) On semi-competing risks data. Biometrika, 88, 907–919. Fygenson, M. and Ritov, Y. (1994) Monotone estimating equations for censored data. Ann. Statist., 22, 732–746. Genest, C. (1987) Frank’s family of bivariate distributions. Biometrika, 74, 549–555. Hsieh, J.-J., Wang, W. and Ding, A. A. (2008) Regression analysis based on semicompeting risks data. J. R. Statist. Soc. B, 70, 3–20. Huang, Y. (2010) Quantile calculus and censored regression. Ann. Statist., 38, 1607–1637. Jiang, H., Chappell, R. and Fine, J. (2003) Estimating the distribution of nonterminal event time in the presence of mortality or informative dropout. Contr. Clin. Trials, 24, 135–146. Kendall, M. and Gibbons, J. (1962) Rank Correlation Methods. London: Griffin. Klein, J. and Moeschberger, M. (2005) Survival Analysis: Techniques for Censored and Truncated Data. New York: Springer. Koenker, R. (2005) Quantile Regression. Cambridge: Cambridge University Press. Koenker, R. and Bassett, G. (1978) Regression quantiles. Econometrica, 46, 33–50. Kosorok, M. (2008) Introduction to Empirical Processes and Semiparametric Inference. New York: Springer. Lai, T. and Ying, Z. (1988) Stochastic integrals of empirical-type processes with applications to censored regression. J. Multiv. Anal., 27, 334–358. Lin, D., Robins, J. and Wei, L. (1996) Comparing two failure time distributions in the presence of dependent censoring. Biometrika, 83, 381–393. Martinussen, T. and Scheike, T. H. (2006) Dynamic Regression Models for Survival Data. New York: Springer. Nelsen, R. (2006) An Introduction to Copulas. New York: Springer US. Peng, L. and Fine, J. (2006) Rank estimation of accelerated lifetime models with dependent censoring. J. Am. Statist. Ass., 101, 1085–1093. Peng, L. and Fine, J. (2007a) Nonparametric quantile inference with competing risks data. Biometrika, 94, 735– 744. Peng, L. and Fine, J. P. (2007b) Regression modeling of semi-competing risks data. Biometrics, 63, 96–108. Peng, L. and Fine, J. (2009) Competing risks quantile regression. J. Am. Statist. Ass., 104, 1440–1453. Peng, L. and Huang, Y. (2008) Survival analysis with quantile regression models. J. Am. Statist. Ass., 103, 637–649. Portnoy, S. (2003) Censored regression quantiles. J. Am. Statist. Ass., 98, 1001–1013. Powell, J. (1986) Censored regression quantiles. J. Econmetr., 32, 143–155. Tsiatis, A. (1975) A nonidentifiability aspect of the problem of competing risks. Proc. Natn. Acad. Sci. USA, 72, 20–22. van der Vaart, A. and Wellner, J. (1996) Weak Convergence and Empirical Processes: with Applications to Statistics. New York: Springer. Wang, H. and Wang, L. (2009) Locally weighted censored quantile regression. J. Am. Statist. Ass., 104, 1117–1128. Ying, Z., Jung, S. and Wei, L. (1995) Survival analysis with median regression models. J. Am. Statist. Ass., 90, 178–184. Zheng, M. and Klein, J. (1995) Estimates of marginal survival for dependent competing risks based on an assumed copula. Biometrika, 82, 127–138.

Supporting information Additional ‘supporting information’ is available in the on-line version of this article: ‘Supplement to “Quantile regression adjusting for dependent censoring from semi-competing risks”’.

Quantile Regression Adjusting for Dependent Censoring from Semi-Competing Risks.

In this work, we study quantile regression when the response is an event time subject to potentially dependent censoring. We consider the semi-competi...
1MB Sizes 2 Downloads 6 Views