Lifetime Data Anal DOI 10.1007/s10985-013-9289-x
Conditional quantile residual lifetime models for right censored data Cunjie Lin · Li Zhang · Yong Zhou
Received: 24 April 2013 / Accepted: 30 December 2013 © Springer Science+Business Media New York 2014
Abstract Quantile residual lifetime function is a more comprehensive quantitative measure for residual lifetimes than the mean residual lifetime function. It also incorporates the median residual life function, which is less restrictive than the model based on the mean residual lifetime. In this study, we propose a semiparametric estimator of the conditional quantile residual lifetime under different covariate effects at a specified time point by the reinforcement of the auxiliary models. Two kind of test statistics are proposed to compare two quantile residual lifetimes at fixed time points. Asymptotic properties are also established and a revised bootstrap method is proposed to estimate the asymptotic variance of the estimator. Simulation studies are reported to assess the finite sample properties of the proposed estimator and the performance of test statistics in terms of type I error probabilities and powers at fixed time points. We also compare the proposed method with the method of Jung et al. (Biometrics 65:1203–1212, 2009) through simulation studies. The proposed methods are applied to HIV data and some interesting results are presented. Keywords Estimating equation · Proportional hazards model · Quantile residual lifetime · Right censoring · Two-sample test statistic
C. Lin (B) · Y. Zhou Academy of Mathematics and Systems Science, Chinese Academy of Sciences, No. 55 Zhongguancun East Road, Haidian District, Beijing 100190, China e-mail:
[email protected] Y. Zhou e-mail:
[email protected] L. Zhang · Y. Zhou School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai 200433, China e-mail:
[email protected] 123
C. Lin et al.
1 Introduction In survival analysis, medical research, actuarial science and reliability analysis, the residual lifetime is often regarded as a crucial index for investigators to make decisions. For example, in clinical trial studies, the remaining lifetime of patients is very attractive for both doctors and patients. The concept of the remaining life time may provide them with a straightforward interpretations about the treatment effect. In reliability analysis, one often desires to know the residual life time of an individual or a machine given the associated factors or environmental background, such as by treating or replacing the failed item with a new item. Many existing methods, such as an adjusted version of Kaplan–Meier estimator or Cox proportional hazards model (Cox 1972, 1975), that may be adopted to indirectly make statistical inference for the residual life time. The Cox proportional hazard function and Kaplan–Meier estimator have long been studied and used to make inferences based on time-to-event data. However, they are often cumbersome and not straightforward, especially when the residual life time needs to be evaluated in the middle of an observation period. Meanwhile, one drawback of the hazard function is its interpretation as ”instantaneous rate of failure” which is conceptually difficult to understand and may not be a relevant metric to measure the long-term reliability. Furthermore, it is difficult to estimate life time by the Kaplan–Meier method when the residual life time of an individual depends on some covariates or the environmental background. Compared to the hazard function and survival function, the mean residual life function (Chiang 1960) is one of the quantitative measures for the residual lifetimes that can describe the characteristics of the residual life time more directly. The mean residual life function is of interest in many fields, such as reliability research and medical research. For example, in chronic disease such as breast cancer, compared to the hazard function it is more informative to tell the patient that how long she can survive or live without disease recurrence, in expectation, given her current situation. There is a lot of statistical literature available on modeling and estimating the mean of residual lifetime (Chen et al. 2005; Chen and Cheng 2006). However, the mean residual lifetime is not preferred when the underlying distribution is highly skewed or heavy tailed. One alternative is the median residual life function (Schmittlein and Morrison 1981), which is less restrictive than the model based on the mean residual lifetime. There are many survival functions that correspond to the same median residual life function while the model based on the mean residual life time has one-to-one correspondence to the survival function. Also, there are a large number of references available about the properties of median residual lifetime (Gupta and Langford 1984; Jeong et al. 2008; Gelfand and Kottas 2003). In this paper, we consider the quantile residual lifetime which incorporates the median residual life function and is developed as an alternative to the mean residual life function. The quantile residual lifetime provides more complete and effective information, especially when the distribution of the residual lifetime is non-symmetric and skewed, when it is heavy tailed or when the data contains outliers. Many authors have studied the concept of quantile residual life function, including Jeong et al. (2008), Jung et al. (2009), Ma and Yin (2010) and Jeong and Fine (2009). Most of the quantile residual life models considered in the current literature focus on modeling
123
Conditional quantile residual lifetime models
and estimation at a single fixed t0 . Intuitively speaking, the residual lifetime can be different for different covariates or backgrounds and it can be extended by effective measures. This means that the quanile residual lifetime depends on some treatment effect covariates. Our research is initially motivated by a clinical study on AIDS patients. In this study, a large HMO wishes to evaluate the survival time of its HIV+ members and the patients are also interested in knowing how long they can still survive. In this case, the residual life time becomes the center of interest. Moreover, the patients would be more interested in knowing the remaining lifetime with 90 probability, rather than knowing the average residual time. Thus, the quantile residual life time is a more informative choice. Further, some studies show that the survival time, after a confirmed diagnosis of HIV, is associated with age and prior drug use. Thus, the survival time and censoring time may depend on the common covariates such that they aren’t independent and in such cases the Kaplan–Meier estimator is not feasible to construct the estimating equation for quantile residual lifetime function. Jung et al. (2009) proposed a timespecific log-linear regression method on quantile residual lifetime which is associated with selected covariates under right censoring. However, in some situations the linear regression model is not always true and is restrictive in statistical analysis. Most popular survival models dissatisfy this model assumption. Moreover, they focus on the estimate of the regression coefficients of the linear regression model for quantile residual lifetimes and testing the null hypothesis of the coefficients. However, the interpretation of regression coefficients is not so straightforward as quantile residual lifetime. We cannot obtain the estimate of quantile residual lifetime directly by the method of Jung et al. (2009). In this paper, we propose the conditional residual lifetime model to solve such problems and take the patient’s characteristics into account. The remainder of the paper is organized as follows. In Sect. 2, we propose the conditional residual lifetime model and consider the well-known proportional hazards model (Cox 1972) as the auxiliary model. The estimate of residual lifetime function, which depends on the covariates, is given, and the large sample properties are also discussed in this section. In Sect. 3, we propose two kinds of test statistics to compare two quantile residual lifetimes, for given two samples. An adjusted method to estimate the variance of estimators is provided in Sect. 4. The simulation studies to assess the performance of the estimators and test statistics, and a real data analysis are presented in Sect. 5. In this section, we also conduct simulation to compare our method with the method of Jung et al. (2009). The proofs for the theorems are provided in the Appendix. 2 Estimating procedure and main results 2.1 Model and estimating method We are interested in the conditional αth quantile residual life function at a specific time point t0 given the covariate z0 , which is defined as θα (t0 ; z0 ) = quantile(T − t0 |T ≥ t0 , z0 ),
(2.1)
123
C. Lin et al.
which implies the conditional αth quantile of remaining lifetimes among survivors beyond time t0 . Obviously, the function (2.1) also satisfies the following equation P{T − t0 ≥ θα (t0 ; z0 )|T ≥ t0 , z0 } = α and it implies that P{T − t0 ≥ θα (t0 ; z0 )|z0 } = α P(T ≥ t0 |z0 ). Note that θα (t0 ; z0 ) does not uniquely determine S(t|z0 ) = P(T ≥ t|z0 ) (Gupta and Langford 1984), but in practice we can model S(t|z0 ) first and then infer θα at a fixed time point t0 given z0 (Gelfand and Kottas 2003). The conditional survival function of the residual lifetime for a patient who has survived beyond time t0 , i.e., (T − t0 |T ≥ t0 ; z0 ), is given as S(t|t0 ; z0 ) = S(t + t0 |z0 )/S(t0 |z0 ) for t0 > a0 , where a0 is the lower bound of the support of S(t|z0 ). Then we have S(t + θα (t; z0 )|z0 ) = αS(t|z0 )
(2.2)
for any t > a0 . Suppose that θα (t; z0 ) is the unique solution of Eq. (2.2). Then, we can estimate the conditional αth quantile residual life function at t0 by solving the following estimating function: ˆ 0 + θα (t0 ; z0 )|z0 ) − α S(t ˆ 0 |z0 ) = 0, Uˆ (θα (t0 ; z0 )) = S(t
(2.3)
ˆ 0 ) is a consistent estimator of S(·|z0 ). Note that if there are several soluwhere S(·|z tions to Eq. (2.2), we can define θα (t; z0 ) = inf{θα (t0 ; z0 ) : S(t0 + θα (t0 ; z0 )|z0 )) ≤ αS(t0 |z0 )}, that is θα (t0 ; z0 ) = S −1 (αS(t0 |z0 )|z0 ) − t0 , where S −1 (θα |z0 ) = inf{t : S(t|z0 ) ≤ α}. Accordingly, we can define the solution of (2.3) as θˆα (t0 ; z0 ) = ˆ 0 |z0 )|z0 ) − t0 . Sˆ −1 (α S(t The important problem is how to estimate S(t|z0 ). A natural estimator of S(t) is the Kaplan–Meier estimator without covariate. But Kaplan–Meier estimator is not so desirable when we have some extra information about the covariates. In this paper, we give another feasible estimator which can take full advantage of the covariates. Assume that the conditional α-th quantile residual life depends on a vector of treatment effect covariates Z, which may include time-dependent categorical covariates. We consider the Cox proportional hazards model (Cox 1972) which specifies that the cumulative hazard function of T conditional on Z takes the form t exp{β T Z(s)}d0 (s),
(t|Z) = 0
123
(2.4)
Conditional quantile residual lifetime models
where β is a p-vector of unknown regression parameters, 0 (t) is an arbitrary cumulative baseline hazard function and Z(·) is a p-vector of possibly time-varying covariates. Denote X i = min(Ti , Ci ), Ci is the censoring time variable and i = I (Ti ≤ Ci ) is the censoring indicator, Yi (t) = I (X i ≥ t). Assume that {Ti , Ci , Z i (·)}, i = 1, . . . , n are i.i.d and Ti and Ci are independent conditional on Z i (·) which is supposed to be bounded. Hence, given a specified z0 , the conditional survival function S(t|z0 ) can be estimated by ⎧ ⎫ ⎨ t ⎬ T ˆ ˆ ˆ 0 (s) , exp{βˆ z0 (s)}d S(t|z 0 ) = exp{−(t|z 0 )} = exp − ⎩ ⎭
(2.5)
0
ˆ 0 (t) is the Breslow (1972) where βˆ is the partial likelihood estimator (Cox 1972) and ˆ estimator of 0 (t). Specifically, β is the maximizer of n
n
eβ
j=1 Y j (X i )e
i=1
i
T Z (X ) i i
β T Z j (X i )
and ˆ 0 (t) =
n
i=1
n
I (X i ≤ t)i ˆ T Z j (X i )
β j=1 Y j (X i )e
.
Therefore, the estimator θˆα (t0 ; z0 ) can be obtained by solving the equation ˆ 0 + θα (t0 ; z0 )|z0 ) − α S(t ˆ 0 |z0 ) = 0, Uˆ (θα (t0 ; z0 )) = S(t
(2.6)
2.2 Asymptotic properties ˆ 0 (·), it is easy to get the consistency of θˆα (t0 ; z0 ), By the strong consistency of βˆ and i.e., θˆα (t0 ; z0 ) → θα (t0 ; z0 ) in probability. Besides, applying√the asymptotic results ˆ − (·)} conof Andersen and Gill (1982), we can see that the process n{(·) verges weakly to a zero-mean Gaussian process. By the functional delta-method and some standard counting process techniques, we can get the asymptotic normality for θˆα (t0 ; z0 ). For the sake of simplicity, we denote S (r ) (β, t) =
n 1 Yi (t) exp(β T Zi (t))Zi (t)⊗r , n i=1
s (r ) (β, t) = E{S (r ) (β, t)},
S (1) (β, t) Z¯ (β, t) = (0) , S (β, t) z¯ (β, t) =
s (1) (β, t) , s (0) (β, t)
123
C. Lin et al.
for r = 0, 1, 2 and t h(t) =
exp(β 0T z0 (u)){z0 (u) − z¯ (β 0 , u)}d0 (u), 0
then, we have the following theorems of consistency and asymptotic normality: Theorem 1 Assume the conditions in Appendix hold and θα (t0 ; z0 ) is the unique solution of Eq. (2.2), then as n → ∞, we have P θˆα (t0 ; z0 ) → θα (t0 ; z0 )
for given covariates z0 and a specified point t0 . Theorem 2 Assume the conditions in Appendix hold, for given covariates z0 and a specified point t0 , as n → ∞, then we have √
L
n(θˆα (t0 ; z0 ) − θα (t0 ; z0 )) → N (0, σ 2 )
L
where → denotes convergence in distribution and σ2 =
α 2 S(t0 |z0 )2 , f (t0 + θα |z0 )2
in which, f (·|z0 ) denotes the density function of T conditional on covariates z0 and ∞
=
s (2) (β 0 , t) ⊗2 − z¯ (β 0 , t) s (0) (β 0 , t)d0 (t), s (0) (β 0 , t)
0 t0+θα
= t0
exp(2β 0T z0 (u)) d0 (u) + (h(t0 + θα ) − h(t0 ))T −1 (h(t0 + θα ) − h(t0 )). s (0) (β 0 , u)
3 Two-sample test problem In practice, we are often interested in comparing the quantile residual lifetimes between two groups at a specified time t0 . For example, we want to examine whether there is any difference between two residual lifetimes under two different treatments with the same covariates. For simplicity, we suppose that the two observed samples are independent with the sample size n k of group k (k = 1, 2), and let θk,α (t0 ; z0 ) be the αth quantile residual lifetime at time t0 given the same covariates z0 for group k. Our statistical hypothesis is H0 : θ1,α (t0 ; z0 ) = θ2,α (t0 ; z0 ) versus H1 : θ1,α (t0 ; z0 ) = θ2,α (t0 ; z0 ).
123
Conditional quantile residual lifetime models
Assume that n 1 /n → ρ where n = n 1 + n 2 and 0 < ρ < 1 is a constant. For group k, let the estimating function be Uˆ k (θk,α (t0 ; z0 )) = Sˆk (t0 + θk,α (t0 ; z0 )|z0 ) − α Sˆk (t0 |z0 ). It is natural to construct the following two test statistics. 3.1 Difference of quantile residual lifetimes First, we consider the difference of two quantile residual lifetimes d(t0 ; z0 ) = θ1,α (t0 ; z0 ) − θ2,α (t0 ; z0 ) and a natural estimator of d(t0 ) is ˆ 0 ; z0 ) = θˆ1,α (t0 ; z0 ) − θˆ2,α (t0 ; z0 ), d(t ˆ 0 ; z0 ) follows easily from the consistency of θˆ1,α (t0 ; z0 ) then the consistency of d(t ˆ and θ2,α (t0 ; z0 ). Combining Theorem 1 and the fact that √
ˆ 0 ; z0 ) − d(t0 ; z0 )} = n{d(t
√ n{θˆ1,α (t0 ; z0 ) − θ1,α (t0 ; z0 )} √ − n{θˆ2,α (t0 ; z0 ) − θ2,α (t0 ; z0 )},
the following theorem holds naturally. Theorem 3 Under the same conditions as in Theorem 1, for a given covariates z0 and a specified point t0 , as n → ∞, we have √
L
ˆ 0 ; z0 ) − d(t0 ; z0 )) → N (0, σd2 ) n(d(t
L
where → denotes convergence in distribution and σd2 = σk2 is the asymptotic variance of of group k, k = 1, 2.
√
1 2 1 σ1 + σ2 ρ 1−ρ 2
n k (θˆk,α (t0 ; z0 ) − θk,α (t0 ; z0 )) based on the sample
3.2 Ratio of quantile residual lifetimes θ
(t ;z )
1,α 0 0 Let r (t0 ; z0 ) = θ2,α (t0 ;z0 ) denote the ratio of two quantile residual lifetimes. Hence, a consistent estimator of r (t0 ; z0 ) is
rˆ (t0 ; z0 ) =
θˆ1,α (t0 ; z0 ) , θˆ2,α (t0 ; z0 )
123
C. Lin et al.
By some simple calculations, we have √
√ n{ˆr (t0 ; z0 ) − r (t0 ; z0 )} = n
θˆ1,α (t0 ; z0 ) θ1,α (t0 ; z0 ) − θˆ2,α (t0 ; z0 ) θ2,α (t0 ; z0 )
θˆ1,α (t0 ; z0 ) θ1,α (t0 ; z0 ) θ1,α (t0 ; z0 ) θ1,α (t0 ; z0 ) − + − θˆ2,α (t0 ; z0 ) θˆ2,α (t0 ; z0 ) θˆ2,α (t0 ; z0 ) θ2,α (t0 ; z0 ) √ 1 = n(θˆ1,α (t0 ; z0 ) − θ1,α (t0 ; z0 )) θˆ2,α (t0 ; z0 ) √ −r (t0 ; z0 ) n(θˆ2,α (t0 ; z0 ) − θ2,α (t0 ; z0 )) , √ = n
then the following theorem can be easily derived from Slutsky Theorem and Theorem 1. Theorem 4 Under the same conditions as in Theorem 1, for given covariates z0 and a specified point t0 , as n → ∞, we have √ L n(ˆr (t0 ; z0 ) − r (t0 ; z0 )) → N (0, σr2 ) L
where → denotes convergence in distribution and σr2
1 = θ2,α (t0 ; z0 )2
σk2 is the asymptotic variance of of group k, k = 1, 2.
√
1 2 θ1,α (t0 ; z0 )2 σ1 + σ2 ρ (1 − ρ)θ2,α (t0 ; z0 )2 2
n k (θˆk,α (t0 ; z0 ) − θk,α (t0 ; z0 )) based on the sample
4 Asymptotic variance estimation In this section, we develop a method to estimate the variance of θˆα (t0 ; z0 ). To make it simple and avoid confusion, we omit t0 and z0 . It is complicated and inaccurate to obtain the estimation of σ 2 directly by ‘plug-in method’ because the procedure involves estimating density function. Instead, we use the adjusted re-sampling method proposed by Zeng and Lin (2008). Specifically, by the proof of Theorem 2, we have √
nUˆ (θˆα ) =
√
√ nUˆ (θα ) − f (t0 + θα |z0 ) n(θˆα − θα ) + o p (1).
Here, we omit t0 for simplicity and let θα be the true value. Denote θ˜α = θˆα + √1n G, where G is a zero-mean normal random variable independent of the data. Then we have √
123
nUˆ (θ˜α ) −
√ √ nUˆ (θˆα ) = − f (t0 + θα |z0 ) n(θ˜α − θˆα ) + o p (1).
Conditional quantile residual lifetime models
Since
√
nUˆ (θˆα ) = 0 and
√
n(θ˜α − θˆα ) = G, we have
√ nUˆ (θ˜α ) = − f (t0 + θα |z0 )G + o p (1) ≡ A(θα )G + o p (1). Then we can use the following LS resampling procedure to estimate A(θα ): Step 1. Generate B√realizations of G, √ denoted by G 1 , . . . , G B . Step 2. Calculate nUˆ (θˆα + G b / n), b = 1, . . . , B and denote them as yb , then the least squares estimate of A(θα ) is Aˆ = (x x)−1 x Y , where x = (G 1 , . . . , G B )T and Y =√(y1 , . . . , y B )T . Step 3. Estimate the variance of n(θˆα − θα ) by Aˆ −2 Vˆ . √ In the above estimating procedure, Vˆ is the estimate of variance of nUˆ (θα ), which is ˆ 0 |z0 )2 , ˆ Vˆ = α 2 S(t where ˆ =
T n 1 I (t0 ≤ X i ≤ t0 + θˆα (t0 ))i exp(2βˆ z0 (X i )) n S (0) (β, u)2 i=1
ˆ 0 + θˆα (t0 )) − h(t ˆ 0 ))T ˆ 0 + θˆα (t0 )) − h(t ˆ 0 )), ˆ −1 (h(t +(h(t n ˆ Xi ) 1 S (2) (β, ˆ X i )⊗2 , ˆ = i − Z¯ (β, (0) (β, ˆ Xi ) n S i=1 n
ˆ Xi ) z0 (X i ) − Z¯ (β, T ˆ = 1 I (X i ≤ t)i exp(βˆ z0 (X i )) h(t) . ˆ Xi ) n S (0) (β, i=1
To estimate σd2 and σr2 , we just need to estimate σ12 and σ22 by similar technique and replace the unknown quantities p and θk,α with pˆ = nnk and θˆk,α . 5 Numerical studies 5.1 Simulations In this section, we conduct simulation studies to assess the performance of the proposed estimators and the test statistics. We also compare the performance of our method with the method proposed by Jung et al. (2009) through simulations. Simulations for estimators. First, a simulation study is carried out to evaluate the performance of the proposed estimator and the method of variance estimation. Under the proportional hazards model, we generate the survival times Ti from the following model λ(t) = λ0 (t) exp(β1 Z 1 + β2 Z 2 ).
123
C. Lin et al.
Here, we set β1 = 0.5 and β2 = 1 and consider two baseline hazard functions λ0 (t) = t and 0.5t 2 . The covariate Z 1 is generated from the normal distribution N (1, 1) and Z 2 from the Bernoulli distribution with success probability 0.5 and the given covariates z0 = [1.5, 1]T . The censoring times Ci are simulated from U (0, c0 ), where c0 is used to control the censoring rate. We denote the observed data as (X i , i , Z i ), where X i = min(Ti , Ci ) and i = I (Ti ≤ Ci ). For evaluating the method of variance estimation, we generate G from the standard normal distribution. In our simulations, the censoring ratios are C R = 15 % and 30 % with the specified points t0 = 0 and 0.5 for the first hazard function λ(t) = t exp(β1 Z 1 + β2 Z 2 ) and t0 = 0.5 and 1 for the second hazard function λ(t) = 0.5t 2 exp(β1 Z 1 + β2 Z 2 ). We consider the quantiles α = 0.1, 0.3, 0.5, 0.7, 0.9 and carry out 1,000 replicates with sample size n = 200 and resampling times B = 500. The results are summarized in the following tables. In the tables, ’Bias’ and ’SE’ are the bias and the standard error of the estimators, respectively; ’SD’ and ’CP’ are the mean of the standard errors of estimators and the coverage probability of the nominal 95 % confidence intervals, respectively. From the simulation results given in Tables 1 and 2, we find that the estimators perform reasonably well. Overall, the bias of the estimators is very small, which means that our estimator is consistent for the true parameter. Our method to estiTable 1 Simulation results for λ0 (t) = t CR
t0
α
θα (t0 )
θˆα (t0 )
15 %
0
0.1
0.8946
0.8953
0.0008
0.0577
0.0597
93.9
0.3
0.6469
0.6466
−0.0002
0.0419
0.0428
93.7
0.5
0.4908
0.4915
0.0006
0.0361
0.0346
94.8
0.7
0.3521
0.3521
0.0001
0.0322
0.0289
95.9
0.9
0.1914
0.1893
−0.0020
0.0293
0.0246
95.5
0.1
0.8946
0.8988
0.0042
0.0638
0.0610
93.8
0.3
0.6469
0.6494
0.0026
0.0454
0.0429
94.3
0.5
0.4908
0.4933
0.0025
0.0377
0.0366
94.6
0.7
0.3521
0.3543
0.0022
0.0332
0.0296
94.9
0.9
0.1914
0.1904
−0.0010
0.0300
0.0245
93.1
30 %
15 %
30 %
0
0.5
0.5
Bias
SE
SD
CP %
0.1
0.5248
0.5236
−0.0012
0.0531
0.0589
94.1
0.3
0.3176
0.3182
0.0006
0.0350
0.0432
96.9
0.5
0.2006
0.2016
0.0009
0.0214
0.0228
95.0
0.7
0.1115
0.1108
−0.0007
0.0209
0.0247
96.1
0.9
0.0354
0.0353
−0.0001
0.0143
0.0155
97.0
0.1
0.5248
0.5217
−0.0031
0.0696
0.0512
94.8
0.3
0.3176
0.3172
−0.0004
0.0482
0.0326
96.6
0.5
0.2006
0.1995
−0.0012
0.0367
0.0308
97.5
0.7
0.1115
0.1103
−0.0012
0.0268
0.0296
96.3
0.9
0.0354
0.0353
−0.0001
0.0149
0.0151
95.7
CR censoring rate, Bias bias, SE standard error, SD standard deviation, CP coverage probability of the 95 % confidence interval
123
Conditional quantile residual lifetime models Table 2 Simulation results for λ0 (t) = 0.5t 2 CR
t0
15 %
0.5
30 %
15 %
30 %
0.5
1.0
1.0
α
θα (t0 )
θˆα (t0 )
Bias
SE
SD
CP %
0.1
0.8619
0.8623
0.0004
0.0578
0.0582
93.8
0.3
0.6134
0.6142
0.0008
0.0471
0.0458
94.3
0.5
0.4464
0.4465
0.0001
0.0422
0.0384
95.6
0.7
0.2920
0.2918
−0.0002
0.0385
0.0305
96.5
0.9
0.1170
0.1161
−0.0009
0.0311
0.0313
93.8
0.1
0.8619
0.8617
−0.0002
0.0642
0.0616
93.4
0.3
0.6134
0.6149
0.0015
0.0506
0.0495
94.6
0.5
0.4464
0.4474
0.0010
0.0449
0.0410
95.2
0.7
0.2920
0.2906
−0.0014
0.0401
0.0483
96.2
0.9
0.1170
0.1155
−0.0015
0.0325
0.0377
96.8
0.1
0.5038
0.5022
−0.0016
0.0597
0.0532
94.4
0.3
0.3114
0.3113
−0.0001
0.0450
0.0348
95.9
0.5
0.1988
0.1984
−0.0004
0.0362
0.0419
96.3
0.7
0.1111
0.1107
−0.0004
0.0275
0.0273
94.3
0.9
0.0354
0.0353
−0.0001
0.0157
0.0155
95.5
0.1
0.5038
0.5025
−0.0013
0.0716
0.0711
94.5
0.3
0.3114
0.3109
−0.0005
0.0513
0.0496
96.5
0.5
0.1988
0.1975
−0.0013
0.0399
0.0308
93.8
0.7
0.1111
0.1103
−0.0009
0.0298
0.0186
96.4
0.9
0.0354
0.0353
−0.0001
0.0167
0.0110
97.0
See the note to Table 1
mate the variance is also reliable as SD is very close to SE. Also, confidence intervals have coverage probabilities that do not deviate too much from the 0.95 nominal level. Simulations for Comparing. Our second simulation study is to compare the performance of the proposed method with the method of Jung et al. (2009). For this, we generate data in a similar way as Jung et al. (2009). We consider a linear regression model for a α-quantile residual lifetime at time t0 : T quantile(Ti − t0 |Ti ≥ t0 , Zi ) = exp(γ α|t Z ), 0 i
(5.1)
T Z )|Z ) = α P(T ≥ t |Z ). Here, Z is a twowhich means that P(Ti ≥ t0 + exp(γ α|t i i 0 i i 0 i T dimensional covariate Zi = (Z 1i , Z 2i ) , where Z 1i are generated from the normal distribution N (1, 1) and Z 2i are generated from the Bernoulli distribution with success probability 0.5. For different quantiles and t0 , we choose different values for γ α|t0 , which are all between −1 and 1.5. The failure times Ti are generated from the Weibull distribution with survival functions S(t|Zi ) = exp(−(t/ai )b ) and we assume that b = 1 and b = 2.5 in this simulation. Then the Eq. (5.1) yields that
123
C. Lin et al.
ai =
⎧ b ⎫1/b ⎪ T Z ) ⎪ ⎨ t0b − t0 + exp(γ α|t ⎬ 0 i ⎪ ⎩
⎪ ⎭
ln α
.
Censoring times Ci are generated from the uniform distribution U (0, c0 ), where c0 is used to control the censoring proportion. (1) (2) Firstly, we estimate the regression parameters γ α|t0 = (γα|t0 , γα|t0 )T using the method proposed by Jung et al. (2009) and denote the estimator by γ˜ α|t0 . Then, given the covariates z0 = [1.5, 1]T , the α-quantile residual lifetime at time t0 can be estiT z ). We can also use the re-sampling method proposed mated by θ˜α (t0 ; z0 ) = exp(γ˜ α|t 0 0 by Zeng and Lin (2008) to estimate the asymptotic variance of γ˜ α|t0 and can use the Delta-method to obtain the asymptotic distribution for θ˜α (t0 ; z0 ). Secondly, we use the proposed method to estimate the α-quantile residual lifetime directly by the assistance of Cox’s proportional hazards model. Note that when b = 1, the assumption for the conditional survival distribution is also satisfied under the linear regression model (5.1). But when b = 2.5, the assumption is not satisfied. If t0 = 0, the conditional T Z ) , which is survival function degenerates into S(t|Zi ) = exp t b ln α exp(−bγ α|0 i also the Cox’s proportional hazards model. We also carry out 1,000 simulations with sample size n = 200 and resampling times B = 500. Tables 3 and 4 summarize the simulation results. “JJB” represents Table 3 Simulation results for compare when b = 1 JJB t0
CR
0
15 %
0
0.5
0.5
30 %
15 %
30 %
α
θα (t0 )
Bias
Proposed SE
SD
CP
Bias
SE
SD
CP
0.3
1.5683
0.0361
0.2222
0.2480
94.3 −0.0074
0.1447
0.1691
94.9
0.5
0.8187
0.0114
0.1163
0.1373
95.2
0.1074
0.1160
94.1
0.0008
0.7
0.6376
0.0088
0.1100
0.1303
95.7
0.0085
0.0933
0.1099
93.8
0.9
0.6065
0.0016
0.1328
0.1995
98.1
0.0025
0.0703
0.1132
97.1
0.3
1.5683
0.0454
0.2278
0.2482
95.7
0.0106
0.1763
0.1967
94.2
0.5
0.8187
0.0348
0.1333
0.1443
94.5
0.0084
0.1053
0.1246
95.2 93.6
0.7
0.6376
0.0296
0.1165
0.1373
95.3
0.0090
0.0924
0.1058
0.9
0.6065 −0.0016
0.1317
0.2163
97.3
0.0038
0.0792
0.1105
98.0
0.3
1.5683
0.2215
0.3355
97.7
0.0028
0.1471
0.1838
95.4
0.0284
0.5
0.8187
0.0111
0.1452
0.1533
94.0 −0.0014
0.1256
0.1473
95.4
0.7
0.6376
0.0088
0.1186
0.1531
96.6 −0.0031
0.0759
0.0985
95.3
0.9
0.6065 −0.0057
0.1206
0.2033
97.1
0.0014
0.0760
0.0971
95.8
0.3
1.5683
0.1883
0.2388
95.3 −0.0100
0.0982
0.1087
96.0
0.0120
0.5
0.8187
0.0229
0.1622
0.2126
96.1 −0.0031
0.1776
0.1925
96.7
0.7
0.6376
0.0241
0.1368
0.1822
96.1 −0.0141
0.1296
0.1642
96.2
0.9
0.6065 −0.0117
0.1255
0.2278
98.0 −0.0033
0.0795
0.1091
97.3
CR censoring rate; Bias bias, SE standard error, SD standard deviation, CPcoverage probability of the 95 % confidence interval
123
Conditional quantile residual lifetime models Table 4 Simulation results for compare when b = 2.5 JJB t0
CR
α
θα (t0 )
0
15 %
0.3
2.1170
0.5
1.2840
0.7
1.1052
0
1
1
30 %
15 %
30 %
Bias
Proposed
SE
SD
CP
Bias
SE
SD
CP
0.0175
0.1385
0.1472
96.2
0.0059
0.1214
0.1331
93.4
0.0069
0.0863
0.0939
96.1
0.0046
0.0837
0.0869
94.0
0.0099
0.0930
0.0947
95.3
0.0068
0.0788
0.0820
95.6
0.9
0.9512
0.0031
0.1157
0.1217
94.0
0.0065
0.0839
0.0876
94.4
0.3
2.1170
0.0198
0.1590
0.1674
95.3
0.0092
0.1257
0.1558
93.9
0.5
1.2840
0.0146
0.0972
0.1076
94.8
0.0097
0.0879
0.0953
94.4
0.7
1.1052
0.0153
0.1021
0.1082
94.6
0.0082
0.0842
0.0879
94.3
0.9
0.9512
0.0137
0.1258
0.1539
96.0
0.0087
0.0908
0.0954
94.4
0.3
1.6487
0.0137
0.1551
0.1779
96.5
0.0217
0.1255
0.1552
95.2
0.5
1.0000
0.0058
0.1192
0.1248
95.1
0.0269
0.0960
0.1085
94.2
0.7
0.8607
0.0103
0.1206
0.1291
94.0
0.0369
0.0939
0.1077
94.1
0.9
0.7408
−0.0094
0.1270
0.1667
94.9
0.0253
0.0887
0.1034
96.2
0.3
1.6487
0.0219
0.1768
0.2056
96.4
0.0244
0.1279
0.2088
96.4
0.5
1.0000
0.0136
0.1212
0.1438
96.0
0.0234
0.0980
0.1189
95.4
0.7
0.8607
0.0103
0.1290
0.1471
94.8
0.0297
0.0974
0.1160
95.3
0.9
0.7408
−0.0186
0.1389
0.1974
95.7
0.0324
0.0910
0.1281
95.0
CR censoring rate, Bias bias, SE standard error, SD standard deviation, CP coverage probability of the 95 % confidence interval
the results for Jung et al. (2009). From Table 3, we can see that for all the cases the proposed method has better performance in terms of biases and variances than that of Jung et al. (2009). Table 4 shows that when the underlying assumption for the conditional survival distribution is not satisfied, the proposed method still has comparable performance with JJB, which means that the proposed method is not sensitive to the assumption for the conditional survival distribution. Simulations for type I error. In the third part of simulation, we intend to validate the proposed test statistics in terms of type I error probabilities at the significance level of 0.05. To calculate type I error, we need to generate data under the null hypothesis H0 : θ1,α (t0 ) = θ2,α (t0 ). For simplicity, we generate the failure times by the same way as the first part from the proportional hazards model λ(t) = 0.5t 2 exp(β1 Z 1 + β2 Z 2 ) with the true parameter β = [0.5, 1]T and Z 1 ∼ N (1, 1), Z 2 ∼ Ber noulli(0.5). The censoring distribution is assumed to follow a uniform distribution U (0, c0 ), where c0 determines the censoring percentages. The given covariate is still z0 = [1.5, 1]T and we consider the quantiles α = 0.3, 0.5, 0.7, 0.9. For both the groups the sample size is 100. For difference of residual lifetimes d(t0 ; z0 ) = θ1,α (t0 ; z0 ) − θ2,α (t0 ; z0 ), the true value is 0 under H0 while the true value of ratio of residual lifetimes r (t0 ; z0 ) = θ1,α (t0 ;z0 ) θ2,α (t0 ;z0 ) = 1. Table 5 summarizes the biases of estimates of d(t0 ; z0 ) and r (t0 ; z0 ) and empirical 95 % confidence intervals at different time points (t0 = 0.1 and 0.5). Also, we report the empirical coverage probabilities (CP=1-empirical type I error probabilities) of 1,000 simulations.
123
C. Lin et al. Table 5 Simulation results for type I error Difference
Ratio
t0
CR
α
Bias
CP
CI
Bias
CP
0.1
15 %
0.3
−0.0036
94.0
(−0.1855–0.1782)
0.0011
93.8
(0.8152–1.1870)
0.5
−0.0012
94.7
(−0.1689–0.1665)
0.0044
95.0
(0.7931–1.2157)
0.7
−0.0025
95.2
(−0.1690–0.1639)
0.0045
95.1
(0.7331–1.2759)
0.1
0.5
0.5
30 %
15 %
30 %
CI
0.9
0.0008
94.4
(−0.1775–0.1792)
0.0323
95.5
(0.5222–1.5424)
0.3
0.0033
94.2
(−0.1855–0.1920)
0.0085
94.3
(0.8147–1.2024)
0.5
0.0027
94.1
(−0.1685–0.1738)
0.0094
94.1
(0.7925–1.2263)
0.7
−0.0013
95.3
(−0.1684–0.1658)
0.0074
95.8
(0.7336–1.2811)
0.9
−0.0011
94.7
(−0.1786–0.1763)
0.0303
94.4
(0.5208–1.5398)
0.3
0.0023
95.2
(−0.1775–0.1821)
0.0145
93.7
(0.7178–1.3112)
0.5
−0.0013
94.8
(−0.1618–0.1591)
0.0124
94.6
(0.6465–1.3783)
0.7
−0.0007
95.4
(−0.1459–0.1444)
0.0300
95.1
(0.5093–1.5507)
0.9
−0.0010
95.0
(−0.1187–0.1168)
0.1205
94.2
(−0.1454–2.3864)
0.3
−0.0027
95.5
(−0.1887–0.1834)
0.0066
94.8
(0.7022–1.3110)
0.5
−0.0017
95.3
(−0.1639–0.1605)
0.0112
96.0
(0.6409–1.3815)
0.7
−0.0018
94.8
(−0.1467–0.1431)
0.0246
94.7
(0.5049–1.5443)
0.9
−0.0002
95.8
(−0.1175–0.1170)
0.1397
94.6
(−0.1972–2.4767)
ˆ 0 ; z0 ) and rˆ (t0 ; z0 ); CP are the empirical coverage probability CR, censoring rate; Bias are the bias of d(t of the 95 % confidence interval and 1-CP is empirical type I error; CI are the empirical 95 % confidence intervals
From Table 5 we find that the bias is very close to zero and type I error probabilities approach the true value of 0.05 because all the coverage probabilities are around 0.95 at each fixed time point. We note that the ratio of residual lifetimes is a little more sensitive about θ1,α (t0 ; z0 ) and θ2,α (t0 ; z0 ) than the difference of residual lifetimes. For example, when α = 0.9 and t0 = 0.5, θ1,α (t0 ; z0 ) = 0.1170, which is comparatively smaller than others, rˆ (t0 ; z0 ) behaved poorly and the confidence intervals contain negative values. Overall, the difference d(t0 ; z0 ) performs much better than the ratio r (t0 ; z0 ). Simulations for powers. In this part, powers for our proposed test statistics are analyzed. For group one we generate the failure times from the proportional hazards model λ(t) = 0.5t 2 exp(β1 Z 1 + β2 Z 2 ) and failure times for the second group are generated from the model λ(t) = t exp(β1 Z 1 + β2 Z 2 ). The other variables are generated in the same way as in the first part. Rejected proportions of null hypothesis H0 : θ1,α (t0 ; z0 ) = θ2,α (t0 ; z0 ) are evaluated at the significance level of 0.05. We summarize our results in Table 6. From Table 6, we note that the power of d(t0 ; z0 ) and r (t0 ; z0 ) tends to increase notably as θ1,α (t0 ; z0 ) − θ2,α (t0 ; z0 ) increases, which is reasonable. Larger values of the fixed time points are associated with lower powers. This is also the reason of difference of two αth quantile residual lifetimes. Besides, we can see that the power of d(t0 ; z0 ) is much bigger than r (t0 ; z0 ) apparently. Hence, combining the type I error and powers, we suggest using the difference of residual lifetimes in practice.
123
Conditional quantile residual lifetime models Table 6 Simulation results for powers Difference t0
C%
0.1
15 %
0.1
30 %
0.5
15 %
0.5
30 %
Ratio
α
0.3
0.5
0.7
0.9
0.3
0.5
0.7
0.9 3.2816
T (t0 )
0.4245
0.3969
0.3538
0.2645
1.7654
1.9901
2.3299
power
0.9980
0.9970
0.9960
0.9230
0.9890
0.9910
0.9810
0.3720
T (t0 )
0.4245
0.3969
0.3538
0.2645
1.7654
1.9901
2.3299
3.2816
power
0.9900
0.9950
0.9940
0.8960
0.9720
0.9880
0.9660
0.3530
T (t0 )
0.2958
0.2458
0.1805
0.0816
1.9315
2.2249
2.6187
3.3074
power
0.9140
0.9070
0.8190
0.4350
0.7080
0.6610
0.1860
0.0050
T (t0 )
0.2958
0.2458
0.1805
0.0816
1.9315
2.2249
2.6187
3.3074
power
0.8620
0.8810
0.8140
0.4020
0.6470
0.5750
0.1600
0.0080
CR, censoring rate; T (t0 ) is the true value of d(t0 ; z0 ) and r (t0 ; z0 )
5.2 Real data example As an illustration, we applied the proposed method to analyze the well-known HIV survival data (Hosmer and Lemeshow 1998), which was collected to evaluate the survival time of HIV+ members, using a follow-up study. Subjects were enrolled in the study from January 1, 1989 to December 31, 1991. The study ended on December 31, 1995. After a confirmed diagnosis of HIV, members were followed until death due to AIDS or AIDS-related complications, until the end of the study or until the subject was lost to follow-up. It was assumed that there were no deaths due to other causes (e.g., auto accident). Participants entered the study at different times over a 3-year period. Information collected for 100 participants at the time of enrollment into the study are TIME: the follow-up time is the number of months between the entry date and the end date, AGE: the age of the subject at the start of follow-up (in years), DRUG: history of prior IV drug use (1= Yes, 0 = No), and CENSOR: vital status at the end of the study (1= Death due to AIDS, 0 = Lost to follow-up or alive). We intend to know the α-th quantile residual lifetime at a specific time point t0 after the confirmed diagnosis of HIV. We took two levels of age, 30 and 45, to compare with the average age 36 and simultaneously, varying the value of α from 0.3 to 0.9. The results are summarized in Table 7. It is evident that, for patients of same age, a prior history of drug use tended to have small estimate value of residual lifetime than those who did not have a history of drug. As for different age, the elder patients have smaller estimate value of residual lifetime. For different time point t0 , we present our results in Table 8. This table shows that for both patients, with and without drug history, the residual lifetimes are decreasing along with the increase of t0 value. Thus, we can conclude that young patients who did not have a history of drug use have high survival probability, which coincides with many previous works. The following test will make this difference clear. The patients who have a history of drug use form the first group and the patients without history of drug use form the second group. The sample size of two groups are n 1 = 49 and n 2 = 51. The censoring rate of drug group is C R1 = 22.45 %
123
C. Lin et al. Table 7 HIV survival data analysis with CR = 20 % A = 30, D = 0 A = 30, D = 1 A = 36, D = 0 A = 36, D = 1 A = 45, D = 0 A = 45, D = 1 t0 α
Est SE
Est SE
Est SE
Est SE
Est SE
Est SE
5 0.3 48.0 0.5267
10.0 0.4621
27.0 0.5685
6.0 0.3254
7.0 0.4830
3.0 0.2002
0.5 27.0 0.6783
6.0 0.3672
9.0 0.5991
4.0 0.2295
4.0 0.3182
2.0 0.1117
0.7 8.0 0.7475
4.0 0.2377
5.0 0.2339
3.0 0.1582
3.0 0.1503
2.0 0.0634
0.9 3.0 0.1450
2.0 0.0576
2.0 0.1141
2.0 0.0651
2.0 0.0595
1.0 0.1832
Est estimate, SE standard error, A Age, D drug use, CR censoring rate Table 8 HIV survival data analysis with CR = 20 %
A = 36, D = 1
t0
α
Est
SE
Est
SE
5
0.3
27.0
0.5685
6.0
0.3254
0.5
9.0
0.5991
4.0
0.2295
0.7
5.0
0.2339
3.0
0.1582
0.9
2.0
0.1141
2.0
0.0651
0.3
29.0
0.6250
8.0
0.3044
0.5
10.0
0.3718
6.0
0.1460
0.7
6.0
0.2112
3.0
0.2353
0.9
2.0
0.1156
2.0
0.0504
1
See the note to Table 7
A = 36, D = 0
and the censoring rate of no-drug group is C R2 = 17.65 %. Our null hypothesis is H0 : θ1,α (t0 ; Age0 ) = θ2,α (t0 ; Age0 ) for a given age. We summarized the results of the test in Table 9. The table clearly shows that the residual lifetime for the patients without drug history is much longer than the patients with drug history, but the difference is getting smaller along with the increase of quantile. In any case, all the differences are significant at the 0.05 level of significance except when Age = 45 and α = 0.9, because of lacking of data. This conclusion coincides with above. 6 Discussion In this article, we have proposed a conditional quantile residual lifetime model and an estimation procedure considering the Cox’s proportional hazards model as an auxiliary model. The most appealing merit of the proposed method is that we can obtain the quantile residual lifetime by solving only one equation. The covariates effects has also been taken into account by the aid of an auxiliary model. Moreover, when we have more information about the true survival model S(t|z0 ), the proposed method can also be applied by adjusting the method of estimating S(t|z0 ) to improve the estimate of θα (t0 ; z0 ). In fact, the proportional hazards model is not the only option, other auxiliary models can also be considered, such as transformation models and accelerated failure time models. As we can see from the simulation of comparison
123
Conditional quantile residual lifetime models Table 9 Test for HIV survival data analysis Age α
t0 = 1 0.3
30
θˆ1,α (t0 ) θˆ2,α (t0 ) ˆ 0) d(t
36
45
10.0
7.0
42.0
30.0
0.0000
0.0000
0.9 4.0
10.0
0.3 2.0 4.0
−7.9940
−7.4265
0.0000
0.0000
0.5
0.7
10.0
5.0
48.0
29.0
−52.4525 −22.3731 0.0000
0.0000
0.9 3.0
2.0
9.0
4.0
−6.5701
−6.0051
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
7.0
6.0
3.0
2.0
5.0
3.0
2.0
2.0
30.0
11.0
7.0
3.0
29.0
10.0
6.0
3.0
−27.2663
−9.5695
−6.2905
−3.2087
−24.5609
−7.9307
−9.4756
0.0000
0.0000
0.0000
0.0017
0.0000
0.0000
0.0000
0.0001
−67.6005 −15.6063
−8.3260
−4.7034
−39.7519 −17.6504 −19.9140
−6.0112
p-value rˆ (t0 )
0.7
−41.8883 −50.9254 −13.9134 −13.8473 −163.5965 −35.7153 −16.0433 −11.4062
p-value θˆ1,α (t0 ) θˆ2,α (t0 ) ˆ 0) d(t
0.5
−31.5097 −21.7592
p-value rˆ (t0 )
t0 = 5
p-value θˆ1,α (t0 ) θˆ2,α (t0 ) ˆ 0) d(t p-value rˆ (t0 ) p-value
−4.2191
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
5.0
3.0
2.0
2.0
3.0
2.0
2.0
1.0
4.0
1.0
9.0 −3.0895
6.0 −2.6381
−6.6492
4.6387
0.0012
0.0052
0.0000
0.0000
−3.4117
−3.9665
−9.4136
2.4351
0.0003
0.0000
0.0000
0.0105
7.0
5.0
4.0
2.0
−6.5650
−5.9700
−5.3451
0.0000
0.0000
0.0000
0.0560
−10.5016 −11.3076 −10.0331
−2.4472
0.0000
0.0000
−1.9700
0.0000
0.0168
θˆ1,α (t0 ; Age0 ) and θˆ2,α (t0 ; Age0 ) are the estimation of αth quantile residual lifetimes for the drug group ˆ 0 ; Age0 ) and rˆ (t0 ; Age0 ) are the estimation of the test statistics under the null and no-drug group. d(t hypothesis
results in the paper, the proposed method is not sensitive to the underlying assumption of the Cox model. Another practical problem is hypothesis testing. For comparing two quantile residual lifetimes, we have proposes two test statistics which can be easily implemented. Further, we have compared the proposed method with the method of Jung et al. (2009), and concluded that the proposed method has better performance than that of Jung et al. (2009), both in motivation and computation. The real data analysis supports our conclusion. 7 Appendix In order to prove the results, we need the following assumptions and regularity conditions: (A1) (A2) (A3) (A4)
The covariate Z(·) has uniformly bounded total variation. β 0 ∈ B ⊂ R p and B is open, convex and bounded. τ = sup{t : Y (t) > 0}, 0 (t) is continuous and 0 (τ ) < ∞ . √ , the asymptotic covariance matrix of n(βˆ − β 0 ), is positive definite.
123
C. Lin et al.
For simplicity, we denote θα = θα (t0 ; z0 ) and θˆα = θˆα (t0 ; z0 ). ˆ 0 (·), we have Proof of Theorem 1 By the strong consistency of βˆ and ˆ sup | S(t|z 0 ) − S(t|z0 )| → 0, P
t∈[0,τ ]
which implies that sup |Uˆ (θα ) − U (θα )| → 0, P
θα ∈[0,τ ]
ˆ 0 + θα |z0 ) − α S(t ˆ 0 |z0 ). where U (θα ) = S(t0 + θα |z0 ) − αS(t0 |z0 ) and Uˆ (θα ) = S(t ˆ By the assumption, U (θα ) = 0 has unique solution θα , then as n → ∞, U (θα ) = 0 also has unique solution θˆα . For any > 0, we have sup |θ˜α −θα |>
|Uˆ (θ˜α ) − U (θ˜α )| = o p (1).
It follows that inf
|θ˜α −θα |>
|Uˆ (θ˜α )| ≥
inf
|θ˜α −θα |≥
|U (θ˜α )| − o p (1) > M,
where M is some positive constant. Hence, |θˆα − θα | ≤ and θˆα is consistent.
To prove Theorem 2, we need the following technical lemmas: Lemma 1 Let Bn ∈ D[a, b] and An ∈ l ∞ ([a, b]) be either cadlag or caglad, and P
assume that supt∈(a,b] |An (t)| → 0, An has uniformly bounded total variation, and Bn converges weakly to a tight, mean zero process with sample paths in D[a, b]. Then b P a An (s)d Bn (s) → 0. Proof of Lemma 1 See Lemma 4.2 of Kosorok (2006).
Lemma 2 Assume that the above conditions hold and the true conditional αth quantile residual lifetime function at t0 given z0 is θα and its estimator θˆα is consistent. Then as n → ∞, we have |
√
n(Uˆ (θˆα ) − U (θˆα )) −
√ P n(Uˆ (θα ) − U (θα )) |−→ 0.
Proof of Lemma 2 We just need to prove that √
123
P ˆ 0 + θˆα ) − S(t0 + θˆα ) − S(t ˆ 0 + θα ) + S(t0 + θα ) |−→ n | S(t 0.
(7.1)
Conditional quantile residual lifetime models
In fact, by the arguments of Andersen and Gill (1982), the asymptotic covariance √ matrix of n(βˆ − β 0 ) is , which is ∞
= 0
Let W (t) =
√
s (2) (β 0 , t) ⊗2 − z¯ (β 0 , t) s (0) (β 0 , t)d0 (t). s (0) (β 0 , t)
ˆ n{(t) − (t)}, and it is asymptotically equivalent to
⎡ t ⎤ ∞ n T
exp(β z (u)) 1 0 0 ⎣ W˜ (t) = √ d Mi (u) + h T (t)−1 {Zi (u) − Z¯ (β 0 , u)}d Mi (u)⎦ , S (0) (β 0 , u) n i=1
0
where Mi (t) = Ni (t) −
0
t 0
Yi (u) exp(β 0T Zi (u))d0 (u) is a martingale and
t h(t) =
exp(β 0T z0 (u)){z0 (u) − z¯ (β, u)}d0 (u). 0
ˆ = exp{−(t)}, ˆ On the other√hand, S(t) by the functional delta method, we have that ˆ the process n( S(t) − S(t)) is asymptotically equivalent to −S(t)W˜ (t). Then by the consistency of θˆα and the continuity of S(·), √ ˆ 0 + θˆα ) − S(t0 + θˆα ) − S(t ˆ 0 + θα ) + S(t0 + θα ) | n | S(t =| −S(t0 + θˆα )W˜ (t0 + θˆα ) + S(t0 + θα )W˜ (t0 + θα ) | +o p (1) = S(t0 + θα ) | W˜ (t0 + θˆα ) − W˜ (t0 + θα ) | +o p (1) ≤| W˜ (t0 + θˆα ) − W˜ (t0 + θα ) | +o p (1). By the expression of W˜ (t), we have | W˜ (t0 + θˆα ) − W˜ (t0 + θα ) | ˆ n t0+θα n t0+θα 1 exp(β 0T z0 (u)) exp(β 0T z0 (u)) 1 = √ d M d Mi (u) (u) − √ i S (0) (β 0 , u) S (0) (β 0 , u) n n i=1
i=1
0
1 +(h(t0 + θˆα ) − h(t0 + θα ))T −1 √ n
0
n ∞
{Zi (u) − Z¯ (β 0 , u)}d Mi (u)
i=1 0
n τ 1 exp(β T z0 (u)) ≤ √ d Mi (u) (I {u ≤ t0 + θˆα } − I {u ≤ t0 + θα }) (0) 0 S (β 0 , u) n i=1 0
τ + (I {u ≤ t0 + θˆα } − I {u ≤ t0 + θα }) exp(β 0T z0 (u)){z0 (u) − z¯ (β, u)}d0 (u) 0
123
C. Lin et al.
−1
n ∞ 1 {Zi (u)− Z¯ (β 0 , u)}d Mi (u) √ n i=1 0
=: I1 + I2 . For I1 , we denote An (t) = (I {t ≤ t0 + θˆα } − I {t ≤ t0 + θα })
exp(β 0T z0 (t)) S (0) (β 0 , t)
n Mi (t). Then Bn converges weakly to mean zero Gaussian and Bn (t) = √1n i=1 process because Mi (t) is a martingale with mean zero and by the fact that the covariate Z(·) is uniformly bounded with bounded variation and the consistency of θˆα , we have τ P P supt∈(0,τ ] |An (t)| → 0. Then using Lemma 1, we have An (u)d Bn (u) −→ 0, that is 0
P
I1 → 0. For I2 , because n ∞ 1 {Zi (u) − Z¯ (β 0 , u)}d Mi (u) = O p (1) √ n i=1 0
and θˆα −→ θα , so I2 −→ 0 is trivial. Then we have the conclusion of 7.1. P
P
Proof of Theorem 2 By simple calculation, √ √ √ nUˆ (θˆα ) = nUˆ (θα ) + n(U (θˆα ) − U (θα )) + o p (1) √ √ = nUˆ (θα ) − f (t0 + θα |z0 ) n(θˆα − θα ) + o p (1), which means that √ √ n(θˆα − θα ) = f (t0 + θα |z0 )−1 nUˆ (θα ) + o p (1), where √ √ ˆ 0 + θα |z0 ) − α S(t ˆ 0 |z0 )] nUˆ (θα ) = n[ S(t √ ˆ 0 + θα |z0 ) − S(t0 + θα |z0 ) − α( S(t ˆ 0 |z0 ) − S(t0 |z0 ))] = n[ S(t By the fact S(t0 + θα |z0 ) = αS(t0 |z0 ), the distribution of imated by
123
√ nUˆ (θα ) can be approx-
Conditional quantile residual lifetime models t0+θα n
exp(β 0T z0 (u)) 1 d Mi (u) − αS(t){W˜ (t0 + θα ) − W˜ (t0 )} = −αS(t) √ S (0) (β 0 , u) n i=1
+ (h(t0 + θα ) − h(t0 ))T −1
t0
∞
{Zi (u) − Z¯ (β 0 , u)}d Mi (u) .
0
By the martingale central limit theorem, it is easy to show that the above process converges weakly to a zero-mean Gaussian process with the covariance function t0+θα
= t0
exp(2β 0T z0 (u)) d0 (u) + (h(t0 + θα ) − h(t0 ))T −1 (h(t0 + θα ) − h(t0 )). S (0) (β 0 , u)
and thereby √ L nUˆ (θα ) → N (0, α 2 S(t0 )2 ). Therefore the asymptotic variance of σ2 =
√ n(θˆα (t0 ) − θα (t0 )) is
α 2 S(t0 |z0 )2 . f (t0 + θα |z0 )2
Then the conclusion of Theorem 1 can be proved.
(7.2)
References Andersen PK, Gill RD (1982) Cox’s regression model for counting processes: a large sample study. Ann Stat 10:1100–1120 Breslow NE (1972) Discussion of paper of D. R. Cox. J R Stat Soc B 34:216–217 Chen YQ, Cheng S (2006) Linear life expectancy regression with censored data. Biometrika 93:303–313. Chen YQ, Jewell NP, Lei X, Cheng SC (2005) Semiparametric estimation of proportional mean residual life model in presence of censoring. Biometrics 61:170–178 Chiang CL (1960) A stochastic study of the life table and its applications: I. Probability distributions of the biometric functions. Biometrics 16:618–635 Cox DR (1972) Regression models and life-tables (with discussion). J R Stat Soc B 34:187–220 Cox DR (1975) Partial likelihood. Biometrika 62:269–276 Gelfand AE, Kottas A (2003) Bayesian semiparametric regression for median residual life. Scand J Stat 30:651–665 Gupta RC, Langford ES (1984) On the determination of a distribution by its median residual life function: a functional equation. J Appl Probab 21:120–128 Hosmer D, Lemeshow S (1998) Applied survival analysis regression modeling of time to event data. Wiley, New York Jeong JH, Fine JP (2009) A note on cause-specific residual life. Biometrika 96:237–242 Jeong JH, Jung SH, Costantino JP (2008) Nonparametric inference on median residual life function. Biometrics 64:157–163 Jung SH, Jeong JH, Bandos H (2009) Regression on quantile residual life. Biometrics 65:1203–1212 Kosorok MR (2006) Introduction to empirical processes and semiparametric inference. Springer, New York Ma Y, Yin G (2010) Semiparametric median residual life model and inference. Can J Stat 34:665–679
123
C. Lin et al. Schmittlein D, Morrison D (1981) On individual-level inference in job duration research: a reexamination of the Wisconsin School Superintendents Study. Adm Sci Q 26:84–89 Zeng D, Lin DY (2008) Efficient resampling methods for nonsmooth estimating functions. Biostatistics 9:355–363
123