Lifetime Data Anal DOI 10.1007/s10985-015-9325-0
Estimating the survival function based on the semi-Markov model for dependent censoring Ziqiang Zhao1 · Ming Zheng1 · Zhezhen Jin2
Received: 10 March 2014 / Accepted: 7 March 2015 © Springer Science+Business Media New York 2015
Abstract In this paper, we study a nonparametric maximum likelihood estimator (NPMLE) of the survival function based on a semi-Markov model under dependent censoring. We show that the NPMLE is asymptotically normal and achieves asymptotic nonparametric efficiency. We also provide a uniformly consistent estimator of the corresponding asymptotic covariance function based on an information operator. The finite-sample performance of the proposed NPMLE is examined with simulation studies, which show that the NPMLE has smaller mean squared error than the existing estimators and its corresponding pointwise confidence intervals have reasonable coverages. A real example is also presented. Keywords Semi-Markov model · Dependent censoring · NPMLE · Survival function
Electronic supplementary material The online version of this article (doi:10.1007/s10985-015-9325-0) contains supplementary material, which is available to authorized users.
B
Zhezhen Jin
[email protected] Ziqiang Zhao
[email protected] Ming Zheng
[email protected] 1
Department of Statistics, School of Management, Fudan University, 670 Guoshun Road, Shanghai, China
2
Department of Biostatistics, Mailman School of Public Health, Columbia University, 722 West 168th Street, New York, NY, USA
123
Z. Zhao et al.
1 Introduction In survival analysis, the survival function of the failure time is commonly estimated by the Kaplan-Meier estimator and the Nelson-Aalen estimator. For these two estimators, a key assumption is that the censoring time and the survival time are independent. It is challenging to estimate the survival function under dependent censorship (Tsiatis 1975). In oncology studies, independent dropout yields independent censoring. In addition, subjects are often censored due to the ending of the study or to the onset of progressive disease (PD). The study-ending censoring is usually independent of the survival time while the PD-related censoring might be dependent on the survival time since the PD could be a precursor of both death and lost of follow-up. In other words, in the presence of PD, the patients have a much higher risk of death and are more likely to leave the study. Ignoring such dependence would yield biased and inconsistent estimation of the survival function. On the other hand, it is possible to improve the estimation if the dependency information is properly used. Datta et al. (2000) considered nonparametric estimation using a three-stage irreversible illness–death model. In the case of dependent censoring, Lee and Tsai (2005) proposed a semi-Markov model and developed an empirical-type estimator of the survival function. The asymptotic variance of their proposed estimator, however, is too complicated to compute. In this paper, we present a nonparametric maximum likelihood estimator (NPMLE) of the survival function based on the semi-Markov model. We show that our proposed NPMLE converges weakly to a Gaussian process and achieves asymptotic efficiency. In addition, we develop a consistent estimator of its asymptotic covariance function based on an information operator, which can be easily calculated. The remainder of the paper is organized as follows. In Sect. 2, we will introduce the semi-Markov model. In Sect. 3, we will derive the NPMLE of the survival function. In Sect. 4, we will establish the asymptotic properties of the NPMLE, and construct a consistent estimator of its asymptotic covariance function based on an information operator. In Sect. 5, we will present simulation studies and re-analysis of the example in Lee and Tsai (2005). We will conclude with a short discussion in Sect. 6 and provide sketches of the proofs of theorems in the Appendix.
2 The semi-Markov model Let T be the survival time and U be the PD censoring time. The semi-Markov model proposed by Lee and Tsai (2005) assumes that: (0,2)
λT|U (t|u) = λ0
(1,2)
(t) I {t u} + λ0
(t − u) I {t > u} , for t, u 0,
(1)
where λT|U is the conditional hazard function of T given U, I {·} is the indicator function (0,2) (1,2) taking the value 1 if the condition is satisfied and the value 0 otherwise, λ0 and λ0 are unknown hazard functions for death without PD and death with PD respectively.
123
Semi-Markov model for dependent censoring Fig. 1 Transition
Note that the cumulative form of Model (1) is: (0,2)
ΛT|U (t|u) = Λ0
(1,2)
(min {t, u}) + Λ0
and that for any t, u 0, ST|U (t|u) = (0,1)
(0,2)
(t − u) I {t > u} , for t, u 0,
⎧ ⎨ S(0,2) (t)
t u,
⎩ S(0,2) (u) S(1,2) (t − u) 0 0
t > u,
0
(1,2)
(0,1)
(0,2)
(2)
(1,2)
and S0 , S0 , S0 are the corresponding cause where Λ0 , Λ0 , Λ0 specific cumulative hazard functions and the corresponding cause specific survival (0,1) (0,2) (1,2) functions for λ0 , λ0 and λ0 , respectively. The superscript values 0, 1, 2 indicate three different states, with 0 being the state of alive without PD, 1 being the state of alive with PD and 2 being the state of death. More precisely, Model (1) corresponds to a non-homogeneous semi-Markov process J with the state space {0, 1, 2}: ⎧ ⎨ 0 T > t, U > t (Alive without PD at time t) J (t) = 1 T > t, U t (Alive with PD at time t), for t 0 . ⎩ 2Tt (Died at time t) (0,1)
denotes the hazard function of U, i.e. the hazard With this specification, if λ0 (0,1) (0,2) (1,2) function for PD, it can be shown that λ0 , λ0 and λ0 are respectively the cause specific hazard functions for the transition from state 0 to state 1, state 0 to state 2, and state 1 to state 2. Figure 1 provides a plot of transition for illustration. In addition, we use C to denote independent censoring, which is independent of (T, U). Let X = min {T, C} and Δ = I {T C}. When X U, the subject is dead or censored before PD, i.e., (X, Δ) is observed while U is not. When X > U, PD occurs before both death and the independent censoring C and so the subject would leave the study at time U with a probability θ0 , i.e., U is observed and (X, Δ) is observed with probability θ0 , where θ0 is an unknown parameter.
123
Z. Zhao et al.
Let V = min {X, U} , R = I {X U} and ξ be the indicator of observing (X, Δ). The observed data for a subject is (V, R, ξ, ξ X, ξ Δ). Throughout the paper, it is assumed that ξ is conditionally independent of (T, U, C) given R with P (ξ = 1|R = 1) = 1 and P (ξ = 1|R = 0) = θ0 . For a sample of size n, we observe (Vi , Ri , ξi , ξi Xi , ξi Δi ) , i = 1, . . . , n.
3 Nonparametric maximum likelihood estimation In general, without additional information, likelihood based estimation is not available for dependent censoring problems. Using the semi-Markov model specification, it is possible to develop likelihood based estimation. Next, we present the likelihood for the unknown parameters:
Λ(0,1) , Λ(0,2) , Λ(1,2) , Λ(c) , θ , (c)
where Λ(c) is the parameter for Λ0 which is the true cumulative hazard function of C. Note that Λ(c) and θ are nuisance parameters. Suppose that Λ(0,1) , Λ(0,2) , Λ(1,2) and Λ(c) are differentiable with corresponding derivatives: λ(0,1) , λ(0,2) , λ(1,2) and λ(c) . Let S(0,1) , S(0,2) , S(1,2) and S(c) be the corresponding survival functions of Λ(0,1) , Λ(0,2) , Λ(1,2) and Λ(c) . With these notations, (0,1) , Λ(0,2) , Λ(1,2) , Λ(c) , θ can be derived. the likelihood of Λ For subjects with ξi = 1, (Vi , Xi , Δi , Ri , ξi ) is observed and its corresponding likelihood can be obtained based on the joint distribution of (V, X, Δ, R, ξ ). Specifically, when ξi = 1, Ri = 1 and Δi = 1, the likelihood is: λ(0,2) (Vi ) S(0,2) (Vi ) S(c) (Vi ) S(0,1) (Vi ) ; When ξi = 1, Ri = 1 and Δi = 0, the likelihood is: λ(c) (Vi ) S(c) (Vi ) S(0,2) (Vi ) S(0,1) (Vi ) ; When ξi = 1, Ri = 0 and Δi = 1, the likelihood is: λ(1,2) (Xi − Vi ) S(1,2) (Xi − Vi ) S(c) (Xi ) λ(0,1) (Vi ) S(0,1) (Vi ) S(0,2) (Vi ) θ ; When ξi = 1, Ri = 0 and Δi = 0, the likelihood is: λ(c) (Xi ) S(c) (Xi ) S(1,2) (Xi − Vi ) λ(0,1) (Vi ) S(0,1) (Vi ) S(0,2) (Vi ) θ. For subjects with ξi = 0, (Vi , Ri , ξi ) is observed and its corresponding likelihood is: λ(0,1) (Vi ) S(0,1) (Vi ) S(0,2) (Vi ) S(c) (Vi ) (1 − θ ) .
123
Semi-Markov model for dependent censoring
Hence, the likelihood based on the observed data (Vi , Ri , ξi , ξi Xi , ξi Δi ) , i = 1, . . . , n, is proportional to ⎡ n
(1−Ri ) ⎣ λ(0,1) (Vi ) i=1
⎤ 1 − Λ(0,1) (dy) ⎦
y∈[0,Vi )
⎡
n
Δi Ri ⎣ λ(0,2) (Vi ) ×
⎤ 1 − Λ(0,2) (dy) ⎦
y∈[0,Vi )
i=1
⎡
n
Δi ⎣ λ(1,2) (Xi − Vi ) ×
⎤ξi (1−Ri ) 1 − Λ(1,2) (dy) ⎦ .
y∈[0,Xi −Vi )
i=1
Unfortunately, this function is unbounded from above and the usual maximum like(0,1) (0,2) (1,2) lihood estimator (MLE) does not exist when Λ0 , Λ0 and Λ0 are restricted (0,1) (0,2) and to continuous functions. With discretized extensions by allowing Λ0 , Λ0 (1,2) to be discontinuous, the NPMLE is well defined in the sense of Kiefer and Λ0 (0,1) (0,2) and Wolfowitz (1956) and Scholz (1980). Specifically, we assume that Λ0 , Λ0 (1,2) are cadlag, piecewise constant and right continuous with left limits. Λ0 To obtain the NPMLE, we rewrite the likelihood function with the discretized Λ(0,1) , Λ(0,2) and Λ(1,2) : (0,2) (1,2) (0,1) Λ(0,1) Ln Λ(0,2) Ln Λ(1,2) , Ln Λ(0,1) , Λ(0,2) , Λ(1,2) = Ln where for any cumulative hazard function Λ, (0,1) Ln (Λ)
⎡ n ⎣ Λ {Vi } (1−Ri ) = i=1
(0,2) Ln (Λ)
⎡
⎤ (1 − Λ (dy))⎦ ,
y∈[0,Vi )
n ⎣ Λ {Vi } Δi Ri (1 − Λ {Vi })1−Δi Ri = i=1
i=1
(1 − Λ (dy))⎦ ,
y∈[0,Vi )
(1,2) Ln (Λ)
⎡ n ⎣ Λ {Xi −Vi } Δi (1−Λ {Xi − Vi })1−Δi =
⎤
⎤ξi (1−Ri ) (1 − Λ (dy))⎦
,
y∈[0,Xi−Vi )
with Λ {t} = Λ (t) − Λ (t−) for all t 0. (0,1) (0,2) (c) does not share jump points with Λ0 and Λ0 , It is also assumed that Λ0 which means that the time of PD censoring occurrence is different from the time of death or the time of independent censoring. Then, for any cumulative hazard function Λ,
123
Z. Zhao et al.
(0,1)
log Ln
+∞
(Λ) 0
0
= (0,2)
log Ln
0
(1,2)
log Ln
+∞
0
(dt) +
(0)
Yn (t+) log [1 − Λ {t}]
(5)
(0,2)
(0) (0,2) ˆ {t} {t} + log 1 − Λ log Λˆ (0,2) N Y (dt) (t+) n n n n
(6)
Λˆ (0,2) , n (1,2)
log [Λ {t}] Nn
(dt) +
(1)
Yn (t+) log [1 − Λ {t}]
(7)
t 0
+∞
0
=
(0,2)
t 0
+∞
(4)
Λˆ (0,1) , n
log [Λ {t}] Nn
(0,2) log Ln
(Λ)
(3)
t 0
0
=
(0)
Yn (t+) log [1 − Λ {t}]
t 0
+∞
(0,1)
(0) {t} Nn (dt) + {t} log Λˆ (0,1) Yn (t+) log 1 − Λˆ (0,1) n n
(0,1) log Ln
(Λ)
(dt) +
t 0
+∞
(0,1)
log [Λ {t}] Nn
(1,2)
(1) {t} Nn (dt) + {t} log Λˆ (1,2) Yn (t+) log 1 − Λˆ (1,2) n n
(1,2) log Ln
t 0
Λˆ (1,2) , n
(8)
where for any t 0, (0)
n
(0,1)
i=1 n
Yn (t) = Nn
(1,2)
Nn
(t) = (t) =
i=1 n
(1)
I {Vi t} , Yn (t) =
(0,2)
(1 − Ri ) I {Vi t} , Nn
Λˆ (0,2) (t) = n Λˆ n(1,2) (t) =
(t) =
n
Δi Ri I {Vi t} ,
i=1
ξi Δi (1 − Ri ) I {Xi − Vi t} ,
ξi (1 − Ri ) I {Xi − Vi t} ,
i=1
i=1
Λˆ n(0,1) (t) =
n
[0,t]
[0,t]
[0,t]
(0)
−1 (0) (0,1) I Yn (y) > 0 Nn (dy) ,
(0)
−1 (0) (0,2) I Yn (y) > 0 Nn (dy) ,
(1)
−1 (1) (1,2) I Yn (y) > 0 Nn (dy) .
Yn (y) Yn (y) Yn (y)
The equalities for (3), (5), and (7) hold if and only if Λ is a pure jump function. The inequalities in (4), (6), and (8) follow from the fact that for any λ ∈ (0, 1) and any α, β > 0,
123
Semi-Markov model for dependent censoring
α log λ + β log (1 − λ) α log (α/(α + β)) + β log (β/(α + β)) . (0,1) (0,2) (1,2) is the NPMLE of According to Criteria (B) of Scholz (1980), Λˆ n , Λˆ n , Λˆ n (0,1) (0,2) (1,2) (0,1) (0,2) (1,2) Λ , since Ln Λˆ n , Λˆ n , Λˆ n > 0. ,Λ ,Λ 0
0
0
Let S0 be the marginal survival function of T. Note that for any t 0,
(0,1)
S0 (t) = P (T > t) = −
ST|U (t|u) S0 (du) R+ (0,1) (0,2) (0,2) (1,2) (0,1) = S0 (t) S0 (t) − S0 (u) S0 (t − u) S0 (du) . [0,t]
Correspondingly, for any t 0, define Sˆ n (t) = Sˆ n(0,1) (t) Sˆ (0,2) (t) − n
[0,t]
Sˆ (0,2) (u) Sˆ n(1,2) (t − u) Sˆ n(0,1) (du) , n
where for any t 0, Sˆ n(0,1) (t) =
1 − Λˆ n(0,1) (dy) , Sˆ (0,2) 1 − Λˆ (0,2) (t) = (dy) , n n y∈[0,t]
y∈[0,t]
1 − Λˆ n(1,2) (dy) . Sˆ n(1,2) (t) = y∈[0,t]
(0,1) (0,2) (1,2) is the NPMLE of By the invariance property, it follows that Sˆ n , Sˆ n , Sˆ n (0,1) (0,2) (1,2) S0 , S0 , S0 and Sˆ n is the NPMLE of S0 .
4 Asymptotic results In this section, we present the asymptotic properties of the NPMLE. 4.1 Regularity conditions We list regularity conditions in this subsection. (0) (0,1) (0,2) (c) For any t 0, define L0 (t) = S0 (t) S0 (t) S0 (t) and (1)
(1,2)
L0 (t) = −θ0 S0
(t)
R+
(0,2)
S0
(c)
(0,1)
(u) S0 (u + t) S0
(du) .
The regularity conditions are as follows: A.1 For a sample of size n, (Ti , Ui , Ci , Xi , Δi , Vi , Ri , ξi ) , i = 1, . . . , n are n independent copies of (T, U, C, X, Δ, V, R, ξ ).
123
Z. Zhao et al. (1,2)
A.2 Model (2) holds with Λ0 (0) = 0. A.3 The censoring time C is independent of (T, U). A.4 The indicator ξ is conditionally independent of (T, U, C) given R, with P (ξ = 1|R = 1) = 1 and P (ξ = 1|R = 0) = θ0 ∈ (0, 1) . A.5 There exists τ > 0 such that L0(0) (τ −) > 0 and L0(1) (τ −) > 0. (0,1) (0,2) (0,1) (c) A.6 For any t ∈ [0, τ ] , Λ0 {t} Λ0 {t} = Λ0 {t} Λ0 {t} = 0. Assumption A.1 is commonly satisfied. Assumption A.2 is a technical condition for model specification. Assumption A.3 indicates that C is independent censoring. Assumption A.4 assumes that there is a subgroup of the potentially dependent censored (0) subjects whose (X, Δ) are observed. Assumption A.5 is equivalent to L0 (0) > 0 and (1) L0 (0) > 0. The required τ is generally not unique and could be chosen as large as possible. Assumption A.6 is a mild technical condition which is weaker than continuity. 4.2 Asymptotic normality In this subsection, we establish the asymptotic normality and the asymptotic efficiency of the NPMLE estimators. The asymptotic normality is established by convergence to a tight Gaussian process in the space of uniformly bounded funcˆ element in ∞ ([0, τ ]), and tions on [0, τ ]. Specifically, Sn is viewed as a random (0,1) (0,2) (1,2) (0,1) (0,2) (1,2) and Sˆ n , Sˆ n , Sˆ n are viewed as random elements in Λˆ n , Λˆ n , Λˆ n ∞ ∞ ∞ ∞
∞ 3 ([0, τ ]) = ([0, τ ]) × ([0, τ ]) × ([0, τ ]), where ([0, τ ]) denotes the space of all uniformly bounded real valued functions defined on [0, τ ] equipped with the uniform norm. The asymptotic efficiency is shown by convolution theorem. (0,1) (0,2) (1,2) . The first theorem gives the asymptotic normality for Λˆ n , Λˆ n , Λˆ n
Theorem 1 Under assumptions A.1–A.6, (0,1) (0,2) ˆ (1,2) (1,2) n 1/2 Λˆ n(0,1) − Λ0 , Λˆ (0,2) − Λ , Λ − Λ n n 0 0 weakly converges to a tight zero-mean Gaussian process in ∞ 3 ([0, τ ]) with covariance function U0 , as n → ∞, where for any t, s ∈ [0, τ ], (0,1) (0,2) (1,2) U0 (t, s) = diag U0 (t, s) , U0 (t, s) , U0 (t, s) , in which, for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any t, s ∈ [0, τ ], −1 (i, j) (i, j) (i, j) (i) L0 (y−) 1 − Λ0 {y} Λ0 (dy) . U0 (t, s) = [0,min{t,s}]
(0,1) (0,2) (1,2) . The second theorem gives the asymptotic normality for Sˆ n , Sˆ n , Sˆ n
123
Semi-Markov model for dependent censoring
Theorem 2 Under assumptions A.1–A.6, (0,1) (0,2) (1,2) − S0 , Sˆ n(1,2) − S0 n 1/2 Sˆ n(0,1) − S0 , Sˆ (0,2) n weakly converges to a tight zero-mean Gaussian process in ∞ 3 ([0, τ ]) with covariance function V0 , as n → ∞, where for any t, s ∈ [0, τ ], (0,1) (0,2) (1,2) V0 (t, s) = diag V0 (t, s) , V0 (t, s) , V0 (t, s) , in which, for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any t, s ∈ [0, τ ], (i, j)
(i, j)
V0
(t, s) = S0 (i, j) W0 (t, s) =
(i, j)
(i, j)
(t) S0 (s) W0 (t, s), −1 −1 (i) (i, j) (i, j) L0 (y−) 1 − Λ0 {y} Λ0 (dy) .
[0,min{t,s}]
The next theorem establishes the asymptotic normality for Sˆn . Theorem 3 Under assumptions A.1–A.6, n 1/2 (Sˆ n − S0 ) weakly converges to a tight zero-mean Gaussian process in ∞ ([0, τ ]) with covariance function Ω0 , as n → ∞, where for any t, s ∈ [0, τ ], (0,1)
Ω0 (t, s) = Ω0 (0,1) Ω0 (t, s) =
(0,2)
(1,2)
(t, s) + Ω0 (t, s) + Ω0 (t, s) , (0,1) (0,2) (0,2) V0 (x−, y−) S0 (dy) S0 (d x)
(0,t] (0,s] (0,2)
− S0
(0)
(0,2)
(0,1)
(0,t] (0,s]
V0
(1,2)
(x−, s − y) S0
(0,1)
(0,2)
(dy) S0
(0,2)
(d x)
(1,2)
− S0 V0 (0) (t − x, y−) S0 (dy) S0 (d x) (0,t] (0,s]
2 (0,2) (0,1) (1,2) (1,2) + S0 V0 (0) (t − x, s − y) S0 (dy) S0 (d x), (0,2)
Ω0
(0,1)
(t, s) = S0
(0,1)
− S0
(0,1)
− S0
(1,2)
Ω0
(t, s) =
(t)
(0,t] (0,s] (0,2)
(s) V0
[0,s]
(s)
+
(0,1)
(t) S0
[0,t]
[0,t] [0,s]
(0,2)
(t, y) S0
(0,2)
(x, s) S0
V0
V0
(0,2)
[0,t] [0,s]
V0
(1,2)
V0
(t, s) (1,2)
(s − y) S0
(1,2)
(t − x) S0
(1,2)
(x, y) S0
(0,1)
(dy)
(0,1)
(d x)
(1,2)
(t − x) S0 (0,2)
(t − x, s − y) S0
(0,2)
(x) S0
(0,1)
(s − y) S0
(0,1)
(y) S0
(0,1)
(dy) S0 (0,1)
(dy) S0
(d x) ,
(d x) .
(0,1) (0,1) (0,2) (0,2) (1,2) (1,2) It is worth noting that, although Λˆ n − Λ0 , Λˆ n − Λ0 and Λˆ n − Λ0 themselves are martingales, they do not form a joint martingale, since no common σ field is available. Hence, the martingale limit theory cannot be directly used here. The proof of the above theorems are based on empirical process theory and are provided in the Appendix.
123
Z. Zhao et al.
4.3 Asymptotic efficiency In this subsection, we turn to the asymptotic nonparametric efficiency, which is formally defined by the convolution theorem (Theorem VIII.3.1 of Andersen et al. 1993). We explicitly specify the Hilbert space required by the convolution theorem. Define H = H(0,1) × H(0,2) × H(1,2) , where for (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)}, H(i, j) = h : [0,τ ]
−1 (i, j) (i, j) (i) Λ0 (dy) < +∞ . [h (y)]2 L0 (y−) 1 − Λ0 {y}
For any g = g (0,1) , g (0,2) , g (1,2) , h = h (0,1) , h (0,2) , h (1,2) ∈ H, define g, hH =
(i, j) [0,τ ]
−1 (i, j) (i, j) (i) g (i, j) (y) h (i, j) (y) L0 (y−) 1 − Λ0 {y} Λ0 (dy) ,
where the summation is taken over {(0, 1) , (0, 2) , (1, 2)}. For any h ∈ H, define 1/2 hH = h, hH . It can be shown that H is a Hilbert space with the inner product ·, ·H and the norm ·H . For any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any h = h (0,1) , h (0,2) , h (1,2) ∈ H, define (i, j) (i, j) 1 + n −1/2 h (i, j) (y) Λ0 (dy) , Λn,h (t) = [0,t]
(0,1) (0,2) (1,2) (0,1) (0,2) (1,2) is: the likelihood ratio of Λn,h , Λn,h , Λn,h to Λ0 , Λ0 , Λ0 Rn (h) = Rn(0,1) (h) R(0,2) (h) Rn(1,2) (h) , n where for any h = h (0,1) , h (0,2) , h (1,2) ∈ H, log Rn(0,1) (h) ⎡
=
n
(1 − Ri ) log 1 + n −1/2 h (0,1) (Vi )
i=1
⎤ (0,1) (0,1) ⎣log 1 − Λn,h (dy) − log 1 − Λ0 (dy) ⎦ , + i=1 y∈[0,Vi ) y∈[0,Vi ) n
Δi Ri log 1 + n −1/2 h (0,2) (Vi ) log R(0,2) (h) = n n
i=1
+
n i=1
(0,2) (0,2) (1 − Δi Ri ) log 1 − Λn,h {Vi } − log 1 − Λ0 {Vi }
⎤ (0,2) (0,2) ⎣log 1 − Λn,h (dy) − log 1 − Λ0 (dy) ⎦ , + i=1 y∈[0,Vi ) y∈[0,Vi ) n
123
⎡
Semi-Markov model for dependent censoring
log Rn(1,2) (h) =
n
ξi (1 − Ri ) Δi log 1 + n −1/2 h (1,2) (Xi − Vi )
i=1
+ −
n i=1 n
(1,2) ξi (1 − Ri ) (1 − Δi ) log 1 − Λn,h {Xi − Vi } (1,2) ξi (1 − Ri ) (1 − Δi ) log 1 − Λ0 {Xi − Vi }
i=1
+
n
−
(1,2) 1 − Λn,h (dy)
y∈[0,Xi −Vi )
i=1 n
ξi (1 − Ri ) log
ξi (1 − Ri ) log
(1,2) 1 − Λ0 (dy) .
y∈[0,Xi −Vi )
i=1
Theorem 4 Under assumptions A.1–A.6, for any m > 0, h 1 , . . . , h m ∈ H, (log Rn (h 1 ) , . . . , log Rn (h m )) weakly converges to a Gaussian random vector in Rm with mean −
1 h 1 2H , . . . , h m 2H 2
and covariance matrix ⎛
h 1 , h 1 H ⎜ .. ⎝ .
h m , h 1 H
... .. . ···
⎞ h 1 , h m H ⎟ .. ⎠. .
h m , h m H
Hence, the information operator I0 : H × H → R is: for any g, h ∈ H, I0 (g, h) = g, hH . For any h = h (0,1) , h (0,2) , h (1,2) ∈ H, define
κˆ n (h) = [0,τ ]
(0,2) (0,2) (1,2) (1,2) ˆ ˆ Λ Λ . h (0,1) (y) Λˆ (0,1) (dy)+h (y) (dy)+h (y) (dy) n n n
Note that for any g, h ∈ H, the asymptotic covariance of κˆ n (g) and κˆ n (h) is (i, j) [0,τ ]
−1 (i, j) (i, j) (i) g (i, j) (y) h (i, j) (y) L0 (y−) 1 − Λ0 {y} Λ0 (dy)
(9)
123
Z. Zhao et al.
and can be interpreted as the “inverse” of the information operator I0 , where the over {(0, 1) , (0, 2) , (1, 2)}. The efficiency result for summation is taken (0,1) ˆ (0,2) ˆ (1,2) ˆ and Sˆ n is stated in the following theorem. Λn , Λn , Λn (0,1) (0,2) (1,2) and Sˆ n are efficient. Theorem 5 Under assumptions A.1–A.6, Λˆ n , Λˆ n , Λˆ n The proof of the above theorems are given in the Appendix. 4.4 Asymptotic covariance function estimation To carry out statistical inference, it is necessary to estimate the asymptotic covariance function. In this subsection, we provide uniform consistent estimators for the asymptotic covariance functions. For any g = g (0,1) , g (0,2) , g (1,2) , h = h (0,1) , h (0,2) , h (1,2) ∈ H, define Iˆ n (g, h) =
(i, j) [0,τ ]
−1 (i, j) (i, j) g (i, j) (y) h (i, j) (y) 1 − Λˆ n {y} Nn (dy)
where the summation is taken over {(0, 1) , (0,2) , (1, 2)}. For any g = g (0,1) , g (0,2) , g (1,2) , h = h (0,1) , h (0,2) , h (1,2) ∈ H, it can be shown that Iˆ n (g, h) is the negative second order directional derivative of log Ln at (0,1) (0,2) (1,2) along the direction Gn , Hn , where Λˆ n , Λˆ n , Λˆ n (1,2) ˆ , Gn = g g d Λn g Hn = , h (1,2) d Λˆ n(1,2) . h (0,1) d Λˆ n(0,1) , h (0,2) d Λˆ (0,2) n
(0,1)
d Λˆ n(0,1) ,
(0,2)
d Λˆ n(0,2) ,
(1,2)
Since the function Ln plays the role of the likelihood, Iˆ n should be a reasonable estimator for I0 . The “inverse” operation in (9) leads to an estimator for the asymptotic covariance function of κˆ n : (i, j) [0,τ ]
−1 (0) (i, j) (i, j) g (i, j) (y) h (i, j) (y) 1 − Λˆ n {y} Yn (y) Λˆ n (dy)
for all g = g (0,1) , g (0,2) , g (1,2) , h = h (0,1) , h (0,2) , h (1,2) ∈ H, where the summation is taken over {(0, 1) , (0, 2) , (1, 2)}. Hence, the estimators for U0 and V0 are given as follows: for any t, s ∈ [0, τ ], Uˆn (t, s) = diag Uˆn(0,1) (t, s) , Uˆn(0,2) (t, s) , Uˆn(1,2) (t, s) , Vˆ n (t, s) = diag Vˆ n(0,1) (t, s) , Vˆ n(0,2) (t, s) , Vˆ n(1,2) (t, s) ,
123
Semi-Markov model for dependent censoring
where for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any t, s ∈ [0, τ ], (i, j) (i, j) (i, j) ˆ n(i, j) (t, s) , Vˆ n (t, s) = Sˆ n (t) Sˆ n (s) W −1 (i) (i, j) (i, j) (i, j) Λˆ n (dy) , 1 − Λˆ n {y} Yn (y) Uˆn (t, s) = n [0,min{t,s}] −1 −1 (i) (i, j) (i, j) ˆ n(i, j) (t, s) = n Λˆ n (dy) . 1 − Λˆ n {y} W Yn (y) [0,min{t,s}]
The estimator for Ω0 is given as Ωˆ n : for any t, s ∈ [0, τ ], Ωˆ n (t, s) = Ωˆ n(0,1) (t, s) + Ωˆ n(0,2) (t, s) + Ωˆ n(1,2) (t, s) , where for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any t, s ∈ [0, τ ], Ωˆ n(0,1) (t, s) =
Vˆ n(0,1) (x−, y−) Sˆ (0,2) (dy) Sˆ (0,2) (d x) n n − Sˆ (0,2) Vˆ n(0,1) (x−, s − y) Sˆ (1,2) (0) (dy) Sˆ (0,2) (d x) n n n (0,t] (0,s] ˆ (1,2) (d x) − Sˆ (0,2) Vˆ n(0,1) (t − x, y−) Sˆ (0,2) (0) (dy) S n n n (0,t] (0,s]
2 + Sˆ (0,2) Vˆ n(0,1) (t − x, s − y) Sˆ (1,2) (0) (dy) Sˆ (1,2) (d x) , n n n (0,t] (0,s]
(0,t] (0,s]
Ωˆ n(0,2) (t, s) = Sˆ (0,1) (t) Sˆ (0,1) (s) Vˆ n(0,2) (t, s) n n Vˆ n(0,2) (t, y) Sˆ (1,2) − Sˆ (0,1) (t) (s − y) Sˆ (0,1) (dy) n n n [0,s] ˆ (0,1) (d x) − Sˆ (0,1) Vˆ n(0,2) (x, s) Sˆ (1,2) (s) (t − x) S n n n [0,t] + Vˆ n(0,2) (x, y) Sˆ (1,2) (t − x) Sˆ (1,2) (s − y) Sˆ (0,1) (dy) Sˆ (0,1) (d x) , n n n n [0,t] [0,s]
Ωˆ n(1,2) (t, s) =
[0,t] [0,s]
ˆ (0,2) (x) Sˆ (0,2) (y) Sˆ (0,1) (dy) Vˆ n(1,2) (t − x, s − y) S n n n
× Sˆ n(0,1) (d x) . The following theorem states the uniform consistency for Uˆn , Vˆ n and Ωˆ n . Theorem 6 Under assumptions A.1–A.6, as n → ∞, sup
Uˆn (t, s) − U0 (t, s) , sup
t,s∈[0,τ ]
sup
Vˆ n (t, s) − V0 (t, s) ,
t,s∈[0,τ ]
Ωˆ n (t, s) − Ω0 (t, s)
t,s∈[0,τ ]
converge to zero in probability. The proof of the theorem is given in the Appendix.
123
Z. Zhao et al.
Based on the asymptotic theory and the estimation of the asymptotic covariance function, various types of inference can be conducted. For example, for any fixed t0 ∈ [0, τ ], a 95 % pointwise confidence interval can be constructed as: 1/2 Sˆ n (t0 ) ± 1.96Ωˆ n (t0 , t0 ) .
5 Numerical studies 5.1 Simulation studies Four sets of simulation studies were carried out to examine the finite-sample performance of the proposed NPMLE. In the first and the third sets of the simulation studies, the independent censoring time C was specified as +∞. In the second and the fourth sets, the independent censoring time C was generated from a uniform distribution such that P (T C) = 0.15. In the first and the second sets of the simulation studies, the PD censoring time U was generated from Uniform (0, η), while in the third and the fourth sets, the PD censoring time U was generated from the exponential distribution with hazard rate η. In all simulation studies, the survival time T was generated as (1) if T(1) U T , T= (2) U+T if T(1) > U where T(1) was generated from the exponential distribution with hazard rate 1 and T(2) was generated from the exponential distribution with hazard rate λ. The parameters were set to be λ ∈ {1, 2} , P (U < T) ∈ {0.3, 0.7} and θ0 ∈ {0.3, 0.6, 1}. For each scenario, we generated 1000 samples of size n = 100. The simulation results were summarized in Tables 1, 2, 3, 4, 5, 6, 7 and 8, which report the empirical bias (Bias), the empirical standard deviation (Std), the square root of the empirical mean squared error (sqrt[MSE]) and the average estimated standard deviation (AES) of the proposed estimator, as well as the empirical coverage rate (CR) of the corresponding 95 % pointwise confidence interval, at the 0.3th, 0.5th and 0.7th quantiles of T. The values of Bias, Std, sqrt(MSE) and AES were multiplied by 10,000. For comparison, the results from the Kaplan–Meier approach were also included in the tables. In addition, the empirical bias, the empirical standard deviation and the square root of the empirical mean squared error of Lee and Tsai’s estimator were included in Tables 1, 2, 5 and 6. Because there is no valid estimator of the variance of Lee and Tsai’s estimator, the AES and the CR are not reported for the Lee and Tsai’s estimator. The Lee and Tsai’s estimator was not included in Tables 3, 4, 7 and 8, because it does not incorporate the additional independent censoring. For the cases with C = +∞, in which the Lee and Tsai’s approach is applicable, the proposed NPMLE and the Lee and Tsai’s estimator were nearly unbiased and consistent. The NPMLE had smaller sqrt(MSE) than the Lee and Tsai’s estimator for all three quantiles, the improvement ranged from 0.6 to 7.9 %. The coverage of the 95 % pointwise confidence interval for the proposed NPMLE were close to the nominal level.
123
1.0
1.0
2
2
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
τ
473.924
−11.000
449.504
440.306
−9.700
9.900
523.249
457.866
−7.299
478.814
480.614
1.588
507.639
−3.479
−11.359
−16.654
467.485
507.468
457.686
−8.600
−9.100
9.379
456.914
472.164
505.172
477.190
493.339
514.706
449.049
19.700
29.273
11.651
11.145
15.460
20.922
16.322
449.613
474.051
440.413
479.104
523.251
457.924
480.749
507.651
467.579
457.777
507.541
457.339
473.071
505.306
477.320
493.581
515.131
449.346
9.900
−11.000
−9.700
133.208
103.364
36.817
282.399
183.352
89.851
−9.100
−8.600
19.700
30.350
9.105
9.971
4.205
19.302
17.504
449.504
473.924
440.306
504.190
533.389
463.476
502.682
510.943
471.065
457.686
507.468
456.914
486.644
517.079
487.063
483.332
519.672
452.963
Std
449.613
474.051
440.413
521.490
543.312
464.936
576.575
542.845
479.558
457.777
507.541
457.339
487.589
517.160
487.165
483.350
520.030
453.301
sqrt(MSE)
Bias
sqrt(MSE)
Bias
Std
Kaplan–Meier estimator
Lee and Tsai’s estimator
456.035
497.740
456.149
481.332
508.829
459.002
505.049
518.393
460.533
455.098
497.407
454.647
479.243
510.235
460.309
497.773
519.932
464.366
AES
0.960
0.949
0.956
0.935
0.931
0.940
0.927
0.932
0.935
0.948
0.947
0.949
0.935
0.945
0.932
0.954
0.952
0.946
CR
9.806
−8.285
−10.120
−15.314
8.289
−5.320
−12.246
−2.961
12.260
−10.352
−9.648
17.716
25.349
10.361
9.911
20.245
18.855
15.095
Bias
NPMLE
430.754
460.149
426.256
464.329
507.991
448.094
477.239
502.958
464.082
437.898
481.701
444.797
461.293
497.356
473.493
487.241
507.437
444.900
Std
430.865
460.224
426.376
464.582
508.059
448.126
477.396
502.966
464.244
438.020
481.798
445.150
461.989
497.464
473.596
487.662
507.788
445.156
sqrt(MSE)
476.470
485.975
443.491
488.198
493.961
446.429
518.828
512.860
453.541
446.506
480.689
444.825
467.520
489.322
447.447
511.333
508.858
453.465
AES
0.970
0.957
0.952
0.961
0.943
0.943
0.965
0.946
0.944
0.951
0.953
0.940
0.946
0.942
0.928
0.966
0.950
0.948
CR
The AES and the CR are not reported for the Lee and Tsai’s estimator, since the corresponding variance estimator has not been developed. The values of Bias, Std, sqrt(MSE) and AES have been multiplied by 10,000 Bias the empirical bias, Std the empirical standard deviation, sqrt(MSE) the square root of the empirical mean squared error, AES the average estimated standard deviation, CR the empirical coverage rate of the 95 % confidence interval
1.0
2
0.6
2
0.6
0.3
2
0.6
0.3
2
2
0.3
2
2
1.0
1.0
1
1
0.6
1.0
1
1
0.6
0.6
1
1
0.3
0.3
1
0.3
1
1
θ
λ
Table 1 Simulation results when C = +∞ and U is generated from the uniform distribution with P (T > U) = 0.3
Semi-Markov model for dependent censoring
123
123
1.0
1.0
2
2
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
τ
531.379
−5.362
511.517
457.627
483.329
445.847
−6.385
−8.200
−23.800
−14.800
527.470
460.885
645.529
11.827
7.979
1.222
602.643
495.129
−6.200
12.071
489.127
448.980
−2.100
467.809
−5.090
1.900
525.029
5.565
442.553
−2.828
4.242
658.791
716.866
−31.311
446.093
483.915
457.700
511.557
527.603
460.954
645.530
602.763
495.158
449.023
489.131
467.813
531.404
525.059
442.574
716.871
659.534
499.565
−14.800
−23.800
−8.200
193.050
247.354
138.658
466.958
513.093
245.676
−6.200
−2.100
1.900
−6.285
3.813
1.272
0.973
−33.695
−0.496
499.437
445.847
483.329
457.627
566.862
563.289
474.635
739.829
595.947
499.341
448.980
489.127
467.809
560.956
558.605
469.836
760.752
675.060
520.377
446.093
483.915
457.700
598.833
615.206
494.474
874.869
786.395
556.506
449.023
489.131
467.813
560.991
558.618
469.838
760.753
675.901
520.377
sqrt(MSE)
−11.337
Std
Bias
Std
Bias
sqrt(MSE)
Kaplan–Meier estimator
Lee and Tsai’s estimator
454.984
497.645
455.865
559.222
546.354
470.687
721.418
593.462
483.087
455.336
497.592
455.301
556.616
562.140
480.906
728.592
644.222
503.210
AES
0.960
0.956
0.953
0.938
0.914
0.931
0.881
0.851
0.895
0.950
0.945
0.941
0.946
0.950
0.946
0.918
0.925
0.930
CR
−18.720
−24.953
−15.557
−1.412
11.581
16.016
6.178
8.185
−5.022
−6.642
−5.547
392.816
436.142
409.854
485.327
494.984
438.841
634.232
589.272
479.782
414.769
447.852
510.060 430.119
−5.814
496.985
418.693
710.683
647.774
492.276
Std
−9.390
4.559
2.315
−3.802
−30.850
−14.576
Bias
NPMLE
393.261
436.856
410.149
485.329
495.120
439.133
634.262
589.329
479.808
414.822
447.886
430.221
510.093
497.006
418.699
710.694
648.508
492.492
sqrt(MSE)
406.233
449.685
413.332
479.425
491.471
428.777
623.757
581.017
467.545
427.758
454.722
420.918
521.131
511.491
437.047
697.288
631.552
474.698
AES
0.951
0.958
0.956
0.944
0.950
0.943
0.932
0.935
0.945
0.956
0.950
0.932
0.951
0.952
0.959
0.920
0.939
0.932
CR
The AES and the CR are not reported for the Lee and Tsai’s estimator, since the corresponding variance estimator has not been developed. The values of Bias, Std, sqrt(MSE) and AES have been multiplied by 10,000 Bias the empirical bias, Std the empirical standard deviation, sqrt(MSE) the square root of the empirical mean squared error, AES the average estimated standard deviation, CR the empirical coverage rate of the 95 % confidence interval
0.6
1.0
2
2
0.6
0.6
2
2
0.3
0.3
2
0.3
1.0
1
2
1.0
1
2
0.6
1.0
1
1
0.6
0.3
1
0.6
0.3
1
1
0.3
1
1
θ
λ
Table 2 Simulation results when C = +∞ and U is generated from the uniform distribution with P (T > U) = 0.7
Z. Zhao et al.
0.6
1.0
1.0
1.0
2
2
2
2
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
τ AES
CR
8.600
8.000
444.391
498.583
481.221
473.568
−4.200
533.052
466.143
520.929
519.638
466.208
168.701
124.852
54.006
285.552
186.877
73.204
517.326
457.368
−1.200
−9.700
477.638
456.393
−4.198
5.800
457.719
521.312
−55.071
−27.605
518.809
480.116
−8.948
−24.535
475.857
3.157
444.474
498.647
473.587
509.935
547.478
469.262
594.060
552.220
471.920
457.470
517.327
456.430
477.657
522.042
461.020
480.742
518.886
475.868
456.052
497.498
455.499
483.245
509.020
458.288
504.759
518.041
461.281
455.068
497.304
455.269
477.467
509.740
463.386
497.576
520.695
465.098
0.961
0.948
0.945
0.939
0.923
0.947
0.911
0.932
0.932
0.949
0.938
0.957
0.952
0.944
0.952
0.949
0.949
0.943
5.087
4.254
−4.680
16.348
18.940
11.512
9.750
423.137
478.715
459.618
447.174
508.319
451.738
494.252
508.992
461.198
−3.787 5.280
439.057
504.409
450.296
444.951
496.274
444.683
488.030
515.592
464.227
Std
−4.476
−0.957
2.850
−6.434
−23.875
−53.657
−26.005
−12.346
1.603
Bias
sqrt(MSE)
Bias
Std
NPMLE
Kaplan–Meier estimator
423.167
478.734
459.642
447.473
508.671
451.884
494.348
509.020
461.213
439.080
504.410
450.305
444.998
496.848
447.908
488.722
515.740
464.229
sqrt(MSE)
476.635
485.907
442.950
489.082
494.004
445.745
519.513
513.616
454.278
446.865
480.853
445.537
466.468
489.593
450.794
512.812
510.107
454.039
AES
0.974
0.955
0.934
0.963
0.936
0.937
0.964
0.955
0.936
0.947
0.945
0.951
0.960
0.939
0.943
0.961
0.952
0.944
CR
The Lee and Tsai’s estimator is not included, since the additional independent censoring is not incorporated in this approach. The values of Bias, Std, sqrt(MSE) and AES have been multiplied by 10,000 Bias the empirical bias, Std the empirical standard deviation, sqrt(MSE) the square root of the empirical mean squared error, AES the average estimated standard deviation, CR the empirical coverage rate of the 95 % confidence interval
0.6
0.6
2
2
0.3
0.3
2
0.3
2
2
1.0
1.0
1
1
0.6
1.0
1
1
0.6
0.6
1
1
0.3
0.3
1
0.3
1
1
θ
λ
Table 3 Simulation results when P (T > C) = 0.15 and U is generated from the uniform distribution with P (T > U) = 0.3
Semi-Markov model for dependent censoring
123
123
1.0
2
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
τ
500.114
593.135
453.780
514.198
465.653
0.100
−20.600
561.836
555.667
453.296
739.105
−18.300
190.462
217.812
97.707
420.009
468.726
510.455
468.413
−15.700
−12.200
268.558
556.991
475.711
21.931
−5.000
492.202
557.711
−1.934
−1.948
674.780
466.108
514.523
453.780
593.242
596.831
463.707
850.108
775.837
540.211
468.572
510.696
475.738
557.423
557.714
492.205
749.633
642.816
454.485
497.333
455.556
559.105
546.970
473.168
722.839
594.952
482.080
454.838
497.374
455.495
556.910
561.540
480.297
731.977
0.943
0.936
0.957
0.942
0.916
0.945
0.900
0.851
0.889
0.946
0.938
0.943
0.947
0.950
0.941
0.936
0.938
0.952
483.365 482.644
−7.811
−19.324
−7.882
415.083
450.739
407.523
413.581
−27.117 −17.538 1.700
587.409 641.460
−1.386 −13.806
466.171
433.071
461.219
432.805
510.031
501.484
440.665
714.376
649.068
491.383
Std
9.309
−21.023
−12.767
−5.471
21.264
11.016
−1.255
21.350
4.681
−7.032
674.774
748.811
35.088
504.172
−2.953
509.053
508.927
−11.321
CR
Bias
AES
Std
Bias
sqrt(MSE)
NPMLE
Kaplan–Meier estimator
415.533
450.807
407.527
482.708
483.683
414.469
641.608
587.410
466.264
433.581
461.395
432.840
510.474
501.605
440.666
714.695
649.085
491.433
sqrt(MSE)
405.605
449.654
412.883
479.955
492.237
430.981
621.044
581.482
466.745
427.045
454.286
420.981
521.362
511.521
437.101
698.724
629.126
473.164
AES
0.937
0.938
0.948
0.937
0.950
0.960
0.924
0.945
0.940
0.938
0.945
0.937
0.935
0.945
0.948
0.933
0.930
0.938
CR
The Lee and Tsai’s estimator is not included, since the additional independent censoring is not incorporated in this approach. The values of Bias, Std, sqrt(MSE) and AES have been multiplied by 10,000 Bias the empirical bias, Std the empirical standard deviation, sqrt(MSE) the square root of the empirical mean squared error, AES the average estimated standard deviation, CR the empirical coverage rate of the 95 % confidence interval
1.0
0.6
2
1.0
0.6
2
0.6
2
2
2
0.3
0.3
2
0.3
2
2
1.0
1.0
1
1
0.6
1.0
1
1
0.6
0.6
1
1
0.3
0.3
1
0.3
1
1
θ
λ
Table 4 Simulation results when P (T > C) = 0.15 and U is generated from the uniform distribution with P (T > U) = 0.7
Z. Zhao et al.
1.0
1.0
2
2
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
τ
5.600
485.494
452.751
517.350
−22.500
1.400
472.414
457.667
503.876
−8.363
−0.668
454.920
−12.466
−22.600
514.297
496.163
2.862
440.773
453.594
−1.447
18.762
6.200
456.154
475.155
19.800
−0.230
464.160
495.721
14.251
21.891
522.233
513.833
4.645
456.820
−2.288
10.296
452.753
517.839
458.224
472.414
503.945
455.091
496.165
514.305
441.172
453.636
485.526
456.583
475.155
495.925
464.676
513.838
522.254
456.936
1.400
−22.500
−22.600
167.725
113.093
43.213
318.350
222.344
116.418
6.200
5.600
19.800
0.286
14.596
19.899
−2.456
10.906
11.247
452.751
517.350
457.667
498.620
513.398
460.479
506.289
511.420
447.944
453.594
485.494
456.154
482.683
503.023
470.593
503.665
527.338
469.847
Std
452.753
517.839
458.224
526.074
525.707
462.502
598.059
557.662
462.825
453.636
485.526
456.583
482.683
503.235
471.014
503.671
527.451
469.981
sqrt(MSE)
Bias
sqrt(MSE)
Bias
Std
Kaplan–Meier estimator
Lee and Tsai’s estimator
455.626
497.298
456.493
486.202
512.080
460.518
512.735
523.404
462.360
455.838
497.628
454.665
481.632
513.288
461.643
505.226
526.662
468.018
AES
0.951
0.941
0.948
0.943
0.943
0.934
0.915
0.934
0.938
0.956
0.947
0.958
0.944
0.951
0.940
0.943
0.955
0.941
CR
489.006 435.387
−2.116
440.063
458.369
488.754
438.648
489.412
508.523
433.321
429.566
469.216
446.730
464.456
489.975
457.762
510.711
517.105
450.538
Std
−16.422
−18.267
−1.017
−10.435
−15.996
−3.142
2.159
19.215
7.415
4.393
19.156
1.168
12.513
20.407
−3.613
3.852
10.442
Bias
NPMLE
435.392
489.282
440.442
458.371
488.865
438.939
489.422
508.527
433.747
429.630
469.236
447.141
464.457
490.135
458.216
510.724
517.120
450.659
sqrt(MSE)
476.523
484.956
441.278
491.470
496.818
446.454
526.006
522.428
456.426
448.339
479.710
442.640
473.894
492.027
446.718
529.396
521.405
456.574
AES
0.955
0.944
0.939
0.971
0.955
0.944
0.961
0.950
0.953
0.963
0.944
0.952
0.954
0.947
0.942
0.952
0.953
0.944
CR
The AES and the CR are not reported for the Lee and Tsai’s estimator, since the corresponding variance estimator has not been developed. The values of Bias, Std, sqrt(MSE) and AES have been multiplied by 10,000 Bias the empirical bias, Std the empirical standard deviation, sqrt(MSE) the square root of the empirical mean squared error, AES the average estimated standard deviation, CR the empirical coverage rate of the 95 % confidence interval
0.6
1.0
0.6
2
2
0.6
2
2
0.3
0.3
2
0.3
2
2
1.0
1.0
1
1
0.6
1.0
1
1
0.6
0.6
1
1
0.3
0.3
1
0.3
1
1
θ
λ
Table 5 Simulation results when C = +∞ and U is generated from the exponential distribution with P (T > U) = 0.3
Semi-Markov model for dependent censoring
123
123
1.0
0.3
0.3
0.3
1
2
2
2
1.0
1.0
2
2
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
τ
24.438
685.975
443.302
523.877
618.911
622.933
447.090
−13.400
−24.212
−24.450
−14.467
−5.620
6.100
31.400
21.600
457.320
501.483
469.163
498.036
484.336
−11.100
−3.514
450.740
−17.500
511.504
526.254
−17.088
0.120
458.252
536.533
0.793
710.008
−17.919
20.986
457.361
502.465
469.660
498.048
511.504
447.125
623.101
619.393
524.437
443.505
484.463
451.080
526.532
536.832
458.253
710.318
686.410
538.657
6.100
31.400
21.600
266.248
259.435
153.853
598.470
503.502
289.561
−13.400
−11.100
−17.500
−13.209
457.320
501.483
469.163
542.172
540.600
465.542
678.103
612.361
503.138
443.302
484.336
450.740
553.737
560.278
479.782
−0.823 −19.713
701.711
672.759
550.361
35.836
29.837
−3.818
538.632
457.361
502.465
469.660
604.018
599.629
490.307
904.428
792.780
580.511
443.505
484.463
451.080
553.894
560.625
479.783
702.626
673.420
550.374
sqrt(MSE)
−5.225
Std
Bias
Std
Bias
sqrt(MSE)
Kaplan–Meier estimator
Lee and Tsai’s estimator
455.782
497.459
454.390
549.819
554.792
478.634
666.906
612.976
497.892
455.083
497.638
456.367
545.974
566.159
490.888
683.782
649.392
526.103
AES
0.954
0.938
0.942
0.930
0.926
0.929
0.856
0.854
0.881
0.957
0.945
0.956
0.929
0.939
0.956
0.933
0.932
0.935
CR
15.701
23.507
21.138
−10.181
1.101
4.358
−12.684
−18.987
−23.854
−10.796
−16.014
−12.883
−15.800
−18.499
0.680
14.405
22.835
−4.202
Bias
NPMLE
421.121
455.782
417.597
469.893
489.888
422.673
614.458
606.850
509.650
411.697
445.981
413.698
505.719
514.169
435.084
703.195
677.260
524.809
Std
421.414
456.388
418.132
470.004
489.889
422.695
614.588
607.147
510.208
411.839
446.268
413.899
505.966
514.501
435.085
703.343
677.645
524.825
sqrt(MSE)
424.771
451.001
406.461
488.709
503.871
435.396
619.218
613.808
498.870
430.601
457.365
417.697
516.219
521.047
445.071
685.009
655.729
507.769
AES
0.955
0.943
0.933
0.955
0.964
0.946
0.939
0.945
0.934
0.955
0.959
0.960
0.946
0.946
0.956
0.928
0.931
0.936
CR
The AES and the CR are not reported for the Lee and Tsai’s estimator, since the corresponding variance estimator has not been developed. The values of Bias, Std, sqrt(MSE) and AES have been multiplied by 10,000 Bias the empirical bias, Std the empirical standard deviation, sqrt(MSE) the square root of the empirical mean squared error, AES the average estimated standard deviation, CR the empirical coverage rate of the 95 % confidence interval
0.6
1.0
2
2
0.6
1.0
1
0.6
1.0
1
2
0.6
1
2
0.6
0.6
1
1
0.3
0.3
1
0.3
1
1
θ
λ
Table 6 Simulation results when C = +∞ and U is generated from the exponential distribution with P (T > U) = 0.7
Z. Zhao et al.
1.0
1.0
2
2
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
τ
481.920
474.120
491.289
457.478
192.218
−12.100
−23.100
524.993
475.318
500.009
499.220
444.514
−24.800
148.054
57.458
320.370
218.342
106.272
517.978
451.950
2.600
−8.700
464.171
473.206
8.971
−5.500
487.732
440.079
16.974
−1.646
527.098
458.060
491.438
474.768
518.840
545.470
478.778
593.840
544.879
457.041
452.033
517.985
473.238
464.258
488.027
440.082
508.614
526.192
454.476
497.570
456.404
487.073
511.501
459.332
512.668
523.112
462.601
455.201
497.297
455.534
482.019
513.509
463.025
505.214
0.961
0.943
0.944
0.941
0.931
0.936
0.919
0.941
0.942
0.950
0.930
0.948
0.957
0.961
0.960
0.940
0.944
0.948
−25.766
−21.879
−24.034
22.617
27.558
6.531
1.890
2.863
4.028
−7.242
7.841
−2.166
9.506
19.143
−0.956
14.140
−7.799
−32.751
527.004
508.382
15.360
469.845
−9.926
487.632
486.215
−37.142
CR
Bias
AES
Std
Bias
sqrt(MSE)
NPMLE
Kaplan–Meier estimator
429.097
480.151
449.961
447.304
500.385
458.627
481.911
501.590
445.565
431.096
491.710
459.658
438.781
461.533
421.928
511.564
521.589
468.945
Std
429.870
480.650
450.602
447.875
501.143
458.673
481.914
501.598
445.583
431.157
491.773
459.663
438.884
461.930
421.929
511.759
521.647
470.088
sqrt(MSE)
476.058
485.176
441.301
492.088
496.141
444.893
528.818
522.948
457.066
448.312
479.562
443.548
473.189
492.530
448.100
528.501
520.417
458.227
AES
0.970
0.948
0.947
0.973
0.949
0.933
0.972
0.955
0.954
0.950
0.935
0.939
0.960
0.963
0.965
0.956
0.948
0.946
CR
The Lee and Tsai’s estimator is not included, since the additional independent censoring is not incorporated in this approach. The values of Bias, Std, sqrt(MSE) and AES have been multiplied by 10,000 Bias the empirical bias, Std the empirical standard deviation, sqrt(MSE) the square root of the empirical mean squared error, AES the average estimated standard deviation, CR the empirical coverage rate of the 95 % confidence interval
0.6
1.0
2
2
0.6
0.6
2
2
0.3
0.3
2
0.3
2
2
1.0
1.0
1
1
0.6
1.0
1
1
1
0.6
0.6
1
0.3
0.3
1
0.3
1
1
θ
λ
Table 7 Simulation results when P (T > C) = 0.15 and U is generated from the exponential distribution with P (T > U) = 0.3
Semi-Markov model for dependent censoring
123
123
0.6
1.0
1.0
1.0
2
2
2
2
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
0.7
0.5
0.3
τ AES
CR
559.884
32.400
45.700
36.000
286.414
288.481
163.339
658.430
458.891
495.830
460.734
549.552
550.581
486.410
663.681
620.296
496.285
500.169
465.061
−13.100
−11.600
317.395
547.570
448.635
−19.481
−20.200
501.497
577.838
−7.869
−7.043
674.713
636.189
530.033
54.060
15.475
11.840
460.033
497.932
462.139
619.710
621.579
513.103
934.881
835.606
589.100
465.206
500.340
449.089
547.916
577.881
501.559
676.876
636.378
530.165
456.942
497.505
453.857
550.980
555.227
478.177
670.392
612.199
496.722
454.898
497.480
456.502
546.076
565.730
490.966
684.240
650.307
525.895
0.951
0.954
0.951
0.927
0.919
0.917
0.849
0.826
0.873
0.944
0.938
0.965
0.943
0.944
0.931
0.942
0.947
0.928
621.620
−10.036 5.585
31.639
35.270
27.417
16.611
14.533
410.367
434.118
398.346
472.103
495.363
437.106
621.533
501.787
−10.674 4.927
428.946
457.657
400.871
510.310
525.635
451.606
662.199
625.898
504.538
Std
−4.168
−20.281
−22.370
−21.699
−15.676
−10.250
34.744
15.571
15.547
Bias
sqrt(MSE)
Bias
Std
NPMLE
Kaplan–Meier estimator
411.585
435.548
399.288
472.395
495.576
437.142
621.552
621.701
501.900
428.966
458.107
401.494
510.771
525.869
451.723
663.110
626.092
504.778
sqrt(MSE)
425.044
451.011
406.952
490.128
504.191
435.284
618.646
613.743
497.296
430.929
457.652
418.539
515.834
521.541
446.259
685.833
654.501
506.972
AES
0.959
0.956
0.945
0.952
0.954
0.949
0.932
0.943
0.945
0.941
0.957
0.960
0.938
0.944
0.937
0.949
0.955
0.947
CR
The Lee and Tsai’s estimator is not included, since the additional independent censoring is not incorporated in this approach. The values of Bias, Std, sqrt(MSE) and AES have been multiplied by 10,000 Bias the empirical bias, Std the empirical standard deviation, sqrt(MSE) the square root of the empirical mean squared error, AES the average estimated standard deviation, CR the empirical coverage rate of the 95 % confidence interval
0.6
0.6
2
2
0.3
0.3
2
0.3
2
2
1.0
1.0
1
1
0.6
1.0
1
1
0.6
0.6
1
1
0.3
0.3
1
0.3
1
1
θ
λ
Table 8 Simulation results when P (T > C) = 0.15 and U is generated from the exponential distribution with P (T > U) = 0.7
Z. Zhao et al.
Semi-Markov model for dependent censoring
1.0
Survival Function Estimate
0.6 0.4 0.0
0.2
Survival Rate
0.8
NPMLE Kaplan−Meier Esimator Empirical Distribution Pointwise 95% Confidence Interval Based on the NPMLE Approach Pointwise 95% Confidence Interval Based on the Kaplan−Meier Approach
0
20
40
60
80
100
t
Fig. 2 Kaplan–Meier estimate and NPMLE
For the cases with λ = 1 when T was independent of U and the cases with θ = 1 when (X, Δ) was fully observed, the Kaplan–Meier estimator was nearly unbiased and consistent and the corresponding 95 % confidence intervals provided reasonable coverage rates; the proposed NPMLE and the Lee and Tsai’s estimator were also nearly unbiased and consistent, and the coverage rate of the proposed 95 % confidence intervals were close to the nominal level. Nevertheless, the proposed NPMLE had smaller sqrt(MSE) than the Kaplan–Meier estimator. For the cases with λ = 2 and θ < 1, the Kaplan–Meier estimator had larger bias and appeared inconsistent, while the other two estimators performed well. 5.2 An example We illustrate our method with the data from a clinical trial of lung cancer conducted by the Eastern Cooperative Oncology Group, which was initially analyzed by Lagakos and Williams (1978). The data were also used by Lee and Wolfe (1998) and Lee and Tsai (2005) for the illustration of their methods. The complete data can be found in Lee and Tsai (2005). The study consists of 61 patients with inoperable carcinoma of the lung who were treated with the drug cyclophosphamide. Among the 61 patients, 28 patients experienced metastatic disease or a significant increase in the size of their
123
Z. Zhao et al.
1.0
Survival Function Estimate
0.6 0.4 0.0
0.2
Survival Rate
0.8
NPMLE Lee and Tsai’s Estimator Empirical Distribution Pointwise 95% Confidence Interval Based on the NPMLE Approach
0
20
40
60
80
100
t
Fig. 3 Lee and Tsai’s estimator and NPMLE
primary lesion, which are certain types of PD. In addition, Lee and Tsai (2005) did a pseudo second stage sampling, in which they selected 10 of these 28 patients as a random sample and pretended that the death time of the rest of the 18 patients were unknown. For illustration, we also pretended that these 18 patients’ death time were unknown. We treated the 10 selected patients as if they did not leave the study while the rest of the 18 did. Figure 2 shows the NPMLE and its pointwise 95 % confidence intervals along with the Kaplan–Meier estimator and its pointwise 95 % confidence intervals and the empirical distribution based on the complete data. Figure 3 presents the NPMLE and its pointwise 95 % confidence intervals along with the Lee and Tsai’s estimator and the empirical distribution based on the complete data. Figure 2 clearly shows that the Kaplan–Meier estimator is very different from the empirical distribution, and far away from the other estimators. Figures 2 and 3 show that both NPMLE and the Lee and Tsai’s estimator are very close to the empirical distribution. The proposed pointwise 95 % confidence interval contains the Lee and Tsai’s estimator and the empirical distribution.
6 Discussion In oncology studies, in addition to the usual independent censoring, the censoring caused by the PD is a certain type of dependent censoring. In this paper, we have
123
Semi-Markov model for dependent censoring
adopted the semi-Markov model provided by Lee and Tsai (2005) with an additional independent censoring and studied the NPMLE of the cause specific cumulative hazard functions and the marginal survival function, where we have shown the asymptotic normality of the NPMLE in the space of uniformly bounded functions and established its asymptotic efficiency. We have also developed uniformly consistent estimators for the covariance function based on the information operator. A limitation of the model (1) is that it assumes that the risk of death after disease progression only depends on the time since progression, but not the duration of the progression. The limitation could be addressed by regression type models with the inclusion of the time to progression as a covariate. It is of interest to develop diagnostic methods for the model and to investigate regression type models in future studies. Acknowledgments The research is supported by the National Natural Science Foundation of China (11271081) and the Student Growth Fund Scholarship of School of Management, Fudan University.
Appendix Preliminary Lemmas 1 and 2 provide related Donsker properties. Lemma 1 The classes {(x, δ) → δI {x t} : t 0} and {(x, δ) → δI {x t} : t 0} are Donsker classes of functions on R+ × {0, 1}. Proof Combining Lemma 2.6.15 and Lemma 2.6.18(iii, vi) of Vaart and Wellner (1996), it can be shown that {(x, δ) → δI {x t} : t 0} and {(x, δ) → δI {x t} : t 0} are VC-classes on R+ × {0, 1}. By Theorem 2.6.8 of Vaart and Wellner (1996), they are Donsker classes, where the measurability conditions could be verified via the denseness of the rational numbers.
Lemma 2 Let η be a positive real number and Λ be a nondecreasing cadlag function !t on [0, η]. The class (x, δ) → 0 δI {x y} Λ (dy) : t ∈ [0, η] is a Donsker class of functions on R+ × {0, 1}. Proof The result is easy to see by combining Example 2.6.21 and Example 2.10.8 of Vaart and Wellner (1996).
Lemma 3 Let η be a positive real number and BV ([0, η]) be the space of all cadlag functions defined on [0, η] whose total variation are bounded by 2. For any (F, G, H) ∈ BV ([0, η]) × BV ([0, η]) × BV ([0, η]) and any t ∈ [0, η], let φ (F, G, H) [t] = F (t) G (t) −
G (u) H (t − u) F (du) . [0,t]
123
Z. Zhao et al.
For any F0 , G0 , H0 ∈ BV ([0, η]) , φ is Hadamard differentiable at (F0 , G0 , H0 ) with derivative φ (F0 , G0 , H0 ), where for any t ∈ [0, η], φ (β0 ) [β] [t] = F0 (t) G (t) + F (t) G0 (t) − G0 (x) H0 (t − x) F (d x) [0,t] G0 (x) H (t − x) F0 (d x) − G (x) H0 (t − x) F0 (d x) , − [0,t]
[0,t]
where β = (F, G, H) and β0 = (F0 , G0 , H0 ). Proof Let F0 , G0 , H0 , F, G, H ∈ BV ([0, η]) , {(Fm , Gm , Hm ) : m ∈ N} be a sequence converging to (F, G, H) in BV ([0, η]) × BV ([0, η]) × BV ([0, η]) and {h m : m ∈ N} be a sequence of real numbers converging to 0. Denote β0 = (F0 , G0 , H0 ) and βm = β0 + h m (F, G, H). Note that for any t ∈ [0, η] , φ(βm )[t] = φ(β0 )[t] + h m Γm (t) + h 2m Wm (t), where Γm (t) = F0 (t) Gm (t) + Fm (t) G0 (t) − G0 (x) H0 (t − x) Fm (d x) [0,t] G0 (x) Hm (t − x) F0 (d x) − Gm (x) H0 (t − x) F0 (d x) , − [0,t] [0,t] H0 (t − x) Gm (x) Fm (d x) Wm (t) = Fm (t) Gm (t) − [0,t] G0 (x) Hm (t − x) Fm (d x) − Gm (x) Hm (t − x) F0 (d x) − [0,t] [0,t] Gm (x) Hm (t − x) Fm (d x) . − hm [0,t]
For any t ∈ [0, η], let G0 (x) H0 (t − x) F (d x) Γ0 (t) = F0 (t) G (t) + F (t) G0 (t) − [0,t] G0 (x) H (t − x) F0 (d x) − G (x) H0 (t − x) F0 (d x) . − [0,t]
[0,t]
Note that for any t ∈ [0, η] , |Wm (t)| 28 + 8h m and |Γm (t) − Γ0 (t)| 6 sup |Fm (s) − F (s)| + 6 sup |Gm (s) − G (s)| s∈[0,η]
s∈[0,η]
+ 4 sup |Hm (s) − H (s)| . s∈[0,η]
Hence, as m → ∞, supt∈[0,η] |Wm (t)| = O(1) and supt∈[0,η] |Γm (t) − Γ0 (t)| = o(1).
123
Semi-Markov model for dependent censoring
Therefore, φ is Hadamard differentiable at β0 and for any t ∈ [0, η], G0 (x) H0 (t − x) F (d x) φ (β0 ) [β] [t] = F0 (t) G (t) + F (t) G0 (t) − [0,t] G0 (x) H (t − x) F0 (d x) − G (x) H0 (t − x) F0 (d x) , − [0,t]
[0,t]
where β = (F, G, H).
Proof of Theorems 1 to 3 For any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any t 0, define (i, j)
Mn
(i, j)
(t) = Nn
(t) − [0,t]
(i)
(i, j)
Yn (y) Λ0
(dy) .
Proof Combining Theorem 2.10.6 of Vaart and Wellner (1996) with Lemmas 1 and 2, it can be shown that, as n → ∞, (0,1) (0,2) (1,2) (0) (0) (1) (1) n −1/2 Mn , Mn , Mn , Yn − EYn , Yn − EYn weakly converges to a tight zero mean Gaussian process in ∞ ∞ ∞ ∞ ∞
∞ 5 ([0, τ ]) = ([0, τ ]) × ([0, τ ]) × ([0, τ ]) × ([0, τ ]) × ([0, τ ]) .
Combining this with Lemma 3.9.17, Lemma 3.9.25, and Theorem 3.9.4 of Vaart and Wellner (1996), for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)}, −1 (i, j) (i, j) (i, j) (i) ˆ L0 (y−) sup Λn (t) − Λ0 (t) − n −1 Mn (dy) = o p n −1/2 , t∈[0,τ ]
[0,t]
(10) By the continuity and the linearity of the integral operator, as n → ∞, (0,1) (0,2) (1,2) − Λ0 , Λˆ n(1,2) − Λ0 n 1/2 Λˆ n(0,1) − Λ0 , Λˆ (0,2) n weakly converges to a tight Gaussian process in ∞ 3 ([0, τ ]) with covariance function U0 . By Lemma 3.9.30 and Theorem 3.9.4 of Vaart and Wellner (1996), as n → ∞, (0,1) (0,2) ˆ (1,2) (1,2) − S , S − S n 1/2 Sˆ n(0,1) − S0 , Sˆ (0,2) n n 0 0 weakly converges to a tight Gaussian process in ∞ 3 ([0, τ ]) with covariance function V0 .
123
Z. Zhao et al.
Combining Lemma 3 with Theorem 3.9.4 of Vaart and Wellner (1996),
sup Sˆ n (t) − S0 (t) − φ (β0 ) βˆn − β0 [t] = o p n −1/2 , t∈[0,τ ]
(0,1) (0,2) (1,2) (0,1) (0,2) (1,2) where β0 = S0 , S0 , S0 and βˆn = Sˆ n , Sˆ n , Sˆ n . 1/2 ˆ Therefore, as n → ∞, n (Sn − S0 ) weakly converges to a tight zero mean
Gaussian process in ∞ ([0, τ ]) with covariance function Ω0 . Proof of Theorems 4 and 5 Proof To obtain the efficiency, we will verify the conditions of Theorem VIII.3.2 and Theorem VIII.3.3 of Andersen et al. (1993). First, we verify the local asymptotic normality (LAN) Assumption (Assumption VIII.3.1 of Andersen et al. 1993). By the Taylor expansion, it is easy to show that for any h = (h (0,1) , h (0,2) , h (1,2) ) ∈ H, log Rn (h) = Sn(0,1) (h) + Sn(0,2) (h) + Sn(1,2) (h) −
1 h2H + o p (1) , 2
(11)
where for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any h = (h (0,1) , h (0,2) , h (1,2) ) ∈ H, (i, j) Sn (h)
=n
−1/2
[0,τ ]
−1 (i, j) (i, j) h (i, j) (y) 1 − Λ0 {y} Mn (dy) .
Hence, for any m ∈ N+ and any h 1 , . . . , h m ∈ H, (log Rn (h 1 ) , . . . , log Rn (h m )) weakly converges to a Gaussian random vector in Rm with mean 1 − (h 1 H , . . . , h m H ) 2 and covariance matrix ⎛
h 1 , h 1 H . . . ⎜ .. .. ⎝ . . h m , h 1 H · · ·
⎞ h 1 , h m H ⎟ .. ⎠. .
h m , h m H
Next, we verify the Differentiability Assumption (Assumption VIII.3.2 of Andersen et al. 1993).
123
Semi-Markov model for dependent censoring
For any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)}, any h = (h (0,1) , h (0,2) , h (1,2) ) ∈ H, any t ∈ [0, τ ], (i, j) (i, j) (i, j) h (i, j) (y) Λ0 (dy) . n 1/2 Λn,h (t) − Λ0 (t) = [0,t]
Note that, for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)}, any h = (h (0,1) , h (0,2) , h (1,2) ) ∈ H, and any t ∈ [0, τ ],
(i, j)
[0,t]
h (i, j) (y) Λ0
(i, j)
(dy) hH Λ0
−1 (i) . (τ ) L0 (τ −)
Hence, the Differentiability Assumption is verified. (0,1) ˆ (0,2) ˆ (1,2) ˆ is regular. Next, we show that Λn , Λn , Λn Recall the weak convergence of the log-likelihood ratio log Rn (h). The continuity follows from the Le Cam’s third Lemma and the weak convergence can be shown via similar arguments in the previous subsection. Hence, the regularity follows. It remains to check Equation (8.3.5) of Andersen et al. (1993) for each coordinate. For any t ∈ [0, τ ], define (0,1)
γt
(0,1) (0,2) (0,2) (1,2) (1,2) = ϕt , 0, 0 , γt = 0, ϕt , 0 , γt = 0, 0, ϕt
where for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any s ∈ [0, τ ], (i, j)
ϕt
(i, j) (i) (s) = I {s t} L0 (s−)−1 1 − Λ0 {y} .
By (10) and (11), for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any t ∈ [0, τ ], 1 (i, j) (i, j) (i, j) + h2H = o p n −1/2 . Λˆ n (t) − Λ0 (t) − n −1/2 log Rn γt 2 Therefore, combining all the above results with Theorem VIII.3.2 and Theorem (0,1) (0,2) (1,2) VIII.3.3 of Andersen et al. (1993), it follows that Λˆ n , Λˆ n , Λˆ n is asymptotically efficient. Combining Theorem VIII.3.4 of Andersen et al. (1993) with Lemma 3, the efficiency of Sˆ n follows.
References Andersen PK, Borgan O, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer, New York Datta S, Satten GA, Datta S (2000) Nonparametric estimation for the three-stage irreversible illness-death model. Biometrics 56:841–847 Kiefer J, Wolfowitz J (1956) Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. Ann Math Stat 27(4):887–906
123
Z. Zhao et al. Lagakos SW, Williams JS (1978) Models for censored survival analysis: a cone class of variable-sum models. Biometrika 65(1):181–189 Lee SY, Tsai WY (2005) An estimator of the survival function based on the semi-markov model under dependent censorship. Lifetime Data Anal 11:193–211 Lee SY, Wolfe RA (1998) A simple test for independent censoring under the proportional hazards model. Biometrics 54:1176–1182 Scholz FW (1980) Towards a unified definition of maximum likelihood. Can J Stat 8:193–203 Tsiatis AA (1975) A nonidentifiability aspect of the problem of competing risks. Proc Natl Acad Sci USA 72:20–22 Van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes: with applications to statistics. Springer, New York
123