Construction of Expanded Continuous Life TablesA Generalization of Abridged and Complete Life Tables JOHN J. HSIEH Department of Preventive Medicine and Biostatistics, University of Toronto, Ontario M5S IA& Canada Received II May 1990; revised 20 August 1990

ABSTRACT This article extends the recent abridged life-table method of Hsieh [7]. It generalizes the conventional discrete (abridged and complete) life tables into a continuous life table that can produce life-table functions at any age and develops a unified method of life-table construction that simplifies the disparate laborious procedures used in the traditional approach of constructing abridged and complete life tables. A set of precise procedures based on the complete cubic spline for the main mortality law for advanced ages is developed for estimating

body of the table and a the basic and nonbasic

life-table functions from a given mortality schedule. The proposed method can also produce more life-table functions than other existing methods. The method is illustrated with Canadian data.

1.

INTRODUCTION

Traditionally, widely different approaches have been used in the construction of abridged versus complete life tables in current (period) analysis. A further deficiency of the existing life-table procedure is the lack of a precise method for estimating life-table functions at fractional ages and for constructing more detailed life tables than the complete life table. In this article, I introduce the concept of a continuous life table as a generalization of the existing discrete (abridged and complete) life tables and present a unified method for constructing these life tables for the entire age span from a given mortality schedule. In addition to avoiding the traditional laborious graduation procedures and providing more accurate estimation of the conventional life-table functions, the present method features two advantages: (1) It allows calculation of additional useful functions such as the death density function, the hazard function (force of mortality), and the generalized and conditional expectations of life that are not available in a conventional current life table, and, more important, (2) it allows one to MXTHEMATICAL

BIOSCIENCES

OElsevier Science Publishing 655 Avenue of the Americas,

103:287-302

(1991)

Co., Inc., 1991 New York, NY 10010

287 00255564/91/$03.50

288

JOHN J. HSIEH

compute these functions at any age points and intervals, in contrast to the traditional life-table method, which gives life-table functions only at certain integral ages such as the age division points and intervals corresponding to the age grouping of the population and death data. The developments given below are an extension of my recent article on abridged life-table methods (Hsieh [71). Suppose we partition the whole age span [O,w), where w is the maximum life length, into a mesh consisting of k + 1 intervals Zi = [xi, xi+ 1)r i=O,l , . . . , k, with xr, = 0 and xk+i = w, so that the lengths of the age intervals Aj = xi+ 1- xi are A, = 1, A, = 4, hi = 5, i = 2,3,. . . , k - 1 (where k is normally taken to be either 18 or 19, corresponding to x,s = 8.5 or of the age span corresponds to the x i9 = 90). The above partitioning conventional age groupings of published data on vital statistics and midyear population estimates and censuses from which the mortality schedule is derived for constructing life tables. The reason for employing 5-year, rather than shorter, age groupings is to reduce random as well as systematic errors inherent in the raw data (including such errors as age misreporting). The first age interval is taken to be 1 year long because of the heavy death toll toward the beginning of life and because the pattern of infant mortality departs considerably from that of the rest of the age intervals. A life table describes, in terms of various functions, the steady-state distribution of the first-passage time to the occurrence of a well-defined point event, such as death of an individual, in a population subject to a set of age-specific (or duration-specific) mortality probabilities qi, the conditional probability of death in Zi given survival to age xi, i = 0,1,. . . , k. In this article, I assume that the input data are the q,‘s rather than the observed occurrence-exposure death rates Mi, and my aim is to construct an expanded continuous life table from (43. In current analysis, the transition from Mi to qi belongs to one of the problems of abridged life-table construction that has been dealt with by many authors (see Hsieh [7] and references therein). In cohort analysis, this transition is not required. 2. 2.1.

THE GENERAL DESCRIPTION

ESTIMATION

OF LIFE-TABLE

PROCEDURES

FUNCTIONS

Let X be the lifetime random variable defined on [0, w) and assumed to be absolutely continuous so that its density exists for almost all points. The probability measure P induced by X is defined on the Bore1 sets generated by the subsets of the sample space [O,o). Our expanded life table will contain 10 life-table functions (the first five being point functions and the remaining five set functions). They are expressed in terms of the probability distribution of the random variable X and are given interpretations with

EXPANDED

CONTINUOUS

289

LIFE TABLES

respect to the event death as follows. [In the formulas below, 0 Q x < y < z < w; Z(A) is the indicator function, which takes the value 1 if the event A is true and 0 if A is false; and I(0) is an arbitrary positive number called the radix, normally taken as 1 or lOO,OOO.] (1) Survival function: cohort of size I(O),

the number

of survivors to age x out of an initial

E(x)=I(O)P(X>x) (21 Person-years above x: the total number of person-years lived (or total number of persons alive in the steady-state life-table population) beyond age x among an initial cohort of size E(O), T(x)

= l(0)JWP( x

(3) Death density function:

x > u) du = ~T(y> x

instantaneous

probability

dy

per unit time of

death at age x,

f(“‘yl[ (4) Hazard

P(xLodu =

Y

P(X>

x)

ages y and z

y)]X>x}

LLy zl =A

l(x)

The three point functions l(x), T(x), and e(x) and the two set functions L[x, y] are as in the conventional life table, except that this article provides a method of computing these functions at any age points x and age intervals [x, y), with x < y, and not just at the prescribed age division points xi and intervals Z,. The remaining functions-which are no less important-are generally not available in traditional life tables. It should be noted that of all the life-table functions described above, only three need be estimated (from the prescribed data {qi}). The remaining life-table functions are all directly calculable from these three basic functions without approximations. The three basic functions can be chosen from either l(x) or d[x, y], either f(x) or h(x), and either T(x) or L[x, y]. Except for the first (i = 0) and the last (i = k) age intervals, we shall use spline interpolation, integration, and differentiation procedures to estimate the three basic functions l(x), L[x, y], and f(x) from the given mortality probabilities qi. For the first year of life I,, spline procedures are not appropriate, because mortality is extremely high at birth and drops very sharply right after birth. For the last open-ended age interval Zk = [xk, o), spline procedures are not applicable because data by five-year or finer age groupings are not available within this interval, and even if they were available, they would not likely be reliable, owing to either sparsity or poor

d[x, y] and

EXPANDED

CONTINUOUS

291

LIFE TABLES

quality, and are therefore disregarded. We therefore adopt parametric estimation procedures for these two extreme intervals and use an appropriate mortality law to close the life table. The procedure for constructing life tables for the first year of life (i = 0) is given in Hsieh [5] and hence will not be repeated here. In this article the construction of life tables begins with age 1. Once the qi, i = 0, 1,. . ., k - 1, are given (either taken from an abridged life table or from a given set of formulas for computing ql), the survival functions li = l(x;), i = 1,2,. . . , k + 1, are obtained from i-l ‘i=~Ojlll(l-qj),

where 1(O)= 1, is the arbitrary radix. Note that qk = 1, because everyone must eventually die, so I, + t = 0. Equation (1) implies that knowing the q,‘s is equivalent to knowing the fi’s. Henceforth we shall use {/J as the prescribed data to construct life tables. 2.2.

METHOD

FOR THE LAST AGE INTERVAL

The Gompertz law of mortality has proved useful as a model for fitting the lifetime distribution for the last open-ended age interval (see Hsieh [7]). Unlike other life distributions such as the Weibull, gamma, lognormal, and Pareto distributions, the Gompertz distribution possesses a desirable invariance property: Aside from the parameters, the functional form of the Gompertz distribution remains unchanged under residual life transformation X - XIX > x, for every x > 0. However, there are two drawbacks with the Gompertz model: (1) The hazard function of the Gompertz distribution does not tend to infinity at any finite age, implying that the lifespan is not finite under this model, and (2) the exponential hazard function of the Gompertz distribution tends to overestimate the force of mortality at advanced ages (from 80 to 105 years). To correct these two defects, we modify the differential equation in terms of the hazard function h(x) for the Gompertz model h’(x) = @(xl - which yields an exponential hazard function as solution-to the more general differential equation

h’(x)=y[a+ph(x)+h(x)2],

(2)

with

a],

(15)

where the si’s, as in (lo), are obtained from solving (11) in conjunction with (12). For x E Z,, an estimate of the death density function is given in Section 2.2 by (7). For x in [xi, x,), we divide (15) by (10) to get a spline estimate of the hazard function: h(x) For ages x E Zk, the estimate by (4).

Ash(x)

=qx)/s/(x).

of the hazard function

(16) is given in Section 2.2

296

JOHN

J. HSIEH

Since differentiation of a spline results in a spline (of one lower order) that still possesses optimum approximation properties, the use of the spline method of differentiation, unlike other numerical differentiation procedures, retains the accuracy of the estimates of both death density and hazard functions. To obtain estimates of T(x) for any x E [xi, xi+ t), i = 1,. . . , k - 1, we need the integral L[x, xi+,] L /; ~+ls,(t)dt. This is obtained by integrating (10) from x to xicl, yielding

3(Si+2si+l)Ai+3(1i-‘i+l) +(xi+l-x)

_

3/q

(xi+l-x)2si+l

2 When

+I_

r+l

(x,

I+

1_x)

(17)

x = xi, (17) reduces to

so that

T(x~) +(t)dt x, -jX’s,(t)dt+~~l(t)dt=k~lLj+T(x,), j=i xi xk

(19)

where T(x,) is given by (9). The tail person-years integral T(x) for x E Z, can now be obtained by adding L[x, xi+11 to T(x,+t> given by (19):

T(x)=L[x,xi+ll+T(xj+l).

(20)

Since integration of a spline results in a spline (of one higher order) that still possesses optimum approximation properties, formula (18) is more accurate than other existing numerical solutions or life-table methods of estimation for L,, and so is (20). For ages x beyond xk, the estimate of the person-years lived beyond x is given by (8).

297

EXPANDED CONTINUOUS LIFE TABLES

With tail person-year integral T(x) computed as above, pectancy (mean residual life function) is obtained by

the life ex-

(21)

e(x)= T(x)/l(x),

where for x l[xi,x~), I(x)& sI(x) and T(x) are given by (10) and (201, respectively, and for x E Ik, I(x) and T(x) are given by (6) and (81, respectively. For 0 G x < y, the stationary population or person-years lived between ages x and y, L[x, y], can be obtained by subtracting T(y) from T(x) given in (8) or (20) depending on whether x (or y> is greater than xk or not, namely, L[x,yl=T(x)-T(y).

(22)

For 0 < x < y < z, the generalized conditional mean lifetime, e[x; y, z], which represents the mean number of years lived in [ y, z] for a person alive at age x, is obtained by dividing L[ y, z] given in (22) by I(x) given in (6) or (10) depending on whether x is greater than xk or not, namely, (23) If y = x, then (23) reduces to the well-known Markov mean residence time. When y = x and z = w, then (23) reduces to the usual life expectancy e(x) of (21). The generalized mortality probabilities q[ x; y, z] and mean lifetimes e[x; y, z] are useful life-table functions that have not received sufficient attention. They would provide additional instruments for mortality analysis. 3.2. COMPLETE

AND ABRIDGED

LIFE TABLES AS SPECIAL

CASES

An abridged life table is characterized by the partition of the entire age span into k + 1 (normally equal to 19 or 201 intervals according to the given mesh described in Section 1. In a complete life table, all age intervals are of one-year length, so that the age division points are the positive integers. These two existing discrete life tables are special cases of the continuous life table introduced in this article. Estimates of the abridged life-table functions are obtained by substituting the xi values for the exact ages x in the estimation formulas for the continuous life-table functions given in Section 3.1 for x < xk only. (Estimation formulas for x > xk are not required for the construction of abridged life tables.) Estimates of the complete life-table functions are obtained by substituting appropriate integer values of x into the estimation formulas for the continuous life-table functions in Section 3.1. For the abridged life table, the estimation formulas for Z(x) and f(x) would reduce to the relevant prescribed data and/or spline parameters at the age division points. The resulting estimation

23

20 21 22

156 157 156 153

97520 97363 97207

. . . . ij;.

161 160 157

160

I ij,.

21 28

97676

$isid

98673

.

5 id

2 3 4 21 28

1091 119

1OOOOO 98909 98790 98726 98694 65 32

1091 120

d[xj9x,+ll

1(x,)

xj

0 1 64 32

lo%*,

table deaths

Survival function

Age

155 89 45 23 23

153 88 45 23

155 157 157 155

161 161 159

159

. . .l.50.. . . . . ij; .

1800879

10%(Xj)

1800879

105f(x,)

(6) Hazard function

23

1

in

5022881 4925596

97286 97288

..

6692674

6890248 6791464

7087662 6988997

97286 97284

.

T(x,j) 7186696

5315445 5217451 5120165

$i995’ .

98791 98604

98749 98783

99033 98666

above x1

(8) Stationary population

52.50 51.59 50.67

53.42

69.79 68.81 67.83

71.87 71.66 70.75

e(x,)

expectancy

(9) Estimated life

. . . ‘jq.j;

Males 1980-1982

LxjTx,+l) L[xj~xj+ll

(7) Stationary population

Life Table for Canadian

Death density

(5)

of a Continuous

(4) Proportion dying in interval

An Extract

(3) Life

(2)

(1)

TABLE

.

52.47 51.55 50.63

54.31 53.39

... .

69.77 68.80 67.84

71.88 71.67 70.73

Official life expectancya

(10)

.

s s Z 2

8

;&G

500 4175

89914 88099

83489 83244

ib%

16622 16122

11946 10297

8077 7572

52.3 54.7

59.2 59.4

ss.d 86.4 86.6

88.4 89.2

90.4 90.7

..

153

151

‘i8;j;

1979 1719 1653

0

2519 2485 2145

25899 13810 21552 6261

3011

;;34’

147 145 .. . .. . . .. . .. . . . . 893 204 255 287 3959 312 2018 675 5233 843 293 1215 1232

.

48682 39794 28801 26454 2347 0

77125 73851

10993

25169

3.49

3.86 3.57

4.64 4.58 4.08

‘;,o; .

....

. ...

n.a.

1980-1982, Catalogue 84-532, Ottawa, 1984. Official life table

21826

17954 19215 21277

15157 15416

‘lb2

%$18.59 18.44 1552381 1535346

17035

1455 1480

.

23.94 22.00

2152921 1937822

215099 385441

751 956

Gi3 3274

31.37 30.64

2944213 2868925

75288 716004

jq,j;

306 333

‘~i7~8Oi

G%,4

...

49.71 48.78

49.75 48.82

4828307 4731017

97291 9655 1

ii8

155 150

‘l;&

“Source: Statistics Canada, Life Tables, Canada and Provinces, figures not available for fractional ages.

506 0

1650 2219

245

1815 4610

239 3706

93860 93620

43.7 44.5

846

iitidi

142

25

4o.i

148

24

300

JOHN J. HSIEH

formulas, which take simpler forms for abridged and complete life tables, are described below. When x = xi, the equation for the estimate of the survival function, Equation (lo), reduces to the abridged life-table values sI(xi) = li, i = 1, . . . , k. When z = x + 1 and y = x is a nonnegative integer, Equations (13) for life-table deaths and (14) for generalized mortality probability would reduce to the conventional complete life-table functions d, = l(x) - I( x + 1) and q, = 1 - I(x + 1)/Z(x), respectively. Furthermore, if z = xi+ 1 and x = y = xi, then (14) and (13) reduce to the prescribed data qi and the conventional abridged life-table functions di = liqi = li - Zi+l. For the last age interval (i = k), we have qk = 1, so that lkfl = 0 and d, = 1, for the abridged life table. When x = xi, the equation for the estimate of the death density, Equation (15), reduces to the abridged life-table values f(xi) A sf(xi) = _ si /I,, and Equation (16) for the estimate of the hazard function reduces to the abridged life-table values h(xj) A sh(xi) = - si /fi. For the abridged life table, the estimate of Li, the person-years lived in the age interval L,, is given by (18), and the estimate of T(xi), the person-years lived beyond age xi for i = l,..., k, is given by (19). The life expectancy at age xi, e(x,>, i=l3 ,-,..., k, is then estimated by dividing (19) by li to get e(x,) = T(x,)/lj. 4.

AN EXAMPLE

The method just presented has been used to construct (expanded) complete and continuous life tables for Canadian populations for various time periods using the li, i = 1,. . ., 18, values derived from the abridged life-table method of Hsieh [7]. Table 1 displays some specimen entries of a continuous life table. Note that different continuous life tables can be obtained by changing the way the age span is partitioned with accompanying changes in the values .of the life-table functions. To obtain the values of the life-table columns in the table, the spline methods of Section 2.3 (for ages less than x18 = 85) and the procedure of fitting the mortality law described in Section 2.2 (for ages greater than 85) were employed to estimate f(x) (using (10) for x E (1,851 and (6) for x ~[85,wl), f(x) (using formula (15) for x E (1,851 and (7) for x E [85, WI>, h(x) (using (16) for x E (1,851 and (4) for x E [85, WI), and T(x) (using (17), (19) and (20) for x ~(1,851 and (8) for x E [85,w]) from the prescribed li values given in Hsieh [7]. The remaining life-table functions are computed directly from these basic functions, namely d[ x, y] from (13), q[x; y, z] from (141, L[x, y] from (22), and e[x; y, zl from (23). The two set functions d[x, y] and L[x, yl are given in columns 3 and 7, respectively, of Table 1. For x = y < z, the set function q[x; y, z] is given in column 4 of Table 1.

301

EXPANDED CONTINUOUS LIFE TABLES

For x < y < z, the two set functions q[x; y, z] and e[x; y, z] are not given in the life table (because of the large volume of values these functions produce and the limited space), but they can be directly calculated from the point functions l(x) and T(x). For example, from Table 1 we have q[40.2;44.5,54.7]

=

93,6~4;;;~099 = 0.0583

and e[40.2;44.5,54.7]

=

2,868,643 - 1,937,340 = 9.833 years, 94.705

giving, respectively, the probability of dying and the average number of years lived, between the ages of 44.5 and 54.7 for a person alive at age 40.2 randomly chosen from the stationary population for Canadian males based on the 1980-1982 mortality schedule. A comparison of the complete life-table values of the official Canadian life table [9] with the corresponding values constructed by the present method shows only slight differences in all life-table columns. For example, the life expectancy column differs invariably in the second decimal place (see columns 9 and 10 of Table 1). For fractional ages, both the life-table functions displayed in Table 1 and the generalized life-table functions that can be calculated therefrom are new products. No published results are available to make comparisons with.

5.

TESTS OF ACCURACY

AND CONCLUDING

REMARKS

We have performed detailed tests of accuracy of the proposed life-table method using the procedures described in Hsieh [7] and found that my continuous life-table method is as accurate as my abridged life-table method, which was compared with and found to be more accurate than other existing life-table methods in Hsieh 171. To demonstrate the importance of accurate determination of the end conditions for spline interpolation, differentiation, and integration, I have also compared the complete spline interpolate with the natural spline interpolation and found that the former interpolation performed considerably better than the latter. [The details are omitted here and are available upon request.] It should be pointed out that the main goal of this paper is not so much to improve the accuracy of the life-table functions, but rather to be able (1) to cover the extreme age intervals, (2) to produce life-table function values at fractional ages and, (3) to calculate nonconventional life-table functions. It is also hoped that the mortality law derived in Section 2.2 will be useful in other areas of research.

302

JOHN

J. HSIEH

Of all the formulas developed for the construction of continuous life tables in this paper, the key ones are (4), (6), (8)-(lo), (12), (1.9, and (17). Besides the life-table functions studied in this article, other useful functions that can be accurately estimated by the spline method include generalized survival functions of orders greater than 2. Just as the second-order survival function T(x), which is the tail integral of the first-order survival function l(x), can be accurately estimated by integrating the complete cubic spline interpolating to the li values, so can the third-order survival function Y(X), which is the tail integral of the second-order survival function T(x), be accurately estimated by integrating the complete cubic spline interpolating to the T, = T(xi) values, and so on. Y(x) represents stationary forward or backward recurrence times in renewal theory (see Feller [3], pp. 368-372) and is useful for calculating the average age at death of any segment of the stationary population and for estimating the standard errors of the estimate of e(x) (see Hsieh L.51). In a different direction, the new method can be extended to construct competing-risk life tables, which are concerned with estimation of net and partial-crude probabilities and other life-table functions derived from them (see Hsieh [6]), as well as increment-decrement (multistate) life tables. These extensions as well as the new life-table functions mentioned in the last paragraph will be dealt with further in a separate paper. This research was supported by National Sciences and Engineering Research Council of Canada operating grant OGP0009253. The author wishes to thank Jian Li for his assistance in testing the continuous life-table method, Nathan Keyfitz who read an earlier draft of this paper, and the anonymous referees for their helpful comments. REFERENCES J. H. Ahlberg,

E. H. Nilson,

and

J. L. Walsh,

The Theory

of Splines and Their

Applications, Academic, New York, 1967. C. de Boor, A Practical Guide to Splines, Springer-Verlag, New York, 1978. W. Feller, An Introduction to Probability Theory and Its Applications, Vol. 2, Wiley, New York, 1971, pp. 368-372. J. J. Hsieh, Use of cardinal splines

in the construction

of life tables,

Proc. Comput.

Sci. Stat. 12:327-331 (1979). J. J. Hsieh, Construction of expanded infant life tables-a method based on a new mortality law, Math. Biosci. 76:221-242 (1985). J. J. Hsieh, A probabilistic approach to the construction of competing-risk life tables, Biomet. J. 31:339-357 (1989). J. J. Hsieh, A general theory of life table construction and a precise life table method. Biomet. J. 32 (forthcoming). L. L. Schumaker, Spline Functions, Wiley, New York, 1981. Statistics Canada, Life Tables, Canada and Provinces, 1970-72 and 1980-82, Authority of the Ministry of Industry, Trade and Commerce, Ottawa, 1974, 1984.

Construction of expanded continuous life tables--a generalization of abridged and complete life tables.

This article extends the recent abridged life-table method of Hsieh. It generalizes the conventional discrete (abridged and complete) life tables into...
859KB Sizes 0 Downloads 0 Views