The Sconce of ~ e Total Environmen~ 5 (1976) 139-169 © E l ~ v ~ r S c ~ m ~ c P u n c h i n g Company, Am~erdam ~ Pfin~d in Belgium

MEASURES OF ASSOCIATION OF SOME AIR POLLUTANT& NATURAL IONIZING RADIATION AND CIGARETTE SMOKING WITH MORTALITY RATES ~

~ C H A R D ~ SCHWING ~t~'4nafi~ Depamnent. G ~ , m l ~ m m ~. 4~090 t ~ & 4 . )

R(,s~h

Laho~mm~, G ~ e ~ l Mom~ ~ ) ~ . ,

~an~,

GARY C. McDONALD ~a~na~ l)~a~lmen~ G ~ t ~ 48090 (U.'S.~.)

Mom~ R ~ e a ~

Lahommne~ G ~

Mom~ Co~., ~ v e n .

Mk~.

~Rec~ved June 2~h, 19~)

ABSTRACT

Two methods are em~oyed to estima~ the asso~m~n of hydrocarbons, sulfur compounds, nitrogen compounds, natural ionizing ~dim~n, and c~a~tte smoking with some age ~rmified and disease specific United States mon~ity rates for wtme males. The first m~hod is based on a ridge ~gression ~ c h n ~ u e and the second on a sign conmrained learn squa~s a n ~ y ~ The measu~ of assodm~n b~ween these environment~ ~ o ~ and mon~ity a ~ quantified as ~am~s; i.e., the ind~ated pe~entage change in the average mon~ity rate cor~spon~ng to a 1% change in the average level of the en~ronment~ ~ o ~ Elasticities a~ esfimmed for age specific and disease specific morality rates, and these values a ~ then agg~gmed and compared to esfim~es cor~sponding to total mortality rates. Overall, con~s~nt ~sults are obt~ned using the above methods for sulfur compounds and c~aret~ smoking. Many of these ~sults differ con~de~ ably from corresponding results obtained from the ordinary least squares regression an~y~s, h ~ h t ~ g the need for applying the appropfime e m i m ~ n methods. In addition to the varia~es already specified, these analyses take into cons~erm~n the following groups of ex~anmory varia~es: Climme - - P ~ d ~ t a t ~ n , January ~mperatu~, July ~mpermu~, humidity, and solar radiation. So~oeconom~ - - Age, e d u c ~ n , sound housing, populat~n per household, population density, % non-white, % w h i ~ income, and city size.

m the Mr Syrup. R ~ e n t A ~ n . 24 m 28 June 19~.

~ P~n~d

A~essment Heolth EJlb~ ~ t ~ n m .

Pollut., Pat4s, ~ n c e ,

139

1. INTRODUCT~N The purpose of this study is to e~imate the association between certain air pollutan~ and specific mortality rates. Other multi~e regression analyses on large data banks of mortality rates and pollution measures have ind~ated an assochtion between high environmental pollution levels and increased mort~ity. Lave and Seskin ~, Hickey, et al. ~ and Carnow and Meie~ are among several autho~ who have recently studied the chronic health effects of pollution by means of multivariate regression analysis including as ex~anatory variables a number of interdependent urban i~cto~ which affect health. Neyman ~s has illu~rated pitfalls in mulfipollutant, mu~ilocality studies. Specifically, he pinpoints the protein of incom~ete comprehensiveness of the set of pollutan~ studied and the subsequent results in the lbllowing statement: "if the study involves a ce~ain number s of pollutants, say P~, P2..... P,, but neglects another pollutant P,, that happens to be important, the condu~ons regarding P~, P2..... P,, suggested by even very highly ~gnificant findings, may be com~et~y misleading". To address this criticism of previous studies, we have included in this study a rather broad (but still incom~ete)list of explanatory variables. Both gaseous and particulate forms of pollutan~ from both mobile and ~ationary sources are included in this study along with natural ionizing radiation, smoking, four climate and ten socioeconom~ variables. Rienke ~ has pointed out that in many studies where data are not obtained from a w~l-des~ned or control~d experimem, as is the case ih ~ p o l l u t i o n studies involving socioeconom~, climate and othe~ uncontrol~d varia~es, non-orthogonal~y requires that e~imation of individu~ effects be handled by techniques other than ordinary least squares. He suggested that ridge regression, as first described by Hoefl ~, provides a promising method for avoiding disto~ion due to non-orthogonality. Recently, McDonald and Schwing ~ have provided an exam~e illu~rating the differences between ridge regression e~imates and least squares regression e~imates of the coeffic~n~ in a model relating air pollution to a total mort~ity rate. This paper is primari~ concerned with point estimates of the association of several indices of mort~ity with pollution based on a multiple linear regression model containing 23 ex~anatory variables. A description of the variables is contained in Section 2. Least squares, ridge regression and sign constrained methodologies are described in Section 3. Four results sections follow: Section 4~, Disease Specific Results; Section 4.2, Age Specific Results; Section 4.3, Age Specific Results for the Lung Cancer and A~efiosderot~ Heart Disease Categories; and Aggregation of Associations, Section 4.4. The procedures of e~imation and the e~imates themselves are discussed in Section 5. It is appropriate to emphas~e that any regression model has numerous ~mhation~ A linear model, as used in this study, can be considered as only an approximation to a more com~icated underlying model. This approximation has greate~ validity in the ne~hborhood of the point where each variable assumes 140

its mean value. Further, in any epMemiological study such as this, the explanatory vafia~es cannot be c o n t r ~ d and thus may be associated in unknown ways with other hidden but influenthl variables. Avenged explanmo~ varia~es and aggmgamd ~sponse variables do not com~emly define the exposure patterns and mort~i~ pm~ms of the large populations in our study commund~s. M~rm~n habits, d i a g n o ~ biases, exposure histories, synerg~ms and pe~onal and dietary habits have not been quamified and themlbm am not usually included in global studies of this type. Most important, corroSiOns or associations b~ween envir~amenml ~cto~ and health effects do not prove causm~n. One can only in~r the degree of associm~n tsee A. B. Hill~). Nevenh~ess, the model is ffequefftly used as a tool in quanti~ing the assodm~n of certain enoronmental ~cto~ with health indices such as mortMity rotes. 2, DESCRIPTION OF VARIABLESCON~DERED To ~fine our e~im~es of the associat~ns of ce~ain envwonment~ fa~o~ with health, t h e e groups of mort~ity data for the yea~ 1959-1961 from Duff)' and Ca~oll 6 are studied as dependent varia~es. The three groups of white male mort~ity rates with summary ~ i s t ~ s are given in Table 1. Briefly, they are: (1) The highest fifteen specific white male disease categories which make up the bulk 165 %) of the total white male m o ~ i t y rate. (2) Total white male and age ~ratified white male mort~ity rates.(3) Age ~ratified white male mort~ity rates due to a~eriosderot~ hea~ and coronary and age ~ratified mort~ity due to lung cance~ the largest and third largest white mate disease categories, respe~iv~y. P r e ~ n c e is given to using the white male category of the mort~ity rates given in re~ 6 since these rates are usually based on large populat~ns with a large number of deaths, and are thought to represent a wide exposure to the various environmental condit~ns. A total of 23 ex~anatory or ~independent" variab~s are used in this study. These socioeconomic, climme, p~lut~n, cigarette smoking, and natural ionizing radiation varia~es are described in Table 2. Summary ~at~t~s of the ex~anatory vafiab~s are contained in Table 2 for the 46 Standard Metrop~imn Statistical Areas (SMSA's) in this sample. Many of these ex~anatory vafia~es tend to be highly cow~ated. The degree of correlm~n is particu~rly high among pollution varia~es, in pa~ because prev~ling weather f a o o ~ often determine whether or not p~lutants emitted into a commun~y accumulate or d~pe~e. For example, the correlation coeffic~nt between some p~lumnts is as high as 0.877. It is believed that the assodat~ns a~ affected not only by the Iongqime average concentrat~n of a p~lutant but also by its physical and chemical [brm as well as by its pattern of sho~qime vafiat~n. Consequent~, to i n v e ~ a ~ the possible effects of such pattern varim~n, the mean and minimum of some of the pollutants are included as separate variables in the reg~ssion anMyses. Difficuhies in fitting the linear model which arise as a result of high correlat~ns and possible 141



.

.

- -

- -

~

~

~

~,

~

~

~,d ~ ~ m ~ ~~ m ~ - -~- - ~ ~ m--

~

~

~

~

~

~

~ ~

~ ~ ~.,,,.,,,, ~ ~ ~ - ~ - =

--

_ 4 d ~ 4 ~ 4 4

~.~s

~,

~

.~ ~-

~ ~

o ~g ~

~E .~ ~ ~

~

_

~E~ ~ =~ ~

~

=

~

=

~ :~ ~ ~ =

~ .~ .=

~:

~

.~ .~ ~ ~ fi o =~ . ~ ~ } ~ ~ = ~ ~=E~

~.==.~o >~ -~ ~ ~

~

~ '~ ~ _ .

~ E = ~

~

--

u

.~

~ ~ = " ~ . = ~E ~~ ~ ~ = ~ ~ ~ ~ ~ c == ~o - ' ~ ='E ~ ~ >

~ .~~~= , ~'~~=~~' ~-~-o~~=°~== "~.~

~

~

~

~

Z ~ ' O

._

~ a~~, ~ ~ ~ ~ o _ ~ o _ ~ ~ ~ ~ ~

,

~ ~

~ ~

~ ~

~

~

~

~

~ ~



0

E


0, the estimator ~(k) is Nased; howeve~ the total variance, i.e., the sum of the variances of the individual coeffic~nts, decreases as k increases. ~ has been shown by Hoed and Kennard ~3 that there does e x i t a v~ue of the ridge parameter, say k~ such that the di~ance between ~(k9 and the unknown coeffic~nt vector ~ is less than the co~esponding d~tance between the estimator ~ and g An e x ~ k ~ method for determining this k' ~ not a v ~ e , so in our ~udy we em~oy the somewhat su~ective criterion of ~ability of a ridge trace. The ridge trace is a ~ot of each of the coefficients ~(k) venus the parameter ~ Th~ trace is then used to obt~n a minimum v~ue of k such that the coeffic~n~ depict sm~i changes in a neighborhood about th~ point. In other words, a minimum v~ue of k is chosen---c~l it k ~, so that ~rge coeffic~nt changes, s~n r e v e ~ s and crissCrossing is primarily confined to the interv~ ~,k*). In the foay disease cases an~yzed in this ~udy, the v~ues of k * ranged from 0.15 to 0.20 with most v~ues occurring near the 0.20 v~ue. To fadl~ate the c o m m u n k m ~ n of our resulu, we have uniformly adopted the v~ue of 0.2 and repo~ the ~astkity estim~es based on ~(0.2) in Se~ion 4. These estimates we bel~ve to be more appropfiat~y suited for e ~ i m ~ n of individu~ contributions than their least squa~s coun~rpar~. As a third method of arriving at point e~immes of regression coeffiden~, the ~sidual sum of squa~s was m ~ i m ~ e d s u ~ e ~ to inequal~y c o n ~ i n ~ on the seven explanatory p~lutant vafiaMes: hydrocarbon potential, sulfur dioxide potential, NO:, minimum s u l ~ , mean s u l ~ , minimum nitrate and mean hi146

trate (va~. 12, 13, 17, 19, 20, 22 and 23, respe~ivdy). These coeffioent e~imates were con~rained to be non-negative. In all cases, the uncon~rained least squares solution violated one or more of the con~raints. Since the re~ficted solution must then I~ on at least one of the violated boundaries, the data were fit suNe~ to all 2 7 possi~e com~nations of imposed re~rict~ns, i.e. variaNe d d e t ~ n ~ The desired solution was then that equation y~lding the stoniest resNuN sum of squares for which the retained con~rained coeffic~nt estimates were non-negative. The re~ricted solution is biased, as is the ridge solutionL This sign constrained technNue is NmNy another means of addressing the proNem of collinearity by eliminating ce~Nn vafiaNes from the final regression modal. 4. RESULTS - - POLLUTANT~ R A ~ A ~ O N

AND S M O ~ N G

The ~ s u l ~ from this study for nine of the e x N a n ~ o ~ vafia~es a ~ given in t h e e groups of whi~ m~e m o ~ i ~ ~tes: the fi~een h ~ h disease categories; the age ~r~ified tot~ m o ~ i t y ~tes; and the age ~ t i f i e d lung cancer rates and age ~ratified aneriosc~rot~ heart and coronary ~tes. The nine exNan~ory va~ ia~es of concern he~ a ~ the seven pollution vafia~es (va~. 12, 13, 17, 19, 20, 22, and 2~ the natural ~n~ing radiation varia~e (va~ 21), and the smo~ng index variable (var. 18). For each of the t h e e groups, nine bar cha~s a ~ given--one co~esponding to each of the exNanmory varia~es under cons~erat~n. These bar chans ind~me which disease categories a ~ associated with each of the exNan~ory variables and prov~e a measu~ of the ex~nt of th~ associat~n. The " m e a s u ~ " p~sented in this section is an da~ici~, i.e., the estim~ed pe~ent change ~ h e r inc~ase or dec~as~ in the avenge m o n ~ i ~ rote of the given disease ~ n d / o r ag~ c~egory conespond~g to a one pe~ent inc~ase in the avenge of the given ex~an~ory variable, holding all other varia~es fixed at their avenge v~ues. The d a ~ i o t y of a given m o n ~ y ~te and ex~an~ory variab~ is obt~ned using the ~ g ~ s s ~ n coeffic~nt e~im~es given in Appendix A. The d a ~ y is then compu~d by m ~ f i N ~ n g this esfimme for the appropfi~e m o ~ i ~ ~ and ex~an~ory variable by the ~tio of the coeffioent of v a r i ~ n of the m o ~ i t y ~te to the coe~ fic~nt of v a r i ~ n of the ex~an~ory variab~. The "coeffi~ent of v a t i c a n " is the ratio of the samNe ~andard d e f i ~ n to the sam~e mean; these can be computed from Ta~es 1 and 2. Each of the ba~ on the subsequent cha~s i n d ~ e two d a ~ y values--one based on the ridge ~ g ~ s s ~ n e ~ i m ~ e s and one based on the sign const~ined lea~ squa~s esfim~es. For each of the t h e e groups, a ta~e is given ~ d ~ i n g in order of deceasing e h ~ i c i ~ the exNanmory varia~es having esfim~ed positive da~icity with the given m o ~ y m~.

* Ahhough not em~oyed in th~ ~udy, an ~ g o ~ h m ~ r s o l i n g pm~ems of th~ type and a disc u ~ n of the ~ i s t ~ p m p e ~ s cf t~e ~ f i ~ e d le~t ~uares ~ f i m ~ is given ~y M a ~

147

~o-~ z~o

~o ~(

~o ( ~yc

~

mo

i~::_s~:~,,:

....

~o

S

-~

~

~

mo Do

I ~? ~

1~ I~ J~

i

~

~

~

~- : L

~~

'

~

~ ~ ~ i~ i

~ - - ~

~ - -

1

15o

~o ~o ~o

~

I ~2 ~I~ ~ ~ I

~:::

....|

1

?

~

a

I

J~

~ ~

~

~

II

~

l~

~o ~ ~

Fig. 1. Assooation of explanatory variables with white male mort~ity rates ~n fi~een disease categories, l = A~efiosdcrot~ hea~ and coronary; 2 = subarach and cerebral hemo~hage; 3 = n e o ~ a s m lung, bronchus: 4 = cerebral embol~m and thrombofis: 5 ~ influenza and p n e u m o n ~ except newbom; 6 = e n d o c a r d ~ and myocardial degeneration; 7 = h y p e ~ e n s i v e hea~: 8 = g e n e r a l anefiosc~rofis~ 9 = Orrho~s of liver; l0 = neoplasm large inte~ine: I I = diabetes mOlitus: 12 = neoplasm stomach: 13 = congenital m~formations: 14 = e m p h y s e m a : 15 = other vascular lefions. * E~imates are nearly equal.

148

~1

~~

~ ~-,,~

2~!,~ ?P~ 5~o~

i:~°~~, ~-'~°'~1 : ~9o~

~

~

'

*

~

~o

i1

/2

I~

la

Ridge.

~

~

I

2~

7

~o~

::! t

2

~

~

5

~

~

Co~strained Least Squares

I~

i¸, iI~1,

ill'~" ,i~I ~ ,,,

149

4.1 Ff#een h(~,/1 disease ~te,eoJ7~ .lOt ~ i t e males In Fig. 1, the ~ a ~ i t ~ s based on the ridge e~imates are indkated by the height of the open rectangles, and those based on the sign constrained least squares estimates are indkated by the height of the shaded rectangles. For example, with respect to the hydrocarbon variable (vat. 12) and the sixth regression, the ridge estimate yields an ~asticity of 0.047 % and the sign con~rained method yields a value of 0.153%. Instances where the ~asticit~s, based on the two methods, a~e appro~mat~y equal are appropriat~y indicated on the cha~s. Table 3 lists in decreasing order the variables (out of the nine being concreted here) which have an indicated positive assoc~tion with the respective morality rates. The upper entry co~esponds to ridge e~imates while the lower entry corresponds to the sign constrained e~imates. Except for regressions 6, 8, and 15, all of the estimates are based on 46 observations. For regression 6, endocarditis and myocardial degeneration, the Baltimore SMSA was deleted ~om the final analyses based on an examination of a plot of residuals versus predicted morality rates. It is interesting to note that, based on the sample of 46 SMSA's, Baltimore has the large~ death rate due to endocarditis and myocard~l degeneration, namely 127.6; the next largest value for this rate is 59.15. For regression 8, general a~efiosderosis, the To~do SMSA was deleted as a "wild" point based on a residual an~ys~. FOr this disease category, Toledo has a rate of 53.42 which is a maximum among the 46 SMSA's and substanti~ly larger than the next largest value of 29.35. Finally, for regression 15, other vascular lesions, Grand Rapids and M~waukee were also deleted in the final analysis based on analysis of re~du~s. Although Columbus was not deleted from any of these studies, it was noted that its total m o n ~ y rates for white males were the highest in this sample for age categories 0-4 and 35-44, and lowest for the age category 15-24. Columbus also had the highest rate of hype~en~ve hea~ disease in this samp~. These data, while not lying so far outside the range of our sarape to be con~dered outliers, represent an unusual data point. There appear to be no sub~antive conclusions which can be stated and verified across a# disease and/or ex~anatory variables considered. The specific results of this study, by their nature, must be presented and stated in a rather sharply defined context. However, there are some general observations which can be made with respect to the methods of a n ~ y ~ the ex~anatory variables considered and the disease categories inve~igated. These observations and comments do, of course, have some exceptions in our results. In comparing the ~asticit~s derived ~om the ridge method and the sign constrained method, the latter method usually magnifies the positive association of the diseases with the pollutants which may be indkated by the ridge method. With respect to the radiation and smoking variables, two variables which were not sign constrained, the elasticit~s of the diseases based on the sign constrained method are generally larger in magnitude than the ridge counterpaa~ There are a few sign d~crepanc~s in ~ a ~ k ~ i.e., one method indkates a positive as150



~__

~

~

~

~



~

©

~

E~

~__ ~ ~ ~

Z

~

~

:~

~-o~-=~ ~ ~ ~¢ ~ ~--e~ ~e

=~

~ ~

~E~ __E~---~ --=z~ z~

~z ~=

~

~

~ee~ ~ ~ ~

__~

~ ~

~~

~ ~--z~=z zz == ~ ~ ~-- z~- z~ zS- -~J~ z__--z__~ ~ ~

~ ~ ~ ¢~ ~ ~¢ ~ ~ ~ ~

~ee~~

~- 5~ ~__ __~--~JJ~ z~ ~z ~ ~ ~ ~ ~ ~-J~-~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~e~ ~ ¢~

~ ~ u

~ ~: ~'~~ ~ ~-i~~--~--~ii~ ii~ ii~ i~~-~ ~-~~--~

~

~

~,-~.~

~=~~.~ ~.~

E

~

~~ ~oe~~=~ ~~~_ee.~~ ~._~=

~

.~

© m.~ ~

.~

~

~e ~

~ ~.~

Measures of association of some air pollutants, natural ionizing radiation and cigarette smoking with mortality rates.

The Sconce of ~ e Total Environmen~ 5 (1976) 139-169 © E l ~ v ~ r S c ~ m ~ c P u n c h i n g Company, Am~erdam ~ Pfin~d in Belgium MEASURES OF ASSO...
1MB Sizes 0 Downloads 0 Views