This article was downloaded by: [Ecole Hautes Etudes Commer-Montreal] On: 08 September 2015, At: 04:52 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: 5 Howick Place, London, SW1P 1WG

Multivariate Behavioral Research Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/hmbr20

Hypothesis Testing and Model Comparison in Two-level Structural Equation Models Sik-Yum Lee & Xin-Yuan Song Published online: 10 Jun 2010.

To cite this article: Sik-Yum Lee & Xin-Yuan Song (2001) Hypothesis Testing and Model Comparison in Two-level Structural Equation Models, Multivariate Behavioral Research, 36:4, 639-655, DOI: 10.1207/S15327906MBR3604_07 To link to this article: http://dx.doi.org/10.1207/S15327906MBR3604_07

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan,

Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Multivariate Behavioral Research, 36 (4), 639-655 Copyright © 2001, Lawrence Erlbaum Associates, Inc.

Hypothesis Testing and Model Comparison in Two-level Structural Equation Models Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

Sik-Yum Lee and Xin-Yuan Song Department of Statistics The Chinese University of Hong Kong

One basic and important problem in two-level structural equation modeling is to find a good model for the observed sample data. This article demonstrates the use of the well-known Bayes factor in the Bayesian literature for hypothesis testing and model comparison in general two-level structural equation models. It is shown that the proposed methodology is flexible, and can be applied to situations with a wide variety of nonnested models. Moreover, some problems encountered in using existing methods for goodness-of-fit assessment of the proposed model can be alleviated. An illustrative example with some real data from an AIDS care study is presented.

Introduction Traditionally, many statistical methods assume identically and independently distributed observations. However, in practice, it is often to encounter hierarchically structured data which are collected from individuals that are nested within larger clusters (see, e.g., Aitkin & Longford, 1986; Goldstein, 1986; Goldstein & McDonald, 1988; among others). Examples might well be drawing of random samples of students from within random samples of classes or schools; or individuals from within random samples of families, etcetera. For this kind of multilevel data, the assumption of independence is not realistic, because individuals within a cluster (or a group) are expected to share certain common influential factors and hence produce correlated observations. Analyzing multilevel data as a single random sample can lead to erroneous results. Hence, a lot of attention has been devoted to develop models and methods that take account of the hierarchical This research was supported by a Hong Kong RGC Earmarked Grant CUHK 4088/ 99H. The authors are grateful to the Editor and two reviewers for many helpful comments. We are also indebted to D. E. Morisky and J. A. Stein for the use of their AIDS data in our example. The assistance of Michael K. H. Leung and Esther L. S. Tam is gratefully acknowledged. Correspondence concerning this article should be addressed to Prof. S. Y. Lee, Department of Statistics, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong, E-mail: [email protected]. MULTIVARIATE BEHAVIORAL RESEARCH

639

Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

S.-Y. Lee and X.-Y. Song

structure (see, e.g., the above listed citations, de Leeuw & Kreft, 1986; Raudenbush, 1995; Raudenbush & Bryk, 1988; Thum, 1997; among others). Structural equation modeling is a widely used multivariate method in assessing causations and correlations among observed and latent variables. The need for more general models that take into account the correlated structure of multilevel data is well recognized; see for example, McDonald and Goldstein (1989), Lee (1990), and McDonald (1993) for some theoretical developments; Muthén (1991, 1994), Lee and Poon (1992), Longford and Muthén (1992), Raudenbush (1995), and Lee and Tsang (1999) for developments of various computational algorithms. One important statistical inference beyond estimation is related to testing of various hypotheses about the model; for example, the goodness-of-fit of the posited model to the sample data. In the field of structural equation modeling, a common approach in hypothesis testing is to use the significance tests on the basis of P-values that are determined by some asymptotic distributions of the test statistics. As pointed out in the literature (see, e.g., Berger & Delampady, 1987; Berger & Sellke, 1987; Kass & Raftery, 1995) there are difficulties with such an approach. Some of those related to structural equation modeling are as follows. 1. Tests on the basis of P-values tend to reject the null hypothesis frequently with large sample sizes; see Raftery (1986) for a concrete example. Various descriptive fit indexes, such as the well-known normed or non-normed fit indexes (Bentler & Bonett, 1980) and the comparative fit index (Bentler, 1992) have been proposed as complementary measures for the goodness-of-fit of the model. Very often, the values of the fit indexes are over 0.95, but the P-values of the ␹2-test are less that 0.01. Under these situations, conclusions drawn from these two testing methods seem contradictory. 2. The P-value of a significance tests in hypothesis testing is a measure of evidence against the null model, not a means of supporting/proving the model. 3. The significance tests as well as the descriptive fit indexes mentioned above cannot be applied to test nonnested hypotheses or to compare nonnested models. In addition to the above problems relating to standard structural equation model (SEM) with independently and identically distributed (i.i.d.) observations, we encounter one further difficulty in hypothesis testing with the more complex multilevel models. The goodness-of-fit test as well as the fit indexes are all based on some test statistics which are claimed to have a central or a non-central ␹2 distribution with certain degrees of freedom (see, Bentler, 1992). The fundamental background of these test statistics is based 640

MULTIVARIATE BEHAVIORAL RESEARCH

Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

S.-Y. Lee and X.-Y. Song

on the theory of likelihood ratio test in conjunction with maximum likelihood (ML) estimation with i.i.d. observations. For multilevel models with unbalanced data, in order to get independent observations, individual observations (of dimensions p) within a group, say group g with sample size Ng, have to be pooled together to form a single long vector of dimension pN g. The resulting likelihood function is the product of G independent but nonidentical random vectors with different dimensions pNg. Hence, the basic i.i.d. assumption in the ML estimation is not satisfied. How serious this violation of assumption to hypothesis testing with the likelihood ratio chisquare test is unknown. Therefore, it is important to find a testing procedure to overcome the above practical and theoretical problems. In this article, an approach using the Bayes factor (e.g. Berger, 1985) is proposed. The development of Bayes factor is quite general/universal and it has been applied to a wide variety of models, see Kass and Raftery (1995). The main aim of this article is to illustrate how it can be used for hypothesis testing and model comparison in the context of two-level SEMs. As a by-product, this approach can be applied to test or compare functional constraints among the parameters in the model. Model Consider a data set of p-variate observations xgi, i = 1, ..., Ng, within groups g = 1, ..., G. The group-level sample sizes Ng may differ from group to group. Assuming that, conditional on the group mean vg, observations within each group are i.i.d. as N(vg, ⌺gW); that is (1)

D

(xgi|vg) = N(vg, ⌺gW),

where ⌺gW is the within-group covariance matrix. Moreover, we assume that the mean vectors {vg, g = 1, ..., G} are i.i.d. and (2)

D

vg = N(␮, ⌺ B),

where ⌺ B is the between-group covariance matrix. The covariance matrices ⌺gW and ⌺B can have any general form such as the LISREL model (Jöreskog & Sörbom, 1996) or the Bentler and Weeks (1980) model in EQS (Bentler, 1992). Hence, the model considered is a general two-level model with unbalanced designs. To achieve valid statistical inference on this general model, certain assumptions on the sample sizes have to be satisfied. For example, G has to be sufficiently large in analyzing the between group MULTIVARIATE BEHAVIORAL RESEARCH

641

Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

S.-Y. Lee and X.-Y. Song

structures ⌺B, and Ng has to be sufficiently large in estimating the specific parameters involved in group g. In many applications of the two-level models, it is assumed that ⌺gW = ⌺W. For simplicity, we assume that ⌺gW = ⌺W and ␮ = 0; but keep in mind that the proposed procedure can be extended to the general case with minor modifications. Let ␪ be the q × 1 vector of parameters, so that ⌺B = ⌺B(␪) and ⌺W = ⌺W(␪) are corresponding matrix-valued functions of ␪. For example, a two-level factor analysis model is given by (3)

xgi = ⌳B␰g + ε g + ⌳W␩ gi + ␦gi, g = 1, ..., G, i = 1, ..., N g,

where ␰g is the common factor at the group level with distribution N(0, ⌽B), ␩gi is the common factors at the individual level with distribution N(0, ⌽W), ε g and ␦gi are the error measurements at the group level and individual level with distributions N(0, ⌿B) and N(0, ⌿W) respectively, where ⌿B and ⌿W are diagonal covariance matrices. In this model, (4)

⌺B = ⌳B⌽B⌳BT + ⌿B, and ⌺W = ⌳W⌽W⌳WT + ⌿W.

The parameter vector ␪ contains unknown parameters in ⌳B, ⌳W, ⌽B, ⌽W, ⌿B and ⌿W. Suppose we are interesting in a class of specific models under the framework of the general model defined by Equations 1 and 2, say M1, ..., Mk. We call the model that is in common with every Mk the basic model. Thus, each particular model can be represented by a basic model, which may be the saturated model where ⌺ B and ⌺W are general positive definite matrices without any structures, or a special model of the general model given in Equations 1 and 2 together with an appropriate set of constraints in ␪. More specifically, a model Mk can be defined by a basic model, and the following constraints among its parameters in ␪: (5)

Ck(␪) = [C(k)1(␪), ..., Crk(k)(␪)] = 0,

where C 1(k), ..., Crk(k) are any linear or nonlinear functions of ␪. Let ␪k be the vector of parameters in Mk that satisfies C k(␪) = 0. For example, a twolevel factor analysis model with equal cross-level loading matrices can be defined by a basic model as specified in Equation 3 together with the constraint: ⌳B - ⌳W = 0, or ⌳B = ⌳W = ⌳. In this special case, (6)

642

⌺B = ⌳⌽B⌳T + ⌿B, ⌺ W = ⌳⌽W⌳T + ⌿W,

MULTIVARIATE BEHAVIORAL RESEARCH

S.-Y. Lee and X.-Y. Song

and the parameter vector contains the unknown parameters in ⌳, ⌽B, ⌽W, ⌿B and ⌿W. Another special case can be defined with the constraint: ⌽B = ⌽W = ⌽. In this case, ⌺B = ⌳B⌽⌳TB + ⌿B, ⌺W = ⌳W⌽⌳TW + ⌿W.

(7)

Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

Note that models given in Equations 6 and 7 are not nested within each other. Bayes Factor for Hypothesis Testing or Model Comparison We will use the Bayes factor to compare two competing models M1 and M2 for a given data set. Bayes factor is a well-known method (see, e.g. Berger, 1985), and it is still the focus of attention in recent Bayesian literature for hypothesis testing and model selection, see for example Aitkin (1991), Raftery (1995), Newton and Raftery (1994), Gelfand and Dey (1994), Berger and Pericchi (1996), Draper (1995), and O’Hagan (1995), among others. Application of this approach to model selection in single level SEMs has been proposed by Raftery (1993). Here, we first describe briefly its general development and then demonstrate its application to two-level SEMs. Suppose the given data D with sample size n have arisen under one of the two competing models M1 and M2 according to a probability density p(D|M1) or p(D|M2). For model Mk, k = 1, 2, let p(Mk) be the prior probability and p(Mk|D) be the posterior probability. From the Bayes theorem, we obtain

b

g pb D| M gppbbDM| Mg +gppbbDM| Mg g pb M g , k = 1, 2.

p Mk | D =

k

1

k

1

2

2

Hence,

(8)

b b

g b g b

gb g gb g

p M1 | D p D| M1 p M1 = . p M2 | D p D| M 2 p M 2

The Bayes factor is defined as

(9)

B12 =

b b

g g

p D| M1 , p D| M 2

MULTIVARIATE BEHAVIORAL RESEARCH

643

Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

S.-Y. Lee and X.-Y. Song

which is a summary of the evidence provided by the data in favor of M1, as opposed to M2. When the prior probabilities of the two hypotheses are equal, the Bayes factor is the posterior odds. Hence, the Bayes factor can be used as a quantity for model comparison. Unlike the significance test approach that is based on P-values, this comparison does not depend on the assumption that either model is ‘true’, and can be applied to nonnested models. When the two competing models M1 and M2 only involve single distributions with no unknown parameters, the Bayes factor B12 is the likelihood ratio. In the general case where there are unknown parameters in either or both M1 and M2, B12 still has the form of a likelihood ratio. However, this test statistic is different from the usual large-sample likelihood ratio test in which the null hypothesis must be nested within the alternative hypothesis. Moreover, the densities p(D|Mk) in Equation 9, k = 1, 2, are obtained by integrating over the parameter space, that is

b

g

zb

gb

g

p D| M k = p D|␪ k , M k ␲ ␪ k | M k d␪ k , where ␪k is the dk × 1 parameter vector in Mk, ␲(␪k|Mk) is its prior density, and p(D|␪k, Mk) is the probability density of D given ␪k, which may also be interpreted as the likelihood function of ␪ evaluated at ␪k under Mk. The dimension of the above complicated integral is equal to the dimension of ␪k. Hence, very often, it is very difficult to obtain B12 analytically, and various analytic and numerical approximations have been proposed in the literature (Kass & Raftery, 1995). For example, see the Laplace method of approximation (Tierney & Kadane, 1986) and its alternative forms (Kass & Vaidyanathan, 1992; Tierney, Kass & Kadane, 1989), as well as Chib’s (1995) procedure with the Gibbs sampler algorithm. The above methodologies are rather involved and depend on the prior density ␲(␪k|Mk). In this article, the following simple approximation that is not depending on the prior density is used to approximate 2logB12: (10)

2 log B12 =& 2 S = 2 log p D| ␪$ 1 , M1 − log p D| ␪$ 2 , M 2 − d1 − d 2 log n,

e

j

e

j b

g

where S is often called the Schwarz criterion (Schwarz, 1978), ␪$ 1 and ␪$ 2 are the maximum likelihood estimates of ␪1 and ␪2 under M1 and M2, respectively. According to Kass and Raftery (1995), 2logB12, with B12 defined in Equation 9, can be interpreted via the following criterion:

644

MULTIVARIATE BEHAVIORAL RESEARCH

S.-Y. Lee and X.-Y. Song

Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

(11)

2logB12

Evidence against M2

10

Negative (supports M2) Barely worth mentioning Positive (supports M1) Strong Decisive

This commonly used criterion provides a descriptive scale about standards of evidence. Since it uses the asymmetric natural logarithm transformation of B12, the cut-points are not symmetric. It gives more detailed guidelines relating to M1. We can evaluate 2logB21 if more detailed descriptive statements about evidence of M2 are required. As n tends to infinity, it has been shown (Schwarz, 1978) that

(12)

S − log B12 → 0, log B12

thus S may be viewed as an approximation to logB12. This approximation is of order 0(1), hence it does not give the exact logB12 even for large samples. However, since the interpretation is on the natural logarithm scale, it provides a reasonable indication of evidence. As pointed out by Kass and Raftery (1995), it can be used for scientific reporting as long as the number of degrees of freedom (d1 – d2) involved in the comparison is small relative to the sample size n. Suppose the competing models M1 and M2 are defined by an identical basic model (e.g. the saturated model) together with appropriate sets of constraints C1(␪) = 0 and C 2(␪) = 0, respectively. Comparing M1 and M2 is then equivalent to comparing the evidence for C 1(␪) = 0 and C2(␪) = 0. The Bayes factor B12 is a summary of the evidence provided by the data in favor of the constraint C 1(␪) = 0 as opposed to constraint C 2(␪) = 0. In the computation of 2logB12 via Equation 10, ␪$ k is the corresponding constrained ML estimate and dk is replaced by dk – rk. In the special case where M2 is the saturated model that involves no constraints, r2 = 0 and d2 is the number of unknown parameters in the saturated model. Substituting r2 = 0 and d2 in Equation 10, we can compute 2logB12 which gives the summary of the evidence in favor of C1(␪) = 0 as opposed to the saturated model. As a result, arbitrary constraints (linear or nonlinear) on the parameter vectors can be compared or tested. Since statistical theory on the constrained ML estimation of two-level SEMs is not yet established, this kind of inference cannot be done rigorously via the existing classical likelihood ratio test. MULTIVARIATE BEHAVIORAL RESEARCH

645

S.-Y. Lee and X.-Y. Song

Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

Let zTg = (xTg1, ..., xTgN ) be the 1 × pN g vector that contains all observations g in group g. The distribution of zg is N[0, (Jg ⊗ ⌺B) + (Ig ⊗ ⌺ W)], where Jg is an Ng × Ng square matrix of unit elements and Ig is the identity matrix of order N g. Since observations from different groups are independent, zh and zg are independent for h ≠ g. It can be shown that the log-likelihood function (ignoring a constant which remains the same for different models) based on the data D = {z1, ..., zG} is given by

b

g

(13) log p D| ␪, M = −

1 G ∑ N g log|⌺g |+tr ⌺−g1S g + N g − 1 log|⌺W |+tr ⌺W−1S gW 2 g =1

d

{

i d

i

d

i },

where ⌺g = ⌺W + Ng⌺ B, and Ng Ng

S g = N g−2 ∑ ∑ x gi x Tgj ; S gW = N g N g − 1

d

i

i =1 j =1

−1

Ng Ng

∑ ∑ x gi dx gi − x gj i

T

.

i =1 j =1

i≠j In computing the Bayes factor, logp(D| ␪$ k, Mk) will be evaluated via Equation 13 at the constrained likelihood estimate ␪$ k under model Mk. Once we have ␪$ k, the computation is straightforward. The proposed Bayes factor approach can be applied as follows to test the goodness-of-fit of a posited two-level model as follows. Let Ms corresponds to the saturated two-level model and M1 be the posited two-level model, ␪$ s and ␪$ 1 be the corresponding ML estimates. Then, 2logB1s can be computed via Equation 10. The strength of evidence in favor of the model M1 is reflected by 2logB1s. On the basis of the criterion given in Equation 11, if 2logB1s < 2, the posited model M1 is rejected. If 2logB1s > 2, it can be concluded that M1 is the better model. Hence, it can be regarded as a plausible model for the observed data. If 2logB1s > 6, we have more confidence about M1 and more definite conclusion can be drawn; otherwise more caution should be taken. To compare a model M2 with constraints C2(␪) = 0 with another proposed model M3 with constraints C3(␪) = 0, we compute (14)

2 log B23 =& 2 log p D|␪$ 2 , M 2 − log p D| ␪$ 3 , M 3 − d 2 − r2 − d3 − r3 log n,

e

j

e

j b

g b

g

where ␪$ 2 and ␪$ 3 are the constrained ML estimates of ␪2 and ␪3; r2 and r3 are the numbers of constraints in C 2(␪) = 0 and C3(␪) = 0, respectively, and 646

MULTIVARIATE BEHAVIORAL RESEARCH

Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

S.-Y. Lee and X.-Y. Song

n = N1 + ... + N G. On the basis of the criterion given in Equation 11, if 2logB23 < 0, M3 is the better model; while M2 is the better model if 2logB23 > 2. No conclusion can be drawn if 0 < 2logB23 < 2. For model selection problem that involves a large number of possible models, the search strategy given in Raftery (1995) can be used. To save space, this strategy is not presented here. For most applications of the two-level SEMs, constraints considered in = constant, the literature so far are very simple. They are either in form: ␪(k) j (k) ␪ = 0. Longford and Muthén (1992) developed a scoring say 0 or 1; or ␪(k) i j type algorithm for obtaining ML estimates of the parameters that subject to the above simple constraints in a two-level factor analysis model. Lee and Poon (1998) developed an efficient EM type algorithm that can be applied to general SEMs. Moreover, they showed that if ⌺B and ⌺ W are the LISREL model or the Bentler and Weeks (1980) model, the well-known software LISREL (Jöreskog & Sörbom, 1996) or EQS (Bentler, 1992) can be used to obtain the ML estimate. Once we get the ML estimate, the Bayes factor can be calculated. Constrained ML estimates that are subjected to more general linear or nonlinear constraints can be obtained via the EM type algorithm with more computational effort, see Lee and Tsang (1999). However, the M-step of their proposed EM algorithm can be completed via LISREL (Jöreskog & Sörbom, 1996) or EQS (Bentler, 1992) with simple auxiliary programs. Illustrative Example A portion of the real data set in a study of the relationship between AIDS and the use of condom among Philippines sex workers (Morisky et al., 1998) is used to illustrate the proposed Bayes factor approach. The whole data set was collected from female commercial sex workers in 97 establishments (bars, right clubs, Karaoke TV, and massage parlors) in cities of Philippines. The questionnaire contains items on socio-demographic background, knowledge, attitudes, belief, behaviours, self efficacy for condom use, alcohol and drug use and social desirability. As pointed out by Morisky et al., influence of the establishment policies is substantial. Hence, a two-level model should be used to take into account the influential factor of the establishments. As an illustration of the proposed method, only six variables are used. The questions corresponding to the first three variables are: How great is the risk of getting AIDS or the AIDS virus from (a) kissing a person with the AIDS virus on the cheek; (b) deep kissing with someone who has the AIDS virus; and (c) sexual intercourse with someone who has the AIDS virus using a condom; while the questions corresponding to the last three variables are: (d) how much of a threat do you think AIDS is to the health MULTIVARIATE BEHAVIORAL RESEARCH

647

Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

S.-Y. Lee and X.-Y. Song

of people; (e) what are the chances that you yourself might get AIDS; and (f) how worried are you about getting AIDS. These are polytomous variables measured in a five-point scale. Since our main purpose is to illustrate the proposed methodology, these variables are treated as continuous variables with a multivariate normal distribution. The final results may be affected by the polytomous nature of the variable and hence real conclusion for the theory from analysis of this AIDS data set should be drawn with caution. Also, for simplicity, we deleted those observations with missing entries in the analysis. The within-establishment sample sizes are summarized in Table 1. For example, there are 6 establishments with sample sizes equal to 1, 11 establishments with sample size equals to 2, and so on. The total sample size is equal to 758. To unify the scales of the variables, the raw data were standardized. Motivated by Morisky et al. to take into account the second level effect on the influence of the establishment policies, the resulting data set was analyzed via a two-level confirmatory factor analysis model. On the basis of some exploratory analysis (see also Lee & Tsang, 1999), we begin our analysis with the following specifications:

T

b g b g b g b g = 10 . , ⌿ = diag ⌿ b11 , g,..., ⌿ b6,6g ; L⌳ b11, g ⌳ b2,1g ⌳ b3,1g 0 OP , 0 0 =M 0 0 ⌳ b4,2g ⌳ b5,2g ⌳ b6,2gQ N 0 L 10. ⌽ b2,1gOP, ⌿ = diag ⌿ b1,1g,..., ⌿ b6,6g . =M N⌽ b2,1g 10. Q

, ⌳ B 2,1 ⌳ B 3,1 0* ⌳ B 5,1 0* , ⌳ B = ⌳ B 11 ⌽B

*

B

B

B

*

(15)

⌳W

W

W

*

*

T

*

W

*

*

W

W

W

*

⌽W

W

W

*

W

W

W

Table 1 The Distribution of Ng in the AIDS Data Ng Frequency

1 6

Ng Frequency

11 3

648

2 11 12 6

3 13

4 6 13 3

5 5 15 2

6 6 16 3

7 11 17 2

8 7 19 1

9 7 28 2

10 Subtotal 2 74 59 1

Subtotal 23

Total

97

MULTIVARIATE BEHAVIORAL RESEARCH

Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

S.-Y. Lee and X.-Y. Song

where elements with an asterisk are fixed at the preassigned values. The total number of unknown parameters is 23. The between-group covariance structure is a single factor model with the group factor representing the influence of the establishments on the sex workers attitude to the risk of getting AIDS and their worry of AIDS. The within-group covariance structure is a factor model with two non-overlapping factors, where the first three manifest variables are loaded on the first factor only and the last three manifest variables are loaded on the second factor only. The first one can be roughly interpreted as one about the risk of getting AIDS, while the other as one about the worry of AIDS. To illustrate the use of Bayes factor in testing the goodness-of-fit of a model, we consider a two-level model M1 that is defined by the model with specifications Equation 15 together will the following constraints C 1(␪) = 0 given in Lee and Tsang (1999):

b g b g b g b g . , for j = 4 , 5, 6, ⌳ b j ,1g + ⌿ b j , j g + ⌳ b j ,2 g + ⌿ b j , j g = 10 ⌳ b4,2g = ⌳ b5,2 g = ⌳ b6,2 g 1 1 , g = ⌳ b2,1g. ⌳ b3,1g = ⌳ b11 3 4

(16) (i) ⌳2B j ,1 + ⌿ B j , j + ⌳2W j ,1 + ⌿W j , j = 10 . , for j = 1, 2, 3, (ii) (iii) (iv)

2 B

B

W

W

W

2 W

W

W

W

W

The first six nonlinear constraints were used to fix the diagonal elements of the estimated covariance matrix of the manifest variables to 1.0; while the last four linear constraints specify some relationships among the parameters in the loading matrix of the within-group structure. For completeness, the constrained ML estimates are presented in Table 2. At these ML estimates, logp(D| ␪$ 1, M1) is equal to -1947.1, while the corresponding value for the saturated model is equal to -1928.9. Also, n = 758, d1 = 23, r1 = 10, and ds = 42, hence from Equation 13, 2logB1s = 155.87. On the basis of the criterion given in Equation 11, it can be concluded that the data provide decisively evidence in favor of M1. Hence, a two-level confirmatory factor analysis model has been established for the given data. Note that the application of the approach with significance test via P-values to assess the goodness-of-fit of this posited model with nonlinear constraints is questionable. The nonzero estimates of the between-group level factor loadings indicate substantial heterogeneity across establishments. From the constrained estimates of the within-group factor loadings, the influences of the non-overlapping factors to the observed variables satisfied the prescribed linear relationships. For MULTIVARIATE BEHAVIORAL RESEARCH

649

S.-Y. Lee and X.-Y. Song

Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

Table 2 Constrained ML Estimates of Parameters in Model 1 Parameter

Estimate

Parameter

Estimate

⌳B (1, 1) ⌳B (2, 1) ⌳B (3, 1) ⌳B (4, 1) ⌳B (5, 1) ⌳B (6, 1) ⌿B (1, 1) ⌿B (2, 2) ⌿B (3, 3) ⌿B (4, 4) ⌿B (5, 5) ⌿B (6, 6)

0.265 0.316 0.196 0.(fixed) -.329 0.(fixed) 0.018 0.025 0.044 0.090 0.135 0.165

⌳W (1, 1) ⌳W (2, 1) ⌳W (3, 1) ⌳W (4, 2) ⌳W (5, 2) ⌳W (6, 2) ⌿W (1, 1) ⌿W (2, 2) ⌿W (3, 3) ⌿W (4, 4) ⌿W (5, 5) ⌿W (6, 6) ⌽W (2, 1)

0.590 0.784 0.196 0.370 0.370 0.370 0.560 0.258 0.876 0.770 0.619 0.697 0.250

example, the factor about the worry of AIDS has equal effects on the observed variables. To reaffirm the heterogeneity across establishments using the Bayes factor, we consider the following model M0 with basic specifications in Equation 15 and constraints: (17)

C0(␪) = 0: ⌳B = 0, ⌿B(j, j) = 0, j = 1, ..., 6.

This model is equivalent to a single-level confirmatory factor analysis model that does not take into account the influential factor of the establishments. It has 13 unknown parameters in ⌳W, ⌽W and ⌿W. We observed that logp(D| ␪$ 0, M0) is equal to -2059.0. To compare with the saturated model, we found that 2logB0s = -67.73. Hence, it can be concluded from Equation 11 that this single-level model should be rejected. To illustrate the proposed approach further, the following models are compared using the Bayes factor. M2. A model with specifications given in Equation 15 and the following constraints: (i), (ii) and (iii) as given in Equation 16, and ⌳W(1, 1) = ⌳W(2, 1) = ⌳W(3, 1). 650

MULTIVARIATE BEHAVIORAL RESEARCH

S.-Y. Lee and X.-Y. Song

M3. A model with two factors in ⌺B such that

⌳B

L⌳ b11, g =M N 0 L 10. =M N⌽ b2,1g B

*

*

Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

⌽B

B

b g

b g

⌳ B 2,1 0*

⌳ B 3,1 0*

b gOP , ⌿ Q

⌽B 2,1 10 . *

T

B

0* ⌳ B 4,2

OP , b gQ

0* ⌳ B 5,2

0* ⌳ B 6,2

T

b g b g , g,..., ⌿ b6,6g , = diag ⌿ b11 B

B

and same specifications of ⌳W, ⌽W, and ⌿W as given in Equation 15, together with same constraints as given in Equation 16. M4. Same specifications in the parameter matrices as given in M3 for ⌺B and ⌺W, together with constraints (i) and (ii) given in Equation 16, and for j = 1, ..., 6, ⌳B(j, 1) = ⌳W(j, 1), and ⌳B(j, 2) = ⌳W(j, 2). Note that M1, M2, M3 and M4 are nonnested models. The motivation to consider M3 with two factors in ⌺B is to study whether the influence of the establishments can be better accounted by two factors instead of a single factor as before. The motivation for M4 is to assess cross-level relationship of the factor loadings. To save space, constrained ML estimates of parameters in these models are not presented. The logp(D| ␪$ k, Mk) values for M2, M3 and M4 are equal to -1993.7, -1956.3 and -1975.5; respectively. The computed 2logBhs and 2logBhk, with h, k = 1, 2, 3, 4 are presented in Table 3. For example, 2logB4s = 91.76, 2logB21 = -93.20, and 2logB43 = 25.14. From 2logBhs, it seems that all the models considered are more preferable than the saturated model. Based on more comparisons among themselves, we can conclude that M1 is the best among these four models.

Table 3 2logBhs and 2logBhk for Models Comparison Modelh s M1 M2 M3

M1 155.87

MULTIVARIATE BEHAVIORAL RESEARCH

M2 62.69 -93.20

M3 117.58 -38.29 54.91

M4 91.76 -63.43 29.77 25.14 651

Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

S.-Y. Lee and X.-Y. Song

To conclude, we have shown that a single-level confirmatory factor analysis model is inadequate to fit the AIDS data set. Several nonnested plausible two-level models have been found. Using the Bayes factor approach, we are able to identify the best model among these plausible models. The above conclusion on model comparison cannot be achieved by existing theory in two-level structural equation modeling. From the above illustration with various models and their associated constraints, it can be seen that the Bayes factor is rather flexible and useful in model testing and comparison for two-level SEMs. The stepwise strategies proposed by Raftery (1993) can be used for situations with a large number of competitive models. Discussion Bayes factor (see Berger, 1985; Kass & Raftery, 1995) is the most important statistic in Bayesian model selection. It has been applied extensively to a lot of statistical problems, such as nonnested regression models (Carlin & Chib, 1995), density estimation (Escobar & West, 1995; Roeder & Wasserman, 1997), multiple change-point problems (Green, 1995), mixture models (Chib, 1995; Richardson & Green, 1997), multisample factor analysis model and multivariate linear model with polytomous variables (Song & Lee, 2001, in press), among many others. In this article, a Bayes factor approach is proposed for hypothesis testing and model comparison in general two-level SEMs with unbalanced designs and parameters subject to nonlinear constraints. For two-level SEMs, some software programs give some chi-square goodness-of-fit statistics on the basis of the classical likelihood ratio test that is associated with the ML estimation. However, the basic i.i.d. assumption required by ML estimation is violated. Hence, we feel very uneasy to use the classical asymptotic result to claim directly that the likelihood ratio statistic has a chi-square distribution. The Bayes factor approach does not have the above theoretical problem. Moreover, this approach can alleviate other difficulties and deficiencies in using fitting indexes or significance tests. For example, it can be applied to nonnested models and does not tend to reject the null hypothesis frequently with large sample sizes. Although the approach is described in the context of two-level models with ␮ = 0 and invariant within-group covariance structures, it can be applied to more general models. For example, if ␮ ≠ 0 in M1, then the parameter vector ␪1 will involve ␮ and the Bayes factor can be similarly computed via Equation 10 with a modified d1. In this article, the natural logarithm of the Bayes factor is approximated via the Schwarz criterion. Minus twice the Schwarz criterion is commonly 652

MULTIVARIATE BEHAVIORAL RESEARCH

S.-Y. Lee and X.-Y. Song

called the Bayesian information criterion (BIC); sometimes an arbitrary constant is added. As pointed out by Kass and Raftery (1995), the Schwarz criterion is easy to use, does not require evaluation of prior distribution and provides a reasonable indication of the evidence. The Akaike (1973) information criterion in comparing models M1 and M2 is given by

Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

(18)

AIC = -2[logp(D| ␪$ 1, M1) - logp(D| ␪$ 2, M2)] + (d1 - d2),

which does not involve the sample size n. Comparing Equations 10 and 18, we see that our approach tends to favor simpler models more than those selected by AIC. The computation of the Bayes factor or its various approximations has received a great deal of attention in the literature of statistics and it is still an area of active research, see the survey papers by Kass and Raftery (1995), and DiCiccio, Kass, Raftery, and Wasserman (1997) and references therein. For two-level SEMs, the popularity of the Bayes factor approach depends on how easy the general users can obtain the ML estimates. As pointed out by Lee and Tsang (1999), only minor modification of the existing software such as LISREL (Jöreskog & Sörbom, 1996) and EQS (Bentler, 1992) is required in obtaining ML estimates that subject to simple linear constraints. Hopefully, we expect to see such options to be included in these common software in the near future. In the absence of the ML estimates, a further approximation of the Bayes factor can be computed in terms of the less optimal but more familiar estimates, such as those given in Muthén (1994). The evaluation of the performance of such an approximation via simulation studies would be interesting. The performance of the exact Bayes factor should also be evaluated, as well as its approximation by S. References Aitkin, M. (1991). Posterior Bayes factors (with discussion). Journal of the Royal Statistical Society, Ser. B, 53, 114-142. Aitkin, M. & Longford, N. (1986). Statistical modeling issues in school effectiveness studies (with discussion). Journal of the Royal Statistical Society, Ser. A, 149, 1-43. Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrox & F. Caski (Eds.), Second international symposium on information theory (p. 267). Budapest: Akademiai Kiado. Bentler, P. M. (1992). EQS structural equations program manual. Los Angeles: BMDP Statistical Software. Bentler, P. M. & Bonett, D. G. (1980). Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin, 88, 588-606. Bentler, P. M. & Weeks, D. G. (1980). Linear structural equations with latent variables. Psychometrika, 45, 289-308.

MULTIVARIATE BEHAVIORAL RESEARCH

653

Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

S.-Y. Lee and X.-Y. Song Berger, J. O. (1985). Statistical decision theory and Bayesian analysis. New York: Springer-Verlag. Berger, J. O. & Delampady, M. (1987). Testing precise hypotheses (with discussion). Statistical Science, 3, 317-352. Berger, J. O. & Pericchi, L. R. (1996). The intrinsic Bayes factor for model selection and prediction. Journal of the American Statistical Association, 91, 109-122. Berger, J. O. & Sellke, T. (1987). Testing a point null hypothesis: The irreconcilability of P values and evidence. Journal of the American Statistical Association, 82, 112-122. Carlin, B. P. & Chib, S. (1995). Bayesian model choice via Markov Chain Monte Carlo. Journal of the Royal Statistical Society, Ser. B, 57, 473-484. Chib, S. (1995). Marginal likelihood from the Gibbs output. Journal of the American Statistical Association, 90, 1313-1321. de Leeuw, J. & Kreft, I. (1986). Random coefficient models for multilevel analysis. Journal of Educational Statistics, 11, 57-85. Diciccio, T. J., Kass, R. E., Raftery, A., & Wasserman, L. (1997). Computing Bayes factors by combining simulation and asymptotic approximations. Journal of the American Statistical Association, 92, 903-915. Draper, D. (1995). Assessment and propagation of model uncertainty. Journal of the Royal Statistical Society, Ser. B, 57, 45-97. Escobar, M. D. & West, M. (1995). Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90, 577-589. Gelfand, A. E. & Dey, D. K. (1994). Bayesian model choice: Asymptotic and exact calculations. Journal of the Royal Statistical Society, Ser. B, 56, 501-514. Goldstein, H. I. (1986). Multilevel mixed linear model analysis using iterative generalized least squares. Biometrika, 73, 43-56. Goldstein, H. & McDonald, R. P. (1988). A general model for the analysis of multilevel data. Psychometrika, 53, 455-468. Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82, 711-732. Jöreskog, K. G., & Sörbom D. (1996). LISREL 8: Structural equation modeling with the SIMPLIS command language. London: Scientific Software International. Kass, R. E. & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773-795. Kass, R. E. & Vaidyanathan, S. (1992). Approximate Bayes factors and orthogonal parameters, with application to testing equality of two binomial proportions. Journal of the Royal Statistical Society, Ser. B, 54, 129-144. Lee, S. Y. (1990). Multilevel analysis of structural equation models. Biometrika, 77, 763772. Lee, S. Y. & Poon, W. Y. (1992). Two level analysis of covariance structures for unbalanced design with small level one samples. British Journal of Mathematical and Statistical Psychology, 45, 109-123. Lee, S. Y. & Poon, W. Y. (1998). Analysis of two-level structural equation models via EM type algorithms. Statistica Sinica, 8, 749-766. Lee, S. Y. & Tsang, S. Y. (1999). Constrained maximum likelihood estimation of two-level covariance structure model via EM type algorithms. Psychometrika, 64, 435-450. Longford, N. T. & Muthén, B. O. (1992). Factor analysis for clustered observations. Psychometrika, 57, 581-597. McDonald, R. P. (1993). A general model for two-level data with responses missing at random. Psychometrika, 58, 575-585.

654

MULTIVARIATE BEHAVIORAL RESEARCH

Downloaded by [Ecole Hautes Etudes Commer-Montreal] at 04:52 08 September 2015

S.-Y. Lee and X.-Y. Song McDonald, R. P. & Goldstein, H. I. (1989). Balanced versus unbalanced designs for linear structural relations in two-level data. British Journal of Mathematical and Statistical Psychology, 42, 215-232. Morisky, D. E., Tiglao, T. V., Sneed, C. D., Tempongko, S. B., Baltazar, J. C., Detels, R. & Stein, J. A. (1998). The effects of establishment practices, knowledge and attitudes on condom use among Filipino sex workers. AIDS Care. 10, 213-220. Muthén, B. (1991). Multilevel factor analysis of class and student achievement components. Journal of Educational Measurement, 28, 338-354. Muthén, B. (1994). Multilevel covariance structure analysis. Sociological Methods Research, 22, 376-398. Newton, M. A. & Raftery, A. E. (1994). Approximate Bayesian inference by the weighted likelihood boostrap (with discussion). Journal of the Royal Statistical Society, Ser. B, 56, 3-48. O’Hagan, A. (1995). Fractional Bayes factor for model comparison. Journal of the Royal Statistical Society, Ser. B, 57, 99-138. Raftery, A. E. (1986). Choosing models for cross-classifications. American Sociological Review, 51, 145-146. Raftery, A. E. (1993). Bayesian model selection in structural equation models. In K. A. Bollen & J. S. Long (Eds), Testing structural equation models. London: New Delhi. Raftery, A. E. (1995). Bayesian model selection in social research (with discussion). In P. V. Marsden (Ed), Sociological methodology 1995. Cambridge, MA: Blackwells. Raudenbush, S. W. (1995). Maximum likelihood estimation for unbalanced multilevel covariance structure models via the EM algorithm. British Journal of Mathematical and Statistical Psychology, 48, 359-370. Raudenbush, S. & Bryk, A. (1988). Methodological advances in studying effects of schools and classrooms on student learning. In E. Z. Roth (Ed.), Review of research in education (pp. 423-475). Washington, DC: American Educational Research Association. Richardson, S. & Green, P. J. (1997). On Bayesian analysis of mixtures with unknown number of components (with discussion). Journal of the Royal Statistical Society, Ser. B, 59, 731-792. Roeder, K. & Wasserman, L. (1997). Practical Bayesian density estimation using mixtures of normals. Journal of the American Statistical Association, 92, 894-902. Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461464. Song, X. Y. & Lee, S. Y. (2001). Bayesian estimation and test for factor analysis model with continuous and polytomous data in several populations. British Journal of Mathematical and Statistical Psychology, 54, 237-263. Song, X. Y. & Lee, S. Y. (in press). Bayesian estimation and model selection of multivariate linear model with polytomous variables. Multivariate Behavioral Research. Thum, Y. M. (1997). Hierarchical linear models for multivariate outcomes. Journal of Educational and Behavioral Statistics, 22, 77-108. Tierney, L. & Kadane, J. B. (1986). Accurate approximations for posterior moments and marginal densities. Journal of the American Statistical Association, 81, 82-86. Tierney, L., Kass, R. E. & Kadane, J. B. (1989). Fully exponential Laplace approximations to expectations and variances of nonpositive functions. Journal of the American Statistical Association, 84, 710-716.

Accepted October, 2000. MULTIVARIATE BEHAVIORAL RESEARCH

655

Hypothesis Testing and Model Comparison in Two-level Structural Equation Models.

One basic and important problem in two-level structural equation modeling is to find a good model for the observed sample data. This article demonstra...
564B Sizes 1 Downloads 3 Views