psychometrika doi: 10.1007/s11336-015-9461-1

PROFILE LIKELIHOOD-BASED CONFIDENCE INTERVALS AND REGIONS FOR STRUCTURAL EQUATION MODELS

Jolynn Pek YORK UNIVERSITY

Hao Wu BOSTON COLLEGE Structural equation models (SEM) are widely used for modeling complex multivariate relationships among measured and latent variables. Although several analytical approaches to interval estimation in SEM have been developed, there lacks a comprehensive review of these methods. We review the popular Wald-type and lesser known likelihood-based methods in linear SEM, emphasizing profile likelihood-based confidence intervals (CIs). Existing algorithms for computing profile likelihood-based CIs are described, including two newer algorithms which are extended to construct profile likelihood-based confidence regions (CRs). Finally, we illustrate the use of these CIs and CRs with two empirical examples, and provide practical recommendations on when to use Wald-type CIs and CRs versus profile likelihood-based CIs and CRs. OpenMx example code is provided in an Online Appendix for constructing profile likelihood-based CIs and CRs for SEM. Key words: confidence intervals, confidence regions, profile likelihood, Wald, multi-parameter.

The broad application of structural equation models (SEM) in psychology and the social sciences may be attributed to the fact that many scientific theories can be expressed as a statistical model within this comprehensive framework. In SEM, a network of directional and nondirectional relationships between latent and measured variables (LVs and MVs) is hypothesized and formulated. Estimates in SEM are simultaneously obtained for multiple parameters that are a combination of means, intercepts, slopes, variances, residual variances, and covariances. The generality of SEM is evident in that it encompasses or overlaps a myriad of well-developed modeling techniques such as multiple linear regression, path analysis, factor analysis, linear mixed models (Bauer, 2003; Curran, 2003), and even state space models (Chow, Ho, Hamaker, & Dolan, 2010). Users of SEM are often interested in making inferences about several parameter estimates, or functions of parameter estimates, and typically rely on point-wise Wald-type confidence intervals (CIs) or their commensurate z- or t-tests. Much progress by way of inferential devices has been made for SEM beyond these Wald-type approaches, especially likelihood-based CIs, and a review of these techniques remains wanting. This paper presents a review of developments in the two primary non-empirical inferential devices in SEM (Wald-type and likelihood-based approaches), placing emphasis on CIs to reflect the push towards using and reporting point estimates and their associated measures of uncertainty due to sampling variability over null hypothesis significance tests (American Psychological Association, 2001; American Educational Research Association, 2006; Cohen, 1994; Meehl, 1997; Nickerson, 2000; Wilkinson & Task Force on Statistical Inference, 1999). We also extend the review to include confidence regions (CRs) for multiple parameters and detail a newer set of algorithms for their computation. Electronic supplementary material The online version of this article (doi:10.1007/s11336-015-9461-1) contains supplementary material, which is available to authorized users. Correspondence should be made to Jolynn Pek, Department of Psychology, York University, 322 Behavioural Science Building, 4700 Keele Street, Toronto, ON M3J 1P3 Canada. Email: [email protected]

© 2015 The Psychometric Society

PSYCHOMETRIKA

To establish notation and frame the problem, the paper begins with a brief description of maximum likelihood (ML) estimation of parameters in SEM followed by a review of the construction of Wald-type and likelihood-based methods for point-wise CIs. Next, we describe how both types of CIs are constructed for a single function of model parameters and highlight several differences between likelihood-based approaches to inference versus Wald-type approaches. In the section on likelihood-based inference, we briefly review the issue of nuisance parameters and highlight the profile likelihood approach. Existing algorithms for constructing profile likelihood-based CIs are described, followed by a detailed description of two more recent algorithms for estimating such CIs. We then extend inference about a single parameter and a single function of parameters to multiple parameters and multiple functions of parameters, and describe how Wald-type and likelihood-based CRs for such multiple parameters and functions of parameters are constructed. Two empirical examples are then used to illustrate Wald-type and profile likelihood-based CIs and CRs. In these examples, we highlight the advantages of using profile likelihood-based CIs and CRs over Wald-type CIs and CRs in certain contexts. The first empirical example is a LV mediation model from a published study in social psychology (Schmitt, Branscombe, Kobrynowicz, & Owen, 2002) where CRs for multiple parameters involving the indirect and total effects are constructed. This example emphasizes the added interpretational value of constructing CRs over joint tests and multiple point-wise CIs when multiple parameters are simultaneously evaluated. The second example is a linear latent curve model (LCM) on aggregate parent and teacher scores measuring children’s Delinquent Peer Association (Stoolmiller, 1994) taken from Hancock and Choi (2006). CIs and CRs are constructed for parameters concerning the aperture, or the point in time where adolescents are most similar to one another in terms of their Delinquent Peer Association scores. In this example, the conditions where Wald-type and profile likelihood-based CIs and CRs strongly diverge are demonstrated.

1. Model Estimation An SEM analysis involves fitting a hypothesized system of linear equations among MVs and LVs to the covariances of p MVs. The p × p population covariance matrix of the MVs is typically denoted by  and the model implied k × 1 vector of parameters is denoted by θ. The SEM is expressed as  = (θ ), implying that the population covariance matrix  for the MVs is a function of the model parameters θ. Parameter estimates θˆ are obtained by minimizing a discrepancy function that measures the discrepancybetween the model implied population matrix N (yi − y¯ )(yi − y¯ )T . Here, N is the sample (θ ) and the sample covariance matrix S = N 1−1 i=1 size, yi is the p × 1 vector of values on the p MVs for case i, and y¯ is the p × 1 vector of MV means. Different discrepancy functions exist for parameter estimation in SEM, such as generalized least squares (GLS) and asymptotic distribution free (ADF; Browne, 1984) approaches. The method of maximum Wishart likelihood (MWL) is one of the dominant techniques and is asymptotically efficient under the normality assumption. For normally distributed MVs, minimizing the MWL discrepancy function is equivalent to maximizing the likelihood function of the sample covariance matrix S. The MWL discrepancy function is given by   F (, S) = ln |S| + tr S −1 − ln || − p,

(1)

where p is the number of MVs, and tr(·) denotes the trace of a square matrix. The estimated sample discrepancy function is Fˆ = F((θˆ ), S)). Under the Wishart distribution (Browne & Arminger, 1995), the goodness-of-fit test statistic

JOLYNN PEK AND HAO WU

X 2 = (N − 1) Fˆ

(2)

asymptotically follows a χ 2 distribution with p( p + 1)/2 − k degrees of freedom (d f ) under the null hypothesis that the population covariance matrix  fits the model perfectly.1 Recall that p is the total number of MVs and k is the total number of parameters estimated under the specified model. This goodness-of-fit test is also known as the likelihood ratio test (LRT; Neyman & Pearson, 1928) as the statistic X 2 compares the likelihoods of the specified model against the saturated model that perfectly reconstructs the sample covariance matrix (Bollen, 1989, p. 292). In this review, we focus on the use of the MWL discrepancy function and its test statistics under the multivariate normality assumption for which profile likelihood methods are originally designed. Similar to the Wald test, test-statistic-based CIs and CRs could also be obtained using other discrepancy functions and test statistics without the normality assumption. To our knowledge, such attempts have not been made in psychometrics. 2. Constructing Confidence Intervals A useful supplement to a single point estimate or a single function of parameter estimates, ˆ respectively, is a CI which conveys uncertainty due to sampling variability. In addition θˆ or g(θ) to communicating information regarding estimate precision, CIs also inform of results regarding a null hypothesis significance test (NHST) because CIs are typically constructed by inverting a NHST about a single parameter. 2.1. Wald-Type Confidence Intervals 2.1.1. A Single Parameter Let θ = (θ1 , . . . , θk ) represent the k × 1 vector of model parameters. A Wald-type CI is obtained by inverting a Wald test (Wald, 1943) for a single parameter of interest using the Wald statistic, W = (θˆ j −θ j0 )/S E θˆ j . In general, the standard error of estimate, S E θˆ j , is the square root of the jth diagonal element of the inverse of the Fisher information matrix, I (θ ) = −E[ ∂∂θl(θ) ] (Fisher, 1922), where l(θ) is the log-likelihood. In practice, the Fisher ∂θ  information is evaluated at the ML estimate θˆ and its inverse communicates information about the curvature of the log-likelihood surface at the ML estimate and serves to approximate the log-likelihood surface with a quadratic form under certain regularity conditions (Steiger, 1980; see Figure 3). Wald-type CIs are constructed using the familiar formula: estimate ± critical value × standard error of estimate, where the critical value is determined by some chosen confidence level (1 − α) or error rate α and a reference distribution such as the z, t, F or central χ 2 distribution. For instance, by inverting a Wald-type z-test with α = .05, the 95 % Wald-type CI is θˆ j ± 1.96S E θˆ j . Note that Wald-type CIs are symmetric by definition. The performance of Wald-type CIs directly depends on how well the estimated standard errors S E θˆ provide approximate information regarding the curvature of the likelihood surface. The quadratic approximation of the Wald approach holds when sample sizes are large (Meeker & Escobar, 1995) such that sampling distributions approximate multivariate normality or when the covariance matrix of the estimates show little variability in the neighborhood of the ML estimate. Such a requirement of asymptotic normality is not often met in applications of SEM with finite 2

1 Alternatively, the multivariate normal likelihood function could be maximized with an additional mean structure, N F(, S, μ, y¯ ) = ln |S|+tr(S −1 )−ln ||− p+[¯y −μ]T  −1 [¯y −μ]. The associated ML estimate of S = N1 i=1 (yi − T 2 y¯ )(yi − y¯ ) for this discrepancy function, and the multiplier for the test statistic X in Equation 2 would be N instead of (N − 1). We exclude the mean structure to simplify the expressions of later equations.

PSYCHOMETRIKA

sample size (Neale & Miller, 1997), resulting in Wald tests that tend to have suboptimal Type I or α error rates and Wald-type CIs with suboptimal coverage or (1 − α) rates (Stryhn & Christensen, 2003). In practice, Wald-type estimated standard errors and CIs, for the most part, serve well to approximate the sampling variability of parameter estimates for large sample size. When data are not assumed to be normally distributed or when other discrepancy functions are used, the standard errors of parameter estimates can be calculated from a sandwich type asymptotic covariance matrix (e.g., see review by Savalei, 2014). 2.1.2. A Function of Parameters Oftentimes, SEM parameters that are directly estimated do not immediately shed light on substantive hypotheses. Instead, parameters are passed through a ˆ is more readily interpretable single function g(·) such that the post transformed parameters in g(θ) (Preacher & Hancock, 2012). Some examples of functions of parameters include the intra-class correlation (McGraw & Wong, 1996), the indirect effect in mediation models (Baron & Kenny, 1986; Sobel, 1982), and the aperture in a latent growth model (Hancock & Choi, 2006). As an extension of the Wald test for a k = 1 parameter, the Wald test for a single function of parameters ˆ − g(θ 0 ))/S E ˆ . g(θ) is defined as W = (g(θ) g(θ ) ˆ and S E ˆ . The first method involves reThere are two approaches to obtaining g(θ) g(θ )

ˆ and S E ˆ are estimated estimation, where the initial model is re-parameterized such that g(θ) g(θ ) directly (e.g., Preacher & Hancock, 2015). The second method does not involve re-estimation, and passes the original estimates θˆ through the function of interest g(·) to obtain the point estiˆ is then approximated using the delta method when g(·) is a nonlinear mate; the variance of g(θ) function. The delta method is a general analytical approach for deriving the approximate sampling distribution for some function of an asymptotically normal estimator (see Raykov and Marcoulides 2004 for a good tutorial). Specifically, the first order Taylor series expansion is employed to ˆ ≈ g(θ) + (θˆ − θ )T g  (θ ), where g  (θ ) is the vector of partial first linearize the function, g(θ) ˆ is derivatives of g(·) taken with respect to each parameter. The estimate of the variance of g(θ)  T −1  ˆ I (θˆ ) g (θˆ ), and the standard error of estimate is S E ˆ = V ˆ 1/2 . ˆ ≈ g (θ) A R[g(θ)] V A R[g(θ)] g(θ ) Accuracy of the delta method standard error of estimate depends on the distance between θ and θˆ and the accuracy of the linearization, which further depends on the smoothness of the derivatives and the sample size (Bollen & Stine, 1990). Delta method CIs for g(θ) are available in SEM software such as LISREL (Jöreskog & Sörbom, 2006), Mplus (Muthén & Muthén, 2012), and the ‘lavaan ’ package in R (Rosseel, 2012). When non-normal data or a different discrepancy function is considered, the delta method still applies and leads to a similar ˆ where I (θˆ )−1 is replaced by the sandwich type asymptotic covariance expression of V A R[g(θ)] matrix. In general, Wald-type inferential devices have the disadvantage of not being invariant to transformations. Consider the inference about a single parameter θ and its transformed counterpart g(θ ) = ln(θ ). Intuitively, tests and CIs for these two expressions of θ should be consistent with each other. However, the non-invariance of Wald methods would result in different p-values for tests about θ and ln(θ ) and the Wald-type CIs for θ and ln(θ ) would not be direct transformations of each other. The performance of Wald-based approaches to inference is also dependent on parameterization (Sprott, 1980). For example, a test on whether two latent variables are correlated or not may depend on how the latent variable model is identified (Gonzalez & Griffn, 2001). Although there could exist a group of transformations that ensures the invariance of Wald-type approaches (e.g., see Lehmann & Romano, 2006, pp. 11–14), such developments have yet to be formally extended to SEM. Research on the practical consequences of transformation noninvariance of Wald approaches have received some recent attention. Simulation studies focusing on

JOLYNN PEK AND HAO WU

mediation show that Wald-type approaches perform reasonably well when the size of the indirect effect and sample size are large, but likelihood-based and resampling approaches are preferred when size of the indirect effect and sample size are small (e.g., Cheung, 2007; Falk & Biesanz, 2015). Beyond indirect effects, Wald-type CIs tend not to perform as well as likelihood-based CIs for bounded parameters such as Pearson correlation coefficients (Cheung, 2009). 2.2. Likelihood-Based Confidence Intervals Likelihood-based methods to inference have several advantages compared to Wald-type approaches although both are asymptotically equivalent (Cox & Hinkley, 1974, p. 321). Wald approaches perform well when the sampling distribution of parameter estimates follow a multivariate normal distribution. Unfortunately, this requirement of asymptotic normality is not usually met in applications of SEM with finite sample size, resulting in Wald tests with less than optimal Type I or α error rates and Wald-type CIs with less than optimal coverage or (1 − α) rates. Under the same condition of finite sample size in SEM, the LRT statistic follows its assumed sampling distribution more closely than the Wald test. Likelihood-based approaches in SEM have more precise error or coverage rates than Wald-based ones (Neale & Miller, 1997), and this improved accuracy is achieved with greater computational effort. In practice, with large N , Wald-type and likelihood-based CIs for most θ show much similarity in terms of coverage and efficiency (e.g., see Cheung, 2009). Likelihood-based methods also have the added advantage over Wald-type methods of being invariant to transformations or parameterization. Different expressions of the same parameter would have identical p-values for LRTs, and their likelihood-based CIs are direct transformations of each other. For instance, likelihood-based CIs for the indirect effect associated with a mediator have better coverage and Type I error rates when the size of the effect and sample size are small (Cheung, 2007; Falk & Biesanz, 2015). It is noteworthy to also mention that likelihood-based CIs have been evaluated against the more popular resampling methods, especially for indirect effects; targeted simulations show that likelihood-based CIs are comparable and sometimes superior to resampling methods (Cheung, 2007; Falk & Biesanz, 2015). Future research should compare resampling methods against likelihood-based CIs beyond the context of indirect effects. Importantly, the advantages of likelihood-based inference have motivated the development of several algorithms for obtaining likelihood-based CIs for a single parameter. 2.2.1. Single Parameter Case If the model has a single parameter, the likelihood-based CI for this parameter is constructed by inverting a certain type of LRT which tests the hypothesis ˆ − l(θ0 )] where l(θ) ˆ is the logH0 : θ = θ0 . This LRT statistic is expressed as G 2 = 2[l(θ) likelihood for the ML estimate and l(θ0 ) is the log-likelihood fixed at the value of the null hypothesis. The value of θ0 is determined by the user and is typically chosen to be zero. When a single parameter is tested, G 2 follows a χk2 distribution with k = 1 d f . The critical value associated 2 . To obtain a 100(1 − α)% likelihood-based CI, the LRT is inverted by searching with G 2 is χ1,1−α for values of θ0 such that the test statistic G 2 is non-significant. Two unique boundary values of 2 is met, and these two values relate θ0 , each denoted θ B , exist when the equality G 2 = χ1,1−α to the lower and upper bounds of the likelihood-based CI for θ . A search algorithm is typically employed to compute these boundary values of the CI. In the case of non-normal data (e.g., use of ADF discrepancy function) or use of a non-ML discrepancy function (e.g., GLS discrepancy function with normal data) where the test statistic has a χ12 distribution, the same procedure of inverting an LRT to obtain profile likelihood-based CIs can be directly applied. If the test statistic has a different asymptotic distribution approximated by a scaled χ 2 (Satorra & Bentler, 2001; Satterthwaite, 1941), the algorithm can search for unique boundary values of θ0 that satisfies T = cχk2 ,1−α . Note that in this case c and k  may depend on

PSYCHOMETRIKA

the value θ0 . The actual coverage performance of the test-statistic based CI depends on the finite sample performance of its asymptotic sampling distribution (or its approximation). 2.2.2. Eliminating Nuisance Parameters Structural equation models typically have numerous parameters that are required to complete the model, and many SEMs have nuisance parameters that are not of inferential interest. Unlike Wald-type approaches, likelihood-based methods have the added complication of eliminating nuisance parameters θ n such that inferences about a scalar focal parameter θ f can be made.2 Let the joint likelihood of both focal and nuisance parameters be denoted by L(θ f , θ n ). Most of the analytical complexities in the theory and application of the likelihood are associated with the problem of nuisance parameters, and very many elimination methods have been devised (Kalbfleisch & Sprott, 1970; Basu, 1977). The main theoretical methods to eliminate nuisance parameters without bias are conditioning and marginalizing (Pawitan, 2001). 2.2.3. Marginalizing and Conditioning The method of marginalizing is appropriate when the joint likelihood of focal and nuisance parameters can be expressed as L(θ f , θ n ) = pθ f (y f ) pθ f ,θ n (yn |y f ) = L m (θ f )L c (θ f , θ n ), where p(·) denotes some probability distribution function and y is the vector of MVs associated with the parameters in θ . The marginal likelihood of θ f is then L m (θ f ) = pθ f (y f ), where this expression does not contain the nuisance parameters θ n . Marginal likelihoods have received much attention in nonlinear SEM models (e.g., Bock & Lieberman, 1970). The method of conditioning is used when the joint likelihood can be re-expressed as L(θ f , θ n ) = pθ f (y f |yn ) pθ f ,θn (yn ) = L c (θ f )L m (θ f , θ n ), where the conditional likelihood for θ f is L c (θ f ) = pθ f (y f |yn ). Note that this conditional likelihood excludes the vector of nuisance parameters θ n . The likelihood for SEMs is a function of simultaneous equations of the parameters and cannot be easily re-expressed into functions of marginal and conditional likelihoods. Only special cases involving limited types of models have conditional likelihoods been worked out (e.g., du Toit & Cudeck, 2009). Therefore, we focus on the more easily applied profile likelihood method. 2.2.4. Profile Likelihood The profile likelihood approach is a reasonable method to eliminate nuisance parameters by replacing θ n by its ML estimate at each fixed value of θ f . More formally, the profile likelihood of θ f is   L(θ f ) =max L θ f , θn , θn

(3)

where ML estimation of θ n is performed at fixed values of θ f to obtain θ˜ n . To construct a profile likelihood-based CI, we begin with an LRT for the focal parameter with null hypothesis H0 : θ f = θ0 f , where θ0 f is a value determined by the user. The LRT statistic to be inverted is modified to reflect the partitioning of nuisance and focal parameters      G 2 = 2 l θˆ − l θ0 f , θ˜ n , 2 Profile likelihood-based CRs for a vector of focal parameters θ is detailed later. f

(4)

JOLYNN PEK AND HAO WU

where l(θˆ ) remains the log-likelihood value associated with the ML estimates θˆ = (θˆ f , θˆ n )T , and l(θ0 f , θ˜ ) is the joint log-likelihood associated with the k f = 1 population focal parameter under the null, θ0 f , and the ML nuisance parameter estimates of θ˜ n . Note that θˆ n is distinct from θ˜ n as θˆ n is estimated jointly with θˆ f whereas θ˜ n is estimated holding θ f fixed. This G 2 test statistic follows an asymptotic χ 2 distribution with k f = 1 degree of freedom, and an approximate (1 − α)100 % profile likelihood-based CI is the set of values of θ0 f that satisfy 2 . G 2 ≤ χ1,1−α

(5)

2 The upper and lower bounds of the CI are identified when G 2 = χ1,1−α , and this CI has an overall −1/2 coverage of 1 − α + O(N ) (c.f. DiCiccio & Tibshirani, 1991). The profile likelihood approach can also be used for non-normal data or for non-ML discrepancy functions, in which case the G 2 statistic is replaced by the statistic T of the difference test of H0 : θ f = θ0 f . The test-statistic based CI is defined as the set of values of θ0 f that satisfy T ≤ cχk2 ,1−α , where c = k  = 1 if T itself follows a χ12 distribution. However, c and k  could depend on θ0 f (both directly and through θ˜ n ) if an approximation method is used.

2.2.5. A Function of Parameters If the research interest lies with a single focal function of the parameters g f (θ ), there are two approaches to constructing likelihood-based CIs for g f (θ). These two methods would produce identical CIs as likelihood-based methods are invariant to transformation and parameterization. First, the model could be re-parameterized such that g f (θ ) is directly estimated, and the LRT for testing this estimated function of parameters is inverted to obtain the desired CI. Alternatively, a multi-parameter profile likelihood-based CR for the set of k parameters in θ could be computed first with d f = 1, and then passed through the function g f (·). Geometrically, these transformed CR values are a projection of the k-dimensional CR (calculated with d f = 1) onto the re-parameterized one-dimensional space of g f (θ ). The minimum and maximum values of the projected boundary points of the transformed CR values are the lower and upper bounds of the desired CI. If the function g f only involves a subset of the parameters in θ , the CR of that subset can be first calculated before passing it through the function g f . The generality of the profile likelihood as an approach to eliminate nuisance parameters comes with the price of potential bias in the ML estimate and overly optimistic precision even with large N (see Pawitan (2001, pp. 273–278) for an example). In most situations, when the number of nuisance parameters kn is small relative to the sample size N , bias would be limited relative to precision. Bias is reduced with increasing sample size. However, when kn is of the same order of magnitude as N , bias would tend to be large relative to estimate precision, resulting in an inconsistent estimation (Kalbfleisch & Sprott, 1970). According to recommended practice, sample size N should be large relative to k when fitting SEMs to data, and constructing these CIs about θ f . Future simulation studies would be informative of the relative ratio of N to k required to obtain profile likelihood-based CIs in SEM with proper Type I error rates and coverage. 2.2.6. Existing Algorithms Computing profile likelihood-based CIs involves finding roots of a function where each function evaluation requires a constrained ML estimation (Meeker & Escobar, 1995). The constraint occurs as values of θ f are held fixed when θ n is re-estimated during the search for the limits to the CI of θ f . Additionally, computations are simplified by treating the estimation of each boundary point of the CI as separate. Obtaining profile likelihood-based CIs for θ f , by definition, is most intuitively understood as a grid search (Stryhn & Christensen, 2003). Let θˆL and θˆU denote the lower and upper bounds of the profile likelihood-based CI. To obtain θˆL , a reasonable lower bound θl is first selected; for

PSYCHOMETRIKA

example, let θl = −5 × S E θˆ f . This is followed by constructing a grid of candidate values, each denoted as θ˙L that ranges from θl to θˆ f , and re-estimating the nuisance parameters θ n via ML at each θ˙L value. A G 2 test statistic value is obtained for each set of grid values (θ˙L , θ˜ n ) and the estimated lower bound θˆL is the smallest grid value θ˙L for which the equality in Equation 5 holds; θˆU is similarly estimated. More efficient algorithms compared to the grid search have been proposed to obtain profile likelihood-based CIs. Venzon and Moolgavkar (1988) developed a two-fold modified NewtonRaphson algorithm based on first and second derivatives to solve for the limits of Equation 5, allowing for profile likelihood-based CIs to be obtained efficiently. In the special case of the profile likelihood of θ f having a quadratic form, the algorithm would find each boundary point of the CI in two steps. Note that this algorithm is not flexible in that the fitted model has to be re-parameterized if a CI of a function of parameters g f (θ ) is desired. This algorithm does not yet seem to be implemented in SEM software, but exists for other specialized models such as capturerecapture models in the R (R Development Core Team, 2013) package ‘Rcapture’ (Baillargeon & Rivest, 2007). Cook and Weisberg (1990) proposed an iterative procedure involving dynamic increments of step sizes or perturbations κ from the ML estimate θˆ f in the search. The computations of θˆL and θˆU are treated separately. Let θ L = θˆ f + κ. The size of κ is based on the shape of the profile likelihood of θ f , such that large sizes are selected with low curvature and vice versa. As the procedure is iterative, the set of nuisance parameters θ n is re-estimated at each candidate θ˙L value until convergence to θˆL . This algorithm could also produce confidence curves or CIs of different confidence levels. Cook and Weisberg’s (1990) algorithm is implemented in NL2SN for nonlinear regression models (Dennis, Gay, & Walsh, 1981), and has not been implemented for SEMs. In a technical report, DiCiccio and Tibshirani (1991) outline a two step iterative algorithm to accurately locate each boundary point of profile likelihood-based CIs. The profile likelihood surface of θ f is first computed on a grid, using a Lagrangian for maximizing l(θ ) subject to θ f . Next, θ n is evaluated along this surface with a Newton-Raphson procedure. Compared to the Venzon and Moolgavkar (1988) algorithm, DiCiccio and Tibshirani’s algorithm can calculate the profile likelihood surface as well as compute CIs for functions of parameters. This algorithm was originally implemented in two S language functions for a general class of models including generalized linear models, but does not seem to be extended to S-PLUS (Insightful Corporation, 2007), R (R Development Core Team, 2013) or any specialized SEM software. Within the SEM framework, Neale and Miller (1997) implemented an algorithm to obtain slightly biased profile likelihood CIs for θ f in Mx (Neale, Boker, Xie, & Maes, 2003) and OpenMx (Boker et al., 2011; Neale et al., 2015). Each boundary point for the CI is defined by a modified discrepancy function (c.f. Equation 1) that has a quadratic form. In particular, the discrepancy function for the lower bound θ L is FL =



    2 F (, S) + χk2f ,1−α − F  θ L , θ˜ n , S + θL ,

(6)

where k f = 1. The lower bound θˆL is computed when the redefined discrepancy function above is optimized. The upper bound θU is similarly estimated in a redefined discrepancy function, that is a reflection of Equation 6. Although the algorithm is faster than a systematic search, the CI estimates obtained from the redefined discrepancy functions are biased by a constant that depends on the slope of the function at the boundary points of the CI. Bias tends to be small with a steep slope, implying a sharp likelihood and high information. Conversely, bias is large in instances of a shallow slope or blunt likelihood.

JOLYNN PEK AND HAO WU

In a separate development, Schoenberg (1997) implemented a modified secant algorithm in Constrained Maximum Likelihood (CML) to obtain unbiased profile likelihood CIs. Constrained Maximum Likelihood is a set of procedures (Schoenberg, 1995) available as a GAUSS Application (Aptech Systems, Inc., 2002) under Constrained Maximum Likelihood MT, and has the flexibility to fit a wide class of statistical models with general parametric constraints including SEMs. Popular SEM software such as AMOS (Arbuckle, 2006), EQS (Bentler, 2006; E. Wu, personal communication, July 18, 2014), LISREL (Jöreskog & Sörbom, 2006), and Mplus (Muthén & Muthén, 2012) have not yet implemented profile likelihood-based CIs. The CALIS procedure in SAS does not yet have the capability to compute profile likelihood-based CIs (Y. Yung, personal communication, September 27, 2013). Similarly, the ‘lavaan’ package (Rosseel, 2012) in R does not have profile likelihood-based CIs implemented. 2.2.7. Recent Algorithms for Profile Likelihood-Based Confidence Intervals We outline two relatively newer algorithms for computing profile likelihood-based CIs about θ f , specific to SEM, and extend it to CRs about a vector valued θ f in the following section. These algorithms have the advantages of obtaining unbiased profile likelihood-based CIs for θ f and g f (θ), as well as being generalizable to compute CRs. The preferred algorithm also boasts of computational efficiency and being readily programmable in OpenMx (Boker et al., 2011; Neale et al., 2015). We begin with a basic computational method which obtains downward biased or approximate profile likelihood-based CIs, extend it to obtain profile likelihood-based CIs, and finally introduce an equivalent algorithm that promotes computational efficiency. Recall that profile likelihood-based CIs are constructed by inverting the LRT in Equation 4 under the constraint in Equation 5. One newer algorithm makes use of a root finding algorithm by Brent (1973) to estimate boundary points of the CI, given the profile likelihood of θ f . For details on this algorithm, see Press, Teukolsky, Vetterling, and Flannery (1992). We begin by re-expressing G 2 in Equation 4 in terms of the MWL discrepancy function in Equation 1, using the relationship established in Equation 2.       G 2 = (N − 1) F  θ0 f , θ n , S − Fˆ ,

(7)

where N is the sample size, (θ0 f , θ n ) is the population covariance matrix under the null hypothesis H0 : θ f = θ0 f , and Fˆ is the estimated discrepancy function value for θˆ . The unique points forming the profile likelihood-based CI, θˆL and θˆU or generally denoted as θ Bt , are determined under the constraint   (N − 1) F ( (θ Bt , θ n ) , S) − Fˆ = χk2 ,1−α , (8) f

where k f = 1 for a single focal parameter. Each one of the T = 2 boundary points of the CI is defined as a perturbation from the ML estimate θˆ f by κt in the direction of dt , θ Bt = θˆ f + κt dt .

(9)

Note the similarity of this setup with Cook and Weisberg’s (1990) algorithm. For the profile likelihood-based CI, the lower bound θˆL has direction vector d1 = −1 and the upper bound θˆU has direction vector d2 = 1. A similar algorithm was developed by MacCallum, Browne, and Lee (2009) and MacCallum, Lee, and Browne (2012) for obtaining fungible parameter values (c.f., Waller, 2008).

PSYCHOMETRIKA

Each boundary point of the profile likelihood-based CI is estimated by substituting Equation 9 into Equation 8 and solving for κt in the following expression

or equivalently,

      (N − 1) F  θˆ f + κt dt , θ n , S − Fˆ = χk2f ,1−α

(10)

      F  θˆ f + κt dt , θ n , S = Fˆ + χk2f ,1−α /(N − 1) .

(11)

ˆ χ2 From Equation 11, θˆ f , dt , S, F, k f ,1−α and N are known; θ n is to be obtained in a constrained ML estimation, and the root finding algorithm searches for the solution to κt . The criterion for which ˆ where the the scaling constant κt is determined could be conceptualized as a perturbed value of F, 2 ˆ value of this perturbation is χk f ,1−α /(N − 1). As ML estimates θ are determined when Fˆ is at its minimum, increasing the value of Fˆ by a perturbation based on the critical value stemming from the LRT obtains boundary points of the profile likelihood-based CI (see Figure 3). Equation 11 defines how the algorithm locates boundary points for the CI, given the profile likelihood of θ f , that is when θ n has been estimated. Equation 10 can also be used for non-normal distributions or for non-ML discrepancy functions, but the critical value χk2f ,1−α needs to be replaced by cχk2 ,1−α , f

where c and k f may depend on κt . When c or k f depend on κt , it is not possible to separate out a constant term as in the right hand side of Equation 11. Next, we describe two approaches to estimating θ n , and finally introduce an efficient algorithm that makes use of a Lagrangian (c.f. DiCiccio & Tibshirani, 1991).

2.2.8. Approximate Profile Likelihood-Based Confidence Intervals A straightforward approach to eliminating θ n is to substitute θˆ n from the initial ML solution of θˆ = (θˆ f , θˆ n )T ˆ Such a substitution obtains an approximate profile likelihood of θ f , when computing F. L a (θ f ) = L(θ f , θˆ n ). By replacing θ n with θˆ n in Equation 11, no additional computation is required beyond searching for κt . This method has been used for computing fungible values in SEM (MacCallum et al., 2009, 2012). By making use of θˆ n , the approximate likelihood does not account for the extra uncertainty due to the nuisance parameters (c.f. θ˜ n in Equation 3 of the profile likelihood of θ f ). Computed CIs based on this approximate likelihood are generally downward biased such that they are more liberal compared to those obtained from the profile likelihood as well as Wald-type CIs. In addition, CIs based on the approximate likelihood depends on the parametrization of the nuisance parameters. For example, if a function of the k parameters is focal, its approximate likelihood-based CI may depend on which k − 1 remaining parameters are chosen to be nuisance. For the most part, boundary points obtained from this computational procedure are useful as starting values in the next procedure that obtains profile likelihood-based CIs. 2.2.9. Profile Likelihood-Based Confidence Intervals The profile likelihood of θ f is ˜ obtained when θ n is estimated by ML, resulting in θ n , for each iterate used in the search for κt . For every candidate scaling constant κ˙ t , assessed by the root finding algorithm (Brent, 1973), the nuisance parameters θ n are re-estimated by optimizing the MWL discrepancy function in Equation 1 while holding the values of the focal parameters fixed at the candidate confidence boundary point θˆ f + κ˙ t dt . For every κ˙ t , a new set of θ˜ n is estimated. Once Equation 11 is satisfied, κ˙ t is taken to be κt and the boundary points of the profile likelihood-based CI are calculated as θˆ f +κt dt . Although profile likelihood CIs are obtained with this algorithm, the algorithm is computationally inefficient because it is an iterative procedure nested within another iterative procedure. For non-normal data or non-ML discrepancy functions, the algorithm has the extra-complexity of updating the scaling factor c and d f k f for the critical value cχk2 ,1−α . f

JOLYNN PEK AND HAO WU

2.2.10. Computationally Efficient Profile Likelihood-Based Confidence Intervals Recently, Wu and Neale (2012) developed an algorithm to obtain unbiased profile likelihood-based CIs for a bounded focal parameter that is analogous to the previously detailed algorithm. Recall that θ B generally denotes a boundary point of the CI for θ f . This algorithm optimizes θ B subject to the constraint F ( (θ B , θ n ) , S) = Fˆ + χk2f ,1−α /(N − 1) (12) with respect to all parameters (θ B , θ n ), so as to obtain the lower and upper bounds of the CI. For the lower confidence bound, θ B is minimized. Conversely, for the upper confidence bound of θ f , θ B is maximized. Note the similarity of Equation 12 with Equation 11; this algorithm will produce equivalent results to the root finding approach and is readily programmable in OpenMx (see Online Appendix for example code). This algorithm is also easily extended to allow for the construction of CIs about a function of parameters g f (θ) by making the appropriate substitutions of θ f with g f (θ ), without having the need to re-parameterize the model (c.f. Venzon & Moolgavkar, 1988). For instance, to construct a CI for g f (θ ), the constraint in Equation 12 remains the same and θ B is replaced by g f (θ ) B ; the lower bound is obtained by minimizing g f (θ ) B and the upper bound is obtained by maximizing g f (θ ) B . However, it is not straightforward to extend this algorithm to the situation where the sampling distribution is not χ12 but is approximated by a scaled χ 2 distribution, because the scaling factor c and d f could depend on both θ B and θ n .

3. Constructing Confidence Regions Joint inference about a set of k f ≥ 1 focal parameters θ f is of interest when focus is placed on the simultaneous effect of multiple parameters. In this case, point-wise approaches should be avoided due to the problem of multiple comparisons (see Benjamini, 2010 for a commentary). As an alternative, multiple adjusted CIs may be constructed, but CRs about θ f are more precise and informative about the combinations that parameters could take (see our first empirical example). Simultaneous CIs prescribe a plausible cubic region in the parameter space for the vector of focal parameters θ f whereas a CR gives a plausible region that is more flexible in shape. If the desired conclusion is a statement concerning individual effects carried by multiple focal parameters, controlling for the presence of other important parameters, then multiple adjusted CIs are a reasonable and convenient choice for inference. In contrast, if specific combinations of parameter estimates are the focus of research, then a joint CR is recommended. Here, the plausible combinations with which the multiple parameters take on are of importance. The following developments for CRs are general for k parameters, and CRs for k f = 2 are emphasized for practical purposes as CRs for k f > 2 are challenging to visualize. 3.1. Wald-Type Confidence Regions For a k f vector of focal parameters θ f , a Wald-type CR is generated by inverting a joint Wald test. The (1 − α)100 % Wald-type CR is defined as the set of θ 0 f in the ellipsoid that satisfies 

θˆ f − θ 0 f

T

  −1  θˆ f − θ 0 f ≤ χk2f ,1−α . V A R θˆ f

(13)

where V A R[θˆ f ] is the estimated asymptotic covariance matrix of θˆ , which is the block that corresponds to θ f in the inverse of the information matrix I (θˆ ). For non-normal data or non-ML discrepancy functions, it is a block of the sandwich type asymptotic covariance matrix. Consider

PSYCHOMETRIKA

the case of k f = 2. The geometrical construction of a Wald-type CR for k f = 2 involves drawing 2 a unit circle in the two-dimensional parameter space, scaling the circle by the critical value χ2,1−α (c.f. Scheffé, 1953), and applying the estimated asymptotic covariance matrix V A R[θˆ f ] to the scaled circle to shape it into an ellipse that reflects the scale of the parameter estimates as well as takes into account the correlation between the parameter estimates. Wald-type CRs about a vector of functions of parameters g f (θ) are similarly generated by substituting g f (θ ) in place of θ f . As with Wald-type CIs, Wald-type CRs about functions of parameters suffer from transformation non-invariance. 3.2. Profile Likelihood-Based Confidence Regions Profile likelihood-based CRs for k f focal parameters is a straightforward extension from the k f = 1 parameter case. Here, the joint LRT to be inverted is       G 2 = (N − 1) F  θ 0 f , θ n , S − Fˆ ,

(14)

which is a generalization of Equation 7. Boundary points of the profile likelihood-based CR are identified as sets of values of θ 0 f , or θ B , when the equality in Equation 5 is achieved. When k f = 2, the likelihood-based CR takes the form of a two-dimensional region, with the ML estimate located within the center of the CR. Test-statistic based CRs for non-normal data or non-ML discrepancy functions can be similarly defined with the critical value replaced by an appropriate scaled χ 2 distribution. 3.2.1. Computing Profile Likelihood-Based Confidence Regions The generalization of computing profile likelihood-based CIs to CRs, using the root finding algorithm (Brent, 1973), involves combining the inversion of the joint LRT in Equation 14 with a generalization of Equation 9, where the scalars θ Bt , θˆ f and dt are replaced by vector valued counterparts θ Bt , θˆ f and dt . Recall that with k f = 1, dt covers both directions of a line with T = 2 scalar valued direction vectors. With k f = 2, T vector valued direction vectors dt are selected to cover different directions in the two-dimensional plane (see Figure 1 where T = 101 direction vectors were specified). With k f = 3, direction vectors are chosen on a sphere in the three-dimensional parameter space. It follows that for k f parameters, dt are chosen on a (k f − 1)-dimensional sphere in the k f -dimensional space. The computational effort required to produce CRs increases geometrically with the dimensionality of θ f . Note that CIs and CRs involving functions of parameters g f (θ) could be constructed using this algorithm without model re-specification as described in the section on profile likelihood-based CIs. Ritter and Bates (1993) describe a similar profiling method for a set of parameters. 3.2.2. Computationally Efficient Profile Likelihood-Based Confidence Regions The computationally efficient algorithm for profile likelihood-based CIs can be generalized to obtain CRs about two focal parameters (or focal functions of parameters). First, the profile-likelihood CI of 2 . Next, T points the first focal parameter, θˆL and θˆU , are obtained using the critical value χ2,1−α between θˆL and θˆU of the first parameter are chosen. Finally, with the first focal parameter fixed to each of the T values, the second focal parameter is minimized or maximized subject to the constraint in Equation 12 to yield the boundary points of the CR. This can be further generalized to multiple focal parameters (or functions of focal parameters). Compared to the other profile likelihood-based algorithms described, the modified Wu and Neale (2012) approach has several important advantages. First, the method generates unbiased

JOLYNN PEK AND HAO WU

Figure 1.

Simultaneous confidence regions (d f = 3) and point-wise confidence intervals (d f = 1) for the direct and indirect effects of perceived discrimination on psychological well-being. CR confidence region.

profile likelihood-based CRs (c.f. Neale & Miller, 1997; MacCallum et al., 2009, 2012). Second, it is relatively computationally efficient as it does not involve nested iterations (c.f. computational method involving the root finding algorithm of Brent, 1973). Third, the method can obtain CRs about functions of parameters without re-specifying the model (c.f. Venzon & Moolgavkar, 1988). Finally, and most important, this algorithm seamlessly fits into the framework of OpenMx (Boker et al., 2011; Neale et al., 2015). We recommend the use of this algorithm for constructing profile likelihood-based CIs and CRs in SEM for certain contexts as illustrated below, and provide example code in the Online Appendix.

4. Empirical Examples 4.1. Latent Variable Mediation Our first illustration presents Wald-type and profile likelihood-based CIs and CRs for a published example in social psychology (Schmitt et al., 2002). The authors fit a latent variable mediation model to data obtained from a sample of N = 220 women, where the effect of perceived discrimination (PD) against one’s in-group on psychological well-being (PWB) was hypothesized to be mediated by in-group identification (IGI). Based on the Rejection-Identification model (Branscombe, Schmitt, & Harvey, 1999), it was expected that PD would exert a direct negative effect on PWB and an indirect positive effect on PWB mediated by IGI. Specifically, PD would increase IGI and IGI would serve as a coping strategy to improve PWB. The authors were therefore interested in three parameters simultaneously. The first two made up the indirect effect of PD on PWB, mediated by IGI, and the third is the direct effect of PD on PWB. Perceived discrimination (η1 ) was indicated by four MVs—in-group disadvantage (y1 ), outgroup privilege (y2 ), prejudice across contexts (y3 ), and past experience with discrimination (y4 ). In-group identification (η2 ) was indicated by four MVs quantifying the extent of emotional attachment on one’s group—liking one’s group (y5 ), valuing one’s group (y6 ), having pride in one’s group (y7 ) and having positive experiences due to being a member of one’s group (y8 ). Psychological well-being (η3 ) was indicated by five MVs—life satisfaction (y9 ), self-esteem (y10 ), positive affect (y11 ), anxiety (y12 ) and depression (y13 ). To identify the model, all LVs had

PSYCHOMETRIKA

their variances fixed to 1.0 resulting in standardized estimates. The chi-square goodness-of-fit is 126.74 with d f = 62. The (standardized) factor loading matrix and the regression coefficient matrix for the LVs are ⎞ ⎛ .71 0 0 ⎜ .64 0 0 ⎟ ⎟ ⎜ ⎜ .38 0 0 ⎟ ⎟ ⎜ ⎜ .63 0 0 ⎟ ⎟ ⎜ ⎜ 0 .90 0 ⎟ ⎟ ⎜ ⎛ ⎞ ⎜ 0 .80 0 0 0 0 ⎟ ⎟ ⎜ ⎝ .17 0 0 ⎠ . 0 ⎟ =⎜ ⎟ and B = ⎜ 0 .81 ⎟ ⎜ 0 .76 −.18 .19 0 0 ⎟ ⎜ ⎟ ⎜ 0 0 .78 ⎟ ⎜ ⎜ 0 0 .88 ⎟ ⎟ ⎜ ⎜ 0 0 .61 ⎟ ⎟ ⎜ ⎝ 0 0 −0.61 ⎠ 0 0 −0.71 The (standardized) residual variances are the complements of the squared loadings and not presented here. Figure 1 presents the ML estimates of the focal parameters with Wald-type and profile likelihood-based CIs and CRs generated using the root finding algorithm of (Brent, 1973) with T = 101 direction vectors.3 The left panel of Figure 1 depicts results of the mediation or indirect effect made up of the effect of PD on IGI and the effect of IGI on PWB. The indirect effect of PD on PWB is often represented by a single parameter which is the product of the effect of PD on IGI and the effect of IGI on PWB. The right panel of Figure 1 plots this one-parameter indirect effect against the direct effect, or the two components of the total effect of PD on IGI. Geometrically, the indirect effect in the right panel of Figure 1 is the projection of the two effects shown in the left panel. Within each panel of Figure 1, the solid dots represent the ML estimates of the effects, the point-wise CIs are constructed with d f = 1 and the CRs are constructed with d f = 3 for joint inference of all three paths. With positive point estimates for the effect of PD on IGI and the effect of IGI on PWB, as well as a negative estimate for the effect of PD on PWB, these results descriptively support the directional pathways hypothesized by the Rejection-Identification model. In addition, the mediation or indirect effect has an ML estimate of .17 × .19 = .033. 4.1.1. Point-Wise Confidence Intervals Given that the direction of results supports their hypotheses, Schmitt et al. (2002) concluded that the Rejection-Identification model was confirmed because point-wise Wald-type tests for each of the three effects (PD on IGI, IGI on PWB and PD on IGI), separate of one another, were significant at α = .05. In Figure 1, point-wise Wald-type CIs are represented by vertical and horizontal lines without symbols; these CIs are equivalent to multiple non-simultaneous Wald tests. It may be observed that point-wise Wald tests for the effect of PD on IGI and the effect of IGI on PWB are each significant ( p < .05) as the 95 % Waldtype CIs in the left panel of Figure 1 do not include the null value of zero, θ0 f = 0. Similarly, the point-wise Wald test for the direct effect of PD on PWB is significant ( p < .05) because its commensurate 95 % Wald-type CI in the right panel of Figure 1 does not contain θ0 f = 0. The indirect effect of PD on PWB could also be analytically evaluated for significance using the Sobel test (Sobel, 1982), that is a Wald test based on delta method, and would be non-significant 3 OpenMx code to generate profile likelihood-based CIs and CRs for these effects is provided in the accompanying Online Appendix. The OpenMx code uses the computationally efficient algorithm with T = 50 equally spaced points between the upper and lower limits of the first focal parameter to obtain the (conditional) range of the second focal parameter. It will essentially generate the same graph.

JOLYNN PEK AND HAO WU

( p = .1064). This result mirrors the point-wise Wald-type CI for the indirect effect of PD on PWB in the right panel of Figure 1, which was constructed by re-parameterizing the model. Point-wise profile likelihood-based CIs are also presented in Figure 1 as vertical and horizontal lines bounded by circles. They suggest point-wise significance of all three paths and the indirect effect ( p < .05). Although LRTs could be alternatively and succinctly employed to test these effects, CIs communicate estimate precision or sampling variability, and provide the range of plausible values for these estimates. Comparing the Wald-type CIs to the profile likelihood-based CIs, it can be observed that the two almost coincide for the three path coefficients but deviate significantly from each other for the indirect effect. This is because the estimates of the three path coefficients are close to normally distributed. In contrast, the indirect effect is the product of two normally distributed estimates and has a distribution far from normal. The profile likelihood-based CI indicates significance of the mediation whereas the Wald-type CI does not. Because the performance of profile-likelihood based CIs is not based on the asymptotic normality of parameter estimates, the indirect effect should be taken as significant according to the profile likelihood-based CI. 4.1.2. Confidence Regions for Multiple Parameters The Rejection-Identification model explicitly states that k f = 3 effects of PD on IGI, IGI on PWB and PD on PWB should be simultaneously significant. We do not evaluate the indirect effect of mediation here because the Rejection-Identification model posits specific relationships for each of the two paths that make up the mediation. Inference in this example therefore concerns the joint effect of the three stated pathways among PD, IGI, and PWB. The Wald-type CRs in Figure 1 are represented by solid black confidence ellipses, and the shape of the Wald-type CRs reflects the quadratic form used to approximate the curvature of the likelihood surface. The T = 101 open circles in Figure 1 form the profile likelihood-based CRs. Two different parametrizations of the model were used to generate Figure 1. The CRs in the left panel are computed from a model where the three paths are three separate parameters in B, and the CRs in the right panel are estimated from a model where the direct and indirect effects and the path from IGI to PWB are the three parameters. The differences noted between the point-wise CIs are mirrored in the CRs. The CRs in the left panel coincide because the estimates of the two path coefficients have a joint distribution close to a bivariate normal distribution. However, because the indirect effect does not follow a normal distribution (e.g., MacKinnon, Lockwood, & Williams, 2004), the CRs in the right panel differ from each other significantly. The quadratic approximation underlying the Wald approach for evaluating the indirect effect performs poorly, so its use should be avoided in this context. In general, although profile likelihood-based CIs and CRs require more effort to compute, they more appropriately preserve Type I error, maintain ideal levels of coverage and are invariant to parameterization compared to Wald-type CIs and CRs. 4.1.3. Confidence Intervals Versus Confidence Regions Regardless of inferential approach (Wald or profile likelihood-based), point-wise CIs should be avoided when joint effects of multiple parameters are of interest because these CIs communicate overly optimistic precision. In Figure 1, the widths of all the CIs are much smaller than the widths of their respective CRs as the CIs are based on d f = 1 instead of d f = 3. Point-wise CIs are valid when inference is made about a single parameter, separate of the others. Support for the Rejection-Identification model requires three effects to be jointly significant, and point-wise CIs may be adjusted to allow for multi-parameter inference. For instance, a Bonferroni correction with α = .05/3 may be used to construct adjusted CIs for each of the three effects of interest. These adjusted CIs would be associated with more optimal Type I error rates and coverage compared to point-wise CIs. Multiple adjusted CIs, however, communicate less precise information about the multiple estimates compared to CRs. For instance, the CRs in the

PSYCHOMETRIKA

left panel of Figure 1 indicate that the effect of PD and IGI could be negative only when the effect of IGI on PWB is approximately between .05 and .35. Such information would not be readily available from multiple adjusted CIs. 4.1.4. Profile Likelihood-Based Confidence Regions for Inference We now focus on the use of profile likelihood-based CRs for inference given their noted advantages over adjusted CIs and Wald-type CRs and, importantly, their relevance to the hypothesized combinations among the focal parameters. The 95 % profile likelihood-based CR in the right panel of Figure 1 indicates that the three parameters of interest are not jointly significant from zero at α = .05 because the CR contains values of zero for various combinations of the three parameters. Such a statement is distinct from failing to reject the null hypothesis H0 : θ 0 f = (0, 0, 0)T . The LRT associated with testing the stated null hypothesis will be rejected when the profile likelihood-based CR in the right panel of Figure 1 does not contain the origin. In the left panel of Figure 1, the profile likelihoodbased CR does not contain θ 0 f = (0, 0)T , implying that the effects of PD on IGI and IGI on PWB are not jointly zero. Rejection of this null hypothesis does not provide information on whether one path of the indirect effect is significant when the other path is non-significant. Confidence regions, however, contain such information that could enhance the interpretation of results. Beyond the joint significance of the three postulated effects, the researchers had specific hypotheses about the combination of the three effects that CRs would communicate specific information on. 4.1.5. Profile Likelihood-Based Confidence Regions for Interpretation Much more information about the phenomena of interest may be gleaned from CRs as they communicate the joint precision of the estimates and provide a range of combinations of plausible population parameter values. In the left panel of Figure 1, the profile likelihood-based CR indicates that the joint effect of IGI on PWB (together with the effect of PD on IGI and PD on PWB) is most plausibly positive as the area bounded by the open circles is mostly located above the solid horizontal reference line, representing the zero effect of IGI on PWB. In contrast, there is less evidence in support of the positive effect of PD on IGI as the vertical solid zero reference line clearly intersects the CR formed by the open circles. The profile likelihood-based CR in the right panel of Figure 1 suggests that the indirect effect may take on negative values only when the direct effect is negative, whereas the indirect effect will only be positive when the direct effect is positive. These observations in the right panel of Figure 1 may be better understood when they are interpreted in relation to the left panel of Figure 1. Using the two profile likelihood-based CRs in Figure 1 to interpret the relationships between the three variables of PD, IGI and PWB, several interesting observations may be noted. First, despite statistical non-significance, the data mostly support a positive indirect and negative direct effect of PD on PWB; the areas bounded by the open circles in Figure 1 mostly fall onto areas of the parameter space that are consistent with the Rejection-Identification model. The profile likelihoodbased CRs also point to a plausible negative direct effect of PD on PWB when the indirect effect is also negative. From the left panel of Figure 1, a negative indirect effect occurs most plausibly when the effect of PD on IGI is negative and the effect of IGI on PWB is positive. Although such a negative indirect effect of PD on PWB is inconsistent with the Rejection-Identification model, there exists research in support of this negative indirect effect, where the effect of PD on IGI may be moderated by importance of group membership (Branscombe, Wann, Noel, & Coleman, 1993). Here, members of the group perceiving discrimination may instead reduce their identification with the discriminated in-group if group membership is unimportant, supporting the plausible negative effect of PD on IGI. These results suggest the presence of moderated mediation, where importance of group membership is the moderator. Information in CRs could therefore be useful for further refining research regarding how PD, IGI, and PWB relate to one another.

JOLYNN PEK AND HAO WU

We have shown in this example that, when researchers are interested in the combinations of multiple focal parameters, CRs for more than one parameter of interest provide three types of information that are informative. First, CRs provide information on whether a joint test (Wald or LRT) of multiple parameters is statistically significant. Second, CRs provide the range of combinations of plausible parameter values and convey the joint precision or sampling variability associated with the focal parameter point estimates. Third, it is from CRs and not adjusted CIs that researchers may obtain precise information about plausible combinations of parameter estimates associated with their phenomena of interest. Such additional information affords CRs the flexibility of testing unconventional null hypotheses such as mediation (c.f., Biesanz, Falk, & Savalei, 2010). 4.2. Latent Curve Model The Wald and profile likelihood-based approaches to constructing CRs would be more highly differentiated for (a) highly nonlinear functions of parameters (e.g., ratio or product such as the indirect effect) and (b) parameters with a natural boundary on the parameter space (e.g, variances). The sampling distribution of these two types of parameter estimates typically cannot be closely approximated by a normal distribution, resulting in less accurate Wald-type CIs and CRs relative to profile likelihood-based counterparts. The next empirical example illustrates the extent to which Wald-type and profile likelihood-based CIs and CRs differ under these two conditions. The second example is a LCM that was fit to scores measuring children’s Delinquent Peer Association derived from an aggregation of parent and teacher responses (Stoolmiller, 1994) as described in Hancock and Choi (2006, p. 370). Complete data for N = 198 children were available for four measurement occasions collected during the fourth (y1 ), sixth (y2 ), seventh (y3 ) and eighth (y4 ) grades. A standard LCM was fit to the data with a latent intercept (α) located at the first time point, latent slope (β), and unequal residuals. The chi-square goodness-of-fit is 0.250 with d f = 5. Save for the factor loading matrix  that is fixed, the model matrices below contain parameter estimates. ⎛

1 ⎜1 =⎜ ⎝1 1

⎞ 0    2⎟ ⎟ τ = 3.29 σ = 5.69 3⎠ 0.23 0.11 4

 0.11  = diag[5.25, 5.67, 4.75, 4.53] 0.30

The intercept mean is denoted by τα , the slope mean is τβ , the intercept variance is σα2 , the slope variance is σβ2 , and the covariance between intercept and slope is σαβ . Of interest are parameters about the aperture (Hancock & Choi, 2006), which is the point in time where individuals are estimated to be the most similar in terms of their Delinquent Peer Association scores; such homogeneity could enhance the efficacy of an intervention to improve Delinquent Peer Association scores. 4.2.1. Point-Wise Confidence Intervals The location of the aperture is the point in time, implied by the model, where variance of children’s Delinquent Peer Association scores is smallest. The estimate for the relative aperture location (RAL; Hancock & Choi, 2006) for this example −σˆ is (λ −λαβ)σˆ 2 = −0.09, where λ p = 4 and λ1 = 0 denote the first and last factor loading values p

1

β

of the slope factor, respectively. An RAL value of 1 indicates that the aperture is located at the last measurement occasion. In this example, children had the most similar Delinquent Peer Association scores of about 0.09 × 48 ≈ 4 months before the first measurement occasion of grade four in Fall 1998. Additionally, the estimated mean level of Delinquent Peer Association scores σˆ at the aperture is τˆα ∗ = τˆα − σˆαβ2 τˆβ = 3.20. β

PSYCHOMETRIKA

Confidence intervals were computed to quantify the sampling variability of the estimates of RAL and τα ∗ . The 95 % delta method Wald-type and profile likelihood-based CIs for RAL are (−0.72, 0.54) and (−2.20, 0.32), respectively. The 95 % delta method Wald-type and profile likelihood-based CIs for τα∗ are (2.46, 3.94) and (3.10, 3.30), respectively. Although a sample size of N = 198 is reasonably large, the Wald-type and profile likelihood-based CIs are highly dissimilar. For RAL, the Wald-type CI is much more liberal than the profile likelihood-based CI; the length of the Wald-type CI is almost half of the profile likelihood-based CI. Conversely, the Wald-type CI for τα ∗ is much more conservative than the profile likelihood-based CI; the Wald-type CI is about 7 times wider than the profile likelihood-based CI. Profile likelihood-based CIs are interpreted over Wald-type CIs as the former have more precise error or coverage rates (Neale & Miller, 1997). The CI for RAL implies that the true location of the aperture ranges from 2.2 × 4 ≈ 8.8 years before the first measurement occasion to about 1.3 years after the first measurement occasion in the fourth grade with 95 % confidence. Additionally, the true value of Deviant Peer Association scores at the aperture is estimated with much precision; this value is expected to range between 3.10 and 3.30 points with 95 % confidence. 4.2.2. Confidence Regions for Multiple Parameters Two parameters about the aperture, the relative aperture variance (RAV) and the relative gradient (RG), merit joint consideration as interest is in the plausible combinations between them (see Figure 1 in Hancock & Choi, 2006). The RAV =

σα2∗ 2 σα ∗ +σβ2

, with σα2∗ denoting the variance at the aperture, is a scale free index of the degree

of dilation at the aperture. Additionally, RAV is bounded between 0 and 1 and is a comparison of variance at the aperture to variance of one growth unit away from either side. When RAV = 0, there is a strong degree of convergence among the individual trajectories at the aperture which could be pictured as a bowtie (). Conversely, when RAV is close to 1, the aperture is extremely wide; in the degenerate case of RAV = 1, the growth trajectories are parallel. In this example, the estimated RAV is .95, indicating that the aperture is relatively wide, characterized by slow convergence to and divergence from the aperture. The RG quantifies the general inclination or tilt τ of trajectories at the aperture and is defined as RG = σββ . The scale of RG is in standard units, and it is the inverse of the coefficient of variation. Large values of RG indicate that individuals have a generally similar trajectory orientation, and small values imply the converse. In this example, the estimated RG is 0.42, implying that children’s trajectories of Delinquent Peer Association scores are rather varied in the direction and magnitude of their slopes. Taken together, these estimates of RAV and RG characterize growth trajectories of Delinquent Peer Association scores that are mostly increasing over time and have slow convergence to and divergence from the aperture. Figure 2 presents several Wald-type and profile likelihood-based CRs for RAV and RG in the left and right panels, respectively. To illustrate the shape of the profile likelihood surface for RAV and RG, and its quadratic approximation, seven levels of coverage for the two types of CRs are presented as a contour map. The poor approximation of the Wald-type to the profile likelihood-based CR is more evident in a dynamic graphic in the Online Appendix. The highest point in both contour maps, which are the ML estimates of RAV and RG, are depicted as a black bold dot in each panel of Figure 2. Wald-type CRs are represented by the ellipses in the left panel, and analogous profile likelihood-based CRs are represented by the same line types in the right panel. Both Wald-type and profile likelihood-based CRs in Figure 2 were obtained with a a+3b parametrization where the loadings of the slope factor are a, a+b 2 , 4 and b (with a and b freely estimated) but the correlation between factors is set to 0 and variance of the slope factor is set to 1. All remaining parameters of the model follow the standard LCM specification. Under this parametrization, the intercept is the score measured at the aperture (i.e. α = α ∗ ) and the mean of the slope is RG. In addition, RAV =

σα2 . σα2 +(b−a)2 /16

JOLYNN PEK AND HAO WU

Figure 2. Simultaneous confidence regions for the relative aperture variance (RAV) and relative gradient (RG). CR confidence region, and the bolded black dot is the ML estimate of RAV and RG. A dynamic version of this figure is available in the Online Appendix.

From Figure 2, the Wald-type and profile likelihood-based CRs strikingly depart from each other at increasing levels of coverage (see Online Appendix for a dynamic graphic). It is very clear from this example that the Wald-type CR is erroneous because it extends the quadratic approximation that is only valid locally or near the MLE to a much wider range. In general, both types of CRs imply that RG is positive at all plausible combinations of RAV. The Wald-type CR suggests a small range of plausible values for RAV whereas the converse held for the profile likelihood-based CRs. As RAV is bounded between 0 and 1, the 95 % profile likelihood-based CR (depicted by the bold line) implies much imprecision in the estimation of RAV. The Wald-type CRs erroneously suggest a positive relation between plausible values of RAV and RG whereas the profile likelihood-based CRs show little correlation between plausible values of RAV and RG. Information about the sampling variability of RG from the 95 % profile likelihood CR implies that trajectories of Deviant Peer Association from grades four to nine are variable and tend to be positive. Importantly, contrary to the Wald-type CR, the 95 % profile likelihood CR indicates that the data provide little information about RAV. The wide range of plausible values of RAV depicted by the 95 % profile likelihood-based CR in Figure 2 is due to the flatness of the profile likelihood function of the intercept variance σˆ α2 . Recall that RAV =

σα2 , σα2 +(b−a)2 /16

and an intercept variance of 0 results in an RAV of 0. Figure 3

presents a function of the profile likelihood for the intercept variance in terms of the minimum discrepancy function (Equation 1) for the second model specification where σαβ is fixed to 0 and σβ2 is fixed to 1. The solid line represents the profile likelihood, and the dot-dashed quadratic function represents the approximation afforded by the Wald approach. The dashed vertical reference line at the minimum value of the discrepancy function shows that the ML estimate σˆ α2 = 5.68. The solid horizontal line depicts the perturbed value of the discrepancy function for α = .05 (see Equations 11 and 12). The points where this horizontal line intersects the discrepancy function obtains the 95 % profile likelihood-based CI of (0, 8.26), as shown by the dashed vertical reference lines located on each side of the ML estimate. Similarly, the 95 % Wald-type CI of (2.68, 8.67) is obtained from the intersection of the horizontal reference line with the quadratic function. Clearly, the Wald-based quadratic approximation of the likelihood function does poorly in this example.

0.15

PSYCHOMETRIKA

0.00

0.05

^ FML

0.10

Likelihood Wald Approximation

0

2

4

6

8

10

12

14

^2 σ α

Figure 3. Profile likelihood of intercept variance.

The LCM example provides additional evidence in support of the main advantage of likelihood-based CIs and CRs over Wald-type CIs and CRs. Likelihood-based procedures do not rely on the quadratic approximation of the likelihood surface which is inadequate when nonlinearly transformed or bounded parameters are involved. It is recommended that profile likelihood-based CIs and CRs are constructed about bounded parameters or parameters that undergo a nonlinear transformation. Although profile likelihood-based CIs and CRs require more effort to compute, their construction in these contexts are worthwhile as their more easily available Wald-type counterparts tend to be unreasonable approximations.

5. Summary and Conclusions We have presented a review of Wald-type and likelihood-based approaches in linear SEM, highlighting the computation and interpretation of CIs and CRs over NHSTs as the former provide additional information about the sampling variability of parameter estimates. Profile likelihoodbased CIs and CRs are computationally more difficult to obtain compared to their Wald-type counterparts, and the review describes existing algorithms devised for computing profile likelihoodbased CIs. More recent algorithms for computing profile likelihood-based CRs for multiple parameters and functions of parameters in SEM are also detailed. We note that research in profile likelihood-based CIs for non-normal data or for non-ML discrepancy functions remains wanting, and the limited use of and research about profile likelihood-based CIs in SEM reflects the reality that many of these algorithms are not yet implemented in specialized SEM software. OpenMx is a viable option for constructing profile likelihood-based CIs and CRs for linear SEM, and could be used in future research on profile likelihood-based CIs and CRs. We provide example OpenMx code in our Online Appendix for computing profile likelihood-based CIs and CRs. When research questions are posed about multiple parameters, multiple adjusted CIs are a simple and reasonable approach to inference if the desired conclusion is a statement concerning

JOLYNN PEK AND HAO WU

individual effects, controlling for the presence of other important parameters. We recommend constructing CRs in place of multiple adjusted CIs especially when the plausible combinations that multiple parameters could take on are of substantive interest. For k = 2 parameters, CRs depict the covariance between the parameter estimates, and communicate estimate precision. More important, is that CRs also communicate more precise information about the combination of plausible values of parameters compared to multiple adjusted CIs. Such additional information is anticipated to be useful for testing unconventional hypotheses (e.g., mediation), and informative for the interpretation of results as well as the refinement of theoretical developments in psychology. The choice of computing Wald-type versus profile likelihood-based CIs and CRs depends on several practical and theoretical issues, and researchers should weigh the benefits against costs of each approach in relation to their research context. When parameter estimates are close to normally distributed, Wald-type CIs and CRs perform well and are much more easily computed compared to profile likelihood-based CIs and CRs; this is especially true when sample size N is relatively large. However, when bounded parameters or nonlinear functions of parameters are involved, profile likelihood-based CIs and CRs are recommended as their Wald-type counterparts tend to perform poorly in terms of coverage. The Online Appendix includes a dynamic graphic illustrating the extent to which Wald-type and profile likelihood-based CRs can diverge. Sample size N , relative to the number of estimated nuisance parameters k f , should be large for profilelikelihood based methods to perform well. Profile likelihood-based CIs and CRs are not new developments in SEM, but they are superior and viable alternatives to Wald-type CIs and CRs for inference and quantifying estimate precision about less conventional parameters.

Acknowledgments This work was supported by the Natural Sciences and Engineering Research Council of Canada Discovery Grant awarded to Jolynn Pek. We thank Robert C. MacCallum for helpful comments on an earlier draft. References American Educational Research Association. (2006). Standards for reporting on empirical social science research in AERA publications American Educational Research Association. Educational Researcher, 35, 33–40. doi:10.3102/ 0013189X035006033. American Psychological Association. (2001). Publication manual of the American Psychological Association (6th ed.). Washington, DC: American Psychological Association. Aptech Systems, Inc. (2002). GAUSS user guide [Computer software manual]. Maple Valley, WA: Aptech Systems Inc. Arbuckle, J. L. (2006). Amos 7.0 user’s guide [Computer software manual]. Chicago, IL: SPSS. Baillargeon, S., & Rivest, L.-P. (2007). Rcapture: Loglinear models for capture-recapture in R. Journal of Statistical Software, 19, 1–31. Retrieved from http://www.jstatsoft.org/v19/i05. Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173–1182. doi:10.1037/0022-3514.51.6.1173. Basu, D. (1977). On the elimination of nuisance parameters. Journal of the American Statistical Association, 72, 355–366. doi:10.2307/2286800. Bauer, D. J. (2003). Estimating multilevel linear models as structural equation models. Journal of Educational and Behavioral Statistics, 28, 135–167. doi:10.3102/10769986028002135. Benjamini, Y. (2010). Simultaneous and selective inference: Current successes and future challenges. Biometrical Journal, 52, 708–721. doi:10.1002/bimj.200900299. Bentler, P. M. (2006). EQS 6 structural equations program manual [Computer software manual]. Encino, CA: Multivariate Software Inc. Biesanz, J. C., Falk, C. F., & Savalei, V. (2010). Assessing mediational models: Testing and interval estimation for indirect effects. Multivariate Behavioral Research, 45, 661–701. doi:10.1080/00273171.2010.498292. Bock, R. D., & Lieberman, M. (1970). Fitting a response model from dichotomously scored items. Psychometrika, 35, 179–197. doi:10.1007/BF02291262. Boker, S., Neale, M., Maes, H., Wilde, M., Spiegel, M., Brick, T., et al. (2011). OpenMx: An open source extended structural equation modeling framework. Psychometrika, 76, 306–317. doi:10.1007/s11336-010-9200-6.

PSYCHOMETRIKA Bollen, K. A. (1989). Structural equation models with latent variables. New York: Wiley. Bollen, K. A., & Stine, R. (1990). Direct and indirect effects: Classical and bootstrap estimates of variability. Sociological Methodology, 20, 115–140. doi:10.2307/271084. Branscombe, N. R., Schmitt, M. T., & Harvey, R. D. (1999). Perceiving pervasive discrimination among African Americans: Implications for group identification and wellbeing. Journal of Personality and Social Psychology, 77, 135–149. doi:10.1037/0022-3514.77.1.135. Branscombe, N. R., Wann, D. L., Noel, J. G., & Coleman, J. (1993). In-group or out-group extemity: Importance of the threatened social identity. Personality and Social Psychology Bulletin, 19, 381–388. doi:10.1177/0146167293194003. Brent, R. P. (1973). Algorithms for minimization without derivatives. Englewood Cliffs, NJ: Prentice-Hall. Browne, M. W. (1984). Asymptotically distribution-free methods for the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, 62–83. doi:10.1111/j.2044-8317.1984.tb00789.x. Browne, M. W., & Arminger, G. (1995). Specification and estimation of mean- and covariance-structure models. In G. Arminger, C. C. Clogg, & M. E. Sobel (Eds.), Handbook of statistical modeling for the social and behavioral sciences (pp. 185–249). New York, NY: Springer. Cheung, M. W. L. (2007). Comparison of approaches to constructing confidence intervals for mediating effects using structural equation models. Structural Equation Modeling, 14, 227–246. doi:10.1080/10705510709336745. Cheung, M. W. L. (2009). Constructing approximate confidence intervals for parameters with structural equation models. Structural Equation Modeling, 16, 267–294. doi:10.1080/10705510902751291. Chow, S.-M., Ho, M.-H. R., Hamaker, E. L., & Dolan, C. V. (2010). Equivalence and differences between structural equation modeling and state-space modeling techniques. Structural Equation Modeling, 17, 303–332. doi:10.1080/ 10705511003661553. Cohen, J. (1994). The earth is round ( p < .05). American Psychologist, 49, 997–1003. doi:10.1037/0003-066X.49.12. 997. Cook, R. D., & Weisberg, S. (1990). Confidence curves in nonlinear regression. Journal of the American Statistical Association, 85, 544–551. doi:10.1080/01621459.1990.10476233. Cox, D. R., & Hinkley, D. V. (1974). Theoretical statistics. London: Chapman & Hall. Curran, P. J. (2003). Have multilevel models been structural equation models all along? Multivariate Behavioral Research, 38, 529–569. doi:10.1207/s15327906mbr3804_5. Dennis, J. E., Gay, D. M., & Walsh, R. E. (1981). An adaptive nonlinear least-squares algorithm. ACM Transactions on Mathematical Software, 7, 348–368. doi:10.1145/355958.355965. DiCiccio, T. J., & Tibshirani, R. (1991). On the implementation of profile likelihood methods (Tech. Rep. No. 9107). Toronto, ON: University of Toronto, Department of Statistics. du Toit, S. H., & Cudeck, R. (2009). Estimation of the nonlinear random coefficient model when some random effects are separable. Psychometrika, 74, 65–82. doi:10.1007/s11336-008-9107-7. Falk, C. F., & Biesanz, J. C. (2015). Inference and interval estimation for indirect effects with latent variable models. Structural Equation Modeling, 22, 24–38. doi:10.1080/10705511.2014.935266. Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society of London, Series A, 22, 309–368. doi:10.1098/rsta.1922.0009. Gonzalez, R., & Griffn, D. (2001). Testing parameters in structural equation modeling: Every “one” matters. Psychological Methods, 6, 258–269. doi:10.1037/1082-989X.6.3.258. Hancock, G. R., & Choi, J. (2006). A vernacular for linear latent growth models. Structural Equation Modeling, 13, 352–377. doi:10.1207/s15328007sem1303_2. Insightful Corporation. (2007). S-PLUS 8 for Windows user’s guide [Computer software manual]. Seattle, WA: Insightful Corporation. Jöreskog, K. G., & Sörbom, D. (2006). LISREL 8.8 for windows [Computer software manual]. Skokie, IL: Scientific Software International Inc. Kalbfleisch, J. D., & Sprott, D. A. (1970). Application of likelihood methods to models involving large numbers of parameters. Journal of the Royal Statistical Society. Series B, 32, 175–208. Retrieved from http://www.jstor.org/ stable/2984524. Lehmann, E. L., & Romano, J. P. (2006). Testing statistical hypotheses (3rd ed.). New York: Springer. MacCallum, R. C., Browne, M. W., & Lee, T. (2009). Fungible parameter estimates in structural equation modeling. Paper presented at the Society of Multivariate Experimental Research. Salishan Resort, OR. MacCallum, R. C., Lee, T., & Browne, M. W. (2012). Fungible parameter estimates in latent curve models. In M. C. Edwards & R. C. MacCallum (Eds.), Current topics in the theory and application of latent variable models (pp. 183–197). New York, NY: Routledge. MacKinnon, D. P., Lockwood, C. M., & Williams, J. (2004). Confidence limits for the indirect effect: Distribution of the product and resampling methods. Multivariate Behavioral Research, 39, 99–128. doi:10.1207/s15327906mbr3901_ 4. McGraw, K. O., & Wong, S. (1996). Forming inferences about some intraclass correlation coefficients. Psychological Methods, 1, 30–46. doi:10.1037/1082-989X.1.1.30. Meehl, P. E. (1997). The problem is epistemology, not statistics: Replace significance tests by confidence intervals and quantify accuracy of risky numerical predictions. In L. L. Harlow, S. A. Mulaik, & J. H. Steiger (Eds.), What if there were no significance tests (pp. 393–425). Mahwah, NJ: Erlbaum. Meeker, W. Q., & Escobar, L. A. (1995). Teaching about approximate confidence regions based on maximum likelihood estimation. The American Statistician, 49, 48–53. doi:10.1080/00031305.1995.10476112.

JOLYNN PEK AND HAO WU Muthén, L. K., & Muthén, B. O. (2012). Mplus user’s guide (7th ed.) [Computer software manual]. Los Angeles, CA: Muthéen & Muthéen. Neale, M. C., Boker, S. M., Xie, G., & Maes, H. H. (2003). Mx: Statistical modeling (6th ed.) [Computer software manual]. Richmond, VA: Department of Psychiatry. Neale, M. C., Hunter, M. D., Pritkin, J., Zahery, M., Brick, T. R., Kirkpatrick, R. M., et al. (2015). OpenMx 2.0: Extended structural equation and statistical modeling. Psychometrika. doi:10.1007/s11336-014-9435-8. Neale, M. C., & Miller, M. B. (1997). The use of likelihood-based confidence intervals in genetic models. Behavior Genetics, 27, 113–120. doi:10.1023/A:1025681223921. Neyman, J., & Pearson, E. S. (1928). On the use and interpretation of certain test criteria for purposes of statistical inference: Part I. Biometrika, 20A, 175–240. doi:10.1093/biomet/20A.1-2.175. Nickerson, R. S. (2000). Null hypothesis significance testing: A review of an old and continuing controversy. Psychological Methods, 5, 241–301. doi:10.1037/1082-989X.5.2.241. Pawitan, Y. (2001). In all likelihood: Statistical modelling and inference using likelihood. Oxford, UK: Oxford University Press. Preacher, K. J., & Hancock, G. R. (2012). On interpretable reparameterizations of linear and nonlinear latent growth curve models. In J. R. Harring & G. R. Hancock (Eds.), Advances in longitudinal methods in the social and behavioral sciences (pp. 25–58). Charlotte, NC: Information Age Publishing. Preacher, K. J., & Hancock, G. R. (2015). Meaningful aspects of change as novel random coefficients: A general method for reparameterizing longitudinal models. Psychological Methods, 20, 84–101. doi:10.1037/met0000028. Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. (1992). Numerical recipes in Fortran 77: The art of scientific computing. Cambridge, UK: Cambridge University Press. R Development Core Team. (2013). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria. Retrieved from http://www.R-project.org Raykov, T., & Marcoulides, G. A. (2004). Using the delta method for approximate interval estimation of parameter functions in SEM. Structural Equation Modeling, 11, 621–637. doi:10.1207/s15328007sem1104_7. Ritter, C., & Bates, D. M. (1993). Profile methods (Tech. Rep. No. 93–31). Louvain-la-Neuve, Beligium: Institut de Statistique, Universie Catholique de Louvain. Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48, 1–36. Retrieved from http://www.jstatsoft.org/v48/i02/paper. Satorra, A., & Bentler, P. (2001). A scaled difference chi-square test statistic for moment structure analysis. Psychometrika, 66(4), 507–514. doi:10.1007/BF02296192. Satterthwaite, F. (1941). Synthesis of variance. Psychometrika, 6(5), 309–316. doi:10.1007/BF02288586. Savalei, V. (2014). Understanding robust corrections in structural equation modeling. Structural Equation Modeling, 21, 149–160. doi:10.1080/10705511.2013.824793. Scheffé, H. (1953). A method for judging all constrasts in the analysis of variance. Biometrika, 40, 87–110. doi:10.1093/ biomet/40.1-2.87. Schmitt, M. T., Branscombe, N. R., Kobrynowicz, D., & Owen, S. (2002). Perceiving discrimination agaist one’s gender group has different implications for well-being in women and men. Personality and Social Psychology Bulletin, 28, 197–210. doi:10.1177/0146167202282006. Schoenberg, R. (1995). CML users guide [Computer software manual]. Maple Valley, WA: Aptech Systems Inc. Schoenberg, R. (1997). Constrained maximum likelihood. Computational Economics, 10, 251–266. doi:10.1023/A: 1008669208700. Sobel, M. E. (1982). Asymptotic confidence intervals for indirect effects in structural equation models. Sociological Methodology, 13, 290–312. doi:10.2307/270723. Sprott, D. (1980). Maximum likelihood in small samples: Estimation in the presence of nuisance parameters. Biometrika, 67, 515–523. doi:10.1093/biomet/67.3.515. Steiger, J. H. (1980). Testing pattern hypotheses on correlation matrices: Alternative statistics and some empirical results. Multivariate Behavioral Research, 15, 335–352. doi:10.1207/s15327906mbr1503_7. Stoolmiller, M. (1994). Antisocial behavior, delinquent peer association, and unsupervised wandering for boys: Growth and change from childhood to early adolescence. Multivariate Behavioral Research, 29, 263–288. doi:10.1207/ s15327906mbr2903_4. Stryhn, H., & Christensen, J. (2003). Confidence intervals by the profile likelihood method, with applications in veterinary epidemiology. In Proceedings of the 10th international symposium on veterinary epidemiology and economics. Vina del Mar, Chile. Venzon, D., & Moolgavkar, S. (1988). A method for computing profile-likelihood-based confidence intervals. Applied Statistics, 37, 87–94. doi:10.2307/2347496. Wald, A. (1943). Tests of statistical hypotheses concerning several parameters when the number of observations is large. Transactions of the American Mathematical Society, 54, 426–482. doi:10.2307/1990256. Waller, N. G. (2008). Fungible weights in multiple regression. Psychometrika, 73, 691–703. doi:10.1007/ s11336-008-9066-z. Wilkinson, L., & Task Force on Statistical Inference. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594–604. doi:10.1037/0003-066X.54.8.594. Wu, H., & Neale, M. C. (2012). Adjusted confidence intervals for a bounded parameter. Behavior Genetics, 42, 886–898. doi:10.1007/s10519-012-9560-z. Manuscript Received: 12 MAR 2013

Profile Likelihood-Based Confidence Intervals and Regions for Structural Equation Models.

Structural equation models (SEM) are widely used for modeling complex multivariate relationships among measured and latent variables. Although several...
792KB Sizes 0 Downloads 8 Views