544471

research-article2014

ASMXXX10.1177/1073191114544471AssessmentSteca et al.

Article

Item Response Theory Analysis of the Life Orientation Test-Revised: Age and Gender Differential Item Functioning Analyses

Assessment 2015, Vol. 22(3) 341­–350 © The Author(s) 2014 Reprints and permissions: sagepub.com/journalsPermissions.nav DOI: 10.1177/1073191114544471 asm.sagepub.com

Patrizia Steca1, Dario Monzani1, Andrea Greco1, Francesca Chiesi2, and Caterina Primi2

Abstract This study is aimed at testing the measurement properties of the Life Orientation Test-Revised (LOT-R) for the assessment of dispositional optimism by employing item response theory (IRT) analyses. The LOT-R was administered to a large sample of 2,862 Italian adults. First, confirmatory factor analyses demonstrated the theoretical conceptualization of the construct measured by the LOT-R as a single bipolar dimension. Subsequently, IRT analyses for polytomous, ordered response category data were applied to investigate the items’ properties. The equivalence of the items across gender and age was assessed by analyzing differential item functioning. Discrimination and severity parameters indicated that all items were able to distinguish people with different levels of optimism and adequately covered the spectrum of the latent trait. Additionally, the LOT-R appears to be gender invariant and, with minor exceptions, age invariant. Results provided evidence that the LOT-R is a reliable and valid measure of dispositional optimism. Keywords Life Orientation Test-Revised, optimism assessment, item response theory, differential item functioning, item equivalence Dispositional optimism is defined as a generalized expectancy of positive future outcomes (Scheier & Carver, 1985) and it plays an important role in the behavioral self-regulation process (Carver & Scheier, 1998). As shown in a broad range of the literature, dispositional optimism is a powerful personal characteristic influencing several aspects of individual psychosocial functioning. It positively affects health outcomes (Cozzarelli, 1993; Curbow, Somerfield, Baker, Wingard, & Legro, 1993; Fitzgerald, Tennen, Affleck, & Pransky, 1993; Litt, Tennen, Affleck, & Klock, 1992) and health-related behaviors (Jones, DeMore, Cohen, O’Connell, & Jones, 2008; Yarcheski, Mahon, & Yarcheski, 2004), as well as the adoption of adaptive coping responses in facing life stressors (Fournier, de Ridder, & Bensing, 2002; Friedman et al., 1992; Grove & Heard, 1997; Jerusalem, 1993; Nes & Segerstrom; 2006; Scheier, Weintraub, & Carver, 1986; Steed, 2002). Optimism also promotes higher psychological (Creed, Patton, & Bartrum, 2002; Lai, 2009) and subjective well-being (Vacek, Coyle, & Vera, 2010), stronger self-esteem (Creed et al., 2002; Jackson, Pratt, Hunsberger, & Pancer, 2005; Weber, Puskar, & Ren, 2010), and more perceived social support (Chong, Huan, Yeo, & Ang, 2006; Weber et al., 2010). At the same time, dispositional optimism is inversely related to negative outcomes,

such as levels of academic stress (Chang & Sanna, 2003; Huan, Yeo, Ang, & Chong, 2006), hopelessness and suicidal ideation (Ayub, 2009), depressive symptoms (Chang & Sanna, 2003; Jackson et al., 2005; Murberg, 2012; Puskar, Sereika, Lamb, Tisai-Mumford, & McGuinness, 1999; Weber et al., 2010; Wong & Lim, 2009), perceived loneliness (Rius-Ottenheim et al., 2012), state and trait anger (Puskar et al., 1999), and somatic complaints (Murberg, 2012). The powerful role of this individual difference in fostering personal psychosocial adjustment has also been highlighted by clinicians and practitioners who have put great effort into devising effective strategies and intervention programs. These programs are meant to enhance the positive expectations about the future that are at the core of the dispositional optimism construct (e.g., Meevissen, Peters, & Alberts, 2011). In light of these findings, it is critical to reliably and validly assess individual differences in dispositional optimism. 1

University of Milan–Bicocca, Milan, Italy University of Florence, Florence, Italy

2

Corresponding Author: Dario Monzani, Department of Psychology, University of Milan–Bicocca, Piazza dell’Ateneo Nuovo, 1-20126 Milano, Italy. Email: [email protected]

Downloaded from asm.sagepub.com at CAMBRIDGE UNIV LIBRARY on August 9, 2015

342

Assessment 22(3)

Scheier and Carver (1985) first developed the Life Orientation Test (LOT) to assess generalized expectancies for future positive events. This was later replaced by the Life Orientation Test-Revised (LOT-R; Scheier, Carver, & Bridges, 1994), a brief and easy-to-use self-report measure, consisting of 10 items, which is currently the most widely used instrument to assess dispositional optimism in different ages. Over the past few years, the psychometric properties of the LOT-R have been investigated according to classical test theory (CTT). CTT is a traditional and widely used approach in evaluating psychological assessment instruments. According to this approach, the dimensionality of the LOT-R has been investigated in order to confirm the theoretical conceptualization of optimism as a single bipolar dimension, according to which people with low scores on this scale are considered pessimists and people with high scores are considered optimists (Monzani, Steca, & Greco, 2014; Rauch, Schweizer, & Moosbrugger, 2007; Vautier, Raufaste, & Cariou, 2003). In this study, we investigated the psychometric properties of the LOT-R by employing item response theory (IRT) in which characteristics of items on a test (i.e., item parameters) and the characteristics of individuals (i.e., latent traits) are related to the probability of a positive response (i.e., a trait-consistent endorsement of an item). IRT is a parametric statistical modeling procedure that involves fitting a hypothetical model to sample data. As such, the importance of IRT is considered to be the development of models where the characteristics of examinees and tests can be separated (Thomas, 2011). It has been argued that IRT models allow for estimating item-invariant and person-invariant parameters. This means that the parameters that characterize the examinees are independent from the specific items that are administered. In turn, the parameters that characterized the items are independent of the ability distribution in the normative sample (Hambleton, Swaminathan, & Rogers, 1991). Concerning the LOT-R, only one study has previously assessed its psychometric properties by applying an IRT analysis (Chiesi, Galli, Primi, Innocenti Borgi, & Bonacchi, 2013). This study showed that the global scale as well as each item on it performed well in measuring the construct. Specifically, the item parameters showed that all items were able to distinguish adequately well among people with different levels of the trait being measured and adequately covered the spectrum of the latent trait. Additionally, good information values were obtained for a large range of the trait. These values attested the precision of the scale along almost the entire pessimism–optimism continuum. Our study aimed at adding new information about a crucial measurement issue, which is more adequately addressed by IRT than by CTT. We refer to the issue of whether group (e.g., gender) differences in mean levels of a construct reflect true differences among the groups (i.e., items combined in aggregated scores have

different measurement properties in different groups). However, psychometric measurement equivalence of a test is a basic requirement for valid demographic subgroup comparisons. Indeed, lack of measurement equivalence renders group comparisons ambiguous since we cannot ascertain if the differences are a function of the measured trait or if they are artifacts of the measurement process. To ensure valid interpretation of group differences, the assessment of differential item functioning (DIF) inside the IRT framework is especially useful (Embretson & Reise, 2000; Reise & Waller, 2009). DIF analysis is used to study the performance of items in scales, and it examines whether or not the likelihood of item category endorsement is equal across subgroups, which are matched on the trait measure. Thus, DIF analysis involves three factors: item response, trait level, and subgroup membership. Specifically, DIF examines the relationship between item response and another variable, called the group variable (e.g., gender), which depends on a measure of an underlying construct (e.g., dispositional optimism). Its aim is to ascertain whether, after controlling for the underlying construct, the response to an item is related to group membership. If so, the item manifests DIF. For example, a randomly selected woman with a specific level of dispositional optimism and a randomly selected man with the same level of dispositional optimism should have the same chance of endorsing the same category of an item about the expectation of positive outcomes. If this not the case, DIF is present. Differently from the previous study applying IRT on the LOT-R (Chiesi et al., 2013), in this study DIF was explored to test the equivalence of the LOT-R items across gender and age in order to ensure valid interpretations of group differences about which previous research has reported mixed results. With regard to gender, early adolescent females were found by Carvajal, Garner, and Evans (1998) to report higher optimism than males whereas the opposite was found among older adolescents (Extremera, Duran, & Rey, 2007) and among adults of different ages (Glaesmer et al., 2012). Controversial results were also found for age differences. Older Americans displayed a higher level of dispositional optimism than did younger Americans (You, Fung, & Isaacowitz, 2009), whereas older individuals showed a lower level of dispositional optimism than did their younger counterparts among Chinese (You et al., 2009) and Europeans (Glaesmer et al., 2012). In sum, the aims of this article are to confirm and extend the analyses of the psychometric properties of the LOT-R in Italians, by applying IRT.

Method Participants and Measures The sample comprised 2,862 participants (46% males) ranging in age from 17 to 95 years (M = 44.4, SD = 18.4).

Downloaded from asm.sagepub.com at CAMBRIDGE UNIV LIBRARY on August 9, 2015

343

Steca et al. Most participants (51.2%) possessed a high school diploma, and 20.5% a university degree. The remaining 28.3% had a lower educational level. Most participants (51.6%) were married or cohabiting, 37.0% were single, 7.1% were widowed, and 4.2% divorced. These distributions greatly matched the national profile (ISTAT, 2013). Sampling was based on the “snowball” method. Volunteers were solicited by a group of undergraduate students to participate in the study and were encouraged to recruit their acquaintances to participate as well. Participants were provided an informed consent form and a self-report questionnaire. They were asked to carefully read and sign the informed consent form, individually complete the questionnaire, and then return the items to the principal investigator. Participants did not receive any incentive for their participation. Each participant was assessed with the Italian version of the LOT-R, which consists of 10 items. Three items (1, 4, and 10) are positively worded, whereas three items (3, 7, and 9) are negatively worded. The remaining four items are filler. Respondents indicated the extent to which they agree or disagree with each of the items on a 5-point Likert-type scale ranging from strongly disagree to strongly agree. The Italian version of the LOT-R was obtained using a forward translation method. Two nonprofessional translators worked independently and then compared their translation for the purpose of assessing the equivalence. This first version was read by five people who checked its understandability and linguistic adequacy. Then, after minor revisions the final version was obtained. The Italian version of the LOT-R was used in previous studies investigating the concurrent validity of the LOT-R for the assessment of optimism. As expected, findings from these studies showed that optimism is positively linked to self-esteem and both psychological and subjective wellbeing and negatively related to negative affectivity and depression (Chiesi et al., 2013; Monzani et al., 2014). In gender DIF analyses, the male group comprised 1,312 cases (mean age = 45.28, SD = 17.84; range 18-90), and the female group comprised 1,550 cases (mean age = 43.60, SD = 18.79; range 17-95). For age DIF analyses, the participants were divided into three groups: Age 1 (N = 1,052; 44% males, mean age = 25.50, SD = 4.5; range 17-35), Age 2 (N = 1,005; 46% males, mean age = 46.6, SD = 5.9; range 36-55), and Age 3 (N = 805; 48% males, mean age = 67.7, SD = 7.8; range 56-95).

Statistical Analyses Analyses were conducted excluding the four filler items; the six remaining items were labeled as follows: Item 1, “I rarely count on good things happening to me”; Item 2, “Overall, I expect more good things to happen to me than bad”; Item 3, “I hardly expect things to go my way”; Item 4,

“In uncertain times, I usually expect the best”; Item 5, “If something can go wrong for me, it will”; Item 6, “I am always optimistic about my future.” Prior to conducting the analyses, we looked at missing values in the data. For each item, we computed the percentage of missing responses, and we tested if missing data occurred completely at random (MCAR). Little (1998) has provided a statistical test of the MCAR assumption. It is a chi-square test, and a significant value indicates that the data are not MCAR. Even when data are MCAR, direct maximum likelihood methods are more efficient (Enders, 2006), so we used an expectation maximization (EM) algorithm to impute the missing data (see Scheffer, 2002). Both Little’s test and the EM algorithm are implemented in SPSS 20.0. As an important preliminary step, we examined the assumptions of the unidimensionality of the scale through a confirmatory factor analysis in the total sample and in each group (male, female, and the three age groups). The onefactor structure of the scale was tested through categorical weighted least squares confirmatory factor analysis implemented in the Mplus software (Muthén & Muthén, 2004). Because the data presented here were ordinal with five ordered response categories (ki), the analyses were conducted using the graded (for polytomous, ordered response category) item response model (GRM; Samejima, 1969). In this model, the probability that a response is in category k or higher for each value of trait (θ) is estimated. The curve that relates the probability of an item response to the underling trait measured by the item set is known as an Item Characteristic Curve (ICC). This curve can be characterized by an average discrimination parameter across response categories (denoted as a) and location (also called threshold, or severity) parameters (denoted as bi) that is the point of inflection of the curve representing each response option. Thus, the GRM will estimate only one discrimination parameter indicating the ability of an item to discriminate among people with different levels of the underlying trait. The number of threshold parameters per item will be the number of response options minus 1. The fit of the GRM was assessed by goodness of fit statistics and fit plots obtained by MODFIT software (Stack, 2001). In the first step of model fit evaluation, following the method developed by Drasgow, Levine, Tsien, and Mead (1995), we tested the frequency pattern statistics not for the full contingency table,1 but for tables of single items (singlets), pairs of items (doublets), and three items (triplets) at a time. In the second step of model fit evaluation, fit plots (i.e., graphical inspections of observed vs. expected item response curves) were considered for all category response functions. The item fit under the GRM model was tested in the total sample (including Test Information Function [TIF], which provides test reliability estimations for each level of the latent trait) and in each group (male, female, younger, older)

Downloaded from asm.sagepub.com at CAMBRIDGE UNIV LIBRARY on August 9, 2015

344

Assessment 22(3)

in order to perform subsequent DIF analyses (see Reise, Morizot, & Hays, 2007). Analyses of DIF across genders and ages were performed by applying the Item Response Theory Likelihood Ratio test approach (IRTLR; Thissen, Steinberg, & Wainer, 1998) implemented in IRTPRO software (Cai, Thissen, & du Toit, 2011). Applying IRT log-likelihood ratio (IRTLR) modeling (Thissen, Steinberg, & Wainer, 1993), the DIF detection procedure is based on a nested model comparison approach. First, a more parsimonious model is tested with all parameters constrained to be equal across groups for a studied item against an augmented model. Here, one or more parameters of studied item are freed to be estimated distinctly for the two groups (a focal group and a reference group). This procedure involves comparing differences in log-likelihoods (distributed as chi-square) associated with nested models. Initial DIF estimates can be obtained by treating each item as a studied item while using the rest as “anchor” items. Anchor items are assumed without DIF and are used to estimate the trait and to link the two groups being compared in terms of trait levels. Anchor items are selected through a process of log-likelihood comparison performed iteratively. This procedure is called the “purification” of the conditioning variable. If the item set includes items with DIF, the estimation of a person’s trait level may be incorrect. Indeed, including items with DIF the total score may be a contaminated estimate of the underlying construct. In such cases, inaccurate DIF detection may result. Items flagged as having DIF at the first step are removed and a new total score for the remaining items is calculated. During this iterative process, the DIF status of items may change as a result of using a less than optimal conditional variable at various steps in the analyses. The final estimate of the attribute uses all items, but only after parameters have been appropriately set as freely or equally estimated depending on whether each item showed DIF or not. Thus, to investigate measurement equivalence the analysis refers to item parameters so that DIF analyses examine differences in these parameters. Two basic types of DIF examined are uniform and nonuniform. Uniform DIF indicates that the DIF is in the same direction across the entire spectrum of the construct to be measured (i.e., at all levels of the trait one group is consistently more likely than another to endorse an item). This kind of DIF refers to severity parameters. Nonuniform DIF can be viewed as significant group by trait interaction (i.e., it is observed when one group is more likely to endorse an item at certain levels of the trait while the other group is more likely to endorse the item at other levels). For example, the likelihood of item endorsement is higher for women than for men at lower levels of dispositional optimism but the reverse is observed for higher levels of the trait. This kind of DIF refers to discrimination parameters. Finally, differences in severity parameters are

interpreted as uniform DIF only if the test of the discrimination parameter is not significant. In this case, when tests of b parameters are performed, the a parameters are constrained to be equal. While chi-square tests of significance are available, they were found to be too stringent, over identifying DIF, especially for large samples. Thus, the magnitude of DIF calculated refers to the degree of difference in item performance between groups, conditional on the trait being examined. One method for quantifying the difference in the average expected item scores is the noncompensatory DIF Index (NCDIF, Raju, 1999; Raju et al., 2009) computed by the DFITP5 program (Raju, 1999). The cutoff values are controversial (see Teresi et al., 2009). In this study, given the large sample size (i.e., >500/group), we adopted the cutoff of 0.016 for polytomous items with five response options recommended by Flowers, Oshima, and Raju (1999) instead of the larger cutoff of .096 originally proposed by Raju (1999). Indeed, they demonstrated through simulation studies that this NCDIF cutoff improved power and Type I error rates. Because NCDIF is expressed as the average squared difference between expected scores for members of the focal group and members of the reference group, the square root of NCDIF provides an effect size in terms of the original metric.

Results Minimal data were missing across all variables. For each item, missing values remained at or below 0.6% of the total cases in the sample, and no case had more than two missing responses. Additionally, data were missing completely at random as indicated by a nonsignificant Little’s (1988) MCAR test, χ2(54) = 71.78, p > .05. Then, the EM algorithm was applied for missing data imputation.

Parameter Estimation Results showed that the unidimensionality assumption was met. For the total sample, the values of the comparative fit index (CFI) and the Tucker–Lewis index (TLI) were both .90, and the standardized root mean square residual (SRMR) was .07, indicating a good fit (i.e., CFI and TLI above .90, SRMR, below .08; Hu & Bentler, 1999). Factor loadings were all significant (p < .001), ranging from .50 to .75. Across the five gender and age groups, values of CFI and TLI ranged from .90 to .92, and SRMR ranged from .06 to .08, indicating a good fit. Factor loadings were all significant (p < .001) ranging from .46 to .79. In the first step of GRM fit evaluation, results showed that the chi-square/df ratios never exceeded the criterion of 3 (Drasgow et al., 1995) for all singlets, doublets, and triplets, indicating a good fit. In the second step of model fit evaluation, results showed that for all plots, the expected

Downloaded from asm.sagepub.com at CAMBRIDGE UNIV LIBRARY on August 9, 2015

345

Steca et al. response functions were always included in the 95% confidence interval for the observed response functions, indicating a good fit of the model. Therefore, both chi-square statistics and fit plots showed the GRM’s adequacy to the data. Figure 1 shows the fit plots for Item 6; this plot illustrates the plots that also characterized the other items. The four b threshold parameters indicated noticeable, moderate to large increase in the level of the latent trait at each subsequent response dichotomy (i.e., all items showed higher values in the level of the latent trait at each subsequent response category and a corresponding shift to higher trait level with higher response options). For all items, the trait values for b1, b2, b3, and b4 were quite evenly spaced, with b1 and b2 well below the mean trait (fixed at 0.00, SD = 1.00), b3 at around the mean trait, and b4 above it. Specifically, the b1 values for all items were around 2 SD below the mean trait level; the b2 values were around 1 SD below the mean trait level; the b3 values were around 0, and the b4 values around 1 SD above the mean trait level. The range of Item a discrimination parameters was between 1.03 ± .05 and 2.13 ± .09. Following Baker’s (2001) cutoffs,2 items had a moderate to high discriminative power. Thus, the items can adequately distinguish between individuals with different levels of dispositional optimism (see Table A in the appendix for the parameters of the total sample). The TIF that provides test reliability estimations (information values) for each level of the latent trait was estimated for the total sample. As shown in Figure A in the appendix, within the range of trait from −3.00 logit (3 SD below the mean trait) to +1.60 logit (1.6 SD above the mean trait) the amount of test information was equal to or greater than 4 (which yields a standard error of estimate equal to or less than .50), indicating that the instrument was sufficiently informative for almost the full range of the trait.3 In line with the previous study (Chiesi et al., 2013), the LOT-R adequately measures optimism ranging from low to quite high levels, whereas it is less precise for the highest levels of the trait.

Measurement Equivalence Across Gender and Age In the first step of gender DIF analyses (in which the female group was the reference group), Items 1, 2, 3, and 5 were identified as anchor items. In the next step, the remaining two items were the items studied and results indicated that only Item 6 showed uniform DIF. The NCDIF was computed to test the magnitude of DIF for Item 6. This index was .014 (representing an absolute difference of .12 on a 5-point scale), which was lower than the criterion value of .016 attesting that the difference in severity parameters of Item 6 across groups was negligible. Concerning age DIF, comparing Age 1 (reference group) and Age 2, all items except for Item 4 were identified as anchor items. In the next step, results indicated that Item 4

showed uniform DIF. The NCDIF index was .031 (constituting an absolute difference of .18 on a 5-point scale). This value revealed a difference in severity parameters for Item 4 across the two groups since it was higher than the criterion value of .016. Comparing Age 1 (reference group) and Age 3, Items 1, 5, and 6 were identified as anchor items in the first step. In the next step, results indicated that Items 2 and 3 showed uniform DIF. The NCDIF index for Item 2 was .014 (representing an absolute difference of .12 on a 5-point scale) and for Item 3 it was .039 (representing an absolute difference of .20 on a 5-point scale). These values attested that across groups, the difference in severity parameters for Item 2 was negligible, whereas there was a difference in severity parameters for Item 3. Finally, comparing Age 2 (reference group) and Age 3, no items showed DIF from the first step of the analyses. Table 1 shows the final results for gender and age DIF with the separately estimated parameters for the groups. Additionally, since it has been stated that graphical displays are helpful in highlighting the magnitude of the effect sizes (Steinberg & Thissen, 2006), the ICCs of Items 3 and 4 are reported for the different age groups in Figure 2. The trace lines show that the probability of endorsing the item categories as a function of the value of the underlying construct is quite similar between the Age 1 and Age 2 groups (Item 4). These probabilities differ slightly from that of the Age 3 group (Item 3).

Discussion This article aimed at investigating the psychometric properties of the LOT-R through the IRT approach to determine its advantages over the CTT approach in evaluating psychological assessment instruments. Our findings reinforce confidence in the assessment accuracy of the LOT-R and support the large number of studies that employed the scale in the various fields of empirical and applied psychology. Confirming previous studies (Rauch et al., 2007; Vautier et al., 2003), our analyses verified the theoretical conceptualization of the construct measured by the LOT-R as a single bipolar dimension, ranging from pessimism to optimism (Scheier et al., 1994) across gender and age. In line with Chiesi et al. (2013), our analyses suggested that the model fit the data adequately under the graded response model (Samejima, 1969). Discrimination and severity parameters indicated that all items were able to distinguish adequately well among people with different levels of the trait being measured (i.e., dispositional optimism) and adequately covered the spectrum of the latent trait. Additionally, good information values were obtained for a large range of the trait. These values attest to the precision of the scale along almost the entire pessimism–optimism continuum. Indeed, only for the highest optimism levels does the LOT-R appear slightly less effective. These findings may

Downloaded from asm.sagepub.com at CAMBRIDGE UNIV LIBRARY on August 9, 2015

346

Assessment 22(3)

Figure 1.  Fit plots for the five categories (ranging from 0 to 4) of Item 6 of the LOT-R.

Note. Displayed are the expected category response functions (EMP) under the GRM and the observed category functions (ORF) with a 95% confidence interval.

Downloaded from asm.sagepub.com at CAMBRIDGE UNIV LIBRARY on August 9, 2015

347

Steca et al.

Table 1.  Item Discrimination (a) and Category Threshold (b) Estimates With Standard Errors (in Parentheses) for the Studied Items in Differential Item functioning (DIF) Analyses of the LOT-R Item Set: Comparison of Gender and Age Groups. Groupa Male versus female       Age 1 versus Age 2   Age 1 versus Age 3          

Item

a

4

1.40 (0.09) 1.30 (0.09) 2.07 (0.13) 1.93 (0.13) 1.41 (0.10) 1.58 (0.13) 2.45 (0.16) 2.42 (0.22) 1.66 (0.11) 1.31 (0.12) 1.40 (0.09) 1.81 (0.17)

6 4 2 3 4

b1

b2

b3

b4

−3.13 (0.19) −3.05 (0.18) −2.22 (0.11) −2.39 (0.13) −2.61 (0.17) −2.61 (0.21) −2.26 (0.12) −1.87 (0.16) −2.48 (0.15) −2.54 (0.26) −2.62 (0.17) −2.20 (0.20)

−1.68 (0.10) −1.65 (0.10) −1.25 (0.06) −1.09 (0.07) −1.16 (0.09) −1.44 (0.12) −1.10 (0.06) −1.10 (0.10) −0.93 (0.07) −0.98 (0.12) −1.16 (0.08) −1.27 (0.12)

−0.55 (0.06) −0.73 (0.06) −0.37 (0.04) −0.20 (0.05) −0.26 (0.06) −0.55 (0.07) −0.35 (0.05) −0.20 (0.06) −0.15 (0.05) 0.22 (0.07) −0.26 (0.06) −0.42 (0.07)

1.12 (0.08) 1.11 (0.09) 1.01 (0.06) 1.30 (0.09) 1.26 (0.09) 1.01 (0.08) 0.87 (0.06) 1.04 (0.08) 1.29 (0.08) 1.67 (0.14) 1.26 (0.09) 1.02 (0.09)

aDIFb

bDIFb

0.7 (0.416)

12.8 (0.012)   17.8 (0.0014)   18.6 (0.0009)   18.0 (0.0012)   24.9 (0.0001)   14.0 (0.0073)  

0.5 (0.464) 1.1 (0.301) 0.1 (0.909) 4.7 (0.031) 4.6 (0.032)

a. Age 1 = 17-35 years; Age 2 = 36-55 years; and Age 3 = 56-95 years. b. Chi-square statistic and associated p (in parentheses). DIF is significant if p < .004 (after Bonferroni correction adjustment).

Item 4

1.0

1.0

0.8

0.8 Probability

Probability

Item 3

0.6 0.4

0.6 0.4 0.2

0.2 0.0

0.0 -3

-2

-1

0

1

2

3

Theta

-3

-2

-1

0

1

2

3

Theta

Figure 2.  The Item Characteristic Curve (ICC) of Item 3 of the LOT-R for the Age 1 group (solid line) and the Age 3 group (dashed line) and ICC of Item 4 of the LOT-R for the Age 1 group (solid line) and the Age 2 group (dashed line). Latent trait (Theta) is shown on the horizontal axis and the probability of endorsing the item options is shown on the vertical axis. Note. Age 1 = 17-35 years; Age 2 = 36-55 years; and Age 3 = 56-95 years.

contribute to the improvement of the instrument, as they suggest that the addition of some items or the use of a response scale with more categories (e.g., a 6-point scale) might help measure the highest levels of optimism, for which the scale is less precise. Indeed, additional items or response options should allow for the entire continuum of the latent trait to be covered, including the extreme positive polarity. For assessment instruments, an individual’s tendency to endorse a particular item category should reflect his/her level of the trait being measured and should not differ based on variables other than the construct of interest, such as gender or age. If this assumption is violated, the item exhibits DIF; this can be considered as a potential source of measurement bias. Results from this study suggest that the

LOT-R can be considered immune from this kind of bias when referring to gender since the measurement equivalence at the item level was confirmed for males and females. Indeed, even though uniform DIF was detected for one item, the effect size of the difference in severity parameters was very small. It can thus be considered irrelevant in practical applications of the LOT-R for measuring dispositional optimism in male and female respondents. Concerning age, some differences were found between the younger group and the two older groups. Given this result, we should reflect on whether age constitutes a potential source of measurement bias for the LOT-R. Our conclusion is that the impact of these differences appears to be very low. This statement can be justified as follows. First of all, no age-related differences were found for discrimination

Downloaded from asm.sagepub.com at CAMBRIDGE UNIV LIBRARY on August 9, 2015

348

Assessment 22(3)

Appendix Table A.  Item Discrimination (a) and Category Threshold (b) Estimates Along With Standard Errors (in Parentheses) for Each Item of the LOT-R. Item 1 2 3 4 5 6

a

b1

b2

b3

b4

1.47 (.06) 2.07 (.08) 1.34 (.06) 1.45 (.06) 1.03 (.05) 2.13 (.09)

−2.57 (.10) −2.39 (.08) −2.82 (.11) −2.78 (.11) −3.28 (.15) −2.08 (.07)

−0.98 (.04) −1.32 (.04) −1.06 (.05) −1.47 (.06) −1.46 (.07) −1.01 (.04)

−0.11 (.03) −0.46 (.03) −0.07 (.04) −0.53 (.04) 0.03 (.04) −0.19 (.03)

1.21(.05) 0.95 (.04) 1.46 (.06) 1.11 (.05) 1.67 (.08) 1.15 (.04)

6

0.9 0.8

5 0.7

Total Information

4

0.6 0.5

3 0.4 2

Standard Error

parameters. That is, all six items were able to discriminate equally well with the same precision among different levels of dispositional optimism regardless the respondents’ ages. Second, two items showed uniform DIF between the younger group and older group (Item 3) and between the younger group and middle-aged group (Item 4). This means that these items, while equally discriminating among different age respondents, showed a difference at the severity parameter level. Nonetheless, looking at the graphical displays of the item parameters to highlight the differences among the b parameters, the trace lines for the Likert-type responses provide a clear picture of the rather insignificant DIF for Item 4. Indeed, the probability of endorsing the item categories as a function of the underlying construct is not very different for younger people and middle-aged people. For Item 3, the trace lines appear to be slightly different between the younger group and older group. However, the slight variation among the b parameters indicated that the values of a trait required for one group to endorse an item over that required for another group to endorse it never exceeded half a standard unit. Thus, we can reasonably conclude that age uniform DIF for Item 3 induces only slight differences in the summed scores of young and old respondents. In sum, these findings have confirmed that the LOT-R is suitable for all of the above-mentioned groups. They suggest that previous results showing gender and age differences (Carvajal et al., 1998; Extremera et al., 2007; Glaesmer et al., 2012) are not affected by an artifact of the measurement process but truly reflect differences in optimism. Future studies should apply the same analyses to examine the LOT-R measurement equivalence as regards other variables, such as individuals’ country of origin. Current findings are based on an Italian sample using an Italian translation of the original LOT-R so they cannot be generalized to other languages or cultures (e.g., American). Additionally, to provide further evidence of the properties of the LOT-R, the scale should be administered to different samples to test the adequacy of the scale in different populations, such as clinical populations. This research contributed to the study of optimism in several respects. First, it provided the opportunity to examine the properties of a widely used instrument for its measurement in different ages and genders. Second, by the adoption of the IRT approach, it provided strong and novel evidence for the measurement quality of the LOT-R, which has been previously investigated by the more traditional approach provided by the CTT. Importantly, this evidence was gathered by studying and analyzing data collected on a very large sample of adults, a fact that strengthens the robustness of the findings. Third, current findings further contribute to the list of the positive characteristics of the LOT-R. As a result of its easy use, brevity, and comprehensibility, it is particularly suitable for people who are impaired by long or complex tasks (e.g., ill or old people) or situations characterized by time restrictions.

0.3 0.2

1 0.1 0

-3

-2

-1

0 Theta Total Information

2

1

3

0.0

Standard Error

Figure A.  Test Information Function (TIF) of the Life Orientation Test-Revised (LOT-R).

Note. Latent trait (Theta) is shown on the horizontal axis, and the amount of information and the standard error yielded by the test at any trait level is shown on the vertical axis.

Declaration of Conflicting Interests The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding The authors received no financial support for the research, authorship, and/or publication of this article.

Notes 1.

In this case, the number of cells for the full contingency table is p = 56 (where 6 denotes the number of items and 5 the number of response categories), and therefore the large majority of the cells will have observed and expected frequencies of 0. 2. According to Baker (2001), values 0.01 to 0.24 are very low, 0.25 to 0.64 are low, 0.65 to 1.34 are moderate, 1.35 to 1.69 are high, and more than 1.7 are very high.

Downloaded from asm.sagepub.com at CAMBRIDGE UNIV LIBRARY on August 9, 2015

349

Steca et al. 3.

We can interpret the information magnitude by computing the associated reliability (r = 1 − 1/Information). Thus, reliability was equal to or greater than .75 within the range described.

References Ayub, N. (2009). Measuring hopelessness and life orientation in Pakistani adolescents. Crisis: The Journal of Crisis Intervention and Suicide Prevention, 30, 153-160. doi:10.1027/0227-5910.30.3.153 Baker, F. B. (2001). The basics of item response theory (2nd ed.). College Park: University of Maryland. Cai, L., Thissen, D., & du Toit, S. H. C. (2011). IRTPRO 2.1 for Windows. Chicago, IL: Scientific Software International. Carvajal, S. C., Garner, R. L., & Evans, R. I. (1998). Dispositional optimism as a protective factor in resisting HIV exposure in sexually active inner-city minority adolescents. Journal of Applied Social Psychology, 28, 2196-2211. Carver, C. S., & Scheier, M. F. (1998). On the self-regulation of behavior. New York, NY: Cambridge University Press. Chang, E. C., & Sanna, L. J. (2003). Optimism, accumulated life stress, and psychological and physical adjustment: Is it always adaptive to expect the best? Journal of Social & Clinical Psychology, 22, 97-115. Chiesi, F., Galli, S., Primi, C., Innocenti Borgi, P., & Bonacchi, A. (2013). The accuracy of the Life Orientation Test-Revised (LOT-R) in measuring dispositional optimism: Evidence from item response theory analyses. Journal of Personality Assessment, 95, 523-529. doi:10.1080/00223891.2013.781029 Chong, W. H., Huan, V. S., Yeo, L. S., & Ang, R. P. (2006). Asian adolescents’ perceptions of parent, peer, and school support and psychological adjustment: The mediating role of dispositional optimism. Current Psychology: Developmental, Learning, Personality, Social, 25, 212-228. Cozzarelli, C. (1993). Personality and self-efficacy as predictors of coping with abortion. Journal of Personality and Social Psychology, 65, 1224-1236. Creed, P. A., Patton, W., & Bartrum, D. (2002). Multidimensional properties of the LOT-R: Effects of optimism and pessimism on career and well-being related variables in adolescents. Journal of Career Assessment, 10, 42-61. doi:10.1177/1069072702010001003 Curbow, B., Somerfield, M. R., Baker, F., Wingard, J. R., & Legro, M. W. (1993). Personal changes, dispositional optimism, and psychological adjustment to bone marrow transplantation. Journal of Behavioral Medicine, 16, 423-443. Drasgow, F., Levine, M. V., Tsien, S., Williams, B., & Mead, A. D. (1995). Fitting polytomous item response theory models to multiple-choice tests. Applied Psychological Measurement, 19, 143-166. doi:10.1177/014662169501900203 Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum. Enders, C. K. (2006). Analyzing structural equation models with missing data. In G. Hancock & R. Mueller (Eds.), Structural equation modeling: A second course (pp. 313-342). Greenwich, CT: Information Age. Extremera, N., Duran, A., & Rey, L. (2007). Perceived emotional intelligence and dispositional optimism-pessimism: Analyzing their role in predicting psychological adjustment among adolescents. Personality and Individual Differences, 42, 1069-1079. doi:10.1016/j.paid.2006.09.014

Fitzgerald, T. E., Tennen, H., Affleck, G., & Pransky, G. S. (1993). The relative importance of dispositional optimism and control appraisals in quality of life after coronary artery bypass surgery. Journal of Behavioral Medicine, 16, 25-43. Flowers, C. P., Oshima, T. C., & Raju, N.S. (1999). A description and demonstration of the polytomous DFIT framework. Applied Psychological Measurement, 23, 309-332. Fournier, M., de Ridder, D., & Bensing, J. (2002). How optimism contributes to the adaptation of chronic illness. A prospective study into the enduring effects of optimism on adaptation moderated by the controllability of chronic illness. Personality and Individual Differences, 33, 1163-1183. Friedman, L. C., Nelson, D. V., Baer, P. E., Lane, M., Smith, F. E., & Dworkin, R. J. (1992). The relationship of dispositional optimism, daily life stress, and domestic environment to coping methods used by cancer patients. Journal of Behavioral Medicine, 15, 127-141. Glaesmer, H., Rief, W., Martin, A., Mewes, R., Brähler, E., Zenger, M., & Hinz, A. (2012). Psychometric properties and population-based norms of the Life Orientation Test-Revised (LOT-R). British Journal of Health Psychology, 17, 432-445. doi:10.1111/j.2044-8287.2011.02046.x Grove, J. R., & Heard, N. P. (1997). Optimism and sport confidence as correlates of slump related coping among athletes. The Sport Psychologist, 11, 400-410. Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage. Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55. Huan, V. S., Yeo, L. S., Ang, R. P., & Chong, W. H. (2006). The influence of dispositional optimism & gender on adolescents’ perception of academic stress. Adolescence, 41, 533-546. ISTAT. (2013). Rapporto annuale 2013 [Annual report of statistics, 2013]. Rome, Italy: Author. Jackson, L. M., Pratt, M. W., Hunsberger, B., & Pancer, S. M. (2005). Optimism as a mediator of the relation between perceived parental authoritativeness and adjustment among adolescents: Finding the sunny side of the street. Social Development, 14, 273-304. doi:10.1111/j.1467-9507.2005.00302.x Jerusalem, M. (1993). Personal resources, environmental constraints, and adaptational processes: The predictive power of a theoretical stress model. Personality and Individual Differences, 14, 15-24. Jones, T., DeMore, M., Cohen, L. L., O’Connell, C., & Jones, D. (2008). Childhood healthcare experience, healthcare attitudes, and optimism as predictors of adolescents’ healthcare behavior. Journal of Clinical Psychology in Medical Settings, 15, 234-240. doi:10.1007/s10880-008-9126-7 Lai, J. C. L. (2009). Dispositional optimism buffers the impact of daily hassles on mental health in Chinese adolescents. Personality and Individual Differences, 47, 247-249. doi:0.1016/j.paid.2009.03.007 Litt, M. D., Tennen, H., Affleck, G., & Klock, S. (1992). Coping and cognitive factors in adaptation to in vitro fertilization failure. Journal of Behavioral Medicine, 15, 171-187. Little, R. J. A. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association, 83, 1198-1202.

Downloaded from asm.sagepub.com at CAMBRIDGE UNIV LIBRARY on August 9, 2015

350

Assessment 22(3)

Meevissen, Y. M. C., Peters, M. L., & Alberts, H. J. E. M. (2011). Become more optimistic by imagining a best possible self: Effects of a two week intervention. Journal of Behavior Therapy and Experimental Psychiatry, 42, 371-378. doi:10.1016/j.jbtep.2011.02.012 Monzani, D., Steca, P., & Greco, A. (2014). Brief report: Assessing dispositional optimism in adolescence—Factor structure and concurrent validity of the Life Orientation TestRevised. Journal of Adolescence, 37, 94-101. doi:10.1016/j. adolescence.2013.11.006 Murberg, T. A. (2012). The influence of optimistic expectations and negative life vents on somatic symptoms among adolescents: A one-year prospective study. Psychology, 3, 123-127. doi:10.4236/psych.2012.32018 Muthén, L. K., & Muthén, B. O. (2004). Mplus: The comprehensive modeling program for applied researchers. User’s guide (3rd ed.). Los Angeles, CA: Muthén & Muthén. Nes, L. S., & Segerstrom, S. C. (2006). Dispositional optimism and coping: A meta-analytic review. Personality and Social Psychology Review, 10, 235-251. doi:10.1207/s15327957pspr1003_3 Puskar, K. R., Sereika, S. M., Lamb, J., Tusaie-Mumford, K., & McGuinness, T. (1999). Optimism and its relationship to depression, coping, anger, and life events in rural adolescents. Issues in Mental Health Nursing, 20, 115-130. Raju, N. S. (1999). DFITP5: A Fortran program for calculating dichotomous DIF/DTF. Chicago: Illinois Institute of Technology. Raju, N. S., Fortmann-Johnson, K. A., Kim, W., Morris, S. B., Nering, M. L., & Oshima, T. C. (2009). The item parameter replication method for detecting differential functioning in the polytomous DFIT framework. Applied Psychological Measurement, 33, 133-147. Rauch, W. A., Schweizer, K., & Moosbrugger, H. (2007). Method effects due to social desirability as a parsimonious explanation of the deviation from unidimensionality in LOT-R scores. Personality and Individual Differences, 42, 1597-1607. doi:10.1016/j.paid.2006.10.035 Reise, S. P., Morizot, J., & Hays, R. D. (2007). The role of the bifactor model in resolving dimensionality issues in health outcomes measures. Quality of Life Research: An International Journal of Quality of Life Aspects of Treatment, Care & Rehabilitation, 16, 19-31. doi:10.1007/s11136-007-9183-7 Reise, S. P., & Waller, N. G. (2009). Item response theory and clinical measurement. Annual Review of Clinical Psychology, 5, 25-46. Rius-Ottenheim, N., Kromhout, D., van der Mast, R. C., Zitman, F. G., Geleijnse, J. M., & Giltay, E. J. (2012). Dispositional optimism and loneliness in older men. International Journal of Geriatric Psychiatry, 27, 151-159. doi:10.1002/gps.2701 Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, 34(4), 100-100. Scheffer, J. (2002). Dealing with missing data. Research Letters in the Information and Mathematical Sciences, 3, 153-160. doi:10.1.1.18.3086 Scheier, M. F., & Carver, C. S. (1985). Optimism, coping, and health: Assessment and implications of generalized outcome expectancies. Health Psychology, 4, 219-247. doi:10.1037/0278-6133.4.3.219

Scheier, M. F., Carver, C. S., & Bridges, M. W. (1994). Distinguishing optimism from neuroticism (and trait anxiety, self-mastery, and self-esteem): A reevaluation of the life orientation test. Journal of Personality and Social Psychology, 67, 1063-1078. Scheier, M. F., Weintraub, J. K., & Carver, C. S. (1986). Coping with stress: Divergent strategies of optimists and pessimists. Journal of Personality and Social Psychology, 51, 1257-1264. Stack, S. (2001). MODFIT: A computer program for model-data fit. Urbana-Champaign: University of Illinois at UrbanaChampaign. Retrieved from http://work.psych.uiuc.edu/irt Steed, L. G. (2002). A psychometric comparison of four measures of hope and optimism. Educational and Psychological Measurement, 62, 466-482. doi:0.1177/00-164402062003005 Steinberg, L., & Thissen, D. (2006). Using effect sizes for research reporting: Examples using item response theory to analyze differential item functioning. Psychological Methods, 11, 402. doi:10.1037/1082-989X.11.4.402 Teresi, J. A., Ocepek-Welikson, K., Kleinman, M., Eimicke, J. P., Crane, P. K., Jones, R. N., . . .Cella, D. (2009). Analysis of differential item functioning in the depression itembank from the Patient Reported Outcome Measurement Information System (PROMIS): An item response theory approach. Psychology Science Quarterly, 51, 148-180. Thissen, D., Steinberg, L., & Wainer, H. (1993). Detection of differential item functioning using the parameters of item response models. In D. Thissen, L. Steinberg & H. Wainer (Eds.), Differential item functioning (pp. 67-113). Hillsdale, NJ: Lawrence Erlbaum. Thissen, D., Steinberg, L., & Wainer, H. (1998). Use of item response theory in the study of group differences in trace lines. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 149-169). Hillsdale, NJ: Erlbaum. Thomas, M. L. (2011). The value of item response theory in clinical assessment: A review. Assessment, 18, 291-307. doi:10.1177/1073191110374797 Vacek, K. R., Coyle, L. D., & Vera, E. M. (2010). Stress, self-esteem, hope, optimism, and well-being in urban ethnic minority adolescents. Journal of Multicultural Counseling and Development, 38, 99-111. doi:10.1002/j.2161-1912.2010.tb00118.x Vautier, S., Raufaste, E, & Cariou, M. (2003). Dimensionality of the Revised Life Orientation Test and the status of filler items. International Journal of Psychology, 38, 390-400. Weber, S., Puskar, K. R., & Ren, D. (2010). Relationships between depressive symptoms and perceived social support, self-esteem, & optimism in a sample of rural adolescents. Issues in Mental Health Nursing, 31, 584-588. doi:10.3109/01612841003775061 Wong, S. S., & Lim, T. (2009). Hope versus optimism in Singaporean adolescents: Contributions to depression and life satisfaction. Personality and Individual Differences, 46, 648652. doi:10.1016/j.paid.2009.01.009 Yarcheski, T. J., Mahon, N. E., & Yarcheski, A. (2004). Depression, optimism, and positive health practices in young adolescents. Psychological Reports, 95, 932-934. doi:10.2466/PR0.95.7.932-934 You, J., Fung, H. H. L., & Isaacowitz, D. M. (2009). Age differences in dispositional optimism: A cross-cultural study. European Journal of Ageing, 6, 247-252. doi:10.1007/ s10433-009-0130-z

Downloaded from asm.sagepub.com at CAMBRIDGE UNIV LIBRARY on August 9, 2015

Item response theory analysis of the life orientation test-revised: age and gender differential item functioning analyses.

This study is aimed at testing the measurement properties of the Life Orientation Test-Revised (LOT-R) for the assessment of dispositional optimism by...
572KB Sizes 0 Downloads 3 Views